Podcasts about Andrej Karpathy

AI researcher at Tesla

121PODCASTS
200EPISODES
48mAVG DURATION
1WEEKLY EPISODE
Nov 15, 2025LATEST

POPULARITY

20172018201920202021202220232024

Best podcasts about Andrej Karpathy

Dave Lee on Investing

10 episodes with Andrej Karpathy

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

12 episodes with Andrej Karpathy

Tesla Daily: Tesla News & Analysis

6 episodes with Andrej Karpathy

The Nonlinear Library

10 episodes with Andrej Karpathy

Lex Fridman Podcast

3 episodes with Andrej Karpathy

Ride the Lightning: Tesla Motors Unofficial Podcast

3 episodes with Andrej Karpathy

The AI Breakdown: Daily Artificial Intelligence News and Discussions

3 episodes with Andrej Karpathy

Lenny's Podcast: Product | Growth | Career

2 episodes with Andrej Karpathy

EV News Daily - Electric Car Podcast

2 episodes with Andrej Karpathy

Latest podcast episodes about Andrej Karpathy

Vibe Coding: Revolução e Desafios da Programação Assistida por IA

Jorge Borges

Play Episode Listen Later Nov 15, 2025 13:22

O podcast apresenta uma análise detalhada da "vibe coding", um novo paradigma no desenvolvimento de software assistido por Inteligência Artificial (IA), onde os programadores descrevem as intenções em linguagem natural e a IA gera o código. Este fenômeno, popularizado por um tweet de Andrej Karpathy em 2025 e reconhecido como Palavra do Ano pelo Collins Dictionary, representa uma democratização da programação e um aumento sem precedentes na velocidade de prototipagem, com startups a alcançar marcos extraordinários. Contudo, o texto também enfatiza os desafios críticos desta abordagem, como graves vulnerabilidades de segurança no código gerado, dificuldades de manutenção e depuração, e críticas de especialistas como Andrew Ng sobre a trivialização do esforço intelectual necessário. Em última análise, o futuro da vibe coding é visto como um modelo híbrido, onde os humanos atuam como arquitetos e revisores, garantindo a qualidade e segurança do código.

vibe ia coding intelig palavra desafios ano revolu contudo andrew ng artificial ia andrej karpathy assistida

'Vibe coding' makes word of the year by Collins', but what does it mean?

Tech and Science Daily | Evening Standard

Play Episode Listen Later Nov 6, 2025 11:36

Ever heard of “vibe coding”? It's been named Word of the Year by Collins Dictionary, but what does it mean?You can thank OpenAI's co-founder Andrej Karpathy, who came up with the phrase.The World Weather Attribution has released new data revealing that climate change significantly amplified Hurricane Melissa's destructive winds and rainfall.We speak to the rapid study's co-author, climate scientist Theodore Keeping, from the World Weather Attribution team at Imperial College London.Three Chinese astronauts are stuck in space for longer than expected, after an unidentified object hits the return spacecraft.Also in this episode:UK energy supplier Tomato Energy has collapsedPrince William honours young environmentalists at Earthshot PrizeThe newly described species of toads that give birth to fully formed toadletsAI chatbots "suffer from brainrot" too Hosted on Acast. See acast.com/privacy for more information.

uk acast openai coding word of the year imperial college london andrej karpathy world weather attribution

Why This CTO Says AI Coding Agents Are “Insidious”, Overhyped, and Nowhere Near Replacing Human Engineers

Crafted

Play Episode Listen Later Oct 31, 2025 25:07

AI coding assistants promise to write your code, speed up your sprint, and maybe even make engineers obsolete. But what if the people building with them every day see something very different?In this special Halloween edition of CRAFTED. — which also marks the show's third anniversary! — a masked CTO shares what he can't say publicly: that these tools are powerful, but insidious. In his view, coding assistants are great for auto-complete, but they can't do what a human engineer does. He says they're terrible at starting from scratch and will often suggest code that “works in vacuum”, but not in context. And because AI can write so much code, so quickly, it's hard to catch errors. In short, he sees an increase in short term velocity, at the expense of increased defects and an increasing dependency on systems that are untrustworthy. I want to emphasize that this episode features the experience of one very experienced person. There are obviously others who disagree, who say AI coding agents are incredible, so long as they're managed well. However, there are also an increasing number of people questioning the sustainability of coding agents — they're incredibly expensive to run — and also how good they are in the first place.For example Andrej Karpathy, the guy who literally coined the phrase "vibe coding" and was early at OpenAI and Tesla, just said publicly on Dwarkesh Podcast that the path to AI agents is going to be a lot slower than people in the industry think it will be. He said coding agents are "not that good at writing code that's never been written before" and that there is too much hype right now about where AI really is, with people in the industry, quote "trying to pretend like this is amazing, when it's not." And he said: "My Claude Code or Codex still feels like this elementary-grade student." Today's guest agrees with Karpathy on a lot of this. Our guest has worked at startups, scale-ups, and big tech companies you've definitely heard of and today he's at a very AI-forward company and using AI coding tools every day. Enjoy this special episode of CRAFTED.! ---And pretty please...!Share with a friend! Word of mouth is how podcasts grow!Subscribe to the newsletter at https://www.crafted.fmShare your feedback! I'm experimenting with new episode formats and would love your feedback on this and other episodes. DM me on LinkedIn or contact me email, via https://www.crafted.fmSponsor the show? I'm actively speaking to potential sponsors for 2026 episodes. Let's talk!Get psyched!… There are some big updates to the show in 2026!---Key Quotes03:16 The myth of AI replacement: “The idea that AI can actually supplant a software engineer in their current role is basically nonsense.”06:29 Why AI struggles without human input: “If you remove the human engineer from the equation, there's no place to start from. The AI does not do well when you're starting from scratch because it doesn't have the real-world context or the continuous learning required to make that system better.”12:21: The illusion of speed: “Coding assistants help you generate code very quickly. There's an illusion that your velocity increases. What actually happens is you're just shipping more bugs to production.”13:30 More code than humans can review: “AI generates so much code that no human can keep that context in their head and review it in a meaningful way. At some point you just have to trust — but who are you trusting? You're trusting the AI, and the AI cannot be trusted.”14:02 AI & Junior Engineer Hiring: “The narrative that hiring trends have anything to do with AI is absurd. It's not that AI is replacing junior engineers — it's that companies are running lean and don't have the bandwidth to train them.”15:42: Where the AI Bulls and Bears Differ: “Whereas we see flawed systems that aren't ready for primetime [...] they view this as ‘oh, that's, that's insignificant. They will get better almost immediately. It's not a big deal.' But we've been repeating this cycle for years at this point.”19:50 Where AI Excels: “Where review and revise are part of the process already, that's a really good place for generative AI because you already have a human in the loop.”21:02: What builders need to unlearn “To the extent that people think these things are thinking or reasoning or on any path to AGI at all — they should discard that. These models don't think. They're very sophisticated pattern-matching machines, and that's really it.”

Agenti e Browser AI: quanto siamo vicini all'AGI?

Algoritmi

Play Episode Listen Later Oct 31, 2025 26:27

OpenAI presenta ChatGPT Atlas, il suo primo browser AI-first con capacità agentiche, pensato per leggere, pianificare e agire sul web. Ma dietro l'hype si apre una domanda più grande: quanto siamo davvero vicini all'intelligenza generale artificiale?Uno studio sperimentale, LLM Brain Rot, mostra che i modelli linguistici possono letteralmente “degradarsi” se esposti troppo a contenuti junk: perdono capacità di ragionamento, memoria di contesto e stabilità.E infine Andrej Karpathy — ex Tesla e OpenAI — mette tutto in prospettiva: non è l'anno degli agenti, ma il decennio in cui dovremo imparare a costruirli davvero.L'AI non esploderà da un giorno all'altro: crescerà lentamente, tra bug cognitivi, nuovi framework e una buona dose di umiltà.E a proposito di fare le cose bene: abbiamo rilasciato da poco Datapizza AI, il nostro framework open-source per la GenAI, e da lunedì scorso è su Product Hunt.Provalo, sperimenta, raccontaci cosa ne pensi: il tuo feedback ci aiuta a migliorarlo!GitHub → https://bit.ly/48ELY0O****Per altri contenuti sul mondo Tech, Data & AI, seguici sui nostri canali!

ai tech data tesla uno newsletter openai quanto github siamo browsers genai product hunt agenti vicini andrej karpathy provalo

Inside Amazon's $100B Automation Gamble

Dead Cat

Play Episode Listen Later Oct 30, 2025 52:48

Is the AI boom already peaking? In this episode of The Newcomer Podcast, Eric, Madeline and Tom take a hard look at the hype cycle driving Silicon Valley's latest gold rush — from Andreessen Horowitz's record-breaking $25 billion year to Amazon's push to automate its entire workforce.We explore whether AI's trillion-dollar promise is real innovation, or if the cracks are already showing. From OpenAI's overblown math claims to Andrej Karpathy's “State of the Union” reflections, we break down what's really happening behind the headlines.

amazon ai silicon valley automation state of the union gamble andreessen horowitz 100b andrej karpathy

Inside Amazon's $100B Automation Gamble

Dead Cat

Play Episode Listen Later Oct 30, 2025 52:48

amazon ai silicon valley automation state of the union gamble andreessen horowitz 100b andrej karpathy

Episode 543: Arts and Crafts

Software Defined Talk

Play Episode Listen Later Oct 24, 2025 66:34

This week, we discuss OpenAI's new browser, AI trying to build spreadsheets, and when to use Claude skills. Plus, Coté explores the art of the perfect staycation. Watch the YouTube Live Recording of Episode (https://www.youtube.com/live/PnwoFl5JjNo?si=DS2CoIgHVlVU9Y3m) 543 (https://www.youtube.com/live/PnwoFl5JjNo?si=DS2CoIgHVlVU9Y3m) Runner-up Titles Firewire is dead USB, what are you going to do? It's like I tell my son: you know what to do, you chose not to do it. I am just a guest. I don't need helpful An amazing hole. Slides for nobody You closed the loop It's pretty amazing, but does it need to exist? Slackhole Rundown OpenAI Introducing ChatGPT Atlas (https://openai.com/index/introducing-chatgpt-atlas/) OpenAI Is Building a Banker (https://www.bloomberg.com/opinion/newsletters/2025-10-21/openai-is-building-a-banker?srnd=undefined&embedded-checkout=true) OpenAI has five years to turn $13 billion into $1 trillion (https://techcrunch.com/2025/10/14/openai-has-five-years-to-turn-13-billion-into-1-trillion/) AI agents are not amazing, they are slop: says OpenAI cofounder Andrej Karpathy as he strongly disagrees with CEO Sam Altman on AGI timeline - The Times of India (https://timesofindia.indiatimes.com/technology/tech-news/ai-agents-are-not-amazing-they-are-slop-says-openai-cofounder-andrej-karpathy-as-he-strongly-disagrees-with-ceo-sam-altman-on-agi-timeline/articleshow/124720565.cms) OpenAI's ChatGPT will soon allow ‘erotica' for adults in major policy shift (https://www.cnbc.com/2025/10/15/erotica-coming-to-chatgpt-this-year-says-openai-ceo-sam-altman.html) OpenAI Inks Deal With Broadcom to Design Its Own Chips for A.I. (https://www.nytimes.com/2025/10/13/technology/openai-broadcom-chips-deal.html) Claude Skills are awesome, maybe a bigger deal than MCP (https://simonwillison.net/2025/Oct/16/claude-skills/#atom-everything) OpenStack Flamingo pays down technical debt as adoption continues to climb (https://www.networkworld.com/article/4066532/openstack-flamingo-pays-down-technical-debt-as-adoption-continues-to-climb.html) Relevant to your Interests Elon Musk will settle $128 million Twitter execs lawsuit (https://www.theverge.com/news/796239/elon-musk-x-128-million-twitter-exec-lawsuit-settlement) GitHub Will Prioritize Migrating to Azure Over Feature Development (https://thenewstack.io/github-will-prioritize-migrating-to-azure-over-feature-development/) The Discord Hack is Every User's Worst Nightmare (https://www.404media.co/the-discord-hack-is-every-users-worst-nightmare/) Cursor-Maker Anysphere Considers Investment Offers at $30 Billion Valuation (https://www.theinformation.com/articles/cursor-maker-anysphere-considers-investment-offers-30-billion-valuation) Rubygems.org AWS Root Access Event – September 2025 (https://rubycentral.org/news/rubygems-org-aws-root-access-event-september-2025/) This Discord Zendesk compromise has gotten more silly (https://x.com/vxunderground/status/1976417029289607223) WP Engine Vs Automattic & Mullenweg Is Back In Play (https://www.searchenginejournal.com/wp-engine-vs-automattic-mullenweg-is-back-in-play/557905/) Windows 11 removes all bypass methods for Microsoft account setup, removing local accounts (https://alternativeto.net/news/2025/10/windows-11-now-blocks-all-microsoft-account-bypasses-during-setup/) Introducing the React Foundation: The New Home for React & React Native (https://engineering.fb.com/2025/10/07/open-source/introducing-the-react-foundation-the-new-home-for-react-react-native/?utm_source=changelog-news) Wiz Finds Critical Redis RCE Vulnerability: CVE‑2025‑49844 | Wiz Blog (https://www.wiz.io/blog/wiz-research-redis-rce-cve-2025-49844) DevRel is -Unbelievably- Back (https://dx.tips/devrel-is-back) The Ruby community has a DHH problem (https://tekin.co.uk/2025/09/the-ruby-community-has-a-dhh-problem) YouTube rolls out its redesigned video player globally (https://www.engadget.com/entertainment/youtube/youtube-rolls-out-its-redesigned-video-player-globally-174609883.html) Oracle stock rises as company confirms Meta cloud deal (https://www.cnbc.com/2025/10/16/oracle-confirms-meta-cloud-deal-.html) Adiós, AirPods (https://www.theatlantic.com/technology/2025/10/apple-airpods-live-translation/684582/?gift=iWa_iB9lkw4UuiWbIbrWGV8Zzu9GF6V5YZpJtnAzcvU&utm_source=copy-link&utm_medium=social&utm_campaign=share) NVIDIA shows off its first Blackwell wafer manufactured in the US (https://www.engadget.com/big-tech/nvidia-shows-off-its-first-blackwell-wafer-manufactured-in-the-us-192836249.html) This Is How Much Anthropic and Cursor Spend On Amazon Web Services (https://www.wheresyoured.at/costs/) Automattic CEO calls Tumblr his 'biggest failure' so far (https://techcrunch.com/2025/10/20/automattic-ceo-calls-tumblr-his-biggest-failure-so-far/) Marc Benioff says Salesforce is saving about $100M a year by using AI tools in its customer service operations (https://www.bloomberg.com/news/articles/2025-10-14/salesforce-says-ai-customer-service-saves-100-million-annually | http://www.techmeme.com/251014/p32#a251014p32) Amazon cloud computing outage disrupts Snapchat, Ring and many other online services (https://apnews.com/article/amazon-east-internet-services-outage-654a12ac9aff0bf4b9dc0e22499d92d7) Amazon Outage Forces Hundreds of Websites Offline for Hours (https://www.nytimes.com/2025/10/20/business/aws-down-internet-outage.html) Today is when Amazon brain drain finally caught up with AWS (https://www.theregister.com/2025/10/20/aws_outage_amazon_brain_drain_corey_quinn/) AWS crash causes $2,000 Smart Beds to overheat and get stuck upright - Dexerto (https://www.dexerto.com/entertainment/aws-crash-causes-2000-smart-beds-to-overheat-and-get-stuck-upright-3272251/) Nonsense Streetlights Are Mysteriously Turning Purple. Here's Why (https://www.scientificamerican.com/article/streetlights-are-mysteriously-turning-purple-heres-why/) Buc-ee's is not America's top convenience store; Midwest chain takes No. 1 spot (https://local12.com/news/nation-world/bucees-not-america-top-convenience-store-satisfaction-ratings-rankings-midwest-chain-kwik-trip-takes-number-one-spot-wawa-sheetz-quicktrip-cincinnati-ohio) French post office rolls out croissant-scented stamp (https://www.ctvnews.ca/world/article/french-post-office-rolls-out-croissant-scented-stamp/) Listener Feedback Jeffrey is looking for college interns. (https://careers.blizzard.com/global/en/job/R025908/2026-US-Summer-Internships-Game-Engineering) Conferences Wiz Wizdom Conferences (https://www.wiz.io/wizdom), NYC November 3-5, London November 17-19 SREDay Amsterdam (https://sreday.com/2025-amsterdam-q4/), Coté speaking, November 7th. SDT News & Community Join our Slack community (https://softwaredefinedtalk.slack.com/join/shared_invite/zt-1hn55iv5d-UTfN7mVX1D9D5ExRt3ZJYQ#/shared-invite/email) Email the show: questions@softwaredefinedtalk.com (mailto:questions@softwaredefinedtalk.com) Free stickers: Email your address to stickers@softwaredefinedtalk.com (mailto:stickers@softwaredefinedtalk.com) Follow us on social media: Twitter (https://twitter.com/softwaredeftalk), Threads (https://www.threads.net/@softwaredefinedtalk), Mastodon (https://hachyderm.io/@softwaredefinedtalk), LinkedIn (https://www.linkedin.com/company/software-defined-talk/), BlueSky (https://bsky.app/profile/softwaredefinedtalk.com) Watch us on: Twitch (https://www.twitch.tv/sdtpodcast), YouTube (https://www.youtube.com/channel/UCi3OJPV6h9tp-hbsGBLGsDQ/featured), Instagram (https://www.instagram.com/softwaredefinedtalk/), TikTok (https://www.tiktok.com/@softwaredefinedtalk) Book offer: Use code SDT for $20 off "Digital WTF" by Coté (https://leanpub.com/digitalwtf/c/sdt) Sponsor the show (https://www.softwaredefinedtalk.com/ads): ads@softwaredefinedtalk.com (mailto:ads@softwaredefinedtalk.com) Recommendations Brandon: The PR Guy Who Says the AI Boom Is a Bust (https://overcast.fm/+AAQL2e2DHQo) Matt: Comfort Ear Grip Hooks (https://www.amazon.com.au/dp/B07YVDT3KT) Coté: MSG on popcorn, Claude Skills, Masman Curry, Sora? Photo Credits Header (https://unsplash.com/photos/person-holding-white-and-gray-stone-OV44gxH71DU)

AI Round Up: Ari Morcos from Datalogy AI and Rob Toews from Radical VC on Karpathy Reactions, OpenAI's Dealmaking, & Bubble Reality Check

Unsupervised Learning

Play Episode Listen Later Oct 24, 2025 76:53

This episode features Rob Toews from Radical Ventures and Ari Morcos, Head of Research at Datology AI, reacting to Andrej Karpathy's recent statement that AGI is at least a decade away and that current AI capabilities are "slop." The discussion explores whether we're in an AI bubble, with both guests pushing back on overly bearish narratives while acknowledging legitimate concerns about hype and excessive CapEx spending. They debate the sustainability of AI scaling, examining whether continued progress will come from massive compute increases or from efficiency gains through better data quality, architectural innovations, and post-training techniques like reinforcement learning. The conversation also tackles which companies truly need frontier models versus those that can succeed with slightly-behind-the-curve alternatives, the surprisingly static landscape of AI application categories (coding, healthcare, and legal remain dominant), and emerging opportunities from brain-computer interfaces to more efficient scaling methods. (0:00) Intro(1:04) Debating the AI Bubble(1:50) Over-Hyping AI: Realities and Misconceptions(3:21) Enterprise AI and Data Center Investments(7:46) Consumer Adoption and Monetization Challenges(8:55) AI in Browsers and the Future of Internet Use(14:37) Deepfakes and Ethical Concerns(26:29) AI's Impact on Job Markets and Training(31:38) Google and Anthropic: Strategic Partnerships(34:51) OpenAI's Strategic Deals and Future Prospects(37:12) The Evolution of Vibe Coding(44:35) AI Outside of San Francisco(48:09) Data Moats in AI Startups(50:38) Comparing AI to the Human Brain(56:07) The Role of Physical Infrastructure in AI(56:55) The Potential of Chinese AI Models(1:03:15) Apple's AI Strategy(1:12:35) The Future of AI Applications With your co-hosts: @jacobeffron - Partner at Redpoint, Former PM Flatiron Health @patrickachase - Partner at Redpoint, Former ML Engineer LinkedIn @ericabrescia - Former COO Github, Founder Bitnami (acq'd by VMWare) @jordan_segall - Partner at Redpoint

ChatGPT Atlas, OpenAI's new web browser

Mixture of Experts

Play Episode Listen Later Oct 24, 2025 44:48

OpenAI is back and coming for search. This week on Mixture of Experts, we debrief ChatGPT Atlas, OpenAI's new web browser and the impacts on search. Then, Andrej Karpathy is back with his pessimistic timeline to AGI. Later, we discuss DeepSeek-OCR. Finally, can your AI have brain rot? Join host Tim Hwang and panelists Aaron Baughman, Abraham Daniels and Martin Keen on this week's Mixture of Experts to find out. 00:00 – Intro 00:55 – Goldman AI, Groq and IBM, Military AI and Uber 02:05 – ChatGPT Atlas 14:23 – Karpathy's AGI timeline 23:52 – DeepSeek-OCR 34:30 – AI brain rot The opinions expressed in this podcast are solely those of the participants and do not necessarily reflect the views of IBM or any other organization or entity.

ai uber chatgpt ibm openai browsers agi mixture webbrowser new web groq andrej karpathy tim hwang military ai

#58 - Marina Vinyes - Elle a recodé ChatGPT

Tronche de Tech

Play Episode Listen Later Oct 23, 2025 90:27

Elle a quitté un job de rêve pour coder son propre LLM. Parce que même les meilleurs ingénieurs IA sont en train d'être dépassés. Et pour cause. Aucun domaine n'avance plus vite que l'IA générative. Chaque semaine, un nouveau modèle. Un qui code. Un qui traduit vos vidéos. Un qui les génère… Et un… Qui s'appelle ”Banana”

News AI 43/25: AGI Timeline // OpenAI Atlas // Claude Skills

programmier.bar – der Podcast für App- und Webentwicklung

Play Episode Listen Later Oct 23, 2025 58:44

Die „programmier.con 2025 - Web & AI Edition“ findet am 29. und 30. Oktober 2025 statt. Sichert euch jetzt Tickets für die Konferenz auf unserer Webseite!Was wäre, wenn wir in weniger als zwei Jahren echte AGI hätten?

Why an AGI Delay Doesn't Mean an AI Bubble

The AI Breakdown: Daily Artificial Intelligence News and Discussions

Play Episode Listen Later Oct 21, 2025 23:35

Silicon Valley spent the weekend debating whether it's time to delay AGI expectations by a decade — and what that would mean for the so-called “AI bubble.” NLW breaks down the chain reaction: Microsoft's retreat from OpenAI's infrastructure arms race, an OpenAI math gaffe that went viral, and Andrej Karpathy's take on agent timelines — plus why none of it necessarily spells doom for real-world AI adoption.Brought to you by:KPMG – Discover how AI is transforming possibility into reality. Tune into the new KPMG 'You Can with AI' podcast and unlock insights that will inform smarter decisions inside your enterprise. Listen now and start shaping your future with every episode. ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.kpmg.us/AIpodcasts⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Blitzy.com - Go to ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://blitzy.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ to build enterprise software in days, not months Robots & Pencils - Cloud-native AI solutions that power results ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://robotsandpencils.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠The Agent Readiness Audit from Superintelligent - Go to ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://besuper.ai/ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠to request your company's agent readiness score.The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: https://pod.link/1680633614Interested in sponsoring the show? nlw@aidailybrief.ai

ai microsoft robots silicon valley bubbles delay openai kpmg agi andrej karpathy nlw

Andrej Karpathy — AGI is still a decade away

The Lunar Society

Play Episode Listen Later Oct 17, 2025 145:19

The Andrej Karpathy episode.During this interview, Andrej explains why reinforcement learning is terrible (but everything else is much worse), why AGI will just blend into the previous ~2.5 centuries of 2% GDP growth, why self driving took so long to crack, and what he sees as the future of education.It was a pleasure chatting with him.Watch on YouTube; read the transcript.Sponsors* Labelbox helps you get data that is more detailed, more accurate, and higher signal than you could get by default, no matter your domain or training paradigm. Reach out today at labelbox.com/dwarkesh* Mercury helps you run your business better. It's the banking platform we use for the podcast — we love that we can see our accounts, cash flows, AR, and AP all in one place. Apply online in minutes at mercury.com* Google's Veo 3.1 update is a notable improvement to an already great model. Veo 3.1's generations are more coherent and the audio is even higher-quality. If you have a Google AI Pro or Ultra plan, you can try it in Gemini today by visiting https://gemini.googleTimestamps(00:00:00) – AGI is still a decade away(00:29:45) – LLM cognitive deficits(00:40:05) – RL is terrible(00:49:38) – How do humans learn?(01:06:25) – AGI will blend into 2% GDP growth(01:17:36) – ASI(01:32:50) – Evolution of intelligence & culture(01:42:55) - Why self driving took so long(01:56:20) - Future of education Get full access to Dwarkesh Podcast at www.dwarkesh.com/subscribe

google future evolution reach decade mercury gemini gdp llm asi agi veo andrej rl andrej karpathy

Podlodka #436 – Математика в ИИ

Podlodka Podcast

Play Episode Listen Later Aug 5, 2025 86:43

Многие знают, что когда модели обучаются, где-то под капотом перемножаются матрицы и тензоры, и все это связано с дифференцированием. Мы с Денисом Степановым взялись за нелегкую задачу – разобраться, что же именно там происходит! Также ждем вас, ваши лайки, репосты и комменты в мессенджерах и соцсетях!  Telegram-чат: https://t.me/podlodka Telegram-канал: https://t.me/podlodkanews Страница в Facebook: www.facebook.com/podlodkacast/ Twitter-аккаунт: https://twitter.com/PodcastPodlodka Ведущие в выпуске: Женя Кателла, Аня Симонова Полезные ссылки: Dive into Deep Learning Aston Zhang, Zachary C. Lipton, Mu Li, Alexander J. Smola (online book with code and formulas) https://d2l.ai/ https://www.amazon.com/s/ref=dp_byline_sr_book_2?ie=UTF8&field-author=Zachary+C.+Lipton&text=Zachary+C.+Lipton&sort=relevancerank&search-alias=books Micrograd by Andrej Karpathy https://github.com/karpathy/micrograd Andrej Karpathy builds GPT from scratch https://www.youtube.com/watch?v=kCc8FmEb1nY Scott Aaronson on LLM Watermarking https://www.youtube.com/watch?v=YzuVet3YkkA Annotated history of Modern AI and Deep Learning by Jurgen Schmidhuber https://people.idsia.ch/~juergen/deep-learning-history.html Probabilistic Machine Learning: An Introduction Kevin Patrick Murphy https://probml.github.io/pml-book/book1.html Probabilistic Machine Learning: Advanced Topics Kevin Patrick Murphy https://probml.github.io/pml-book/book2.html Pattern Recognition and Machine Learning Christopher Bishop https://www.microsoft.com/en-us/research/wp-content/uploads/2006/01/Bishop-Pattern-Recognition-and-Machine-Learning-2006.pdf Deep Learning: Foundations and Concepts Christopher Bishop, Hugh Bishop https://www.bishopbook.com/ Deep Learning Ian Goodfellow, Yoshua Bengio, Aaron Courville https://www.deeplearningbook.org/ Глубокое обучение: Погружение в мир нейронных сетей С. Николенко, А. Кадурин, Е. Архангельская https://www.k0d.cc/storage/books/AI,%20Neural%20Networks/%D0%93%D0%BB%D1%83%D0%B1%D0%BE%D0%BA%D0%BE%D0%B5%20%D0%BE%D0%B1%D1%83%D1%87%D0%B5%D0%BD%D0%B8%D0%B5%20(%D0%9D%D0%B8%D0%BA%D0%BE%D0%BB%D0%B5%D0%BD%D0%BA%D0%BE).pdf Gonzo-обзоры ML статей Григорий Сапунов, Алексей Тихонов https://t.me/gonzo_ML Machine Learning Street Talk podcast https://www.youtube.com/c/machinelearningstreettalk Feedforward NNs, Autograd, Backprop (Datalore report, Denis Stepanov) https://datalore.jetbrains.com/report/static/Ht_isxs4iB2.BNIqv-C3WUp/pEpNv2eMVU9tEkPsaboR9y Softmax Regression, Adversarial Attacks (Datalore report, Denis Stepanov) https://datalore.jetbrains.com/report/static/Ht_isxs4iB2.BNIqv-C3WUp/cIvd6zX1B5I3kULNiVCEyy Dual Numbers, PINN (Datalore report, Denis Stepanov) https://datalore.jetbrains.com/report/static/Ht_isxs4iB2.BNIqv-C3WUp/3oa1BNrPGpQ8uc82tCaz5d

ai dive telegram machine learning gpt d1 deep learning pattern recognition yoshua bengio andrej karpathy

How Attention to Detail Built a Unicorn | Notion's Ivan Zhao

Go To Market Grit

Play Episode Listen Later Aug 4, 2025 88:36

Ivan Zhao joins Joubin Mirzadegan on Grit to break down how the company's minimalist design became a strategic edge in a world overwhelmed by bloated software. He shares why the AI agent still hasn't arrived, and how Notion's modular approach might be the closest thing to making it real.Guest: Ivan Zhao, co-founder and CEO of NotionMentioned in this episode: Fuzzy Khosrowshahi, Airbnb, Sequoia Capital, Linear, Figma, Apple, Things, Microsoft, BMW, Lumiere, The Beatles, The Rolling Stones, Eric Clapton, Rippling, Matt MacInnis, Inkling, Steve Jobs, Douglas Engelbart, Alan Kay, Bill Gates, OpenAI ChatGPT, Y Combinator, Andrej Karpathy, Toby Schachman, Simon Last, Spotify, SlackConnect with Ivan ZhaoXLinkedInConnect with JoubinXLinkedInEmail: grit@kleinerperkins.comLearn more about Kleiner Perkins

Talking to a billionaire about how he uses ChatGPT

My First Million

Play Episode Listen Later Jul 16, 2025 56:15

Want Sam's playbook to turn ChatGPT into your executive coach? Get it here: https://clickhubspot.com/sfb Episode 726: Sam Parr ( https://x.com/theSamParr ) and Shaan Puri ( https://x.com/ShaanVP ) talk to Dharmesh Shah ( https://x.com/dharmesh ) about how he's using ChatGPT. — Show Notes: (0:00) Intro (2:00) Context windows (5:26) Vector embeddings (17:20) Automation and orchestration (21:03) Tool calling (28:14) Dharmesh's hot takes on AI (33:06) Agentic managers (39:41) Zuck poaches OpenAI talent w/ 9-figures (49:33) Shaan makes a video game — Links: • Agent.ai - https://agent.ai/ • Andrej Karpathy - https://www.youtube.com/andrejkarpathy — Check Out Shaan's Stuff: • Shaan's weekly email - https://www.shaanpuri.com • Visit https://www.somewhere.com/mfm to hire worldwide talent like Shaan and get $500 off for being an MFM listener. Hire developers, assistants, marketing pros, sales teams and more for 80% less than US equivalents. • Mercury - Need a bank for your company? Go check out Mercury (mercury.com). Shaan uses it for all of his companies! Mercury is a financial technology company, not an FDIC-insured bank. Banking services provided by Choice Financial Group, Column, N.A., and Evolve Bank & Trust, Members FDIC — Check Out Sam's Stuff: • Hampton - https://www.joinhampton.com/ • Ideation Bootcamp - https://www.ideationbootcamp.co/ • Copy That - https://copythat.com • Hampton Wealth Survey - https://joinhampton.com/wealth • Sam's List - http://samslist.co/ My First Million is a HubSpot Original Podcast // Brought to you by HubSpot Media // Production by Arie Desormeaux // Editing by Ezra Bakker Trupiano

101: Software 101

Silberbauer & Blomseth

Play Episode Listen Later Jul 8, 2025 64:04

Generativ AI med de store sprogmodeller, LLM'erne, er det største, der er sket inden for softwareudvikling i de fire årtier, hvor Thomas har skrevet instruktioner til computere for at få dem til at gøre det, som han gerne vil have dem til. Og det er altså ikke softwareudviklingen bag LLM'erne, som det handler om: Det er det, at man kan bruge LLM'erne til at lave software med. Det synspunkt står han ikke alene med inden for softwarebranchen, og Klaus er da heller ikke afvisende over for tanken. For bedre at kunne forklare hvorfor bliver et nyligt foredrag af Andrej Karpathy (eks-OpenAI, eks-Tesla AI) om “Software 3.0” til afsættet for en tur op gennem historien fra den håndkodede og tilstræbt forudsigelige software 1.0 til den probabilistiske og fabulerende 3.0.**** Nu også på YouTube — hvis du bedst kan lide din podcast med uden video. https://youtube.com/@silberblom **** Hvem betaler for Silberbauer & Blomseth? Det gør vi selv. Vores indhold er på ingen måde egnet til sponsorer eller reklamer for proteinpulver, VPN-forbindelse eller e-bøger. Så hosting, udstyr og alt det der er på egen regning. Det eneste vi beder om til gengæld (hvis du altså kan lide det, vi laver) er at du smider stjerner, og måske oven i købet en lille anbefaling, efter os på Apple Podcast. Det betyder alverden. Vi higer jo allesammen efter anerkendelse i en eller anden form. Husk at følge os på Bluesky (@silberblom) Linktree

software apple podcast openai blue sky linktree klaus vpn hvem llm husk vores andrej karpathy

Como a AI está reescrevendo as regras de produto e distribuição | Papo na Arena #86

Papo na Arena

Play Episode Listen Later Jul 1, 2025 37:53

No episódio dessa semana, nossos hosts ⁠Arthur⁠ e ⁠Aíquis⁠ discutem o vídeo do Andrej Karpathy na AI Startup School da YC e sobre o texto do Brian Balfour falando sobre a próxima grande mudança nos modelo de distribuição#JABÁASSINATURA DA ARENA COM R$150 OFF: Acesso a TODOS cursos, eventos, encontros da comunidade, além de sorteio TODA SEMANA: ⁠⁠linkPrograma de mentoria para lideranças de produto⁠⁠⁠PRÉ-INSCRIÇÃO COHORT Q3Chapters:- 00:00 Introdução- 02:53 Mudanças de Paradigma com AI- 06:52 Comentário sobre o vídeo do Andrej Karpathy- 15:55 Comentário sobre o texto do Brian Balfour- 31:51 Produtos da Semana- 36:59 EncerramentoOutros links comentados:- [Andrej Karpathy: Software Is Changing (Again) - YouTube](https://www.youtube.com/watch?v=LCEmiRjPEtQ)- [The Next Great Distribution Shift - Brian Balfour](https://blog.brianbalfour.com/p/the-next-great-distribution-shift)- [As Upstarts: Como a Uber, o Airbnb e as killer companies do novo Vale do Silício estão mudando o mundo | Amazon.com.br](https://www.amazon.com.br/Upstarts-Airbnb-Companies-Sil%C3%ADcio-Mudando/dp/8551002082/)- [Deep Dive into LLMs like ChatGPT - YouTube](https://www.youtube.com/watch?v=7xTGNNLPyMI)- [Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task](https://arxiv.org/pdf/2506.08872v1)Produtos da Semana:- Substack- Acquired Podcast- Whispr Flow- NotebookLM- Gemini- Youtube- Abridor de vinho eletrico da AliExpress- Perplexity Labs- Pedidos Ya- Turbi- CapCut- Windsurf- iFood- Event Tracker do Aíquis- Buffetmax- Camera TP-Link (Tapo)

amazon ai uber deep dive airbnb arena semana mudan sil coment regras produtos paradigma yc produto distribui o papo andrej karpathy brian balfour

AI, Agents and Software 3.0

The AI Breakdown: Daily Artificial Intelligence News and Discussions

Play Episode Listen Later Jun 29, 2025 14:26

Andrej Karpathy's Software 3.0 talk reframes LLMs as a new kind of software—programmable, agent-native, and fundamentally different from past computing models. This episode breaks down his key ideas, from autonomy sliders to the need for new infrastructure designed for AI-first users.Source: https://www.youtube.com/watch?v=LCEmiRjPEtQGet Ad Free AI Daily Brief: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://patreon.com/AIDailyBrief⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Brought to you by:Gemini - Supercharge your creativity and productivity - ⁠⁠⁠⁠⁠⁠⁠⁠⁠http://gemini.google/⁠⁠⁠⁠⁠⁠⁠⁠KPMG – Go to ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://kpmg.com/ai⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ to learn more about how KPMG can help you drive value with our AI solutions.Blitzy.com - Go to ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://blitzy.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ to build enterprise software in days, not months AGNTCY - The AGNTCY is an open-source collective dedicated to building the Internet of Agents, enabling AI agents to communicate and collaborate seamlessly across frameworks. Join a community of engineers focused on high-quality multi-agent software and support the initiative at ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠agntcy.org ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ - ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://agntcy.org/?utm_campaign=fy25q4_agntcy_amer_paid-media_agntcy-aidailybrief_podcast&utm_channel=podcast&utm_source=podcast⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Vanta - Simplify compliance - ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://vanta.com/nlw⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Plumb - The automation platform for AI experts and consultants ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://useplumb.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠The Agent Readiness Audit from Superintelligent - Go to ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://besuper.ai/ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠to request your company's agent readiness score.The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: https://pod.link/1680633614Subscribe to the newsletter: https://aidailybrief.beehiiv.com/Join our Discord: https://bit.ly/aibreakdownInterested in sponsoring the show? nlw@breakdown.network

ai internet software discord kpmg andrej karpathy

AI Moves Off the Cloud, Google Breaks the Internet, Google-Wiz Deal Under Fire

Cloud Unplugged

Play Episode Listen Later Jun 27, 2025 38:46 Transcription Available

This week on Cloud Unplugged: AI goes local, Google Cloud breaks the internet, and the DOJ turns up the heat on Google's $32B Wiz acquisition.We're breaking down the biggest stories in cloud and AI:Context & Qualcomm are teaming up to move AI agents off the cloud and onto your device. What does this mean for the future of local-first AI?A major Google Cloud outage caused chaos across Cloudflare, Shopify, and Discord. We explain what went wrong and what it tells us about the risks of centralised cloud infrastructure.The DOJ is investigating Google's acquisition of Wiz, raising questions about cloud security competition and antitrust concerns.Plus: Andrej Karpathy's Software 3.0 vision, is natural language the new programming interface?Hosted by Lewis and Jon, two cloud-native veterans covering the real stories behind the hype in cloud, AI, and dev infrastructure.

ai google startups breaks software discord cloud infrastructure saas shopify doj qualcomm wiz cloud computing google cloud cloudflare tech podcast gcp edge ai andrej karpathy internet google google cloud outage

Baby Registries, Cold Showers, and Launching opencode

How About Tomorrow?

Play Episode Listen Later Jun 24, 2025 52:26 Transcription Available

Links:Dax's tweet about opencode rewriteopencode.aiIntro | opencodeGitHub - sst/opencode: AI coding agent, built for the terminal.Andrej Karpathy: Software Is Changing (Again) - YouTubeModels.dev — An open-source database of AI modelsSponsor: Terminal now offers a monthly box called Cron.Want to carry on the conversation? Join us in Discord. Or send us an email at sliceoffalittlepieceofbacon@tomorrow.fm.Topics:(00:00) - A Canadian standoff (00:29) - Who's the resident nice guy around here? (02:27) - Finding the Apple of carseats and baby strollers (05:39) - Transitioning from walking to running (08:03) - Sleeping struggles (11:21) - Launching Opencode (16:48) - Is starting a podcast the key to working well together as programmers? 4 out of 5 podcast editors say yes (22:46) - Figuring out what to work on next in open source software (32:29) - Dax is still living in oblivious bliss from Twitter (33:45) - Andrej Karpathy on how Software Is Changing (35:53) - Dax tries vibe coding (46:30) - How much of a bet are we placing on the terminal? (48:17) - Writing code for Frank ★ Support this podcast ★

ai babies apple internet technology canadian writing web software discord transitioning launching sleeping programming cold showers cron registries andrej karpathy chris enns dax raad

AI's Jagged Age: Memory Limits, Retrieval Bots, and Legal Battles Over Encryption and Privacy

Business of Tech

Play Episode Listen Later Jun 13, 2025 18:01

AI models have a defined memory ceiling, which is reshaping the ongoing debates surrounding copyright and data privacy. Recent research from Meta, Google DeepMind, Cornell, and NVIDIA reveals that large language models have a fixed memorization capacity of approximately 8.6 bits per parameter. This finding clarifies the distinction between memorized data and generalized knowledge, indicating that larger datasets do not necessarily lead to increased memorization of specific data points. This understanding is crucial as it informs the operational mechanisms of AI models and addresses concerns related to copyright infringement.Sundar Pichai, CEO of Google, has introduced the term "artificial jagged intelligence" to describe the current phase of AI development, highlighting the non-linear progress and the challenges faced by researchers despite significant advancements. Pichai's perspective reflects the mixed performance of AI models, which can exhibit extraordinary capabilities alongside notable errors. This sentiment is echoed by deep learning researcher Andrej Karpathy, emphasizing the unpredictability of AI performance and the need for a more nuanced understanding of its capabilities.The rise of AI retrieval bots is transforming how users access information online, with a significant increase in traffic from these bots. Companies like OpenAI and Anthropic are deploying these bots to summarize content in real-time, moving away from traditional search methods that provide links to multiple sources. This shift poses challenges for content publishers, as the growth of retrieval bots indicates a changing economic landscape where content is increasingly consumed by AI first, with human users following. Publishers may need to rethink their engagement strategies to adapt to this new reality.In the broader context of technology and cybersecurity, WhatsApp's intervention in a legal case concerning encryption and privacy rights highlights the growing role of platforms in surveillance debates. Additionally, the U.S. Cybersecurity and Infrastructure Security Agency faces leadership challenges amid a talent exodus, raising concerns about its operational effectiveness. As the IT services industry evolves, the integration of AI into various sectors, including hiring and cybersecurity, underscores the importance of execution, interoperability, and trust in automation. The future of technology will depend on how well businesses can navigate these changes and support their clients in making informed decisions. Four things to know today 00:00 AI's Jagged Reality: Study Reveals Limits to Model Memory as Bots Redefine the Web Economy05:35 Cybersecurity Crossroads: WhatsApp Joins Apple in Legal Fight as U.S. Agency Leadership Crumbles08:29 AI Matures Into Infrastructure Layer as IT Vendors Shift Focus to Outcomes and Execution11:51 Legal Tech, GenAI, and Fast Food Bots All Show One Thing: Hype Doesn't Equal Success This is the Business of Tech. Supported by: All our Sponsors: https://businessof.tech/sponsors/ Do you want the show on your podcast app or the written versions of the stories? Subscribe to the Business of Tech: https://www.businessof.tech/subscribe/Looking for a link from the stories? The entire script of the show, with links to articles, are posted in each story on https://www.businessof.tech/ Support the show on Patreon: https://patreon.com/mspradio/ Want to be a guest on Business of Tech: Daily 10-Minute IT Services Insights? Send Dave Sobel a message on PodMatch, here: https://www.podmatch.com/hostdetailpreview/businessoftech Want our stuff? Cool Merch? Wear “Why Do We Care?” - Visit https://mspradio.myspreadshop.com Follow us on:LinkedIn: https://www.linkedin.com/company/28908079/YouTube: https://youtube.com/mspradio/Facebook: https://www.facebook.com/mspradionews/Instagram: https://www.instagram.com/mspradio/TikTok: https://www.tiktok.com/@businessoftechBluesky: https://bsky.app/profile/businessof.tech

Teaching AI to Understand the Physical World, with Dr. Fei-Fei Li of World Labs

No Priors: Artificial Intelligence | Machine Learning | Technology | Startups

Play Episode Listen Later Jun 5, 2025 35:53

In this episode of No Priors, Sarah and Elad are joined by Dr. Fei-Fei Li, AI pioneer, co-director of Stanford's Human-Centered AI Institute, and founder of World Labs. Fei-Fei shares why she's building at the intersection of embodiment and intelligence, and what today's AI systems are still missing. From the early days of ImageNet to her vision for the next generation of robotics, she unpacks the human and technical motivations behind World Labs. They also discuss the challenges of 3D world modeling, her approach to building exceptional teams, and the special qualities that have led her students like Andrej Karpathy to make major breakthroughs. Show Notes: 0:00 Why and what Dr. Fei-Fei Li is building 3:00 World models at World Labs 6:44 Missing gaps in the AI future 9:16 Robotics and physical intelligence 16:15 Greatest challenges of 3D 19:08 Fei-Fei's work in PhD in ImageNet 23:05 Special moments in Dr. Li's career 29:33 Building teams 32:05 Human-centered AI

world ai building phd teaching missing 3d human stanford li robotics labs greatest elad physical world imagenet fei fei li andrej karpathy fei fei no priors

Vibe coding is having its moment

Marketplace Tech

Play Episode Listen Later May 8, 2025 7:38

Vibe coding is having a moment.The buzzy new phrase was coined earlier this year by OpenAI co-founder Andrej Karpathy to describe his process of programming by prompting AI. It's been embraced by tech professionals and amateurs alike. Google, Microsoft and Apple have or are developing their own AI-assisted coding platforms while vibe coding startups like Cursor are raking in funding.Marketplace's Meghan McCarty Carino recently spoke with Clarence Huang, vice president of technology at the financial software company Intuit and an early adopter of vibe coding, about how the practice has changed how he approaches building software.More on this“What is vibe coding, exactly?” - from MIT Technology Review“New ‘Slopsquatting' Threat Emerges from AI-Generated Code Hallucinations” - from HackRead“Three-minute explainer on… slopsquatting” - from Raconteur

ai google apple microsoft vibe marketplace openai coding intuit cursor raconteur andrej karpathy meghan mccarty carino

Vibe coding is having its moment

Marketplace All-in-One

Play Episode Listen Later May 8, 2025 7:38

ai google apple microsoft vibe marketplace openai coding intuit cursor raconteur andrej karpathy meghan mccarty carino

How Others Are Using AI - Claire and Greta

Don't Stop Us Now! Podcast

Play Episode Listen Later Apr 24, 2025 26:36

Our episode this week tackles that quiet question many of us ponder: are others using AI more effectively than we are?We explore some fascinating new research just published by Harvard Business Review that reveals the top 100 AI use cases based on actual user reports. It shows some dramatic changes in how people are using AI compared with just one year ago. We know it will give you new insights on how you could be using AI right now too.What's particularly interesting is that the top five use cases have been completely reshuffled. Entirely new entrants making their debut straight into the top five spots. From deeply personal uses to professional applications, we were surprised by the findings.The research has a unique approach which we explore in this episode as well as:Explore the top 5 use case types and share prompts and experiences Share personal examples of how others are using AI to save hours organising their livesDiscuss why certain uses have skyrocketed in popularity, and Examine a thought-provoking observation from AI thought leader Andrej Karpathy about why AI is unfolding completely differently than any other tech.If you've been fretting that your workplace is falling behind in the AI race we share exactly why this might be. And if you have been wondering how your use of AI compares to other peoples', then this episode may answer that question too. If you're looking for inspiration on how to use AI then stay tuned for a wealth of practical ideas. Enjoy this episode. Useful LinksFull list of HBR's Top 100 AI use cases - check our website for the complete list - www.dontstopusnow.coHarvard Business Review article on Top 100 AI use casesAndrej Karpathy's blog post on consumers as AI power usersSubscribe to Don't Stop Us Now – AI Edition wherever you get your podcasts to stay in the loop on what you need to know to remain relevant in this fast-changing world. Hosted on Acast. See acast.com/privacy for more information.

ai explore acast harvard business review using ai hbr andrej karpathy

Andrej Karpathy on How AI Empowers

The AI Breakdown: Daily Artificial Intelligence News and Discussions

Play Episode Listen Later Apr 14, 2025 15:26

OpenAI cofounder Andrej Karpathy makes an argument that the normal patterns of technology diffusion have been upended with AI, to the benefit of regular people. Source: https://x.com/karpathy/status/1909308143156240538Get Ad Free AI Daily Brief: ⁠⁠⁠⁠⁠⁠⁠https://patreon.com/AIDailyBrief⁠⁠⁠⁠⁠⁠⁠Brought to you by:KPMG – Go to ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://kpmg.com/ai⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ to learn more about how KPMG can help you drive value with our AI solutions.Vanta - Simplify compliance - ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://vanta.com/nlw⁠⁠⁠⁠Plumb - The Automation Platform for AI Experts - ⁠⁠⁠⁠https://useplumb.com/nlw⁠⁠⁠⁠The Agent Readiness Audit from Superintelligent - Go to ⁠⁠⁠⁠https://besuper.ai/ ⁠⁠⁠⁠to request your company's agent readiness score.The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: https://pod.link/1680633614Subscribe to the newsletter: https://aidailybrief.beehiiv.com/Join our Discord: https://bit.ly/aibreakdown

ai discord empower openai kpmg andrej karpathy

OpenAI's CPO on how AI changes must-have skills, moats, coding, startup playbooks, more | Kevin Weil (CPO at OpenAI, ex-Instagram, Twitter)

Lenny's Podcast: Product | Growth | Career

Play Episode Listen Later Apr 10, 2025 91:41

Kevin Weil is the chief product officer at OpenAI, where he oversees the development of ChatGPT, enterprise products, and the OpenAI API. Prior to OpenAI, Kevin was head of product at Twitter, Instagram, and Planet, and was instrumental in the development of the Libra (later Novi) cryptocurrency project at Facebook.In this episode, you'll learn:1. How OpenAI structures its product teams and maintains agility while developing cutting-edge AI2. The power of model ensembles—using multiple specialized models together like a company of humans with different skills3. Why writing effective evals (AI evaluation tests) is becoming a critical skill for product managers4. The surprisingly enduring value of chat as an interface for AI, despite predictions of its obsolescence5. How “vibe coding” is changing how companies operate6. What OpenAI looks for when hiring product managers (hint: high agency and comfort with ambiguity)7. “Model maximalism” and why today's AI is the worst you'll ever use again8. Practical prompting techniques that improve AI interactions, including example-based prompting—Brought to you by:• Eppo—Run reliable, impactful experiments• Persona—A global leader in digital identity verification• OneSchema—Import CSV data 10x faster—Where to find Kevin Weil:• X: https://x.com/kevinweil• LinkedIn: https://www.linkedin.com/in/kevinweil/—Where to find Lenny:• Newsletter: https://www.lennysnewsletter.com• X: https://twitter.com/lennysan• LinkedIn: https://www.linkedin.com/in/lennyrachitsky/—In this episode, we cover:(00:00) Kevin's background(04:06) OpenAI's new image model(06:52) The role of chief product officer at OpenAI(10:18) His recruitment story and joining OpenAI(17:20) The importance of evals in AI(24:59) Shipping quickly and consistently(28:34) Product reviews and iterative deployment(39:35) Chat as an interface for AI(43:59) Collaboration between researchers and product teams(46:41) Hiring product managers at OpenAI(48:45) Embracing ambiguity in product management(51:41) The role of AI in product teams(53:21) Vibe coding and AI prototyping(55:55) The future of product teams and fine-tuned models(01:04:36) AI in education(01:06:42) Optimism and concerns about AI's future(01:16:37) Reflections on the Libra project(01:20:37) Lightning round and final thoughts—Referenced:• OpenAI: https://openai.com/• The AI-Generated Studio Ghibli Trend, Explained: https://www.forbes.com/sites/danidiplacido/2025/03/27/the-ai-generated-studio-ghibli-trend-explained/• Introducing 4o Image Generation: https://openai.com/index/introducing-4o-image-generation/• Waymo: https://waymo.com/• X: https://x.com• Facebook: https://www.facebook.com/• Instagram: https://www.instagram.com/• Planet: https://www.planet.com/• Sam Altman on X: https://x.com/sama• A conversation with OpenAI's CPO Kevin Weil, Anthropic's CPO Mike Krieger, and Sarah Guo: https://www.youtube.com/watch?v=IxkvVZua28k• OpenAI evals: https://github.com/openai/evals• Deep Research: https://openai.com/index/introducing-deep-research/• Ev Williams on X: https://x.com/ev• OpenAI API: https://platform.openai.com/docs/overview• Dwight Eisenhower quote: https://www.brainyquote.com/quotes/dwight_d_eisenhower_164720• Inside Bolt: From near-death to ~$40m ARR in 5 months—one of the fastest-growing products in history | Eric Simons (founder & CEO of StackBlitz): https://www.lennysnewsletter.com/p/inside-bolt-eric-simons• StackBlitz: https://stackblitz.com/• Claude 3.5 Sonnet: https://www.anthropic.com/news/claude-3-5-sonnet• Anthropic: https://www.anthropic.com/• Four-minute mile: https://en.wikipedia.org/wiki/Four-minute_mile• Chad: https://chatgpt.com/g/g-3F100ZiIe-chad-open-a-i• Dario Amodei on LinkedIn: https://www.linkedin.com/in/dario-amodei-3934934/• Figma: https://www.figma.com/• Julia Villagra on LinkedIn: https://www.linkedin.com/in/juliavillagra/• Andrej Karpathy on X: https://x.com/karpathy• Silicon Valley CEO says ‘vibe coding' lets 10 engineers do the work of 100—here's how to use it: https://fortune.com/2025/03/26/silicon-valley-ceo-says-vibe-coding-lets-10-engineers-do-the-work-of-100-heres-how-to-use-it/• Cursor: https://www.cursor.com/• Windsurf: https://codeium.com/windsurf• GitHub Copilot: https://github.com/features/copilot• Patrick Srail on X: https://x.com/patricksrail• Khan Academy: https://www.khanacademy.org/• CK-12 Education: https://www.ck12.org/• Sora: https://openai.com/sora/• Sam Altman's post on X about creative writing: https://x.com/sama/status/1899535387435086115• Diem (formerly known as Libra): https://en.wikipedia.org/wiki/Diem_(digital_currency)• Novi: https://about.fb.com/news/2020/05/welcome-to-novi/• David Marcus on LinkedIn: https://www.linkedin.com/in/dmarcus/• Peter Zeihan on X: https://x.com/PeterZeihan• The Wheel of Time on Prime Video: https://www.amazon.com/Wheel-Time-Season-1/dp/B09F59CZ7R• Top Gun: Maverick on Prime Video: https://www.amazon.com/Top-Gun-Maverick-Joseph-Kosinski/dp/B0DM2LYL8G• Thinking like a gardener not a builder, organizing teams like slime mold, the adjacent possible, and other unconventional product advice | Alex Komoroske (Stripe, Google): https://www.lennysnewsletter.com/p/unconventional-product-advice-alex-komoroske• MySQL: https://www.mysql.com/—Recommended books:• Co-Intelligence: Living and Working with AI: https://www.amazon.com/Co-Intelligence-Living-Working-Ethan-Mollick/dp/059371671X• The Accidental Superpower: Ten Years On: https://www.amazon.com/Accidental-Superpower-Ten-Years/dp/1538767341• Cable Cowboy: https://www.amazon.com/Cable-Cowboy-Malone-Modern-Business/dp/047170637X—Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email podcast@lennyrachitsky.com.—Lenny may be an investor in the companies discussed. Get full access to Lenny's Newsletter at www.lennysnewsletter.com/subscribe

We need to talk about vibe coding

ThoughtWorks Podcast

Play Episode Listen Later Apr 2, 2025 36:53

The term 'vibe coding' — which first appeared in a post on X by Andrej Karpathy in early February 2025 — has set the software development world abuzz: everyone seems to have their own take on what it is, how it's done and whether it's a bold new chapter in the history of programming or an insult to anyone that's ever written a line of code. Clearly, then, we need to talk about vibe coding — and that's precisely what we do on this episode of the Technology Podcast. Featuring Thoughtworkers Birgitta Böckeler (AI for Software Delivery Lead) and Lilly Ryan (Cybersecurity Principal), who join hosts Neal Ford and Prem Chandrasekaran, we dive into the different understandings and applications of the concept, and discuss what happens when a meme collides with reality.

ai vibe coding technology podcast andrej karpathy neal ford

Vibe Coding

Let's Know Things

Play Episode Listen Later Apr 1, 2025 18:48

This week we talk about Studio Ghibli, Andrej Karpathy, and OpenAI.We also discuss code abstraction, economic repercussions, and DOGE.Recommended Book: How To Know a Person by David BrooksTranscriptIn late-November of 2022, OpenAI released a demo version of a product they didn't think would have much potential, because it was kind of buggy and not very impressive compared to the other things they were working on at the time. This product was a chatbot interface for a generative AI model they had been refining, called ChatGPT.This was basically just a chatbot that users could interact with, as if they were texting another human being. And the results were good enough—both in the sense that the bot seemed kinda sorta human-like, but also in the sense that the bot could generate convincing-seeming text on all sorts of subjects—that people went absolutely gaga over it, and the company went full-bore on this category of products, dropping an enterprise version in August the following year, a search engine powered by the same general model in October of 2024, and by 2025, upgraded versions of their core models were widely available, alongside paid, enhanced tiers for those who wanted higher-level processing behind the scenes: that upgraded version basically tapping a model with more feedstock, a larger training library and more intensive and refined training, but also, in some cases, a model that thinks longer, than can reach out and use the internet to research stuff it doesn't already know, and increasingly, to produce other media, like images and videos.During that time, this industry has absolutely exploded, and while OpenAI is generally considered to be one of the top dogs in this space, still, they've got enthusiastic and well-funded competition from pretty much everyone in the big tech world, like Google and Amazon and Meta, while also facing upstart competitors like Anthropic and Perplexity, alongside burgeoning Chinese competitors, like Deepseek, and established Chinese tech giants like Tencent and Baidu.It's been somewhat boggling watching this space develop, as while there's a chance some of the valuations of AI-oriented companies are overblown, potentially leading to a correction or the popping of a valuation bubble at some point in the next few years, the underlying tech and the output of that tech really has been iterating rapidly, the state of the art in generative AI in particular producing just staggeringly complex and convincing images, videos, audio, and text, but the lower-tier stuff, which is available to anyone who wants it, for free, is also valuable and useable for all sorts of purposes.Just recently, at the tail-end of March 2025, OpenAI announced new multimodal capabilities for its GPT-4o language model, which basically means this model, which could previously only generate text, can now produce images, as well.And the model has been lauded as a sort of sea change in the industry, allowing users to produce remarkable photorealistic images just by prompting the AI—telling it what you want, basically—with usually accurate, high-quality text, which has been a problem for most image models up till this point. It also boasts the capacity to adjust existing images in all sorts of ways.Case-in-point, it's possible to use this feature to take a photo of your family on vacation and have it rendered in the style of a Studio Ghibli cartoon; Studio Ghibli being the Japanese animation studio behind legendary films like My Neighbor Totoro, Spirited Away, and Princess Mononoke, among others.This is partly the result of better capabilities by this model, compared to its precursors, but it's also the result of OpenAI loosening its policies to allow folks to prompt these models in this way; previously they disallowed this sort of power, due to copyright concerns. And the implications here are interesting, as this suggests the company is now comfortable showing that their models have been trained on these films, which has all sorts of potential copyright implications, depending on how pending court cases turn out, but also that they're no long being as precious with potential scandals related to how their models are used.It's possible to apply all sorts of distinctive styles to existing images, then, including South Park and the Simpsons, but Studio Ghibli's style has become a meme since this new capability was deployed, and users have applied it to images ranging from existing memes to their own self-portrait avatars, to things like the planes crashing into the Twin Towers on 9/11, JFK's assassination, and famous mass-shootings and other murders.It's also worth noting that the co-founder of Studio Ghibli, Hayao Miyazaki, has called AI-generated artwork “an insult to life itself.” That so many people are using this kind of AI-generated filter on these images is a jarring sort of celebration, then, as the person behind that style probably wouldn't appreciate it; many people are using it because they love the style and the movies in which it was born so much, though. An odd moral quandary that's emerged as a result of these new AI-provided powers.What I'd like to talk about today is another burgeoning controversy within the AI space that's perhaps even larger in implications, and which is landing on an unprepared culture and economy just as rapidly as these new image capabilities and memes.—In February of 2025, the former AI head at Tesla, founding team member at OpenAI, and founder of an impending new, education-focused project called Eureka Labs named Andrej Karpathy coined the term ‘vibe coding' to refer to a trend he's noticed in himself and other developers, people who write code for a living, to develop new projects using code-assistant AI tools in a manner that essentially abstracts away the code, allowing the developer to rely more on vibes in order to get their project out the door, using plain English rather than code or even code-speak.So while a developer would typically need to invest a fair bit of time writing the underlying code for a new app or website or video game, someone who's vibe coding might instead focus on a higher, more meta-level of the project, worrying less about the coding parts, and instead just telling their AI assistant what they want to do. The AI then figures out the nuts and bolts, writes a bunch of code in seconds, and then the vibe coder can tweak the code, or have the AI tweak it for them, as they refine the concept, fix bugs, and get deeper into the nitty-gritty of things, all, again, in plain-spoken English.There are now videos, posted in the usual places, all over YouTube and TikTok and such, where folks—some of whom are coders, some of whom are purely vibe coders, who wouldn't be able to program their way out of a cardboard box—produce entire functioning video games in a matter of minutes.These games typically aren't very good, but they work. And reaching even that level of functionality would previously have taken days or weeks for an experienced, highly trained developer; now it takes mere minutes or moments, and can be achieved by the average, non-trained person, who has a fundamental understanding of how to prompt AI to get what they want from these systems.Ethan Mollick, who writes a fair bit on this subject and who keeps tabs on these sorts of developments in his newsletter, One Useful Thing, documented his attempts to make meaning from a pile of data he had sitting around, and which he hadn't made the time to dig through for meaning. Using plain English he was able to feed all that data to OpenAI's Deep Research model, interact with its findings, and further home in on meaningful directions suggested by the data.He also built a simple game in which he drove a firetruck around a 3D city, trying to put out fires before a competing helicopter could do the same. He spent a total of about $13 in AI token fees to make the game, and he was able to do so despite not having any relevant coding expertise.A guy named Pieter Levels, who's an experienced software engineer, was able to vibe-code a video game, which is a free-to-play, massively multiplayer online flying game, in just a month. Nearly all the code was written by Cursor and Grok 3, the first of which is a code-writing AI system, the latter of which is a ChatGPT-like generalist AI agent, and he's been able to generate something like $100k per month in revenue from this game just 17 days, post-launch.Now an important caveat here is that, first, this game received a lot of publicity, because Levels is a well-known name in this space, and he made this game as part of a ‘Vibe Coding Game Jam,' which is an event focused on exactly this type of AI-augmented programming, in which all of the entrants had to be at least 80% AI generated. But he's also a very skilled programmer and game-maker, so this isn't the sort of outcome the average person could expect from these sorts of tools.That said, it's an interesting case study that suggests a few things about where this category of tools is taking us, even if it's not representative for all programming spaces and would-be programmers.One prediction that's been percolating in this space for years, even before ChatGPT was released, but especially after generative AI tools hit the mainstream, is that many jobs will become redundant, and as a result many people, especially those in positions that are easily and convincingly replicated using such tools, will be fired. Because why would you pay twenty people $100,000 a year to do basic coding work when you can have one person working part-time with AI tools vibe-coding their way to approximately the same outcome?It's a fair question, and it's one that pretty much every industry is asking itself right now. And we've seen some early waves of firings based on this premise, most of which haven't gone great for the firing entity, as they've then had to backtrack and starting hiring to fill those positions again—the software they expected to fill the gaps not quite there yet, and their offerings suffering as a consequence of that gambit.Some are still convinced this is the way things are going, though, including people like Elon Musk, who, as part of his Department of Government Efficiency, or DOGE efforts in the US government, is basically stripping things down to the bare-minimum, in part to weaken agencies he doesn't like, but also, ostensibly at least, to reduce bloat and redundancy, the premise being that a lot of this work can be done by fewer people, and in some cases can be automated entirely using AI-based systems.This was the premise of his mass-firings at Twitter, now X, when he took over, and while there have been a lot of hiccups and issues resulting from that decision, the company is managing to operate, even if less optimally than before, with about 20% the staff it had before he took over—something like 1,500 people compared to 7,500.Now, there are different ways of looking at that outcome, and Musk's activities since that acquisition will probably color some of our perceptions of his ambitions and level of success with that job-culling, as well. But the underlying theory that a company can do even 90% as well as it did before with just a fifth of the workforce is a compelling argument to many people, and that includes folks running governments, but also those in charge of major companies with huge rosters of employees that make up the vast majority of their operating expenses.A major concern about all this, though, is that even if this theory works in broader practice, and all these companies and governments can function well enough with a dramatically reduced staff using AI tools to augment their capabilities and output, we may find ourselves in a situation in which the folks using said tools are more and more commodified—they'll be less specialized and have less education and expertise in the relevant areas, so they can be paid less, basically, the tools doing more and the humans mostly being paid to prompt and manage them. And as a result we may find ourselves in a situation where these people don't know enough to recognize when the AI are doing something wrong or weird, and we may even reach a point where the abstraction is so complete that very few humans even know how this code works, which leaves us increasingly reliant on these tools, but also more vulnerable to problems should they fail at a basic level, at which point there may not be any humans left who are capable of figuring out what went wrong, since all the jobs that would incentivize the acquisition of such knowledge and skill will have long since disappeared.As I mentioned in the intro, these tools are being applied to images, videos, music, and everything else, as well. Which means we could see vibe artists, vibe designers, vibe musicians and vibe filmmakers. All of which is arguably good in the sense that these mediums become more accessible to more people, allowing more voices to communicate in more ways than ever before.But it's also arguably worrying in the sense that more communication might be filtered through the capabilities of these tools—which, by the way, are predicated on previous artists and writers and filmmakers' work, arguably stealing their styles and ideas and regurgitating them, rather than doing anything truly original—and that could lead to less originality in these spaces, but also a similar situation in which people forget how to make their own films, their own art, their own writing; a capability drain that gets worse with each new generation of people who are incentivized to hand those responsibilities off to AI tools; we'll all become AI prompters, rather than all the things we are, currently.This has been the case with many technologies over the years—how many blacksmiths do we have in 2025, after all? And how many people actually hand-code the 1s and 0s that all our coding languages eventually write, for us, after we work at a higher, more human-optimized level of abstraction?But because our existing economies are predicated on a certain type of labor and certain number of people being employed to do said labor, even if those concerns ultimately don't end up being too big a deal, because the benefits are just that much more impactful than the downsides and other incentives to develop these or similar skills and understandings arise, it's possible we could experience a moment, years or decades long, in which the whole of the employment market is disrupted, perhaps quite rapidly, leaving a lot of people without income and thus a lot fewer people who can afford the products and services that are generated more cheaply using these tools.A situation that's ripe with potential for those in a position to take advantage of it, but also a situation that could be devastating to those reliant on the current state of employment and income—which is the vast, vast majority of human beings on the planet.Show Noteshttps://en.wikipedia.org/wiki/X_Corphttps://devclass.com/2025/03/26/the-paradox-of-vibe-coding-it-works-best-for-those-who-do-not-need-it/https://www.wired.com/story/doge-rebuild-social-security-administration-cobol-benefits/https://www.wired.com/story/anthropic-benevolent-artificial-intelligence/https://arstechnica.com/tech-policy/2025/03/what-could-possibly-go-wrong-doge-to-rapidly-rebuild-social-security-codebase/https://en.wikipedia.org/wiki/Vibe_codinghttps://www.newscientist.com/article/2473993-what-is-vibe-coding-should-you-be-doing-it-and-does-it-matter/https://nmn.gl/blog/dangers-vibe-codinghttps://x.com/karpathy/status/1886192184808149383https://simonwillison.net/2025/Mar/19/vibe-coding/https://arstechnica.com/ai/2025/03/is-vibe-coding-with-ai-gnarly-or-reckless-maybe-some-of-both/https://devclass.com/2025/03/26/the-paradox-of-vibe-coding-it-works-best-for-those-who-do-not-need-it/https://www.creativebloq.com/3d/video-game-design/what-is-vibe-coding-and-is-it-really-the-future-of-app-and-game-developmenthttps://arstechnica.com/ai/2025/03/openais-new-ai-image-generator-is-potent-and-bound-to-provoke/https://en.wikipedia.org/wiki/Studio_Ghibli This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit letsknowthings.substack.com/subscribe

Vibe Coding

SEO Is Not That Hard

Play Episode Listen Later Mar 28, 2025 15:43 Transcription Available

Send us a textVibe coding represents a revolutionary AI-driven approach to software development that allows anyone to create functional applications using natural language instead of traditional programming.• Coined by Andrej Karpathy (former Tesla AI director and OpenAI co-founder) in February 2025• Dramatically lowers the barrier to entry for software development• Enables rapid prototyping and iteration through conversational feedback loops• Particularly useful for creating "software for one" - personal tools to solve specific problems• Major tools include Cursor, Replit, and even general AI assistants like ChatGPT and Claude• SEO professionals can use vibe coding to build custom data processing and analysis tools• Currently has limitations with complex systems and quality control• Best used for prototyping or solving personal workflow challenges• The technology is evolving rapidly, with capabilities expanding monthlyTry our SEO intelligence platform at keywordspeopleuse.com where we help you discover questions people ask online, organize them into topical groups, and optimize your content with personalized advice to grow your traffic.SEO Is Not That Hard is hosted by Edd Dawson and brought to you by KeywordsPeopleUse.com Help feed the algorithm and leave a review at ratethispodcast.com/seo You can get your free copy of my 101 Quick SEO Tips at: https://seotips.edddawson.com/101-quick-seo-tipsTo get a personal no-obligation demo of how KeywordsPeopleUse could help you boost your SEO and get a 7 day FREE trial of our Standard Plan book a demo with me nowSee Edd's personal site at edddawson.comAsk me a question and get on the show Click here to record a questionFind Edd on Linkedin, Bluesky & TwitterFind KeywordsPeopleUse on Twitter @kwds_ppl_use"Werq" Kevin MacLeod (incompetech.com)Licensed under Creative Commons: By Attribution 4.0 Licensehttp://creativecommons.org/licenses/by/4.0/

ai chatgpt seo vibe openai kevin macleod blue sky coding dramatically enables cursor coined replit werq andrej karpathy

Vibe Coding with Ryan Booth

Telemetry Now

Play Episode Listen Later Mar 20, 2025 47:07

In this episode, Phillip Gervasi and Ryan Booth dive into "vibe coding"—a new AI-assisted approach where LLMs generate code from natural language descriptions. Inspired by Andrej Karpathy's vision, vibe coding streamlines development but raises questions about debugging, best practices, and the future of software engineering.

ai vibe coding andrej karpathy ryan booth

How Pieter Levels Hit $67k MRR in 3 Weeks

Founder's Journal

Play Episode Listen Later Mar 10, 2025 21:06

Episode 140: Alex breaks down Pieter Levels' AI-coded flying game, which hit $67,000 in monthly revenue in just 3 weeks. Here's what to expect: The stats & story behind Levels' flying game Key lessons to take from this business Understanding critiques of the game — Show Notes: (0:00) A note from our sponsor (2:26) Welcome back to Founder's Journal (3:09) Peter Levels' AI-coded flying game (6:17) The power of AI in game development (8:00) The value of trusted distribution (13:24) Vibe marketing explained (15:58) Addressing critiques (20:15) Conclusion— Thanks to our presenting sponsor, Gusto. Head to www.gusto.com/alex — Episode Links: • Flying game - https://fly.pieter.com/ • Levels on X - https://x.com/levelsio • Andrej Karpathy - https://www.youtube.com/@AndrejKarpathyCheck Out Alex's Stuff: • storyarb - https://www.storyarb.com/ • CTA - https://www.creatortalentagency.co/ • X - https://x.com/businessbarista • Linkedin - https://www.linkedin.com/in/alex-lieberman/ Learn more about your ad choices. Visit megaphone.fm/adchoices

founders head ai journal addressing levels vibe gusto cta andrej karpathy pieter levels

Tue. 02/18 – Grok-3

Techmeme Ride Home

Play Episode Listen Later Feb 18, 2025 16:44

xAI and Elon Musk have launched Grok-3, their cutting-edge AI model. Is it really a step forward? It's really the cutting-edge? Andrej Karpathy is gonna tell us. The first tri-foldable phone is here. Is there a new huge AI player? And how Apple's move to manufacture in India is going.Sponsors:MackWeldon.com Promocode: BRIANLinks:Elon Musk's xAI releases its latest flagship model, Grok 3 (TechCrunch)Impressions of Grok-3 (@karpathy)Huawei's trifold phone launches outside of China (The Verge)Trump tariffs result in 10% laptop price hike in U.S. says Acer CEO (Tom's Hardware)OpenAI Co-Founder Sutskever's Startup Is Fundraising at $30 Billion-Plus Valuation (Bloomberg)Apple's quiet pivot to India (FT)See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

ai apple elon musk huawei grok xai andrej karpathy

AI ROLLUP #11: $97B Elon OpenAI Rumor | AI Crypto Rebound? | Virtuals on Solana | ARC Launchpad

Bankless

Play Episode Listen Later Feb 13, 2025 69:36

Ejaaz and David reunite to dissect the AI Crypto sector's rebound from a 70% crash, fueled by Elon's rumored $97B OpenAI bid and the relentless rise of open-source devs being heads down. They explore how tokens might be the most accessible path to AI exposure, why ARC's curated launchpad could elevate agent quality, and what Virtuals' move onto Solana means for cross-chain expansion. Meanwhile, X (Twitter) embraces a new wave of AI agents, and AI16z reorganizes to stay competitive. Is this the turning point for AI and crypto—or just another plateau? Tune in to find out, anon. ------

ai elon musk network crypto rumors openai arc rebound solana launchpad roll up replit andrej karpathy

The AI Architect — Bret Taylor

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Feb 11, 2025 96:19

If you're in SF, join us tomorrow for a fun meetup at CodeGen Night!If you're in NYC, join us for AI Engineer Summit! The Agent Engineering track is now sold out, but 25 tickets remain for AI Leadership and 5 tickets for the workshops. You can see the full schedule of speakers and workshops at https://ai.engineer!It's exceedingly hard to introduce someone like Bret Taylor. We could recite his Wikipedia page, or his extensive work history through Silicon Valley's greatest companies, but everyone else already does that.As a podcast by AI engineers for AI engineers, we had the opportunity to do something a little different. We wanted to dig into what Bret sees from his vantage point at the top of our industry for the last 2 decades, and how that explains the rise of the AI Architect at Sierra, the leading conversational AI/CX platform.“Across our customer base, we are seeing a new role emerge - the role of the AI architect. These leaders are responsible for helping define, manage and evolve their company's AI agent over time. They come from a variety of both technical and business backgrounds, and we think that every company will have one or many AI architects managing their AI agent and related experience.”In our conversation, Bret Taylor confirms the Paul Buchheit legend that he rewrote Google Maps in a weekend, armed with only the help of a then-nascent Google Closure Compiler and no other modern tooling. But what we find remarkable is that he was the PM of Maps, not an engineer, though of course he still identifies as one. We find this theme recurring throughout Bret's career and worldview. We think it is plain as day that AI leadership will have to be hands-on and technical, especially when the ground is shifting as quickly as it is today:“There's a lot of power in combining product and engineering into as few people as possible… few great things have been created by committee.”“If engineering is an order taking organization for product you can sometimes make meaningful things, but rarely will you create extremely well crafted breakthrough products. Those tend to be small teams who deeply understand the customer need that they're solving, who have a maniacal focus on outcomes.”“And I think the reason why is if you look at like software as a service five years ago, maybe you can have a separation of product and engineering because most software as a service created five years ago. I wouldn't say there's like a lot of technological breakthroughs required for most business applications. And if you're making expense reporting software or whatever, it's useful… You kind of know how databases work, how to build auto scaling with your AWS cluster, whatever, you know, it's just, you're just applying best practices to yet another problem. "When you have areas like the early days of mobile development or the early days of interactive web applications, which I think Google Maps and Gmail represent, or now AI agents, you're in this constant conversation with what the requirements of your customers and stakeholders are and all the different people interacting with it and the capabilities of the technology. And it's almost impossible to specify the requirements of a product when you're not sure of the limitations of the technology itself.”This is the first time the difference between technical leadership for “normal” software and for “AI” software was articulated this clearly for us, and we'll be thinking a lot about this going forward. We left a lot of nuggets in the conversation, so we hope you'll just dive in with us (and thank Bret for joining the pod!)Timestamps* 00:00:02 Introductions and Bret Taylor's background* 00:01:23 Bret's experience at Stanford and the dot-com era* 00:04:04 The story of rewriting Google Maps backend* 00:11:06 Early days of interactive web applications at Google* 00:15:26 Discussion on product management and engineering roles* 00:21:00 AI and the future of software development* 00:26:42 Bret's approach to identifying customer needs and building AI companies* 00:32:09 The evolution of business models in the AI era* 00:41:00 The future of programming languages and software development* 00:49:38 Challenges in precisely communicating human intent to machines* 00:56:44 Discussion on Artificial General Intelligence (AGI) and its impact* 01:08:51 The future of agent-to-agent communication* 01:14:03 Bret's involvement in the OpenAI leadership crisis* 01:22:11 OpenAI's relationship with Microsoft* 01:23:23 OpenAI's mission and priorities* 01:27:40 Bret's guiding principles for career choices* 01:29:12 Brief discussion on pasta-making* 01:30:47 How Bret keeps up with AI developments* 01:32:15 Exciting research directions in AI* 01:35:19 Closing remarks and hiring at Sierra Transcript[00:02:05] Introduction and Guest Welcome[00:02:05] Alessio: Hey everyone, welcome to the Latent Space Podcast. This is Alessio, partner and CTO at Decibel Partners, and I'm joined by my co host swyx, founder of smol.ai.[00:02:17] swyx: Hey, and today we're super excited to have Bret Taylor join us. Welcome. Thanks for having me. It's a little unreal to have you in the studio.[00:02:25] swyx: I've read about you so much over the years, like even before. Open AI effectively. I mean, I use Google Maps to get here. So like, thank you for everything that you've done. Like, like your story history, like, you know, I think people can find out what your greatest hits have been.[00:02:40] Bret Taylor's Early Career and Education[00:02:40] swyx: How do you usually like to introduce yourself when, you know, you talk about, you summarize your career, like, how do you look at yourself?[00:02:47] Bret: Yeah, it's a great question. You know, we, before we went on the mics here, we're talking about the audience for this podcast being more engineering. And I do think depending on the audience, I'll introduce myself differently because I've had a lot of [00:03:00] corporate and board roles. I probably self identify as an engineer more than anything else though.[00:03:04] Bret: So even when I was. Salesforce, I was coding on the weekends. So I think of myself as an engineer and then all the roles that I do in my career sort of start with that just because I do feel like engineering is sort of a mindset and how I approach most of my life. So I'm an engineer first and that's how I describe myself.[00:03:24] Bret: You majored in computer[00:03:25] swyx: science, like 1998. And, and I was high[00:03:28] Bret: school, actually my, my college degree was Oh, two undergrad. Oh, three masters. Right. That old.[00:03:33] swyx: Yeah. I mean, no, I was going, I was going like 1998 to 2003, but like engineering wasn't as, wasn't a thing back then. Like we didn't have the title of senior engineer, you know, kind of like, it was just.[00:03:44] swyx: You were a programmer, you were a developer, maybe. What was it like in Stanford? Like, what was that feeling like? You know, was it, were you feeling like on the cusp of a great computer revolution? Or was it just like a niche, you know, interest at the time?[00:03:57] Stanford and the Dot-Com Bubble[00:03:57] Bret: Well, I was at Stanford, as you said, from 1998 to [00:04:00] 2002.[00:04:02] Bret: 1998 was near the peak of the dot com bubble. So. This is back in the day where most people that they're coding in the computer lab, just because there was these sun microsystems, Unix boxes there that most of us had to do our assignments on. And every single day there was a. com like buying pizza for everybody.[00:04:20] Bret: I didn't have to like, I got. Free food, like my first two years of university and then the dot com bubble burst in the middle of my college career. And so by the end there was like tumbleweed going to the job fair, you know, it was like, cause it was hard to describe unless you were there at the time, the like level of hype and being a computer science major at Stanford was like, A thousand opportunities.[00:04:45] Bret: And then, and then when I left, it was like Microsoft, IBM.[00:04:49] Joining Google and Early Projects[00:04:49] Bret: And then the two startups that I applied to were VMware and Google. And I ended up going to Google in large part because a woman named Marissa Meyer, who had been a teaching [00:05:00] assistant when I was, what was called a section leader, which was like a junior teaching assistant kind of for one of the big interest.[00:05:05] Bret: Yes. Classes. She had gone there. And she was recruiting me and I knew her and it was sort of felt safe, you know, like, I don't know. I thought about it much, but it turned out to be a real blessing. I realized like, you know, you always want to think you'd pick Google if given the option, but no one knew at the time.[00:05:20] Bret: And I wonder if I'd graduated in like 1999 where I've been like, mom, I just got a job at pets. com. It's good. But you know, at the end I just didn't have any options. So I was like, do I want to go like make kernel software at VMware? Do I want to go build search at Google? And I chose Google. 50, 50 ball.[00:05:36] Bret: I'm not really a 50, 50 ball. So I feel very fortunate in retrospect that the economy collapsed because in some ways it forced me into like one of the greatest companies of all time, but I kind of lucked into it, I think.[00:05:47] The Google Maps Rewrite Story[00:05:47] Alessio: So the famous story about Google is that you rewrote the Google maps back in, in one week after the map quest quest maps acquisition, what was the story there?[00:05:57] Alessio: Is it. Actually true. Is it [00:06:00] being glorified? Like how, how did that come to be? And is there any detail that maybe Paul hasn't shared before?[00:06:06] Bret: It's largely true, but I'll give the color commentary. So it was actually the front end, not the back end, but it turns out for Google maps, the front end was sort of the hard part just because Google maps was.[00:06:17] Bret: Largely the first ish kind of really interactive web application, say first ish. I think Gmail certainly was though Gmail, probably a lot of people then who weren't engineers probably didn't appreciate its level of interactivity. It was just fast, but. Google maps, because you could drag the map and it was sort of graphical.[00:06:38] Bret: My, it really in the mainstream, I think, was it a map[00:06:41] swyx: quest back then that was, you had the arrows up and down, it[00:06:44] Bret: was up and down arrows. Each map was a single image and you just click left and then wait for a few seconds to the new map to let it was really small too, because generating a big image was kind of expensive on computers that day.[00:06:57] Bret: So Google maps was truly innovative in that [00:07:00] regard. The story on it. There was a small company called where two technologies started by two Danish brothers, Lars and Jens Rasmussen, who are two of my closest friends now. They had made a windows app called expedition, which had beautiful maps. Even in 2000.[00:07:18] Bret: For whenever we acquired or sort of acquired their company, Windows software was not particularly fashionable, but they were really passionate about mapping and we had made a local search product that was kind of middling in terms of popularity, sort of like a yellow page of search product. So we wanted to really go into mapping.[00:07:36] Bret: We'd started working on it. Their small team seemed passionate about it. So we're like, come join us. We can build this together.[00:07:42] Technical Challenges and Innovations[00:07:42] Bret: It turned out to be a great blessing that they had built a windows app because you're less technically constrained when you're doing native code than you are building a web browser, particularly back then when there weren't really interactive web apps and it ended up.[00:07:56] Bret: Changing the level of quality that we [00:08:00] wanted to hit with the app because we were shooting for something that felt like a native windows application. So it was a really good fortune that we sort of, you know, their unusual technical choices turned out to be the greatest blessing. So we spent a lot of time basically saying, how can you make a interactive draggable map in a web browser?[00:08:18] Bret: How do you progressively load, you know, new map tiles, you know, as you're dragging even things like down in the weeds of the browser at the time, most browsers like Internet Explorer, which was dominant at the time would only load two images at a time from the same domain. So we ended up making our map tile servers have like.[00:08:37] Bret: Forty different subdomains so we could load maps and parallels like lots of hacks. I'm happy to go into as much as like[00:08:44] swyx: HTTP connections and stuff.[00:08:46] Bret: They just like, there was just maximum parallelism of two. And so if you had a map, set of map tiles, like eight of them, so So we just, we were down in the weeds of the browser anyway.[00:08:56] Bret: So it was lots of plumbing. I can, I know a lot more about browsers than [00:09:00] most people, but then by the end of it, it was fairly, it was a lot of duct tape on that code. If you've ever done an engineering project where you're not really sure the path from point A to point B, it's almost like. Building a house by building one room at a time.[00:09:14] Bret: The, there's not a lot of architectural cohesion at the end. And then we acquired a company called Keyhole, which became Google earth, which was like that three, it was a native windows app as well, separate app, great app, but with that, we got licenses to all this satellite imagery. And so in August of 2005, we added.[00:09:33] Bret: Satellite imagery to Google Maps, which added even more complexity in the code base. And then we decided we wanted to support Safari. There was no mobile phones yet. So Safari was this like nascent browser on, on the Mac. And it turns out there's like a lot of decisions behind the scenes, sort of inspired by this windows app, like heavy use of XML and XSLT and all these like.[00:09:54] Bret: Technologies that were like briefly fashionable in the early two thousands and everyone hates now for good [00:10:00] reason. And it turns out that all of the XML functionality and Internet Explorer wasn't supporting Safari. So people are like re implementing like XML parsers. And it was just like this like pile of s**t.[00:10:11] Bret: And I had to say a s**t on your part. Yeah, of[00:10:12] Alessio: course.[00:10:13] Bret: So. It went from this like beautifully elegant application that everyone was proud of to something that probably had hundreds of K of JavaScript, which sounds like nothing. Now we're talking like people have modems, you know, not all modems, but it was a big deal.[00:10:29] Bret: So it was like slow. It took a while to load and just, it wasn't like a great code base. Like everything was fragile. So I just got. Super frustrated by it. And then one weekend I did rewrite all of it. And at the time the word JSON hadn't been coined yet too, just to give you a sense. So it's all XML.[00:10:47] swyx: Yeah.[00:10:47] Bret: So we used what is now you would call JSON, but I just said like, let's use eval so that we can parse the data fast. And, and again, that's, it would literally as JSON, but at the time there was no name for it. So we [00:11:00] just said, let's. Pass on JavaScript from the server and eval it. And then somebody just refactored the whole thing.[00:11:05] Bret: And, and it wasn't like I was some genius. It was just like, you know, if you knew everything you wished you had known at the beginning and I knew all the functionality, cause I was the primary, one of the primary authors of the JavaScript. And I just like, I just drank a lot of coffee and just stayed up all weekend.[00:11:22] Bret: And then I, I guess I developed a bit of reputation and no one knew about this for a long time. And then Paul who created Gmail and I ended up starting a company with him too, after all of this told this on a podcast and now it's large, but it's largely true. I did rewrite it and it, my proudest thing.[00:11:38] Bret: And I think JavaScript people appreciate this. Like the un G zipped bundle size for all of Google maps. When I rewrote, it was 20 K G zipped. It was like much smaller for the entire application. It went down by like 10 X. So. What happened on Google? Google is a pretty mainstream company. And so like our usage is shot up because it turns out like it's faster.[00:11:57] Bret: Just being faster is worth a lot of [00:12:00] percentage points of growth at a scale of Google. So how[00:12:03] swyx: much modern tooling did you have? Like test suites no compilers.[00:12:07] Bret: Actually, that's not true. We did it one thing. So I actually think Google, I, you can. Download it. There's a, Google has a closure compiler, a closure compiler.[00:12:15] Bret: I don't know if anyone still uses it. It's gone. Yeah. Yeah. It's sort of gone out of favor. Yeah. Well, even until recently it was better than most JavaScript minifiers because it was more like it did a lot more renaming of variables and things. Most people use ES build now just cause it's fast and closure compilers built on Java and super slow and stuff like that.[00:12:37] Bret: But, so we did have that, that was it. Okay.[00:12:39] The Evolution of Web Applications[00:12:39] Bret: So and that was treated internally, you know, it was a really interesting time at Google at the time because there's a lot of teams working on fairly advanced JavaScript when no one was. So Google suggest, which Kevin Gibbs was the tech lead for, was the first kind of type ahead, autocomplete, I believe in a web browser, and now it's just pervasive in search boxes that you sort of [00:13:00] see a type ahead there.[00:13:01] Bret: I mean, chat, dbt[00:13:01] swyx: just added it. It's kind of like a round trip.[00:13:03] Bret: Totally. No, it's now pervasive as a UI affordance, but that was like Kevin's 20 percent project. And then Gmail, Paul you know, he tells the story better than anyone, but he's like, you know, basically was scratching his own itch, but what was really neat about it is email, because it's such a productivity tool, just needed to be faster.[00:13:21] Bret: So, you know, he was scratching his own itch of just making more stuff work on the client side. And then we, because of Lars and Yen sort of like setting the bar of this windows app or like we need our maps to be draggable. So we ended up. Not only innovate in terms of having a big sync, what would be called a single page application today, but also all the graphical stuff you know, we were crashing Firefox, like it was going out of style because, you know, when you make a document object model with the idea that it's a document and then you layer on some JavaScript and then we're essentially abusing all of this, it just was running into code paths that were not.[00:13:56] Bret: Well, it's rotten, you know, at this time. And so it was [00:14:00] super fun. And, and, you know, in the building you had, so you had compilers, people helping minify JavaScript just practically, but there is a great engineering team. So they were like, that's why Closure Compiler is so good. It was like a. Person who actually knew about programming languages doing it, not just, you know, writing regular expressions.[00:14:17] Bret: And then the team that is now the Chrome team believe, and I, I don't know this for a fact, but I'm pretty sure Google is the main contributor to Firefox for a long time in terms of code. And a lot of browser people were there. So every time we would crash Firefox, we'd like walk up two floors and say like, what the hell is going on here?[00:14:35] Bret: And they would load their browser, like in a debugger. And we could like figure out exactly what was breaking. And you can't change the code, right? Cause it's the browser. It's like slow, right? I mean, slow to update. So, but we could figure out exactly where the bug was and then work around it in our JavaScript.[00:14:52] Bret: So it was just like new territory. Like so super, super fun time, just like a lot of, a lot of great engineers figuring out [00:15:00] new things. And And now, you know, the word, this term is no longer in fashion, but the word Ajax, which was asynchronous JavaScript and XML cause I'm telling you XML, but see the word XML there, to be fair, the way you made HTTP requests from a client to server was this.[00:15:18] Bret: Object called XML HTTP request because Microsoft and making Outlook web access back in the day made this and it turns out to have nothing to do with XML. It's just a way of making HTTP requests because XML was like the fashionable thing. It was like that was the way you, you know, you did it. But the JSON came out of that, you know, and then a lot of the best practices around building JavaScript applications is pre React.[00:15:44] Bret: I think React was probably the big conceptual step forward that we needed. Even my first social network after Google, we used a lot of like HTML injection and. Making real time updates was still very hand coded and it's really neat when you [00:16:00] see conceptual breakthroughs like react because it's, I just love those things where it's like obvious once you see it, but it's so not obvious until you do.[00:16:07] Bret: And actually, well, I'm sure we'll get into AI, but I, I sort of feel like we'll go through that evolution with AI agents as well that I feel like we're missing a lot of the core abstractions that I think in 10 years we'll be like, gosh, how'd you make agents? Before that, you know, but it was kind of that early days of web applications.[00:16:22] swyx: There's a lot of contenders for the reactive jobs of of AI, but no clear winner yet. I would say one thing I was there for, I mean, there's so much we can go into there. You just covered so much.[00:16:32] Product Management and Engineering Synergy[00:16:32] swyx: One thing I just, I just observe is that I think the early Google days had this interesting mix of PM and engineer, which I think you are, you didn't, you didn't wait for PM to tell you these are my, this is my PRD.[00:16:42] swyx: This is my requirements.[00:16:44] mix: Oh,[00:16:44] Bret: okay.[00:16:45] swyx: I wasn't technically a software engineer. I mean,[00:16:48] Bret: by title, obviously. Right, right, right.[00:16:51] swyx: It's like a blend. And I feel like these days, product is its own discipline and its own lore and own industry and engineering is its own thing. And there's this process [00:17:00] that happens and they're kind of separated, but you don't produce as good of a product as if they were the same person.[00:17:06] swyx: And I'm curious, you know, if, if that, if that sort of resonates in, in, in terms of like comparing early Google versus modern startups that you see out there,[00:17:16] Bret: I certainly like wear a lot of hats. So, you know, sort of biased in this, but I really agree that there's a lot of power and combining product design engineering into as few people as possible because, you know few great things have been created by committee, you know, and so.[00:17:33] Bret: If engineering is an order taking organization for product you can sometimes make meaningful things, but rarely will you create extremely well crafted breakthrough products. Those tend to be small teams who deeply understand the customer need that they're solving, who have a. Maniacal focus on outcomes.[00:17:53] Bret: And I think the reason why it's, I think for some areas, if you look at like software as a service five years ago, maybe you can have a [00:18:00] separation of product and engineering because most software as a service created five years ago. I wouldn't say there's like a lot of like. Technological breakthroughs required for most, you know, business applications.[00:18:11] Bret: And if you're making expense reporting software or whatever, it's useful. I don't mean to be dismissive of expense reporting software, but you probably just want to understand like, what are the requirements of the finance department? What are the requirements of an individual file expense report? Okay.[00:18:25] Bret: Go implement that. And you kind of know how web applications are implemented. You kind of know how to. How databases work, how to build auto scaling with your AWS cluster, whatever, you know, it's just, you're just applying best practices to yet another problem when you have areas like the early days of mobile development or the early days of interactive web applications, which I think Google Maps and Gmail represent, or now AI agents, you're in this constant conversation with what the requirements of your customers and stakeholders are and all the different people interacting with it.[00:18:58] Bret: And the capabilities of the [00:19:00] technology. And it's almost impossible to specify the requirements of a product when you're not sure of the limitations of the technology itself. And that's why I use the word conversation. It's not literal. That's sort of funny to use that word in the age of conversational AI.[00:19:15] Bret: You're constantly sort of saying, like, ideally, you could sprinkle some magic AI pixie dust and solve all the world's problems, but it's not the way it works. And it turns out that actually, I'll just give an interesting example.[00:19:26] AI Agents and Modern Tooling[00:19:26] Bret: I think most people listening probably use co pilots to code like Cursor or Devon or Microsoft Copilot or whatever.[00:19:34] Bret: Most of those tools are, they're remarkable. I'm, I couldn't, you know, imagine development without them now, but they're not autonomous yet. Like I wouldn't let it just write most code without my interactively inspecting it. We just are somewhere between it's an amazing co pilot and it's an autonomous software engineer.[00:19:53] Bret: As a product manager, like your aspirations for what the product is are like kind of meaningful. But [00:20:00] if you're a product person, yeah, of course you'd say it should be autonomous. You should click a button and program should come out the other side. The requirements meaningless. Like what matters is like, what is based on the like very nuanced limitations of the technology.[00:20:14] Bret: What is it capable of? And then how do you maximize the leverage? It gives a software engineering team, given those very nuanced trade offs. Coupled with the fact that those nuanced trade offs are changing more rapidly than any technology in my memory, meaning every few months you'll have new models with new capabilities.[00:20:34] Bret: So how do you construct a product that can absorb those new capabilities as rapidly as possible as well? That requires such a combination of technical depth and understanding the customer that you really need more integration. Of product design and engineering. And so I think it's why with these big technology waves, I think startups have a bit of a leg up relative to incumbents because they [00:21:00] tend to be sort of more self actualized in terms of just like bringing those disciplines closer together.[00:21:06] Bret: And in particular, I think entrepreneurs, the proverbial full stack engineers, you know, have a leg up as well because. I think most breakthroughs happen when you have someone who can understand those extremely nuanced technical trade offs, have a vision for a product. And then in the process of building it, have that, as I said, like metaphorical conversation with the technology, right?[00:21:30] Bret: Gosh, I ran into a technical limit that I didn't expect. It's not just like changing that feature. You might need to refactor the whole product based on that. And I think that's, that it's particularly important right now. So I don't, you know, if you, if you're building a big ERP system, probably there's a great reason to have product and engineering.[00:21:51] Bret: I think in general, the disciplines are there for a reason. I think when you're dealing with something as nuanced as the like technologies, like large language models today, there's a ton of [00:22:00] advantage of having. Individuals or organizations that integrate the disciplines more formally.[00:22:05] Alessio: That makes a lot of sense.[00:22:06] Alessio: I've run a lot of engineering teams in the past, and I think the product versus engineering tension has always been more about effort than like whether or not the feature is buildable. But I think, yeah, today you see a lot more of like. Models actually cannot do that. And I think the most interesting thing is on the startup side, people don't yet know where a lot of the AI value is going to accrue.[00:22:26] Alessio: So you have this rush of people building frameworks, building infrastructure, layered things, but we don't really know the shape of the compute. I'm curious that Sierra, like how you thought about building an house, a lot of the tooling for evals or like just, you know, building the agents and all of that.[00:22:41] Alessio: Versus how you see some of the startup opportunities that is maybe still out there.[00:22:46] Bret: We build most of our tooling in house at Sierra, not all. It's, we don't, it's not like not invented here syndrome necessarily, though, maybe slightly guilty of that in some ways, but because we're trying to build a platform [00:23:00] that's in Dorian, you know, we really want to have control over our own destiny.[00:23:03] Bret: And you had made a comment earlier that like. We're still trying to figure out who like the reactive agents are and the jury is still out. I would argue it hasn't been created yet. I don't think the jury is still out to go use that metaphor. We're sort of in the jQuery era of agents, not the react era.[00:23:19] Bret: And, and that's like a throwback for people listening,[00:23:22] swyx: we shouldn't rush it. You know?[00:23:23] Bret: No, yeah, that's my point is. And so. Because we're trying to create an enduring company at Sierra that outlives us, you know, I'm not sure we want to like attach our cart to some like to a horse where it's not clear that like we've figured out and I actually want as a company, we're trying to enable just at a high level and I'll, I'll quickly go back to tech at Sierra, we help consumer brands build customer facing AI agents.[00:23:48] Bret: So. Everyone from Sonos to ADT home security to Sirius XM, you know, if you call them on the phone and AI will pick up with you, you know, chat with them on the Sirius XM homepage. It's an AI agent called Harmony [00:24:00] that they've built on our platform. We're what are the contours of what it means for someone to build an end to end complete customer experience with AI with conversational AI.[00:24:09] Bret: You know, we really want to dive into the deep end of, of all the trade offs to do it. You know, where do you use fine tuning? Where do you string models together? You know, where do you use reasoning? Where do you use generation? How do you use reasoning? How do you express the guardrails of an agentic process?[00:24:25] Bret: How do you impose determinism on a fundamentally non deterministic technology? There's just a lot of really like as an important design space. And I could sit here and tell you, we have the best approach. Every entrepreneur will, you know. But I hope that in two years, we look back at our platform and laugh at how naive we were, because that's the pace of change broadly.[00:24:45] Bret: If you talk about like the startup opportunities, I'm not wholly skeptical of tools companies, but I'm fairly skeptical. There's always an exception for every role, but I believe that certainly there's a big market for [00:25:00] frontier models, but largely for companies with huge CapEx budgets. So. Open AI and Microsoft's Anthropic and Amazon Web Services, Google Cloud XAI, which is very well capitalized now, but I think the, the idea that a company can make money sort of pre training a foundation model is probably not true.[00:25:20] Bret: It's hard to, you're competing with just, you know, unreasonably large CapEx budgets. And I just like the cloud infrastructure market, I think will be largely there. I also really believe in the applications of AI. And I define that not as like building agents or things like that. I define it much more as like, you're actually solving a problem for a business.[00:25:40] Bret: So it's what Harvey is doing in legal profession or what cursor is doing for software engineering or what we're doing for customer experience and customer service. The reason I believe in that is I do think that in the age of AI, what's really interesting about software is it can actually complete a task.[00:25:56] Bret: It can actually do a job, which is very different than the value proposition of [00:26:00] software was to ancient history two years ago. And as a consequence, I think the way you build a solution and For a domain is very different than you would have before, which means that it's not obvious, like the incumbent incumbents have like a leg up, you know, necessarily, they certainly have some advantages, but there's just such a different form factor, you know, for providing a solution and it's just really valuable.[00:26:23] Bret: You know, it's. Like just think of how much money cursor is saving software engineering teams or the alternative, how much revenue it can produce tool making is really challenging. If you look at the cloud market, just as a analog, there are a lot of like interesting tools, companies, you know, Confluent, Monetized Kafka, Snowflake, Hortonworks, you know, there's a, there's a bunch of them.[00:26:48] Bret: A lot of them, you know, have that mix of sort of like like confluence or have the open source or open core or whatever you call it. I, I, I'm not an expert in this area. You know, I do think [00:27:00] that developers are fickle. I think that in the tool space, I probably like. Default towards open source being like the area that will win.[00:27:09] Bret: It's hard to build a company around this and then you end up with companies sort of built around open source to that can work. Don't get me wrong, but I just think that it's nowadays the tools are changing so rapidly that I'm like, not totally skeptical of tool makers, but I just think that open source will broadly win, but I think that the CapEx required for building frontier models is such that it will go to a handful of big companies.[00:27:33] Bret: And then I really believe in agents for specific domains which I think will, it's sort of the analog to software as a service in this new era. You know, it's like, if you just think of the cloud. You can lease a server. It's just a low level primitive, or you can buy an app like you know, Shopify or whatever.[00:27:51] Bret: And most people building a storefront would prefer Shopify over hand rolling their e commerce storefront. I think the same thing will be true of AI. So [00:28:00] I've. I tend to like, if I have a, like an entrepreneur asked me for advice, I'm like, you know, move up the stack as far as you can towards a customer need.[00:28:09] Bret: Broadly, but I, but it doesn't reduce my excitement about what is the reactive building agents kind of thing, just because it is, it is the right question to ask, but I think we'll probably play out probably an open source space more than anything else.[00:28:21] swyx: Yeah, and it's not a priority for you. There's a lot in there.[00:28:24] swyx: I'm kind of curious about your idea maze towards, there are many customer needs. You happen to identify customer experience as yours, but it could equally have been coding assistance or whatever. I think for some, I'm just kind of curious at the top down, how do you look at the world in terms of the potential problem space?[00:28:44] swyx: Because there are many people out there who are very smart and pick the wrong problem.[00:28:47] Bret: Yeah, that's a great question.[00:28:48] Future of Software Development[00:28:48] Bret: By the way, I would love to talk about the future of software, too, because despite the fact it didn't pick coding, I have a lot of that, but I can talk to I can answer your question, though, you know I think when a technology is as [00:29:00] cool as large language models.[00:29:02] Bret: You just see a lot of people starting from the technology and searching for a problem to solve. And I think it's why you see a lot of tools companies, because as a software engineer, you start building an app or a demo and you, you encounter some pain points. You're like,[00:29:17] swyx: a lot of[00:29:17] Bret: people are experiencing the same pain point.[00:29:19] Bret: What if I make it? That it's just very incremental. And you know, I always like to use the metaphor, like you can sell coffee beans, roasted coffee beans. You can add some value. You took coffee beans and you roasted them and roasted coffee beans largely, you know, are priced relative to the cost of the beans.[00:29:39] Bret: Or you can sell a latte and a latte. Is rarely priced directly like as a percentage of coffee bean prices. In fact, if you buy a latte at the airport, it's a captive audience. So it's a really expensive latte. And there's just a lot that goes into like. How much does a latte cost? And I bring it up because there's a supply chain from growing [00:30:00] coffee beans to roasting coffee beans to like, you know, you could make one at home or you could be in the airport and buy one and the margins of the company selling lattes in the airport is a lot higher than the, you know, people roasting the coffee beans and it's because you've actually solved a much more acute human problem in the airport.[00:30:19] Bret: And, and it's just worth a lot more to that person in that moment. It's kind of the way I think about technology too. It sounds funny to liken it to coffee beans, but you're selling tools on top of a large language model yet in some ways your market is big, but you're probably going to like be price compressed just because you're sort of a piece of infrastructure and then you have open source and all these other things competing with you naturally.[00:30:43] Bret: If you go and solve a really big business problem for somebody, that's actually like a meaningful business problem that AI facilitates, they will value it according to the value of that business problem. And so I actually feel like people should just stop. You're like, no, that's, that's [00:31:00] unfair. If you're searching for an idea of people, I, I love people trying things, even if, I mean, most of the, a lot of the greatest ideas have been things no one believed in.[00:31:07] Bret: So I like, if you're passionate about something, go do it. Like who am I to say, yeah, a hundred percent. Or Gmail, like Paul as far, I mean I, some of it's Laura at this point, but like Gmail is Paul's own email for a long time. , and then I amusingly and Paul can't correct me, I'm pretty sure he sent her in a link and like the first comment was like, this is really neat.[00:31:26] Bret: It would be great. It was not your email, but my own . I don't know if it's a true story. I'm pretty sure it's, yeah, I've read that before. So scratch your own niche. Fine. Like it depends on what your goal is. If you wanna do like a venture backed company, if its a. Passion project, f*****g passion, do it like don't listen to anybody.[00:31:41] Bret: In fact, but if you're trying to start, you know an enduring company, solve an important business problem. And I, and I do think that in the world of agents, the software industries has shifted where you're not just helping people more. People be more productive, but you're actually accomplishing tasks autonomously.[00:31:58] Bret: And as a consequence, I think the [00:32:00] addressable market has just greatly expanded just because software can actually do things now and actually accomplish tasks and how much is coding autocomplete worth. A fair amount. How much is the eventual, I'm certain we'll have it, the software agent that actually writes the code and delivers it to you, that's worth a lot.[00:32:20] Bret: And so, you know, I would just maybe look up from the large language models and start thinking about the economy and, you know, think from first principles. I don't wanna get too far afield, but just think about which parts of the economy. We'll benefit most from this intelligence and which parts can absorb it most easily.[00:32:38] Bret: And what would an agent in this space look like? Who's the customer of it is the technology feasible. And I would just start with these business problems more. And I think, you know, the best companies tend to have great engineers who happen to have great insight into a market. And it's that last part that I think some people.[00:32:56] Bret: Whether or not they have, it's like people start so much in the technology, they [00:33:00] lose the forest for the trees a little bit.[00:33:02] Alessio: How do you think about the model of still selling some sort of software versus selling more package labor? I feel like when people are selling the package labor, it's almost more stateless, you know, like it's easier to swap out if you're just putting an input and getting an output.[00:33:16] Alessio: If you think about coding, if there's no ID, you're just putting a prompt and getting back an app. It doesn't really matter. Who generates the app, you know, you have less of a buy in versus the platform you're building, I'm sure on the backend customers have to like put on their documentation and they have, you know, different workflows that they can tie in what's kind of like the line to draw there versus like going full where you're managed customer support team as a service outsource versus.[00:33:40] Alessio: This is the Sierra platform that you can build on. What was that decision? I'll sort of[00:33:44] Bret: like decouple the question in some ways, which is when you have something that's an agent, who is the person using it and what do they want to do with it? So let's just take your coding agent for a second. I will talk about Sierra as well.[00:33:59] Bret: Who's the [00:34:00] customer of a, an agent that actually produces software? Is it a software engineering manager? Is it a software engineer? And it's there, you know, intern so to speak. I don't know. I mean, we'll figure this out over the next few years. Like what is that? And is it generating code that you then review?[00:34:16] Bret: Is it generating code with a set of unit tests that pass, what is the actual. For lack of a better word contract, like, how do you know that it did what you wanted it to do? And then I would say like the product and the pricing, the packaging model sort of emerged from that. And I don't think the world's figured out.[00:34:33] Bret: I think it'll be different for every agent. You know, in our customer base, we do what's called outcome based pricing. So essentially every time the AI agent. Solves the problem or saves a customer or whatever it might be. There's a pre negotiated rate for that. We do that. Cause it's, we think that that's sort of the correct way agents, you know, should be packaged.[00:34:53] Bret: I look back at the history of like cloud software and notably the introduction of the browser, which led to [00:35:00] software being delivered in a browser, like Salesforce to. Famously invented sort of software as a service, which is both a technical delivery model through the browser, but also a business model, which is you subscribe to it rather than pay for a perpetual license.[00:35:13] Bret: Those two things are somewhat orthogonal, but not really. If you think about the idea of software running in a browser, that's hosted. Data center that you don't own, you sort of needed to change the business model because you don't, you can't really buy a perpetual license or something otherwise like, how do you afford making changes to it?[00:35:31] Bret: So it only worked when you were buying like a new version every year or whatever. So to some degree, but then the business model shift actually changed business as we know it, because now like. Things like Adobe Photoshop. Now you subscribe to rather than purchase. So it ended up where you had a technical shift and a business model shift that were very logically intertwined that actually the business model shift was turned out to be as significant as the technical as the shift.[00:35:59] Bret: And I think with [00:36:00] agents, because they actually accomplish a job, I do think that it doesn't make sense to me that you'd pay for the privilege of like. Using the software like that coding agent, like if it writes really bad code, like fire it, you know, I don't know what the right metaphor is like you should pay for a job.[00:36:17] Bret: Well done in my opinion. I mean, that's how you pay your software engineers, right? And[00:36:20] swyx: and well, not really. We paid to put them on salary and give them options and they vest over time. That's fair.[00:36:26] Bret: But my point is that you don't pay them for how many characters they write, which is sort of the token based, you know, whatever, like, There's a, that famous Apple story where we're like asking for a report of how many lines of code you wrote.[00:36:40] Bret: And one of the engineers showed up with like a negative number cause he had just like done a big refactoring. There was like a big F you to management who didn't understand how software is written. You know, my sense is like the traditional usage based or seat based thing. It's just going to look really antiquated.[00:36:55] Bret: Cause it's like asking your software engineer, how many lines of code did you write today? Like who cares? Like, cause [00:37:00] absolutely no correlation. So my old view is I don't think it's be different in every category, but I do think that that is the, if an agent is doing a job, you should, I think it properly incentivizes the maker of that agent and the customer of, of your pain for the job well done.[00:37:16] Bret: It's not always perfect to measure. It's hard to measure engineering productivity, but you can, you should do something other than how many keys you typed, you know Talk about perverse incentives for AI, right? Like I can write really long functions to do the same thing, right? So broadly speaking, you know, I do think that we're going to see a change in business models of software towards outcomes.[00:37:36] Bret: And I think you'll see a change in delivery models too. And, and, you know, in our customer base you know, we empower our customers to really have their hands on the steering wheel of what the agent does they, they want and need that. But the role is different. You know, at a lot of our customers, the customer experience operations folks have renamed themselves the AI architects, which I think is really cool.[00:37:55] Bret: And, you know, it's like in the early days of the Internet, there's the role of the webmaster. [00:38:00] And I don't know whether your webmaster is not a fashionable, you know, Term, nor is it a job anymore? I just, I don't know. Will they, our tech stand the test of time? Maybe, maybe not. But I do think that again, I like, you know, because everyone listening right now is a software engineer.[00:38:14] Bret: Like what is the form factor of a coding agent? And actually I'll, I'll take a breath. Cause actually I have a bunch of pins on them. Like I wrote a blog post right before Christmas, just on the future of software development. And one of the things that's interesting is like, if you look at the way I use cursor today, as an example, it's inside of.[00:38:31] Bret: A repackaged visual studio code environment. I sometimes use the sort of agentic parts of it, but it's largely, you know, I've sort of gotten a good routine of making it auto complete code in the way I want through tuning it properly when it actually can write. I do wonder what like the future of development environments will look like.[00:38:55] Bret: And to your point on what is a software product, I think it's going to change a lot in [00:39:00] ways that will surprise us. But I always use, I use the metaphor in my blog post of, have you all driven around in a way, Mo around here? Yeah, everyone has. And there are these Jaguars, the really nice cars, but it's funny because it still has a steering wheel, even though there's no one sitting there and the steering wheels like turning and stuff clearly in the future.[00:39:16] Bret: If once we get to that, be more ubiquitous, like why have the steering wheel and also why have all the seats facing forward? Maybe just for car sickness. I don't know, but you could totally rearrange the car. I mean, so much of the car is oriented around the driver, so. It stands to reason to me that like, well, autonomous agents for software engineering run through visual studio code.[00:39:37] Bret: That seems a little bit silly because having a single source code file open one at a time is kind of a goofy form factor for when like the code isn't being written primarily by you, but it begs the question of what's your relationship with that agent. And I think the same is true in our industry of customer experience, which is like.[00:39:55] Bret: Who are the people managing this agent? What are the tools do they need? And they definitely need [00:40:00] tools, but it's probably pretty different than the tools we had before. It's certainly different than training a contact center team. And as software engineers, I think that I would like to see particularly like on the passion project side or research side.[00:40:14] Bret: More innovation in programming languages. I think that we're bringing the cost of writing code down to zero. So the fact that we're still writing Python with AI cracks me up just cause it's like literally was designed to be ergonomic to write, not safe to run or fast to run. I would love to see more innovation and how we verify program correctness.[00:40:37] Bret: I studied for formal verification in college a little bit and. It's not very fashionable because it's really like tedious and slow and doesn't work very well. If a lot of code is being written by a machine, you know, one of the primary values we can provide is verifying that it actually does what we intend that it does.[00:40:56] Bret: I think there should be lots of interesting things in the software development life cycle, like how [00:41:00] we think of testing and everything else, because. If you think about if we have to manually read every line of code that's coming out as machines, it will just rate limit how much the machines can do. The alternative is totally unsafe.[00:41:13] Bret: So I wouldn't want to put code in production that didn't go through proper code review and inspection. So my whole view is like, I actually think there's like an AI native I don't think the coding agents don't work well enough to do this yet, but once they do, what is sort of an AI native software development life cycle and how do you actually.[00:41:31] Bret: Enable the creators of software to produce the highest quality, most robust, fastest software and know that it's correct. And I think that's an incredible opportunity. I mean, how much C code can we rewrite and rust and make it safe so that there's fewer security vulnerabilities. Can we like have more efficient, safer code than ever before?[00:41:53] Bret: And can you have someone who's like that guy in the matrix, you know, like staring at the little green things, like where could you have an operator [00:42:00] of a code generating machine be like superhuman? I think that's a cool vision. And I think too many people are focused on like. Autocomplete, you know, right now, I'm not, I'm not even, I'm guilty as charged.[00:42:10] Bret: I guess in some ways, but I just like, I'd like to see some bolder ideas. And that's why when you were joking, you know, talking about what's the react of whatever, I think we're clearly in a local maximum, you know, metaphor, like sort of conceptual local maximum, obviously it's moving really fast. I think we're moving out of it.[00:42:26] Alessio: Yeah. At the end of 23, I've read this blog post from syntax to semantics. Like if you think about Python. It's taking C and making it more semantic and LLMs are like the ultimate semantic program, right? You can just talk to them and they can generate any type of syntax from your language. But again, the languages that they have to use were made for us, not for them.[00:42:46] Alessio: But the problem is like, as long as you will ever need a human to intervene, you cannot change the language under it. You know what I mean? So I'm curious at what point of automation we'll need to get, we're going to be okay making changes. To the underlying languages, [00:43:00] like the programming languages versus just saying, Hey, you just got to write Python because I understand Python and I'm more important at the end of the day than the model.[00:43:08] Alessio: But I think that will change, but I don't know if it's like two years or five years. I think it's more nuanced actually.[00:43:13] Bret: So I think there's a, some of the more interesting programming languages bring semantics into syntax. So let me, that's a little reductive, but like Rust as an example, Rust is memory safe.[00:43:25] Bret: Statically, and that was a really interesting conceptual, but it's why it's hard to write rust. It's why most people write python instead of rust. I think rust programs are safer and faster than python, probably slower to compile. But like broadly speaking, like given the option, if you didn't have to care about the labor that went into it.[00:43:45] Bret: You should prefer a program written in Rust over a program written in Python, just because it will run more efficiently. It's almost certainly safer, et cetera, et cetera, depending on how you define safe, but most people don't write Rust because it's kind of a pain in the ass. And [00:44:00] the audience of people who can is smaller, but it's sort of better in most, most ways.[00:44:05] Bret: And again, let's say you're making a web service and you didn't have to care about how hard it was to write. If you just got the output of the web service, the rest one would be cheaper to operate. It's certainly cheaper and probably more correct just because there's so much in the static analysis implied by the rest programming language that it probably will have fewer runtime errors and things like that as well.[00:44:25] Bret: So I just give that as an example, because so rust, at least my understanding that came out of the Mozilla team, because. There's lots of security vulnerabilities in the browser and it needs to be really fast. They said, okay, we want to put more of a burden at the authorship time to have fewer issues at runtime.[00:44:43] Bret: And we need the constraint that it has to be done statically because browsers need to be really fast. My sense is if you just think about like the, the needs of a programming language today, where the role of a software engineer is [00:45:00] to use an AI to generate functionality and audit that it does in fact work as intended, maybe functionally, maybe from like a correctness standpoint, some combination thereof, how would you create a programming system that facilitated that?[00:45:15] Bret: And, you know, I bring up Rust is because I think it's a good example of like, I think given a choice of writing in C or Rust, you should choose Rust today. I think most people would say that, even C aficionados, just because. C is largely less safe for very similar, you know, trade offs, you know, for the, the system and now with AI, it's like, okay, well, that just changes the game on writing these things.[00:45:36] Bret: And so like, I just wonder if a combination of programming languages that are more structurally oriented towards the values that we need from an AI generated program, verifiable correctness and all of that. If it's tedious to produce for a person, that maybe doesn't matter. But one thing, like if I asked you, is this rest program memory safe?[00:45:58] Bret: You wouldn't have to read it, you just have [00:46:00] to compile it. So that's interesting. I mean, that's like an, that's one example of a very modest form of formal verification. So I bring that up because I do think you have AI inspect AI, you can have AI reviewed. Do AI code reviews. It would disappoint me if the best we could get was AI reviewing Python and having scaled a few very large.[00:46:21] Bret: Websites that were written on Python. It's just like, you know, expensive and it's like every, trust me, every team who's written a big web service in Python has experimented with like Pi Pi and all these things just to make it slightly more efficient than it naturally is. You don't really have true multi threading anyway.[00:46:36] Bret: It's just like clearly that you do it just because it's convenient to write. And I just feel like we're, I don't want to say it's insane. I just mean. I do think we're at a local maximum. And I would hope that we create a programming system, a combination of programming languages, formal verification, testing, automated code reviews, where you can use AI to generate software in a high scale way and trust it.[00:46:59] Bret: And you're [00:47:00] not limited by your ability to read it necessarily. I don't know exactly what form that would take, but I feel like that would be a pretty cool world to live in.[00:47:08] Alessio: Yeah. We had Chris Lanner on the podcast. He's doing great work with modular. I mean, I love. LVM. Yeah. Basically merging rust in and Python.[00:47:15] Alessio: That's kind of the idea. Should be, but I'm curious is like, for them a big use case was like making it compatible with Python, same APIs so that Python developers could use it. Yeah. And so I, I wonder at what point, well, yeah.[00:47:26] Bret: At least my understanding is they're targeting the data science Yeah. Machine learning crowd, which is all written in Python, so still feels like a local maximum.[00:47:34] Bret: Yeah.[00:47:34] swyx: Yeah, exactly. I'll force you to make a prediction. You know, Python's roughly 30 years old. In 30 years from now, is Rust going to be bigger than Python?[00:47:42] Bret: I don't know this, but just, I don't even know this is a prediction. I just am sort of like saying stuff I hope is true. I would like to see an AI native programming language and programming system, and I use language because I'm not sure language is even the right thing, but I hope in 30 years, there's an AI native way we make [00:48:00] software that is wholly uncorrelated with the current set of programming languages.[00:48:04] Bret: or not uncorrelated, but I think most programming languages today were designed to be efficiently authored by people and some have different trade offs.[00:48:15] Evolution of Programming Languages[00:48:15] Bret: You know, you have Haskell and others that were designed for abstractions for parallelism and things like that. You have programming languages like Python, which are designed to be very easily written, sort of like Perl and Python lineage, which is why data scientists use it.[00:48:31] Bret: It's it can, it has a. Interactive mode, things like that. And I love, I'm a huge Python fan. So despite all my Python trash talk, a huge Python fan wrote at least two of my three companies were exclusively written in Python and then C came out of the birth of Unix and it wasn't the first, but certainly the most prominent first step after assembly language, right?[00:48:54] Bret: Where you had higher level abstractions rather than and going beyond go to, to like abstractions, [00:49:00] like the for loop and the while loop.[00:49:01] The Future of Software Engineering[00:49:01] Bret: So I just think that if the act of writing code is no longer a meaningful human exercise, maybe it will be, I don't know. I'm just saying it sort of feels like maybe it's one of those parts of history that just will sort of like go away, but there's still the role of this offer engineer, like the person actually building the system.[00:49:20] Bret: Right. And. What does a programming system for that form factor look like?[00:49:25] React and Front-End Development[00:49:25] Bret: And I, I just have a, I hope to be just like I mentioned, I remember I was at Facebook in the very early days when, when, what is now react was being created. And I remember when the, it was like released open source I had left by that time and I was just like, this is so f*****g cool.[00:49:42] Bret: Like, you know, to basically model your app independent of the data flowing through it, just made everything easier. And then now. You know, I can create, like there's a lot of the front end software gym play is like a little chaotic for me, to be honest with you. It is like, it's sort of like [00:50:00] abstraction soup right now for me, but like some of those core ideas felt really ergonomic.[00:50:04] Bret: I just wanna, I'm just looking forward to the day when someone comes up with a programming system that feels both really like an aha moment, but completely foreign to me at the same time. Because they created it with sort of like from first principles recognizing that like. Authoring code in an editor is maybe not like the primary like reason why a programming system exists anymore.[00:50:26] Bret: And I think that's like, that would be a very exciting day for me.[00:50:28] The Role of AI in Programming[00:50:28] swyx: Yeah, I would say like the various versions of this discussion have happened at the end of the day, you still need to precisely communicate what you want. As a manager of people, as someone who has done many, many legal contracts, you know how hard that is.[00:50:42] swyx: And then now we have to talk to machines doing that and AIs interpreting what we mean and reading our minds effectively. I don't know how to get across that barrier of translating human intent to instructions. And yes, it can be more declarative, but I don't know if it'll ever Crossover from being [00:51:00] a programming language to something more than that.[00:51:02] Bret: I agree with you. And I actually do think if you look at like a legal contract, you know, the imprecision of the English language, it's like a flaw in the system. How many[00:51:12] swyx: holes there are.[00:51:13] Bret: And I do think that when you're making a mission critical software system, I don't think it should be English language prompts.[00:51:19] Bret: I think that is silly because you want the precision of a a programming language. My point was less about that and more about if the actual act of authoring it, like if you.[00:51:32] Formal Verification in Software[00:51:32] Bret: I'll think of some embedded systems do use formal verification. I know it's very common in like security protocols now so that you can, because the importance of correctness is so great.[00:51:41] Bret: My intellectual exercise is like, why not do that for all software? I mean, probably that's silly just literally to do what we literally do for. These low level security protocols, but the only reason we don't is because it's hard and tedious and hard and tedious are no longer factors. So, like, if I could, I mean, [00:52:00] just think of, like, the silliest app on your phone right now, the idea that that app should be, like, formally verified for its correctness feels laughable right now because, like, God, why would you spend the time on it?[00:52:10] Bret: But if it's zero costs, like, yeah, I guess so. I mean, it never crashed. That's probably good. You know, why not? I just want to, like, set our bars really high. Like. We should make, software has been amazing. Like there's a Mark Andreessen blog post, software is eating the world. And you know, our whole life is, is mediated digitally.[00:52:26] Bret: And that's just increasing with AI. And now we'll have our personal agents talking to the agents on the CRO platform and it's agents all the way down, you know, our core infrastructure is running on these digital systems. We now have like, and we've had a shortage of software developers for my entire life.[00:52:45] Bret: And as a consequence, you know if you look, remember like health care, got healthcare. gov that fiasco security vulnerabilities leading to state actors getting access to critical infrastructure. I'm like. We now have like created this like amazing system that can [00:53:00] like, we can fix this, you know, and I, I just want to, I'm both excited about the productivity gains in the economy, but I just think as software engineers, we should be bolder.[00:53:08] Bret: Like we should have aspirations to fix these systems so that like in general, as you said, as precise as we want to be in the specification of the system. We can make it work correctly now, and I'm being a little bit hand wavy, and I think we need some systems. I think that's where we should set the bar, especially when so much of our life depends on this critical digital infrastructure.[00:53:28] Bret: So I'm I'm just like super optimistic about it. But actually, let's go to w

christmas god ceo amazon california new york city ai english google apple education internet future talk space passion challenges san francisco building innovation evolution data cost microsoft italian iphone impact greek pass code silicon valley product software independent id stanford mac ab wikipedia exciting windows ibm outlook oracle crossover models decision making architects reason cto programming infinite react excel individuals danish classes openai moments maps salesforce sf shopify rust ux industries hp jaguars sirius xm safari google maps chrome gmail default interactive open source gpt python ui aws colonel linux java bret snowflakes product management cro apis intensity ajax technological stargate javascript html erp ling sam altman linus amazon web services kg firefox agi vmware ben franklin coupled solves software development mozilla software engineering sonos anthropic early career mountain view internet explorer sauces yen haskell alessio satya capex humpty dumpty fud json unix famously sota xml sun microsystems artificial general intelligence cursor microsoft excel microsoft copilot authoring adt specifications svg adobe photoshop latent jquery programming languages prd cambrian generalized w3c keyhole brian armstrong web applications confluent marissa meyer maniacal dot com bubble gbt autocomplete silicon graphics andrej karpathy front end development so google lvm hortonworks formal verification xslt ron conway jens rasmussen

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Lex Fridman Podcast

Play Episode Listen Later Feb 3, 2025 316:20

Dylan Patel is the founder of SemiAnalysis, a research & analysis company specializing in semiconductors, GPUs, CPUs, and AI hardware. Nathan Lambert is a research scientist at the Allen Institute for AI (Ai2) and the author of a blog on AI called Interconnects. Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep459-sc See below for timestamps, and to give feedback, submit questions, contact Lex, etc. CONTACT LEX: Feedback - give feedback to Lex: https://lexfridman.com/survey AMA - submit questions, videos or call-in: https://lexfridman.com/ama Hiring - join our team: https://lexfridman.com/hiring Other - other ways to get in touch: https://lexfridman.com/contact EPISODE LINKS: Dylan's X: https://x.com/dylan522p SemiAnalysis: https://semianalysis.com/ Nathan's X: https://x.com/natolambert Nathan's Blog: https://www.interconnects.ai/ Nathan's Podcast: https://www.interconnects.ai/podcast Nathan's Website: https://www.natolambert.com/ Nathan's YouTube: https://youtube.com/@natolambert Nathan's Book: https://rlhfbook.com/ SPONSORS: To support this podcast, check out our sponsors & get discounts: Invideo AI: AI video generator. Go to https://invideo.io/i/lexpod GitHub: Developer platform and AI code editor. Go to https://gh.io/copilot Shopify: Sell stuff online. Go to https://shopify.com/lex NetSuite: Business management software. Go to http://netsuite.com/lex AG1: All-in-one daily nutrition drinks. Go to https://drinkag1.com/lex OUTLINE: (00:00) - Introduction (13:28) - DeepSeek-R1 and DeepSeek-V3 (35:02) - Low cost of training (1:01:19) - DeepSeek compute cluster (1:08:52) - Export controls on GPUs to China (1:19:10) - AGI timeline (1:28:35) - China's manufacturing capacity (1:36:30) - Cold war with China (1:41:00) - TSMC and Taiwan (2:04:38) - Best GPUs for AI (2:19:30) - Why DeepSeek is so cheap (2:32:49) - Espionage (2:41:52) - Censorship (2:54:46) - Andrej Karpathy and magic of RL (3:05:17) - OpenAI o3-mini vs DeepSeek r1 (3:24:25) - NVIDIA (3:28:53) - GPU smuggling (3:35:30) - DeepSeek training on OpenAI data (3:45:59) - AI megaclusters (4:21:21) - Who wins the race to AGI? (4:31:34) - AI agents (4:40:16) - Programming and AI (4:47:43) - Open source (4:56:55) - Stargate (5:04:24) - Future of AI PODCAST LINKS: - Podcast Website: https://lexfridman.com/podcast - Apple Podcasts: https://apple.co/2lwqZIr - Spotify: https://spoti.fi/2nEwCF8 - RSS: https://lexfridman.com/feed/podcast/ - Podcast Playlist: https://www.youtube.com/playlist?list=PLrAXtmErZgOdP_8GztsuKi9nrraNbKKp4 - Clips Channel: https://www.youtube.com/lexclips

[Ride Home] Simon Willison: Things we learned about LLMs in 2024

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Jan 12, 2025 73:23

Due to overwhelming demand (>15x applications:slots), we are closing CFPs for AI Engineer Summit NYC today. Last call! Thanks, we'll be reaching out to all shortly!The world's top AI blogger and friend of every pod, Simon Willison, dropped a monster 2024 recap: Things we learned about LLMs in 2024. Brian of the excellent TechMeme Ride Home pinged us for a connection and a special crossover episode, our first in 2025. The target audience for this podcast is a tech-literate, but non-technical one. You can see Simon's notes for AI Engineers in his World's Fair Keynote.Timestamp* 00:00 Introduction and Guest Welcome* 01:06 State of AI in 2025* 01:43 Advancements in AI Models* 03:59 Cost Efficiency in AI* 06:16 Challenges and Competition in AI* 17:15 AI Agents and Their Limitations* 26:12 Multimodal AI and Future Prospects* 35:29 Exploring Video Avatar Companies* 36:24 AI Influencers and Their Future* 37:12 Simplifying Content Creation with AI* 38:30 The Importance of Credibility in AI* 41:36 The Future of LLM User Interfaces* 48:58 Local LLMs: A Growing Interest* 01:07:22 AI Wearables: The Next Big Thing* 01:10:16 Wrapping Up and Final ThoughtsTranscript[00:00:00] Introduction and Guest Welcome[00:00:00] Brian: Welcome to the first bonus episode of the Tech Meme Write Home for the year 2025. I'm your host as always, Brian McCullough. Listeners to the pod over the last year know that I have made a habit of quoting from Simon Willison when new stuff happens in AI from his blog. Simon has been, become a go to for many folks in terms of, you know, Analyzing things, criticizing things in the AI space.[00:00:33] Brian: I've wanted to talk to you for a long time, Simon. So thank you for coming on the show. No, it's a privilege to be here. And the person that made this connection happen is our friend Swyx, who has been on the show back, even going back to the, the Twitter Spaces days but also an AI guru in, in their own right Swyx, thanks for coming on the show also.[00:00:54] swyx (2): Thanks. I'm happy to be on and have been a regular listener, so just happy to [00:01:00] contribute as well.[00:01:00] Brian: And a good friend of the pod, as they say. Alright, let's go right into it.[00:01:06] State of AI in 2025[00:01:06] Brian: Simon, I'm going to do the most unfair, broad question first, so let's get it out of the way. The year 2025. Broadly, what is the state of AI as we begin this year?[00:01:20] Brian: Whatever you want to say, I don't want to lead the witness.[00:01:22] Simon: Wow. So many things, right? I mean, the big thing is everything's got really good and fast and cheap. Like, that was the trend throughout all of 2024. The good models got so much cheaper, they got so much faster, they got multimodal, right? The image stuff isn't even a surprise anymore.[00:01:39] Simon: They're growing video, all of that kind of stuff. So that's all really exciting.[00:01:43] Advancements in AI Models[00:01:43] Simon: At the same time, they didn't get massively better than GPT 4, which was a bit of a surprise. So that's sort of one of the open questions is, are we going to see huge, but I kind of feel like that's a bit of a distraction because GPT 4, but way cheaper, much larger context lengths, and it [00:02:00] can do multimodal.[00:02:01] Simon: is better, right? That's a better model, even if it's not.[00:02:05] Brian: What people were expecting or hoping, maybe not expecting is not the right word, but hoping that we would see another step change, right? Right. From like GPT 2 to 3 to 4, we were expecting or hoping that maybe we were going to see the next evolution in that sort of, yeah.[00:02:21] Brian: We[00:02:21] Simon: did see that, but not in the way we expected. We thought the model was just going to get smarter, and instead we got. Massive drops in, drops in price. We got all of these new capabilities. You can talk to the things now, right? They can do simulated audio input, all of that kind of stuff. And so it's kind of, it's interesting to me that the models improved in all of these ways we weren't necessarily expecting.[00:02:43] Simon: I didn't know it would be able to do an impersonation of Santa Claus, like a, you know, Talked to it through my phone and show it what I was seeing by the end of 2024. But yeah, we didn't get that GPT 5 step. And that's one of the big open questions is, is that actually just around the corner and we'll have a bunch of GPT 5 class models drop in the [00:03:00] next few months?[00:03:00] Simon: Or is there a limit?[00:03:03] Brian: If you were a betting man and wanted to put money on it, do you expect to see a phase change, step change in 2025?[00:03:11] Simon: I don't particularly for that, like, the models, but smarter. I think all of the trends we're seeing right now are going to keep on going, especially the inference time compute, right?[00:03:21] Simon: The trick that O1 and O3 are doing, which means that you can solve harder problems, but they cost more and it churns away for longer. I think that's going to happen because that's already proven to work. I don't know. I don't know. Maybe there will be a step change to a GPT 5 level, but honestly, I'd be completely happy if we got what we've got right now.[00:03:41] Simon: But cheaper and faster and more capabilities and longer contexts and so forth. That would be thrilling to me.[00:03:46] Brian: Digging into what you've just said one of the things that, by the way, I hope to link in the show notes to Simon's year end post about what, what things we learned about LLMs in 2024. Look for that in the show notes.[00:03:59] Cost Efficiency in AI[00:03:59] Brian: One of the things that you [00:04:00] did say that you alluded to even right there was that in the last year, you felt like the GPT 4 barrier was broken, like IE. Other models, even open source ones are now regularly matching sort of the state of the art.[00:04:13] Simon: Well, it's interesting, right? So the GPT 4 barrier was a year ago, the best available model was OpenAI's GPT 4 and nobody else had even come close to it.[00:04:22] Simon: And they'd been at the, in the lead for like nine months, right? That thing came out in what, February, March of, of 2023. And for the rest of 2023, nobody else came close. And so at the start of last year, like a year ago, the big question was, Why has nobody beaten them yet? Like, what do they know that the rest of the industry doesn't know?[00:04:40] Simon: And today, that I've counted 18 organizations other than GPT 4 who've put out a model which clearly beats that GPT 4 from a year ago thing. Like, maybe they're not better than GPT 4. 0, but that's, that, that, that barrier got completely smashed. And yeah, a few of those I've run on my laptop, which is wild to me.[00:04:59] Simon: Like, [00:05:00] it was very, very wild. It felt very clear to me a year ago that if you want GPT 4, you need a rack of 40, 000 GPUs just to run the thing. And that turned out not to be true. Like the, the, this is that big trend from last year of the models getting more efficient, cheaper to run, just as capable with smaller weights and so forth.[00:05:20] Simon: And I ran another GPT 4 model on my laptop this morning, right? Microsoft 5. 4 just came out. And that, if you look at the benchmarks, it's definitely, it's up there with GPT 4. 0. It's probably not as good when you actually get into the vibes of the thing, but it, it runs on my, it's a 14 gigabyte download and I can run it on a MacBook Pro.[00:05:38] Simon: Like who saw that coming? The most exciting, like the close of the year on Christmas day, just a few weeks ago, was when DeepSeek dropped their DeepSeek v3 model on Hugging Face without even a readme file. It was just like a giant binary blob that I can't run on my laptop. It's too big. But in all of the benchmarks, it's now by far the best available [00:06:00] open, open weights model.[00:06:01] Simon: Like it's, it's, it's beating the, the metalamas and so forth. And that was trained for five and a half million dollars, which is a tenth of the price that people thought it costs to train these things. So everything's trending smaller and faster and more efficient.[00:06:15] Brian: Well, okay.[00:06:16] Challenges and Competition in AI[00:06:16] Brian: I, I kind of was going to get to that later, but let's, let's combine this with what I was going to ask you next, which is, you know, you're talking, you know, Also in the piece about the LLM prices crashing, which I've even seen in projects that I'm working on, but explain Explain that to a general audience, because we hear all the time that LLMs are eye wateringly expensive to run, but what we're suggesting, and we'll come back to the cheap Chinese LLM, but first of all, for the end user, what you're suggesting is that we're starting to see the cost come down sort of in the traditional technology way of Of costs coming down over time,[00:06:49] Simon: yes, but very aggressively.[00:06:51] Simon: I mean, my favorite thing, the example here is if you look at GPT-3, so open AI's g, PT three, which was the best, a developed model in [00:07:00] 2022 and through most of 20 2023. That, the models that we have today, the OpenAI models are a hundred times cheaper. So there was a 100x drop in price for OpenAI from their best available model, like two and a half years ago to today.[00:07:13] Simon: And[00:07:14] Brian: just to be clear, not to train the model, but for the use of tokens and things. Exactly,[00:07:20] Simon: for running prompts through them. And then When you look at the, the really, the top tier model providers right now, I think, are OpenAI, Anthropic, Google, and Meta. And there are a bunch of others that I could list there as well.[00:07:32] Simon: Mistral are very good. The, the DeepSeq and Quen models have got great. There's a whole bunch of providers serving really good models. But even if you just look at the sort of big brand name providers, they all offer models now that are A fraction of the price of the, the, of the models we were using last year.[00:07:49] Simon: I think I've got some numbers that I threw into my blog entry here. Yeah. Like Gemini 1. 5 flash, that's Google's fast high quality model is [00:08:00] how much is that? It's 0. 075 dollars per million tokens. Like these numbers are getting, So we just do cents per million now,[00:08:09] swyx (2): cents per million,[00:08:10] Simon: cents per million makes, makes a lot more sense.[00:08:12] Simon: Yeah they have one model 1. 5 flash 8B, the absolute cheapest of the Google models, is 27 times cheaper than GPT 3. 5 turbo was a year ago. That's it. And GPT 3. 5 turbo, that was the cheap model, right? Now we've got something 27 times cheaper, and the Google, this Google one can do image recognition, it can do million token context, all of those tricks.[00:08:36] Simon: But it's, it's, it's very, it's, it really is startling how inexpensive some of this stuff has got.[00:08:41] Brian: Now, are we assuming that this, that happening is directly the result of competition? Because again, you know, OpenAI, and probably they're doing this for their own almost political reasons, strategic reasons, keeps saying, we're losing money on everything, even the 200.[00:08:56] Brian: So they probably wouldn't, the prices wouldn't be [00:09:00] coming down if there wasn't intense competition in this space.[00:09:04] Simon: The competition is absolutely part of it, but I have it on good authority from sources I trust that Google Gemini is not operating at a loss. Like, the amount of electricity to run a prompt is less than they charge you.[00:09:16] Simon: And the same thing for Amazon Nova. Like, somebody found an Amazon executive and got them to say, Yeah, we're not losing money on this. I don't know about Anthropic and OpenAI, but clearly that demonstrates it is possible to run these things at these ludicrously low prices and still not be running at a loss if you discount the Army of PhDs and the, the training costs and all of that kind of stuff.[00:09:36] Brian: One, one more for me before I let Swyx jump in here. To, to come back to DeepSeek and this idea that you could train, you know, a cutting edge model for 6 million. I, I was saying on the show, like six months ago, that if we are getting to the point where each new model It would cost a billion, ten billion, a hundred billion to train that.[00:09:54] Brian: At some point it would almost, only nation states would be able to train the new models. Do you [00:10:00] expect what DeepSeek and maybe others are proving to sort of blow that up? Or is there like some sort of a parallel track here that maybe I'm not technically, I don't have the mouse to understand the difference.[00:10:11] Brian: Is the model, are the models going to go, you know, Up to a hundred billion dollars or can we get them down? Sort of like DeepSeek has proven[00:10:18] Simon: so I'm the wrong person to answer that because I don't work in the lab training these models. So I can give you my completely uninformed opinion, which is, I felt like the DeepSeek thing.[00:10:27] Simon: That was a bomb shell. That was an absolute bombshell when they came out and said, Hey, look, we've trained. One of the best available models and it cost us six, five and a half million dollars to do it. I feel, and they, the reason, one of the reasons it's so efficient is that we put all of these export controls in to stop Chinese companies from giant buying GPUs.[00:10:44] Simon: So they've, were forced to be, go as efficient as possible. And yet the fact that they've demonstrated that that's possible to do. I think it does completely tear apart this, this, this mental model we had before that yeah, the training runs just keep on getting more and more expensive and the number of [00:11:00] organizations that can afford to run these training runs keeps on shrinking.[00:11:03] Simon: That, that's been blown out of the water. So yeah, that's, again, this was our Christmas gift. This was the thing they dropped on Christmas day. Yeah, it makes me really optimistic that we can, there are, It feels like there was so much low hanging fruit in terms of the efficiency of both inference and training and we spent a whole bunch of last year exploring that and getting results from it.[00:11:22] Simon: I think there's probably a lot left. I think there's probably, well, I would not be surprised to see even better models trained spending even less money over the next six months.[00:11:31] swyx (2): Yeah. So I, I think there's a unspoken angle here on what exactly the Chinese labs are trying to do because DeepSea made a lot of noise.[00:11:41] swyx (2): so much for joining us for around the fact that they train their model for six million dollars and nobody quite quite believes them. Like it's very, very rare for a lab to trumpet the fact that they're doing it for so cheap. They're not trying to get anyone to buy them. So why [00:12:00] are they doing this? They make it very, very obvious.[00:12:05] swyx (2): Deepseek is about 150 employees. It's an order of magnitude smaller than at least Anthropic and maybe, maybe more so for OpenAI. And so what's, what's the end game here? Are they, are they just trying to show that the Chinese are better than us?[00:12:21] Simon: So Deepseek, it's the arm of a hedge, it's a, it's a quant fund, right?[00:12:25] Simon: It's an algorithmic quant trading thing. So I, I, I would love to get more insight into how that organization works. My assumption from what I've seen is it looks like they're basically just flexing. They're like, hey, look at how utterly brilliant we are with this amazing thing that we've done. And it's, it's working, right?[00:12:43] Simon: They but, and so is that it? Are they, is this just their kind of like, this is, this is why our company is so amazing. Look at this thing that we've done, or? I don't know. I'd, I'd love to get Some insight from, from within that industry as to, as to how that's all playing out.[00:12:57] swyx (2): The, the prevailing theory among the Local Llama [00:13:00] crew and the Twitter crew that I indexed for my newsletter is that there is some amount of copying going on.[00:13:06] swyx (2): It's like Sam Altman you know, tweet, tweeting about how they're being copied. And then also there's this, there, there are other sort of opening eye employees that have said, Stuff that is similar that DeepSeek's rate of progress is how U. S. intelligence estimates the number of foreign spies embedded in top labs.[00:13:22] swyx (2): Because a lot of these ideas do spread around, but they surprisingly have a very high density of them in the DeepSeek v3 technical report. So it's, it's interesting. We don't know how much, how many, how much tokens. I think that, you know, people have run analysis on how often DeepSeek thinks it is cloud or thinks it is opening GPC 4.[00:13:40] swyx (2): Thanks for watching! And we don't, we don't know. We don't know. I think for me, like, yeah, we'll, we'll, we basically will never know as, as external commentators. I think what's interesting is how, where does this go? Is there a logical floor or bottom by my estimations for the same amount of ELO started last year to the end of last year cost went down by a thousand X for the [00:14:00] GPT, for, for GPT 4 intelligence.[00:14:02] swyx (2): Would, do they go down a thousand X this year?[00:14:04] Simon: That's a fascinating question. Yeah.[00:14:06] swyx (2): Is there a Moore's law going on, or did we just get a one off benefit last year for some weird reason?[00:14:14] Simon: My uninformed hunch is low hanging fruit. I feel like up until a year ago, people haven't been focusing on efficiency at all. You know, it was all about, what can we get these weird shaped things to do?[00:14:24] Simon: And now once we've sort of hit that, okay, we know that we can get them to do what GPT 4 can do, When thousands of researchers around the world all focus on, okay, how do we make this more efficient? What are the most important, like, how do we strip out all of the weights that have stuff in that doesn't really matter?[00:14:39] Simon: All of that kind of thing. So yeah, maybe that was it. Maybe 2024 was a freak year of all of the low hanging fruit coming out at once. And we'll actually see a reduction in the, in that rate of improvement in terms of efficiency. I wonder, I mean, I think we'll know for sure in about three months time if that trend's going to continue or not.[00:14:58] swyx (2): I agree. You know, I [00:15:00] think the other thing that you mentioned that DeepSeq v3 was the gift that was given from DeepSeq over Christmas, but I feel like the other thing that might be underrated was DeepSeq R1,[00:15:11] Speaker 4: which is[00:15:13] swyx (2): a reasoning model you can run on your laptop. And I think that's something that a lot of people are looking ahead to this year.[00:15:18] swyx (2): Oh, did they[00:15:18] Simon: release the weights for that one?[00:15:20] swyx (2): Yeah.[00:15:21] Simon: Oh my goodness, I missed that. I've been playing with the quen. So the other great, the other big Chinese AI app is Alibaba's quen. Actually, yeah, I, sorry, R1 is an API available. Yeah. Exactly. When that's really cool. So Alibaba's Quen have released two reasoning models that I've run on my laptop.[00:15:38] Simon: Now there was, the first one was Q, Q, WQ. And then the second one was QVQ because the second one's a vision model. So you can like give it vision puzzles and a prompt that these things, they are so much fun to run. Because they think out loud. It's like the OpenAR 01 sort of hides its thinking process. The Query ones don't.[00:15:59] Simon: They just, they [00:16:00] just churn away. And so you'll give it a problem and it will output literally dozens of paragraphs of text about how it's thinking. My favorite thing that happened with QWQ is I asked it to draw me a pelican on a bicycle in SVG. That's like my standard stupid prompt. And for some reason it thought in Chinese.[00:16:18] Simon: It spat out a whole bunch of like Chinese text onto my terminal on my laptop, and then at the end it gave me quite a good sort of artistic pelican on a bicycle. And I ran it all through Google Translate, and yeah, it was like, it was contemplating the nature of SVG files as a starting point. And the fact that my laptop can think in Chinese now is so delightful.[00:16:40] Simon: It's so much fun watching you do that.[00:16:43] swyx (2): Yeah, I think Andrej Karpathy was saying, you know, we, we know that we have achieved proper reasoning inside of these models when they stop thinking in English, and perhaps the best form of thought is in Chinese. But yeah, for listeners who don't know Simon's blog he always, whenever a new model comes out, you, I don't know how you do it, but [00:17:00] you're always the first to run Pelican Bench on these models.[00:17:02] swyx (2): I just did it for 5.[00:17:05] Simon: Yeah.[00:17:07] swyx (2): So I really appreciate that. You should check it out. These are not theoretical. Simon's blog actually shows them.[00:17:12] Brian: Let me put on the investor hat for a second.[00:17:15] AI Agents and Their Limitations[00:17:15] Brian: Because from the investor side of things, a lot of the, the VCs that I know are really hot on agents, and this is the year of agents, but last year was supposed to be the year of agents as well. Lots of money flowing towards, And Gentic startups.[00:17:32] Brian: But in in your piece that again, we're hopefully going to have linked in the show notes, you sort of suggest there's a fundamental flaw in AI agents as they exist right now. Let me let me quote you. And then I'd love to dive into this. You said, I remain skeptical as to their ability based once again, on the Challenge of gullibility.[00:17:49] Brian: LLMs believe anything you tell them, any systems that attempt to make meaningful decisions on your behalf, will run into the same roadblock. How good is a travel agent, or a digital assistant, or even a research tool, if it [00:18:00] can't distinguish truth from fiction? So, essentially, what you're suggesting is that the state of the art now that allows agents is still, it's still that sort of 90 percent problem, the edge problem, getting to the Or, or, or is there a deeper flaw?[00:18:14] Brian: What are you, what are you saying there?[00:18:16] Simon: So this is the fundamental challenge here and honestly my frustration with agents is mainly around definitions Like any if you ask anyone who says they're working on agents to define agents You will get a subtly different definition from each person But everyone always assumes that their definition is the one true one that everyone else understands So I feel like a lot of these agent conversations, people talking past each other because one person's talking about the, the sort of travel agent idea of something that books things on your behalf.[00:18:41] Simon: Somebody else is talking about LLMs with tools running in a loop with a cron job somewhere and all of these different things. You, you ask academics and they'll laugh at you because they've been debating what agents mean for over 30 years at this point. It's like this, this long running, almost sort of an in joke in that community.[00:18:57] Simon: But if we assume that for this purpose of this conversation, an [00:19:00] agent is something that, Which you can give a job and it goes off and it does that thing for you like, like booking travel or things like that. The fundamental challenge is, it's the reliability thing, which comes from this gullibility problem.[00:19:12] Simon: And a lot of my, my interest in this originally came from when I was thinking about prompt injections as a source of this form of attack against LLM systems where you deliberately lay traps out there for this LLM to stumble across,[00:19:24] Brian: and which I should say you have been banging this drum that no one's gotten any far, at least on solving this, that I'm aware of, right.[00:19:31] Brian: Like that's still an open problem. The two years.[00:19:33] Simon: Yeah. Right. We've been talking about this problem and like, a great illustration of this was Claude so Anthropic released Claude computer use a few months ago. Fantastic demo. You could fire up a Docker container and you could literally tell it to do something and watch it open a web browser and navigate to a webpage and click around and so forth.[00:19:51] Simon: Really, really, really interesting and fun to play with. And then, um. One of the first demos somebody tried was, what if you give it a web page that says download and run this [00:20:00] executable, and it did, and the executable was malware that added it to a botnet. So the, the very first most obvious dumb trick that you could play on this thing just worked, right?[00:20:10] Simon: So that's obviously a really big problem. If I'm going to send something out to book travel on my behalf, I mean, it's hard enough for me to figure out which airlines are trying to scam me and which ones aren't. Do I really trust a language model that believes the literal truth of anything that's presented to it to go out and do those things?[00:20:29] swyx (2): Yeah I definitely think there's, it's interesting to see Anthropic doing this because they used to be the safety arm of OpenAI that split out and said, you know, we're worried about letting this thing out in the wild and here they are enabling computer use for agents. Thanks. The, it feels like things have merged.[00:20:49] swyx (2): You know, I'm, I'm also fairly skeptical about, you know, this always being the, the year of Linux on the desktop. And this is the equivalent of this being the year of agents that people [00:21:00] are not predicting so much as wishfully thinking and hoping and praying for their companies and agents to work.[00:21:05] swyx (2): But I, I feel like things are. Coming along a little bit. It's to me, it's kind of like self driving. I remember in 2014 saying that self driving was just around the corner. And I mean, it kind of is, you know, like in, in, in the Bay area. You[00:21:17] Simon: get in a Waymo and you're like, Oh, this works. Yeah, but it's a slow[00:21:21] swyx (2): cook.[00:21:21] swyx (2): It's a slow cook over the next 10 years. We're going to hammer out these things and the cynical people can just point to all the flaws, but like, there are measurable or concrete progress steps that are being made by these builders.[00:21:33] Simon: There is one form of agent that I believe in. I believe, mostly believe in the research assistant form of agents.[00:21:39] Simon: The thing where you've got a difficult problem and, and I've got like, I'm, I'm on the beta for the, the Google Gemini 1. 5 pro with deep research. I think it's called like these names, these names. Right. But. I've been using that. It's good, right? You can give it a difficult problem and it tells you, okay, I'm going to look at 56 different websites [00:22:00] and it goes away and it dumps everything to its context and it comes up with a report for you.[00:22:04] Simon: And it's not, it won't work against adversarial websites, right? If there are websites with deliberate lies in them, it might well get caught out. Most things don't have that as a problem. And so I've had some answers from that which were genuinely really valuable to me. And that feels to me like, I can see how given existing LLM tech, especially with Google Gemini with its like million token contacts and Google with their crawl of the entire web and their, they've got like search, they've got search and cache, they've got a cache of every page and so forth.[00:22:35] Simon: That makes sense to me. And that what they've got right now, I don't think it's, it's not as good as it can be, obviously, but it's, it's, it's, it's a real useful thing, which they're going to start rolling out. So, you know, Perplexity have been building the same thing for a couple of years. That, that I believe in.[00:22:50] Simon: You know, if you tell me that you're going to have an agent that's a research assistant agent, great. The coding agents I mean, chat gpt code interpreter, Nearly two years [00:23:00] ago, that thing started writing Python code, executing the code, getting errors, rewriting it to fix the errors. That pattern obviously works.[00:23:07] Simon: That works really, really well. So, yeah, coding agents that do that sort of error message loop thing, those are proven to work. And they're going to keep on getting better, and that's going to be great. The research assistant agents are just beginning to get there. The things I'm critical of are the ones where you trust, you trust this thing to go out and act autonomously on your behalf, and make decisions on your behalf, especially involving spending money, like that.[00:23:31] Simon: I don't see that working for a very long time. That feels to me like an AGI level problem.[00:23:37] swyx (2): It's it's funny because I think Stripe actually released an agent toolkit which is one of the, the things I featured that is trying to enable these agents each to have a wallet that they can go and spend and have, basically, it's a virtual card.[00:23:49] swyx (2): It's not that, not that difficult with modern infrastructure. can[00:23:51] Simon: stick a 50 cap on it, then at least it's an honor. Can't lose more than 50.[00:23:56] Brian: You know I don't, I don't know if either of you know Rafat Ali [00:24:00] he runs Skift, which is a, a travel news vertical. And he, he, he constantly laughs at the fact that every agent thing is, we're gonna get rid of booking a, a plane flight for you, you know?[00:24:11] Brian: And, and I would point out that, like, historically, when the web started, the first thing everyone talked about is, You can go online and book a trip, right? So it's funny for each generation of like technological advance. The thing they always want to kill is the travel agent. And now they want to kill the webpage travel agent.[00:24:29] Simon: Like it's like I use Google flight search. It's great, right? If you gave me an agent to do that for me, it would save me, I mean, maybe 15 seconds of typing in my things, but I still want to see what my options are and go, yeah, I'm not flying on that airline, no matter how cheap they are.[00:24:44] swyx (2): Yeah. For listeners, go ahead.[00:24:47] swyx (2): For listeners, I think, you know, I think both of you are pretty positive on NotebookLM. And you know, we, we actually interviewed the NotebookLM creators, and there are actually two internal agents going on internally. The reason it takes so long is because they're running an agent loop [00:25:00] inside that is fairly autonomous, which is kind of interesting.[00:25:01] swyx (2): For one,[00:25:02] Simon: for a definition of agent loop, if you picked that particularly well. For one definition. And you're talking about the podcast side of this, right?[00:25:07] swyx (2): Yeah, the podcast side of things. They have a there's, there's going to be a new version coming out that, that we'll be featuring at our, at our conference.[00:25:14] Simon: That one's fascinating to me. Like NotebookLM, I think it's two products, right? On the one hand, it's actually a very good rag product, right? You dump a bunch of things in, you can run searches, that, that, it does a good job of. And then, and then they added the, the podcast thing. It's a bit of a, it's a total gimmick, right?[00:25:30] Simon: But that gimmick got them attention, because they had a great product that nobody paid any attention to at all. And then you add the unfeasibly good voice synthesis of the podcast. Like, it's just, it's, it's, it's the lesson.[00:25:43] Brian: It's the lesson of mid journey and stuff like that. If you can create something that people can post on socials, you don't have to lift a finger again to do any marketing for what you're doing.[00:25:53] Brian: Let me dig into Notebook LLM just for a second as a podcaster. As a [00:26:00] gimmick, it makes sense, and then obviously, you know, you dig into it, it sort of has problems around the edges. It's like, it does the thing that all sort of LLMs kind of do, where it's like, oh, we want to Wrap up with a conclusion.[00:26:12] Multimodal AI and Future Prospects[00:26:12] Brian: I always call that like the the eighth grade book report paper problem where it has to have an intro and then, you know But that's sort of a thing where because I think you spoke about this again in your piece at the year end About how things are going multimodal and how things are that you didn't expect like, you know vision and especially audio I think So that's another thing where, at least over the last year, there's been progress made that maybe you, you didn't think was coming as quick as it came.[00:26:43] Simon: I don't know. I mean, a year ago, we had one really good vision model. We had GPT 4 vision, was, was, was very impressive. And Google Gemini had just dropped Gemini 1. 0, which had vision, but nobody had really played with it yet. Like Google hadn't. People weren't taking Gemini [00:27:00] seriously at that point. I feel like it was 1.[00:27:02] Simon: 5 Pro when it became apparent that actually they were, they, they got over their hump and they were building really good models. And yeah, and they, to be honest, the video models are mostly still using the same trick. The thing where you divide the video up into one image per second and you dump that all into the context.[00:27:16] Simon: So maybe it shouldn't have been so surprising to us that long context models plus vision meant that the video was, was starting to be solved. Of course, it didn't. Not being, you, what you really want with videos, you want to be able to do the audio and the images at the same time. And I think the models are beginning to do that now.[00:27:33] Simon: Like, originally, Gemini 1. 5 Pro originally ignored the audio. It just did the, the, like, one frame per second video trick. As far as I can tell, the most recent ones are actually doing pure multimodal. But the things that opens up are just extraordinary. Like, the the ChatGPT iPhone app feature that they shipped as one of their 12 days of, of OpenAI, I really can be having a conversation and just turn on my video camera and go, Hey, what kind of tree is [00:28:00] this?[00:28:00] Simon: And so forth. And it works. And for all I know, that's just snapping a like picture once a second and feeding it into the model. The, the, the things that you can do with that as an end user are extraordinary. Like that, that to me, I don't think most people have cottoned onto the fact that you can now stream video directly into a model because it, it's only a few weeks old.[00:28:22] Simon: Wow. That's a, that's a, that's a, that's Big boost in terms of what kinds of things you can do with this stuff. Yeah. For[00:28:30] swyx (2): people who are not that close I think Gemini Flashes free tier allows you to do something like capture a photo, one photo every second or a minute and leave it on 24, seven, and you can prompt it to do whatever.[00:28:45] swyx (2): And so you can effectively have your own camera app or monitoring app that that you just prompt and it detects where it changes. It detects for, you know, alerts or anything like that, or describes your day. You know, and, and, and the fact that this is free I think [00:29:00] it's also leads into the previous point of it being the prices haven't come down a lot.[00:29:05] Simon: And even if you're paying for this stuff, like a thing that I put in my blog entry is I ran a calculation on what it would cost to process 68, 000 photographs in my photo collection, and for each one just generate a caption, and using Gemini 1. 5 Flash 8B, it would cost me 1. 68 to process 68, 000 images, which is, I mean, that, that doesn't make sense.[00:29:28] Simon: None of that makes sense. Like it's, it's a, for one four hundredth of a cent per image to generate captions now. So you can see why feeding in a day's worth of video just isn't even very expensive to process.[00:29:40] swyx (2): Yeah, I'll tell you what is expensive. It's the other direction. So we're here, we're talking about consuming video.[00:29:46] swyx (2): And this year, we also had a lot of progress, like probably one of the most excited, excited, anticipated launches of the year was Sora. We actually got Sora. And less exciting.[00:29:55] Simon: We did, and then VO2, Google's Sora, came out like three [00:30:00] days later and upstaged it. Like, Sora was exciting until VO2 landed, which was just better.[00:30:05] swyx (2): In general, I feel the media, or the social media, has been very unfair to Sora. Because what was released to the world, generally available, was Sora Lite. It's the distilled version of Sora, right? So you're, I did not[00:30:16] Simon: realize that you're absolutely comparing[00:30:18] swyx (2): the, the most cherry picked version of VO two, the one that they published on the marketing page to the, the most embarrassing version of the soa.[00:30:25] swyx (2): So of course it's gonna look bad, so, well, I got[00:30:27] Simon: access to the VO two I'm in the VO two beta and I've been poking around with it and. Getting it to generate pelicans on bicycles and stuff. I would absolutely[00:30:34] swyx (2): believe that[00:30:35] Simon: VL2 is actually better. Is Sora, so is full fat Sora coming soon? Do you know, when, when do we get to play with that one?[00:30:42] Simon: No one's[00:30:43] swyx (2): mentioned anything. I think basically the strategy is let people play around with Sora Lite and get info there. But the, the, keep developing Sora with the Hollywood studios. That's what they actually care about. Gotcha. Like the rest of us. Don't really know what to do with the video anyway. Right.[00:30:59] Simon: I mean, [00:31:00] that's my thing is I realized that for generative images and images and video like images We've had for a few years and I don't feel like they've broken out into the talented artist community yet Like lots of people are having fun with them and doing and producing stuff. That's kind of cool to look at but what I want you know that that movie everything everywhere all at once, right?[00:31:20] Simon: One, one ton of Oscars, utterly amazing film. The VFX team for that were five people, some of whom were watching YouTube videos to figure out what to do. My big question for, for Sora and and and Midjourney and stuff, what happens when a creative team like that starts using these tools? I want the creative geniuses behind everything, everywhere all at once.[00:31:40] Simon: What are they going to be able to do with this stuff in like a few years time? Because that's really exciting to me. That's where you take artists who are at the very peak of their game. Give them these new capabilities and see, see what they can do with them.[00:31:52] swyx (2): I should, I know a little bit here. So it should mention that, that team actually used RunwayML.[00:31:57] swyx (2): So there was, there was,[00:31:57] Simon: yeah.[00:31:59] swyx (2): I don't know how [00:32:00] much I don't. So, you know, it's possible to overstate this, but there are people integrating it. Generated video within their workflow, even pre SORA. Right, because[00:32:09] Brian: it's not, it's not the thing where it's like, okay, tomorrow we'll be able to do a full two hour movie that you prompt with three sentences.[00:32:15] Brian: It is like, for the very first part of, of, you know video effects in film, it's like, if you can get that three second clip, if you can get that 20 second thing that they did in the matrix that blew everyone's minds and took a million dollars or whatever to do, like, it's the, it's the little bits and pieces that they can fill in now that it's probably already there.[00:32:34] swyx (2): Yeah, it's like, I think actually having a layered view of what assets people need and letting AI fill in the low value assets. Right, like the background video, the background music and, you know, sometimes the sound effects. That, that maybe, maybe more palatable maybe also changes the, the way that you evaluate the stuff that's coming out.[00:32:57] swyx (2): Because people tend to, in social media, try to [00:33:00] emphasize foreground stuff, main character stuff. So you really care about consistency, and you, you really are bothered when, like, for example, Sorad. Botch's image generation of a gymnast doing flips, which is horrible. It's horrible. But for background crowds, like, who cares?[00:33:18] Brian: And by the way, again, I was, I was a film major way, way back in the day, like, that's how it started. Like things like Braveheart, where they filmed 10 people on a field, and then the computer could turn it into 1000 people on a field. Like, that's always been the way it's around the margins and in the background that first comes in.[00:33:36] Brian: The[00:33:36] Simon: Lord of the Rings movies were over 20 years ago. Although they have those giant battle sequences, which were very early, like, I mean, you could almost call it a generative AI approach, right? They were using very sophisticated, like, algorithms to model out those different battles and all of that kind of stuff.[00:33:52] Simon: Yeah, I know very little. I know basically nothing about film production, so I try not to commentate on it. But I am fascinated to [00:34:00] see what happens when, when these tools start being used by the real, the people at the top of their game.[00:34:05] swyx (2): I would say like there's a cultural war that is more that being fought here than a technology war.[00:34:11] swyx (2): Most of the Hollywood people are against any form of AI anyway, so they're busy Fighting that battle instead of thinking about how to adopt it and it's, it's very fringe. I participated here in San Francisco, one generative AI video creative hackathon where the AI positive artists actually met with technologists like myself and then we collaborated together to build short films and that was really nice and I think, you know, I'll be hosting some of those in my events going forward.[00:34:38] swyx (2): One thing that I think like I want to leave it. Give people a sense of it's like this is a recap of last year But then sometimes it's useful to walk away as well with like what can we expect in the future? I don't know if you got anything. I would also call out that the Chinese models here have made a lot of progress Hyde Law and Kling and God knows who like who else in the video arena [00:35:00] Also making a lot of progress like surprising him like I think maybe actually Chinese China is surprisingly ahead with regards to Open8 at least, but also just like specific forms of video generation.[00:35:12] Simon: Wouldn't it be interesting if a film industry sprung up in a country that we don't normally think of having a really strong film industry that was using these tools? Like, that would be a fascinating sort of angle on this. Mm hmm. Mm hmm.[00:35:25] swyx (2): Agreed. I, I, I Oh, sorry. Go ahead.[00:35:29] Exploring Video Avatar Companies[00:35:29] swyx (2): Just for people's Just to put it on people's radar as well, Hey Jen, there's like there's a category of video avatar companies that don't specifically, don't specialize in general video.[00:35:41] swyx (2): They only do talking heads, let's just say. And HeyGen sings very well.[00:35:45] Brian: Swyx, you know that that's what I've been using, right? Like, have, have I, yeah, right. So, if you see some of my recent YouTube videos and things like that, where, because the beauty part of the HeyGen thing is, I, I, I don't want to use the robot voice, so [00:36:00] I record the mp3 file for my computer, And then I put that into HeyGen with the avatar that I've trained it on, and all it does is the lip sync.[00:36:09] Brian: So it looks, it's not 100 percent uncanny valley beatable, but it's good enough that if you weren't looking for it, it's just me sitting there doing one of my clips from the show. And, yeah, so, by the way, HeyGen. Shout out to them.[00:36:24] AI Influencers and Their Future[00:36:24] swyx (2): So I would, you know, in terms of like the look ahead going, like, looking, reviewing 2024, looking at trends for 2025, I would, they basically call this out.[00:36:33] swyx (2): Meta tried to introduce AI influencers and failed horribly because they were just bad at it. But at some point that there will be more and more basically AI influencers Not in a way that Simon is but in a way that they are not human.[00:36:50] Simon: Like the few of those that have done well, I always feel like they're doing well because it's a gimmick, right?[00:36:54] Simon: It's a it's it's novel and fun to like Like that, the AI Seinfeld thing [00:37:00] from last year, the Twitch stream, you know, like those, if you're the only one or one of just a few doing that, you'll get, you'll attract an audience because it's an interesting new thing. But I just, I don't know if that's going to be sustainable longer term or not.[00:37:11] Simon: Like,[00:37:12] Simplifying Content Creation with AI[00:37:12] Brian: I'm going to tell you, Because I've had discussions, I can't name the companies or whatever, but, so think about the workflow for this, like, now we all know that on TikTok and Instagram, like, holding up a phone to your face, and doing like, in my car video, or walking, a walk and talk, you know, that's, that's very common, but also, if you want to do a professional sort of talking head video, you still have to sit in front of a camera, you still have to do the lighting, you still have to do the video editing, versus, if you can just record, what I'm saying right now, the last 30 seconds, If you clip that out as an mp3 and you have a good enough avatar, then you can put that avatar in front of Times Square, on a beach, or whatever.[00:37:50] Brian: So, like, again for creators, the reason I think Simon, we're on the verge of something, it, it just, it's not going to, I think it's not, oh, we're going to have [00:38:00] AI avatars take over, it'll be one of those things where it takes another piece of the workflow out and simplifies it. I'm all[00:38:07] Simon: for that. I, I always love this stuff.[00:38:08] Simon: I like tools. Tools that help human beings do more. Do more ambitious things. I'm always in favor of, like, that, that, that's what excites me about this entire field.[00:38:17] swyx (2): Yeah. We're, we're looking into basically creating one for my podcast. We have this guy Charlie, he's Australian. He's, he's not real, but he pre, he opens every show and we are gonna have him present all the shorts.[00:38:29] Simon: Yeah, go ahead.[00:38:30] The Importance of Credibility in AI[00:38:30] Simon: The thing that I keep coming back to is this idea of credibility like in a world that is full of like AI generated everything and so forth It becomes even more important that people find the sources of information that they trust and find people and find Sources that are credible and I feel like that's the one thing that LLMs and AI can never have is credibility, right?[00:38:49] Simon: ChatGPT can never stake its reputation on telling you something useful and interesting because That means nothing, right? It's a matrix multiplication. It depends on who prompted it and so forth. So [00:39:00] I'm always, and this is when I'm blogging as well, I'm always looking for, okay, who are the reliable people who will tell me useful, interesting information who aren't just going to tell me whatever somebody's paying them to tell, tell them, who aren't going to, like, type a one sentence prompt into an LLM and spit out an essay and stick it online.[00:39:16] Simon: And that, that to me, Like, earning that credibility is really important. That's why a lot of my ethics around the way that I publish are based on the idea that I want people to trust me. I want to do things that, that gain credibility in people's eyes so they will come to me for information as a trustworthy source.[00:39:32] Simon: And it's the same for the sources that I'm, I'm consulting as well. So that's something I've, I've been thinking a lot about that sort of credibility focus on this thing for a while now.[00:39:40] swyx (2): Yeah, you can layer or structure credibility or decompose it like so one thing I would put in front of you I'm not saying that you should Agree with this or accept this at all is that you can use AI to generate different Variations and then and you pick you as the final sort of last mile person that you pick The last output and [00:40:00] you put your stamp of credibility behind that like that everything's human reviewed instead of human origin[00:40:04] Simon: Yeah, if you publish something you need to be able to put it on the ground Publishing it.[00:40:08] Simon: You need to say, I will put my name to this. I will attach my credibility to this thing. And if you're willing to do that, then, then that's great.[00:40:16] swyx (2): For creators, this is huge because there's a fundamental asymmetry between starting with a blank slate versus choosing from five different variations.[00:40:23] Brian: Right.[00:40:24] Brian: And also the key thing that you just said is like, if everything that I do, if all of the words were generated by an LLM, if the voice is generated by an LLM. If the video is also generated by the LLM, then I haven't done anything, right? But if, if one or two of those, you take a shortcut, but it's still, I'm willing to sign off on it.[00:40:47] Brian: Like, I feel like that's where I feel like people are coming around to like, this is maybe acceptable, sort of.[00:40:53] Simon: This is where I've been pushing the definition. I love the term slop. Where I've been pushing the definition of slop as AI generated [00:41:00] content that is both unrequested and unreviewed and the unreviewed thing is really important like that's the thing that elevates something from slop to not slop is if A human being has reviewed it and said, you know what, this is actually worth other people's time.[00:41:12] Simon: And again, I'm willing to attach my credibility to it and say, hey, this is worthwhile.[00:41:16] Brian: It's, it's, it's the cura curational, curatorial and editorial part of it that no matter what the tools are to do shortcuts, to do, as, as Swyx is saying choose between different edits or different cuts, but in the end, if there's a curatorial mind, Or editorial mind behind it.[00:41:32] Brian: Let me I want to wedge this in before we start to close.[00:41:36] The Future of LLM User Interfaces[00:41:36] Brian: One of the things coming back to your year end piece that has been a something that I've been banging the drum about is when you're talking about LLMs. Getting harder to use. You said most users are thrown in at the deep end.[00:41:48] Brian: The default LLM chat UI is like taking brand new computer users, dropping them into a Linux terminal and expecting them to figure it all out. I mean, it's, it's literally going back to the command line. The command line was defeated [00:42:00] by the GUI interface. And this is what I've been banging the drum about is like, this cannot be.[00:42:05] Brian: The user interface, what we have now cannot be the end result. Do you see any hints or seeds of a GUI moment for LLM interfaces?[00:42:17] Simon: I mean, it has to happen. It absolutely has to happen. The the, the, the, the usability of these things is turning into a bit of a crisis. And we are at least seeing some really interesting innovation in little directions.[00:42:28] Simon: Just like OpenAI's chat GPT canvas thing that they just launched. That is at least. Going a little bit more interesting than just chat, chats and responses. You know, you can, they're exploring that space where you're collaborating with an LLM. You're both working in the, on the same document. That makes a lot of sense to me.[00:42:44] Simon: Like that, that feels really smart. The one of the best things is still who was it who did the, the UI where you could, they had a drawing UI where you draw an interface and click a button. TL draw would then make it real thing. That was spectacular, [00:43:00] absolutely spectacular, like, alternative vision of how you'd interact with these models.[00:43:05] Simon: Because yeah, the and that's, you know, so I feel like there is so much scope for innovation there and it is beginning to happen. Like, like, I, I feel like most people do understand that we need to do better in terms of interfaces that both help explain what's going on and give people better tools for working with models.[00:43:23] Simon: I was going to say, I want to[00:43:25] Brian: dig a little deeper into this because think of the conceptual idea behind the GUI, which is instead of typing into a command line open word. exe, it's, you, you click an icon, right? So that's abstracting away sort of the, again, the programming stuff that like, you know, it's, it's a, a, a child can tap on an iPad and, and make a program open, right?[00:43:47] Brian: The problem it seems to me right now with how we're interacting with LLMs is it's sort of like you know a dumb robot where it's like you poke it and it goes over here, but no, I want it, I want to go over here so you poke it this way and you can't get it exactly [00:44:00] right, like, what can we abstract away from the From the current, what's going on that, that makes it more fine tuned and easier to get more precise.[00:44:12] Brian: You see what I'm saying?[00:44:13] Simon: Yes. And the this is the other trend that I've been following from the last year, which I think is super interesting. It's the, the prompt driven UI development thing. Basically, this is the pattern where Claude Artifacts was the first thing to do this really well. You type in a prompt and it goes, Oh, I should answer that by writing a custom HTML and JavaScript application for you that does a certain thing.[00:44:35] Simon: And when you think about that take and since then it turns out This is easy, right? Every decent LLM can produce HTML and JavaScript that does something useful. So we've actually got this alternative way of interacting where they can respond to your prompt with an interactive custom interface that you can work with.[00:44:54] Simon: People haven't quite wired those back up again. Like, ideally, I'd want the LLM ask me a [00:45:00] question where it builds me a custom little UI, For that question, and then it gets to see how I interacted with that. I don't know why, but that's like just such a small step from where we are right now. But that feels like such an obvious next step.[00:45:12] Simon: Like an LLM, why should it, why should you just be communicating with, with text when it can build interfaces on the fly that let you select a point on a map or or move like sliders up and down. It's gonna create knobs and dials. I keep saying knobs and dials. right. We can do that. And the LLMs can build, and Claude artifacts will build you a knobs and dials interface.[00:45:34] Simon: But at the moment they haven't closed the loop. When you twiddle those knobs, Claude doesn't see what you were doing. They're going to close that loop. I'm, I'm shocked that they haven't done it yet. So yeah, I think there's so much scope for innovation and there's so much scope for doing interesting stuff with that model where the LLM, anything you can represent in SVG, which is almost everything, can now be part of that ongoing conversation.[00:45:59] swyx (2): Yeah, [00:46:00] I would say the best executed version of this I've seen so far is Bolt where you can literally type in, make a Spotify clone, make an Airbnb clone, and it actually just does that for you zero shot with a nice design.[00:46:14] Simon: There's a benchmark for that now. The LMRena people now have a benchmark that is zero shot app, app generation, because all of the models can do it.[00:46:22] Simon: Like it's, it's, I've started figuring out. I'm building my own version of this for my own project, because I think within six months. I think it'll just be an expected feature. Like if you have a web application, why don't you have a thing where, oh, look, the, you can add a custom, like, so for my dataset data exploration project, I want you to be able to do things like conjure up a dashboard, just via a prompt.[00:46:43] Simon: You say, oh, I need a pie chart and a bar chart and put them next to each other, and then have a form where submitting the form inserts a row into my database table. And this is all suddenly feasible. It's, it's, it's not even particularly difficult to do, which is great. Utterly bizarre that these things are now easy.[00:47:00][00:47:00] swyx (2): I think for a general audience, that is what I would highlight, that software creation is becoming easier and easier. Gemini is now available in Gmail and Google Sheets. I don't write my own Google Sheets formulas anymore, I just tell Gemini to do it. And so I think those are, I almost wanted to basically somewhat disagree with, with your assertion that LMS got harder to use.[00:47:22] swyx (2): Like, yes, we, we expose more capabilities, but they're, they're in minor forms, like using canvas, like web search in, in in chat GPT and like Gemini being in, in Excel sheets or in Google sheets, like, yeah, we're getting, no,[00:47:37] Simon: no, no, no. Those are the things that make it harder, because the problem is that for each of those features, they're amazing.[00:47:43] Simon: If you understand the edges of the feature, if you're like, okay, so in Google, Gemini, Excel formulas, I can get it to do a certain amount of things, but I can't get it to go and read a web. You probably can't get it to read a webpage, right? But you know, there are, there are things that it can do and things that it can't do, which are completely undocumented.[00:47:58] Simon: If you ask it what it [00:48:00] can and can't do, they're terrible at answering questions about that. So like my favorite example is Claude artifacts. You can't build a Claude artifact that can hit an API somewhere else. Because the cause headers on that iframe prevents accessing anything outside of CDNJS. So, good luck learning cause headers as an end user in order to understand why Like, I've seen people saying, oh, this is rubbish.[00:48:26] Simon: I tried building an artifact that would run a prompt and it couldn't because Claude didn't expose an API with cause headers that all of this stuff is so weird and complicated. And yeah, like that, that, the more that with the more tools we add, the more expertise you need to really, To understand the full scope of what you can do.[00:48:44] Simon: And so it's, it's, I wouldn't say it's, it's, it's, it's like, the question really comes down to what does it take to understand the full extent of what's possible? And honestly, that, that's just getting more and more involved over time.[00:48:58] Local LLMs: A Growing Interest[00:48:58] swyx (2): I have one more topic that I, I [00:49:00] think you, you're kind of a champion of and we've touched on it a little bit, which is local LLMs.[00:49:05] swyx (2): And running AI applications on your desktop, I feel like you are an early adopter of many, many things.[00:49:12] Simon: I had an interesting experience with that over the past year. Six months ago, I almost completely lost interest. And the reason is that six months ago, the best local models you could run, There was no point in using them at all, because the best hosted models were so much better.[00:49:26] Simon: Like, there was no point at which I'd choose to run a model on my laptop if I had API access to Cloud 3. 5 SONNET. They just, they weren't even comparable. And that changed, basically, in the past three months, as the local models had this step changing capability, where now I can run some of these local models, and they're not as good as Cloud 3.[00:49:45] Simon: 5 SONNET, but they're not so far away that It's not worth me even using them. The other, the, the, the, the continuing problem is I've only got 64 gigabytes of RAM, and if you run, like, LLAMA370B, it's not going to work. Most of my RAM is gone. So now I have to shut down my Firefox tabs [00:50:00] and, and my Chrome and my VS Code windows in order to run it.[00:50:03] Simon: But it's got me interested again. Like, like the, the efficiency improvements are such that now, if you were to like stick me on a desert island with my laptop, I'd be very productive using those local models. And that's, that's pretty exciting. And if those trends continue, and also, like, I think my next laptop, if when I buy one is going to have twice the amount of RAM, At which point, maybe I can run the, almost the top tier, like open weights models and still be able to use it as a computer as well.[00:50:32] Simon: NVIDIA just announced their 3, 000 128 gigabyte monstrosity. That's pretty good price. You know, that's that's, if you're going to buy it,[00:50:42] swyx (2): custom OS and all.[00:50:46] Simon: If I get a job, if I, if, if, if I have enough of an income that I can justify blowing $3,000 on it, then yes.[00:50:52] swyx (2): Okay, let's do a GoFundMe to get Simon one it.[00:50:54] swyx (2): Come on. You know, you can get a job anytime you want. Is this, this is just purely discretionary .[00:50:59] Simon: I want, [00:51:00] I want a job that pays me to do exactly what I'm doing already and doesn't tell me what else to do. That's, thats the challenge.[00:51:06] swyx (2): I think Ethan Molik does pretty well. Whatever, whatever it is he's doing.[00:51:11] swyx (2): But yeah, basically I was trying to bring in also, you know, not just local models, but Apple intelligence is on every Mac machine. You're, you're, you seem skeptical. It's rubbish.[00:51:21] Simon: Apple intelligence is so bad. It's like, it does one thing well.[00:51:25] swyx (2): Oh yeah, what's that? It summarizes notifications. And sometimes it's humorous.[00:51:29] Brian: Are you sure it does that well? And also, by the way, the other, again, from a sort of a normie point of view. There's no indication from Apple of when to use it. Like, everybody upgrades their thing and it's like, okay, now you have Apple Intelligence, and you never know when to use it ever again.[00:51:47] swyx (2): Oh, yeah, you consult the Apple docs, which is MKBHD.[00:51:49] swyx (2): The[00:51:51] Simon: one thing, the one thing I'll say about Apple Intelligence is, One of the reasons it's so disappointing is that the models are just weak, but now, like, Llama 3b [00:52:00] is Such a good model in a 2 gigabyte file I think give Apple six months and hopefully they'll catch up to the state of the art on the small models And then maybe it'll start being a lot more interesting.[00:52:10] swyx (2): Yeah. Anyway, I like This was year one And and you know just like our first year of iPhone maybe maybe not that much of a hit and then year three They had the App Store so Hey I would say give it some time, and you know, I think Chrome also shipping Gemini Nano I think this year in Chrome, which means that every app, every web app will have for free access to a local model that just ships in the browser, which is kind of interesting.[00:52:38] swyx (2): And then I, I think I also wanted to just open the floor for any, like, you know, any of us what are the apps that, you know, AI applications that we've adopted that have, that we really recommend because these are all, you know, apps that are running on our browser that like, or apps that are running locally that we should be, that, that other people should be trying.[00:52:55] swyx (2): Right? Like, I, I feel like that's, that's one always one thing that is helpful at the start of the [00:53:00] year.[00:53:00] Simon: Okay. So for running local models. My top picks, firstly, on the iPhone, there's this thing called MLC Chat, which works, and it's easy to install, and it runs Llama 3B, and it's so much fun. Like, it's not necessarily a capable enough novel that I use it for real things, but my party trick right now is I get my phone to write a Netflix Christmas movie plot outline where, like, a bunch of Jeweller falls in love with the King of Sweden or whatever.[00:53:25] Simon: And it does a good job and it comes up with pun names for the movies. And that's, that's deeply entertaining. On my laptop, most recently, I've been getting heavy into, into Olama because the Olama team are very, very good at finding the good models and patching them up and making them work well. It gives you an API.[00:53:42] Simon: My little LLM command line tool that has a plugin that talks to Olama, which works really well. So that's my, my Olama is. I think the easiest on ramp to to running models locally, if you want a nice user interface, LMStudio is, I think, the best user interface [00:54:00] thing at that. It's not open source. It's good.[00:54:02] Simon: It's worth playing with. The other one that I've been trying with recently, there's a thing called, what's it called? Open web UI or something. Yeah. The UI is fantastic. It, if you've got Olama running and you fire this thing up, it spots Olama and it gives you an interface onto your Olama models. And t

christmas god new york amazon spotify california tiktok world new york city ai english google hollywood apple science future talk online state challenges san francisco phd chinese australian fighting speaker european union tools microsoft army open iphone wrap twitch white house oscars competition sweden silicon valley massive os airbnb act cloud mac ipads santa claus ces sort fantastic publishing explain ram gofundme bay whispers excel cnbc analyzing final thoughts fintech openai gemini gentlemen sf adobe app store api limitless talked riverside chrome gmail wrapping up gpt python qu'en ui credibility mm times square alibaba linux acceptable llama bolt stripe honored vcs javascript html sora llm tl sam altman instacart macbook pro vo variations advancements braveheart firefox vfx generated 3b agi elo wearables midjourney perplexity waymo docker gotcha anthropic gpus sonnets google translate deep sea vps google sheets lms r1 rosebud query 8b google gemini vo2 things we learned kling notebooklm svg future prospects vs code botch chinese ai ai models asr ride home netflix christmas mkbhd dataset rabbit r1 gpc skift 3d tv cfps o3 heygen jeweller o1 andrej karpathy i oh wq brian mccullough brian to'o california sb rafat ali brian it simon willison brian you brian yeah latent space runwayml brian so techmeme ride home brian there brian well brian right simon it simon you chinese china

The Best of 2024 with Sarah Guo and Elad Gil

No Priors: Artificial Intelligence | Machine Learning | Technology | Startups

Play Episode Listen Later Dec 26, 2024 27:07

2024 has been a year of transformative technological progress, marked by conversations that have reshaped our understanding of AI's evolution and what lies ahead. Throughout the year, Sarah and Elad have had the privilege of speaking with some of the brightest minds in the field. As we look back on the past months, we're excited to share highlights from some of our favorite No Priors podcast episodes. Featured guests include Jensen Huang (Nvidia), Andrej Karpathy (OpenAI, Tesla), Bret Taylor (Sierra), Aditya Ramesh, Tim Brooks, and Bill Peebles (OpenAI's Sora Team), Dmitri Dolgov (Waymo), Dylan Field (Figma), and Alexandr Wang (Scale). Want to dive deeper? Listen to the full episodes here: NVIDIA's Jensen Huang on AI Chip Design, Scaling Data Centers, and his 10-Year Bet No Priors Ep. 89 | With NVIDIA CEO Jensen Huang The Road to Autonomous Intelligence, With Andrej Karpathy from OpenAI and Tesla No Priors Ep. 80 | With Andrej Karpathy from OpenAI and Tesla Transforming Customer Service through Company Agents, with Sierra's Bret Taylor No Priors Ep. 82 | With CEO of Sierra Bret Taylor OpenAI's Sora team thinks we've only seen the "GPT-1 of video models" No Priors Ep.61 | OpenAI's Sora Leaders Aditya Ramesh, Tim Brooks and Bill Peebles Waymo's Journey to Full Autonomy: AI Breakthroughs, Safety, and Scaling No Priors Ep. 87 | With Co-CEO of Waymo Dmitri Dolgov Designing the Future: Dylan Field on AI, Collaboration, and Independence No Priors Ep. 55 | With Figma CEO Dylan Field The Data Foundry for AI with Alexandr Wang from Scale No Priors Ep. 65 | With Scale AI CEO Alexandr Wang Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil Timecodes: 0:00 Introduction 0:15 Jensen Huang on building at data-center scale 4:00 Andrej Karpathy on the AI exo-cortex, model control, and a shift to smaller models 7:14 Bret Taylor on the agentic future of business interactions 11:17 OpenAI's Sora team on visual models and their role in AGI 15:53 Waymo's Dmitri Dolgov on bridging the gap to full autonomy and the challenge of 100% accuracy 19:00 Figma's Dylan Field on the future of interfaces and new modalities 23:29 Scale AI's Alexandr Wang on the journey to AGI 26:29 Outro

ai safety tesla collaboration openai nvidia gpt sora agi waymo figma jensen huang elad scale ai andrej karpathy elad gil dylan field tim brooks no priors

The AI Strategy That Doubled His Email Conversion Rate (Step By Step)

Marketing Against The Grain

Play Episode Listen Later Dec 10, 2024 37:45

Ep. 284 What if AI could double your email conversion rates overnight? Kipp and Kieran dive into the revolutionary AI strategies that are transforming the way we approach email marketing, featuring insights from Dan Wolchonok of Reforge. Learn more on how leveraging proprietary data can create hyper-personalized emails for increased engagement, the importance of seamlessly integrating AI solutions into existing workflows, and the innovative use of AI-generated content to make compelling email campaigns. Mentions Dan Wolchonok https://www.linkedin.com/in/danielwolchonok/ Reforge https://www.reforge.com/ Grammarly https://www.grammarly.com/ Andrej Karpathy https://karpathy.ai/ Button Down AI Newsletter https://buttondown.com/ainews Get our guide to build your own Custom GPT: https://clickhubspot.com/customgpt We're creating our next round of content and want to ensure it tackles the challenges you're facing at work or in your business. To understand your biggest challenges we've put together a survey and we'd love to hear from you! https://bit.ly/matg-research Resource [Free] Steal our favorite AI Prompts featured on the show! Grab them here: https://clickhubspot.com/aip We're on Social Media! Follow us for everyday marketing wisdom straight to your feed YouTube: https://www.youtube.com/channel/UCGtXqPiNV8YC0GMUzY-EUFg Twitter: https://twitter.com/matgpod TikTok: https://www.tiktok.com/@matgpod Join our community https://landing.connect.com/matg Thank you for tuning into Marketing Against The Grain! Don't forget to hit subscribe and follow us on Apple Podcasts (so you never miss an episode)! https://podcasts.apple.com/us/podcast/marketing-against-the-grain/id1616700934 If you love this show, please leave us a 5-Star Review https://link.chtbl.com/h9_sjBKH and share your favorite episodes with friends. We really appreciate your support. Host Links: Kipp Bodnar, https://twitter.com/kippbodnar Kieran Flanagan, https://twitter.com/searchbrat ‘Marketing Against The Grain' is a HubSpot Original Podcast // Brought to you by The HubSpot Podcast Network // Produced by Darren Clarke.

tiktok social media ai step by step double d grammarly conversion rate ai strategy reforge darren clarke andrej karpathy kieran flanagan

How NotebookLM Was Made

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Oct 25, 2024 73:57

If you've listened to the podcast for a while, you might have heard our ElevenLabs-powered AI co-host Charlie a few times. Text-to-speech has made amazing progress in the last 18 months, with OpenAI's Advanced Voice Mode (aka “Her”) as a sneak peek of the future of AI interactions (see our “Building AGI in Real Time” recap). Yet, we had yet to see a real killer app for AI voice (not counting music).Today's guests, Raiza Martin and Usama Bin Shafqat, are the lead PM and AI engineer behind the NotebookLM feature flag that gave us the first viral AI voice experience, the “Deep Dive” podcast:The idea behind the “Audio Overviews” feature is simple: take a bunch of documents, websites, YouTube videos, etc, and generate a podcast out of them. This was one of the first demos that people built with voice models + RAG + GPT models, but it was always a glorified speech-to-text. Raiza and Usama took a very different approach:* Make it conversational: when you listen to a NotebookLM audio there are a ton of micro-interjections (Steven Johnson calls them disfluencies) like “Oh really?” or “Totally”, as well as pauses and “uh…”, like you would expect in a real conversation. These are not generated by the LLM in the transcript, but they are built into the the audio model. See ~28:00 in the pod for more details. * Listeners love tension: if two people are always in agreement on everything, it's not super interesting. They tuned the model to generate flowing conversations that mirror the tone and rhythm of human speech. They did not confirm this, but many suspect the 2 year old SoundStorm paper is related to this model.* Generating new insights: because the hosts' goal is not to summarize, but to entertain, it comes up with funny metaphors and comparisons that actually help expand on the content rather than just paraphrasing like most models do. We have had listeners make podcasts out of our podcasts, like this one.This is different than your average SOTA-chasing, MMLU-driven model buildooor. Putting product and AI engineering in the same room, having them build evals together, and understanding what the goal is lets you get these unique results. The 5 rules for AI PMsWe always focus on AI Engineers, but this episode had a ton of AI PM nuggets as well, which we wanted to collect as NotebookLM is one of the most successful products in the AI space:1. Less is more: the first version of the product had 0 customization options. All you could do is give it source documents, and then press a button to generate. Most users don't know what “temperature” or “top-k” are, so you're often taking the magic away by adding more options in the UI. Since recording they added a few, like a system prompt, but those were features that users were “hacking in”, as Simon Willison highlighted in his blog post.2. Use Real-Time Feedback: they built a community of 65,000 users on Discord that is constantly reporting issues and giving feedback; sometimes they noticed server downtime even before the Google internal monitoring did. Getting real time pings > aggregating user data when doing initial iterations. 3. Embrace Non-Determinism: AI outputs variability is a feature, not a bug. Rather than limiting the outputs from the get-go, build toggles that you can turn on/off with feature flags as the feedback starts to roll in.4. Curate with Taste: if you try your product and it sucks, you don't need more data to confirm it. Just scrap that and iterate again. This is even easier for a product like this; if you start listening to one of the podcasts and turn it off after 10 seconds, it's never a good sign. 5. Stay Hands-On: It's hard to build taste if you don't experiment. Trying out all your competitors products as well as unrelated tools really helps you understand what users are seeing in market, and how to improve on it.Chapters00:00 Introductions01:39 From Project Tailwind to NotebookLM09:25 Learning from 65,000 Discord members12:15 How NotebookLM works18:00 Working with Steven Johnson23:00 How to prioritize features25:13 Structuring the data pipelines29:50 How to eval34:34 Steering the podcast outputs37:51 Defining speakers personalities39:04 How do you make audio engaging?45:47 Humor is AGI51:38 Designing for non-determinism53:35 API when?55:05 Multilingual support and dialect considerations57:50 Managing system prompts and feature requests01:00:58 Future of NotebookLM01:04:59 Podcasts for your codebase01:07:16 Plans for real-time chat01:08:27 Wrap upShow Notes* Notebook LM* AI Test Kitchen* Nicholas Carlini* Steven Johnson* Wealth of Nations* Histories of Mysteries by Andrej Karpathy* chicken.pdf Threads* Area 120* Raiza Martin* Usama Bin ShafqatTranscriptNotebookLM [00:00:00]: Hey everyone, we're here today as guests on Latent Space. It's great to be here, I'm a long time listener and fan, they've had some great guests on this show before. Yeah, what an honor to have us, the hosts of another podcast, join as guests. I mean a huge thank you to Swyx and Alessio for the invite, thanks for having us on the show. Yeah really, it seems like they brought us here to talk a little bit about our show, our podcast. Yeah, I mean we've had lots of listeners ourselves, listeners at Deep Dive. Oh yeah, we've made a ton of audio overviews since we launched and we're learning a lot. There's probably a lot we can share around what we're building next, huh? Yeah, we'll share a little bit at least. The short version is we'll keep learning and getting better for you. We're glad you're along for the ride. So yeah, keep listening. Keep listening and stay curious. We promise to keep diving deep and bringing you even better options in the future. Stay curious.Alessio [00:00:52]: Hey everyone, welcome to the Latent Space Podcast. This is Alessio, partner and CTO at Residence at Decibel Partners. And I'm joined by my co-host, Swyx, founder of Smol.ai.Swyx [00:01:01]: Hey, and today we're back in the studio with our special guest, Raiza Martin. And Raiza, I forgot to get your last name, Shafqat.Raiza [00:01:10]: Yes.Swyx [00:01:10]: Okay, welcome.Raiza [00:01:12]: Hello, thank you for having us.Swyx [00:01:14]: So AI podcasters meet human podcasters, always fun. Congrats on the success of Notebook LM. I mean, how does it feel?Raiza [00:01:22]: It's been a lot of fun. A lot of it, honestly, was unexpected. But my favorite part is really listening to the audio overviews that people have been making.Swyx [00:01:29]: Maybe we should do a little bit of intros and tell the story. You know, what is your path into the sort of Google AI org? Or maybe, actually, I don't even know what org you guys are in.Raiza [00:01:39]: I can start. My name is Raisa. I lead the Notebook LM team inside of Google Labs. So specifically, that's the org that we're in. It's called Google Labs. It's only about two years old. And our whole mandate is really to build AI products. That's it. We work super closely with DeepMind. Our entire thing is just, like, try a bunch of things and see what's landing with users. And the background that I have is, really, I worked in payments before this, and I worked in ads right before, and then startups. I tell people, like, at every time that I changed orgs, I actually almost quit Google. Like, specifically, like, in between ads and payments, I was like, all right, I can't do this. Like, this is, like, super hard. I was like, it's not for me. I'm, like, a very zero-to-one person. But then I was like, okay, I'll try. I'll interview with other teams. And when I interviewed in payments, I was like, oh, these people are really cool. I don't know if I'm, like, a super good fit with this space, but I'll try it because the people are cool. And then I really enjoyed that, and then I worked on, like, zero-to-one features inside of payments, and I had a lot of fun. But then the time came again where I was like, oh, I don't know. It's like, it's time to leave. It's time to start my own thing. But then I interviewed inside of Google Labs, and I was like, oh, darn. Like, there's definitely, like—Alessio [00:02:48]: They got you again.Raiza [00:02:49]: They got me again. And so now I've been here for two years, and I'm happy that I stayed because especially with, you know, the recent success of Notebook LM, I'm like, dang, we did it. I actually got to do it. So that was really cool.Usama [00:03:02]: Kind of similar, honestly. I was at a big team at Google. We do sort of the data center supply chain planning stuff. Google has, like, the largest sort of footprint. Obviously, there's a lot of management stuff to do there. But then there was this thing called Area 120 at Google, which does not exist anymore. But I sort of wanted to do, like, more zero-to-one building and landed a role there. We were trying to build, like, a creator commerce platform called Kaya. It launched briefly a couple years ago. But then Area 120 sort of transitioned and morphed into Labs. And, like, over the last few years, like, the focus just got a lot clearer. Like, we were trying to build new AI products and do it in the wild and sort of co-create and all of that. So, you know, we've just been trying a bunch of different things. And this one really landed, which has felt pretty phenomenal. Really, really landed.Swyx [00:03:53]: Let's talk about the brief history of Notebook LM. You had a tweet, which is very helpful for doing research. May 2023, during Google I.O., you announced Project Tailwind.Raiza [00:04:03]: Yeah.Swyx [00:04:03]: So today is October 2024. So you joined October 2022?Raiza [00:04:09]: Actually, I used to lead AI Test Kitchen. And this was actually, I think, not I.O. 2023. I.O. 2022 is when we launched AI Test Kitchen, or announced it. And I don't know if you remember it.Swyx [00:04:23]: That's how you, like, had the basic prototype for Gemini.Raiza [00:04:26]: Yes, yes, exactly. Lambda.Swyx [00:04:28]: Gave beta access to people.Raiza [00:04:29]: Yeah, yeah, yeah. And I remember, I was like, wow, this is crazy. We're going to launch an LLM into the wild. And that was the first project that I was working on at Google. But at the same time, my manager at the time, Josh, he was like, hey, I want you to really think about, like, what real products would we build that are not just demos of the technology? That was in October of 2022. I was sitting next to an engineer that was working on a project called Talk to Small Corpus. His name was Adam. And the idea of Talk to Small Corpus is basically using LLM to talk to your data. And at the time, I was like, wait, there's some, like, really practical things that you can build here. And just a little bit of background, like, I was an adult learner. Like, I went to college while I was working a full-time job. And the first thing I thought was, like, this would have really helped me with my studying, right? Like, if I could just, like, talk to a textbook, especially, like, when I was tired after work, that would have been huge. We took a lot of, like, the Talk to Small Corpus prototypes, and I showed it to a lot of, like, college students, particularly, like, adult learners. They were like, yes, like, I get it, right? Like, I didn't even have to explain it to them. And we just continued to iterate the prototype from there to the point where we actually got a slot as part of the I.O. demo in 23.Swyx [00:05:42]: And Corpus, was it a textbook? Oh, my gosh.Raiza [00:05:45]: Yeah. It's funny. Actually, when he explained the project to me, he was like, talk to Small Corpus. It was like, talk to a small corpse?Swyx [00:05:51]: Yeah, nobody says Corpus.Raiza [00:06:00]: It was like, a small corpse? This is not AI. Yeah, yeah. And it really was just, like, a way for us to describe the amount of data that we thought, like, it could be good for.Swyx [00:06:02]: Yeah, but even then, you're still, like, doing rag stuff. Because, you know, the context length back then was probably, like, 2K, 4K.Raiza [00:06:08]: Yeah, it was basically rag.Raiza [00:06:09]: That was essentially what it was.Raiza [00:06:10]: And I remember, I was like, we were building the prototypes. And at the same time, I think, like, the rest of the world was. Right? We were seeing all of these, like, chat with PDF stuff come up. And I was like, come on, we gotta go. Like, we have to, like, push this out into the world. I think if there was anything, I wish we would have launched sooner because I wanted to learn faster. But I think, like, we netted out pretty well.Alessio [00:06:30]: Was the initial product just text-to-speech? Or were you also doing kind of, like, synthesizing of the content, refining it? Or were you just helping people read through it?Raiza [00:06:40]: Before we did the I.O. announcement in 23, we'd already done a lot of studies. And one of the first things that I realized was the first thing anybody ever typed was, summarize the thing. Right?Raiza [00:06:53]: Summarize the document.Raiza [00:06:54]: And it was, like, half like a test and half just like, oh, I know the content. I want to see how well it does this. So it was part of the first thing that we launched. It was called Project Tailwind back then. It was just Q&A, so you could chat with the doc just through text, and it would automatically generate a summary as well. I'm not sure if we had it back then.Raiza [00:07:12]: I think we did.Raiza [00:07:12]: It would also generate the key topics in your document, and it could support up to, like, 10 documents. So it wasn't just, like, a single doc.Alessio [00:07:20]: And then the I.O. demo went well, I guess. And then what was the discussion from there to where we are today? Is there any, maybe, intermediate step of the product that people missed between this was launch or?Raiza [00:07:33]: It was interesting because every step of the way, I think we hit, like, some pretty critical milestones. So I think from the initial demo, I think there was so much excitement of, like, wow, what is this thing that Google is launching? And so we capitalized on that. We built the wait list. That's actually when we also launched the Discord server, which has been huge for us because for us in particular, one of the things that I really wanted to do was to be able to launch features and get feedback ASAP. Like, the moment somebody tries it, like, I want to hear what they think right now, and I want to ask follow-up questions. And the Discord has just been so great for that. But then we basically took the feedback from I.O., we continued to refine the product.Raiza [00:08:12]: So we added more features.Raiza [00:08:13]: We added sort of, like, the ability to save notes, write notes. We generate follow-up questions. So there's a bunch of stuff in the product that shows, like, a lot of that research. But it was really the rolling out of things. Like, we removed the wait list, so rolled out to all of the United States. We rolled out to over 200 countries and territories. We started supporting more languages, both in the UI and, like, the actual source stuff. We experienced, like, in terms of milestones, there was, like, an explosion of, like, users in Japan. This was super interesting in terms of just, like, unexpected. Like, people would write to us and they would be like, this is amazing. I have to read all of these rules in English, but I can chat in Japanese. It's like, oh, wow. That's true, right? Like, with LLMs, you kind of get this natural, it translates the content for you. And you can ask in your sort of preferred mode. And I think that's not just, like, a language thing, too. I think there's, like, I do this test with Wealth of Nations all the time because it's, like, a pretty complicated text to read. The Evan Smith classic.Swyx [00:09:11]: It's, like, 400 pages or something.Raiza [00:09:12]: Yeah. But I like this test because I'm, like, asking, like, Normie, you know, plain speak. And then it summarizes really well for me. It sort of adapts to my tone.Swyx [00:09:22]: Very capitalist.Raiza [00:09:25]: Very on brand.Swyx [00:09:25]: I just checked in on a Notebook LM Discord. 65,000 people. Yeah.Raiza [00:09:29]: Crazy.Swyx [00:09:29]: Just, like, for one project within Google. It's not, like, it's not labs. It's just Notebook LM.Raiza [00:09:35]: Just Notebook LM.Swyx [00:09:36]: What do you learn from the community?Raiza [00:09:39]: I think that the Discord is really great for hearing about a couple of things.Raiza [00:09:43]: One, when things are going wrong. I think, honestly, like, our fastest way that we've been able to find out if, like, the servers are down or there's just an influx of people being, like, it saysRaiza [00:09:53]: system unable to answer.Raiza [00:09:54]: Anybody else getting this?Raiza [00:09:56]: And I'm, like, all right, let's go.Raiza [00:09:58]: And it actually catches it a lot faster than, like, our own monitoring does.Raiza [00:10:01]: It's, like, that's been really cool. So, thank you.Swyx [00:10:03]: Canceled eat a dog.Raiza [00:10:05]: So, thank you to everybody. Please keep reporting it. I think the second thing is really the use cases.Raiza [00:10:10]: I think when we put it out there, I was, like, hey, I have a hunch of how people will use it, but, like, to actually hear about, you know, not just the context of, like, the use of Notebook LM, but, like, what is this person's life like? Why do they care about using this tool?Raiza [00:10:23]: Especially people who actually have trouble using it, but they keep pushing.Raiza [00:10:27]: Like, that's just so critical to understand what was so motivating, right?Raiza [00:10:31]: Like, what was your problem that was, like, so worth solving? So, that's, like, a second thing.Raiza [00:10:34]: The third thing is also just hearing sort of, like, when we have wins and when we don't have wins because there's actually a lot of functionality where I'm, like, hmm, IRaiza [00:10:42]: don't know if that landed super well or if that was actually super critical.Raiza [00:10:45]: As part of having this sort of small project, right, I want to be able to unlaunch things, too. So, it's not just about just, like, rolling things out and testing it and being, like, wow, now we have, like, 99 features. Like, hopefully we get to a place where it's, like, there's just a really strong core feature set and the things that aren't as great, we can just unlaunch.Swyx [00:11:02]: What have you unlaunched? I have to ask.Raiza [00:11:04]: I'm in the process of unlaunching some stuff, but, for example, we had this idea that you could highlight the text in your source passage and then you could transform it. And nobody was really using it and it was, like, a very complicated piece of our architecture and it's very hard to continue supporting it in the context of new features. So, we were, like, okay, let's do a 50-50 sunset of this thing and see if anybody complains.Raiza [00:11:28]: And so far, nobody has.Swyx [00:11:29]: Is there, like, a feature flagging paradigm inside of your architecture that lets you feature flag these things easily?Raiza [00:11:36]: Yes, and actually...Raiza [00:11:37]: What is it called?Swyx [00:11:38]: Like, I love feature flagging.Raiza [00:11:40]: You mean, like, in terms of just, like, being able to expose things to users?Swyx [00:11:42]: Yeah, as a PM. Like, this is your number one tool, right?Raiza [00:11:44]: Yeah, yeah.Swyx [00:11:45]: Let's try this out. All right, if it works, roll it out. If it doesn't, roll it back, you know?Raiza [00:11:49]: Yeah, I mean, we just run Mendel experiments for the most part. And, actually, I don't know if you saw it, but on Twitter, somebody was able to get around our flags and they enabled all the experiments.Raiza [00:11:58]: They were, like, check out what the Notebook LM team is cooking.Raiza [00:12:02]: I was, like, oh!Raiza [00:12:03]: And I was at lunch with the rest of the team and I was, like, I was eating. I was, like, guys, guys, Magic Draft League!Raiza [00:12:10]: They were, like, oh, no!Raiza [00:12:12]: I was, like, okay, just finish eating and then let's go figure out what to do.Raiza [00:12:15]: Yeah.Alessio [00:12:15]: I think a post-mortem would be fun, but I don't think we need to do it on the podcast now. Can we just talk about what's behind the magic? So, I think everybody has questions, hypotheses about what models power it. I know you might not be able to share everything, but can you just get people very basic? How do you take the data and put it in the model? What text model you use? What's the text-to-speech kind of, like, jump between the two? Sure.Raiza [00:12:42]: Yeah.Raiza [00:12:42]: I was going to say, SRaiza, he manually does all the podcasts.Raiza [00:12:46]: Oh, thank you.Usama [00:12:46]: Really fast. You're very fast, yeah.Raiza [00:12:48]: Both of the voices at once.Usama [00:12:51]: Voice actor.Raiza [00:12:52]: Good, good.Usama [00:12:52]: Yeah, so, for a bit of background, we were building this thing sort of outside Notebook LM to begin with. Like, just the idea is, like, content transformation, right? Like, we can do different modalities. Like, everyone knows that. Everyone's been poking at it. But, like, how do you make it really useful? And, like, one of the ways we thought was, like, okay, like, you maybe, like, you know, people learn better when they're hearing things. But TTS exists, and you can, like, narrate whatever's on screen. But you want to absorb it the same way. So, like, that's where we sort of started out into the realm of, like, maybe we try, like, you know, two people are having a conversation kind of format. We didn't actually start out thinking this would live in Notebook, right? Like, Notebook was sort of, we built this demo out independently, tried out, like, a few different sort of sources. The main idea was, like, go from some sort of sources and transform it into a listenable, engaging audio format. And then through that process, we, like, unlocked a bunch more sort of learnings. Like, for example, in a sense, like, you're not prompting the model as much because, like, the information density is getting unrolled by the model prompting itself, in a sense. Because there's two speakers, and they're both technically, like, AI personas, right? That have different angles of looking at things. And, like, they'll have a discussion about it. And that sort of, we realized that's kind of what was making it riveting, in a sense. Like, you care about what comes next, even if you've read the material already. Because, like, people say they get new insights on their own journals or books or whatever. Like, anything that they've written themselves. So, yeah, from a modeling perspective, like, it's, like Reiza said earlier, like, we work with the DeepMind audio folks pretty closely. So, they're always cooking up new techniques to, like, get better, more human-like audio. And then Gemini 1.5 is really, really good at absorbing long context. So, we sort of, like, generally put those things together in a way that we could reliably produce the audio.Raiza [00:14:52]: I would add, like, there's something really nuanced, I think, about sort of the evolution of, like, the utility of text-to-speech. Where, if it's just reading an actual text response, and I've done this several times. I do it all the time with, like, reading my text messages. Or, like, sometimes I'm trying to read, like, a really dense paper, but I'm trying to do actual work. I'll have it, like, read out the screen. There is something really robotic about it that is not engaging. And it's really hard to consume content in that way. And it's never been really effective. Like, particularly for me, where I'm, like, hey, it's actually just, like, it's fine for, like, short stuff. Like, texting, but even that, it's, like, not that great. So, I think the frontier of experimentation here was really thinking about there is a transform that needs to happen in between whatever.Raiza [00:15:38]: Here's, like, my resume, right?Raiza [00:15:39]: Or here's, like, a 100-page slide deck or something. There is a transform that needs to happen that is inherently editorial. And I think this is where, like, that two-person persona, right, dialogue model, they have takes on the material that you've presented. That's where it really sort of, like, brings the content to life in a way that's, like, not robotic. And I think that's, like, where the magic is, is, like, you don't actually know what's going to happen when you press generate.Raiza [00:16:08]: You know, for better or for worse.Raiza [00:16:09]: Like, to the extent that, like, people are, like, no, I actually want it to be more predictable now. Like, I want to be able to tell them. But I think that initial, like, wow was because you didn't know, right? When you upload your resume, what's it about to say about you? And I think I've seen enough of these where I'm, like, oh, it gave you good vibes, right? Like, you knew it was going to say, like, something really cool. As we start to shape this product, I think we want to try to preserve as much of that wow as much as we can. Because I do think, like, exposing, like, all the knobs and, like, the dials, like, we've been thinking about this a lot. It's like, hey, is that, like, the actual thing?Raiza [00:16:43]: Is that the thing that people really want?Alessio [00:16:45]: Have you found differences in having one model just generate the conversation and then using text-to-speech to kind of fake two people? Or, like, are you actually using two different kind of system prompts to, like, have a conversation step-by-step? I'm always curious, like, if persona system prompts make a big difference? Or, like, you just put in one prompt and then you just let it run?Usama [00:17:05]: I guess, like, generally we use a lot of inference, as you can tell with, like, the spinning thing takes a while. So, yeah, there's definitely, like, a bunch of different things happening under the hood. We've tried both approaches and they have their, sort of, drawbacks and benefits. I think that that idea of, like, questioning, like, the two different personas, like, persists throughout, like, whatever approach we try. It's like, there's a bit of, like, imperfection in there. Like, we had to really lean into the fact that, like, to build something that's engaging, like, it needs to be somewhat human and it needs to be just not a chatbot. Like, that was sort of, like, what we need to diverge from. It's like, you know, most chatbots will just narrate the same kind of answer, like, given the same sources, for the most part, which is ridiculous. So, yeah, there's, like, experimentation there under the hood, like, with the model to, like, make sure that it's spitting out, like, different takes and different personas and different, sort of, prompting each other is, like, a good analogy, I guess.Swyx [00:18:00]: Yeah, I think Steven Johnson, I think he's on your team. I don't know what his role is. He seems like chief dreamer, writer.Raiza [00:18:08]: Yeah, I mean, I can comment on Steven. So, Steven joined, actually, in the very early days, I think before it was even a fully funded project. And I remember when he joined, I was like, Steven Johnson's going to be on my team? You know, and for folks who don't know him, Steven is a New York Times bestselling author of, like, 14 books. He has a PBS show. He's, like, incredibly smart, just, like, a true, sort of, celebrity by himself. And then he joined Google, and he was like, I want to come here, and I want to build the thing that I've always dreamed of, which is a tool to help me think. I was like, a what? Like, a tool to help you think? I was like, what do you need help with? Like, you seem to be doing great on your own. And, you know, he would describe this to me, and I would watch his flow. And aside from, like, providing a lot of inspiration, to be honest, like, when I watched Steven work, I was like, oh, nobody works like this, right? Like, this is what makes him special. Like, he is such a dedicated, like, researcher and journalist, and he's so thorough, he's so smart. And then I had this realization of, like, maybe Steven is the product. Maybe the work is to take Steven's expertise and bring it to, like, everyday people that could really benefit from this. Like, just watching him work, I was like, oh, I could definitely use, like, a mini-Steven, like, doing work for me. Like, that would make me a better PM. And then I thought very quickly about, like, the adjacent roles that could use sort of this, like, research and analysis tool. And so, aside from being, you know, chief dreamer, Steven also represents, like, a super workflow that I think all of us, like, if we had access to a tool like it, would just inherently, like, make us better.Swyx [00:19:46]: Did you make him express his thoughts while he worked, or you just silently watched him, or how does this work?Raiza [00:19:52]: Oh, now you're making me admit it. But yes, I did just silently watch him.Swyx [00:19:57]: This is a part of the PM toolkit, right? They give user interviews and all that.Raiza [00:20:00]: Yeah, I mean, I did interview him, but I noticed, like, if I interviewed him, it was different than if I just watched him. And I did the same thing with students all the time. Like, I followed a lot of students around. I watched them study. I would ask them, like, oh, how do you feel now, right?Raiza [00:20:15]: Or why did you do that? Like, what made you do that, actually?Raiza [00:20:18]: Or why are you upset about, like, this particular thing? Why are you cranky about this particular topic? And it was very similar, I think, for Steven, especially because he was describing, he was in the middle of writing a book. And he would describe, like, oh, you know, here's how I research things, and here's how I keep my notes. Oh, and here's how I do it. And it was really, he was doing this sort of, like, self-questioning, right? Like, now we talk about, like, chain of, you know, reasoning or thought, reflection.Raiza [00:20:44]: And I was like, oh, he's the OG.Raiza [00:20:46]: Like, I watched him do it in real time. I was like, that's, like, L-O-M right there. And to be able to bring sort of that expertise in a way that was, like, you know, maybe, like, costly inference-wise, but really have, like, that ability inside of a tool that was, like, for starters, free inside of NotebookLM, it was good to learn whether or not people really did find use out of it.Swyx [00:21:05]: So did he just commit to using NotebookLM for everything, or did you just model his existing workflow?Raiza [00:21:12]: Both, right?Raiza [00:21:12]: Like, in the beginning, there was no product for him to use. And so he just kept describing the thing that he wanted. And then eventually, like, we started building the thing. And then I would start watching him use it. One of the things that I love about Steven is he uses the product in ways where it kind of does it, but doesn't quite. Like, he's always using it at, like, the absolute max limit of this thing. But the way that he describes it is so full of promise, where he's like, I can see it going here. And all I have to do is sort of, like, meet him there and sort of pressure test whether or not, you know, everyday people want it. And we just have to build it.Swyx [00:21:47]: I would say OpenAI has a pretty similar person, Andrew Mason, I think his name is. It's very similar, like, just from the writing world and using it as a tool for thought to shape Chachabitty. I don't think that people who use AI tools to their limit are common. I'm looking at my NotebookLM now. I've got two sources. You have a little, like, source limit thing. And my bar is over here, you know, and it stretches across the whole thing. I'm like, did he fill it up?Raiza [00:22:09]: Yes, and he has, like, a higher limit than others, I think. He fills it up.Raiza [00:22:14]: Oh, yeah.Raiza [00:22:14]: Like, I don't think Steven even has a limit, actually.Swyx [00:22:17]: And he has Notes, Google Drive stuff, PDFs, MP3, whatever.Raiza [00:22:22]: Yes, and one of my favorite demos, he just did this recently, is he has actually PDFs of, like, handwritten Marie Curie notes. I see.Swyx [00:22:29]: So you're doing image recognition as well. Yeah, it does support it today.Raiza [00:22:32]: So if you have a PDF that's purely images, it will recognize it.Raiza [00:22:36]: But his demo is just, like, super powerful.Raiza [00:22:37]: He's like, okay, here's Marie Curie's notes. And it's like, here's how I'm using it to analyze it. And I'm using it for, like, this thing that I'm writing.Raiza [00:22:44]: And that's really compelling.Raiza [00:22:45]: It's like the everyday person doesn't think of these applications. And I think even, like, when I listen to Steven's demo, I see the gap. I see how Steven got there, but I don't see how I could without him. And so there's a lot of work still for us to build of, like, hey, how do I bring that magic down to, like, zero work? Because I look at all the steps that he had to take in order to do it, and I'm like, okay, that's product work for us, right? Like, that's just onboarding.Alessio [00:23:09]: And so from an engineering perspective, people come to you and it's like, hey, I need to use this handwritten notes from Marie Curie from hundreds of years ago. How do you think about adding support for, like, data sources and then maybe any fun stories and, like, supporting more esoteric types of inputs?Raiza [00:23:25]: So I think about the product in three ways, right? So there's the sources, the source input. There's, like, the capabilities of, like, what you could do with those sources. And then there's the third space, which is how do you output it into the world? Like, how do you put it back out there? There's a lot of really basic sources that we don't support still, right? I think there's sort of, like, the handwritten notes stuff is one, but even basic things like DocX or, like, PowerPoint, right? Like, these are the things that people, everyday people are like, hey, my professor actually gave me everything in DocX. Can you support that? And then just, like, basic stuff, like images and PDFs combined with text. Like, there's just a really long roadmap for sources that I think we just have to work on.Raiza [00:24:04]: So that's, like, a big piece of it.Raiza [00:24:05]: On the output side, and I think this is, like, one of the most interesting things that we learned really early on, is, sure, there's, like, the Q&A analysis stuff, which is like, hey, when did this thing launch? Okay, you found it in the slide deck. Here's the answer. But most of the time, the reason why people ask those questions is because they're trying to make something new. And so when, actually, when some of those early features leaked, like, a lot of the features we're experimenting with are the output types. And so you can imagine that people care a lot about the resources that they're putting into NotebookLM because they're trying to create something new. So I think equally as important as, like, the source inputs are the outputs that we're helping people to create. And really, like, you know, shortly on the roadmap, we're thinking about how do we help people use NotebookLM to distribute knowledge? And that's, like, one of the most compelling use cases is, like, shared notebooks. It's, like, a way to share knowledge. How do we help people take sources and, like, one-click new documents out of it, right? And I think that's something that people think is, like, oh, yeah, of course, right? Like, one push a document. But what does it mean to do it right? Like, to do it in your style, in your brand, right?Raiza [00:25:08]: To follow your guidelines, stuff like that.Raiza [00:25:09]: So I think there's a lot of work, like, on both sides of that equation.Raiza [00:25:13]: Interesting.Swyx [00:25:13]: Any comments on the engineering side of things?Usama [00:25:16]: So, yeah, like I said, I was mostly working on building the text to audio, which kind of lives as a separate engineering pipeline, almost, that we then put into NotebookLM. But I think there's probably tons of NotebookLM engineering war stories on dealing with sources. And so I don't work too closely with engineers directly. But I think a lot of it does come down to, like, Gemini's native understanding of images really well with the latest generation.Raiza [00:25:39]: Yeah, I think on the engineering and modeling side, I think we are a really good example of a team that's put a product out there, and we're getting a lot of feedback from the users, and we return the data to the modeling team, right? To the extent that we say, hey, actually, you know what people are uploading, but we can't really support super well?Raiza [00:25:56]: Text plus image, right?Raiza [00:25:57]: Especially to the extent that, like, NotebookLM can handle up to 50 sources, 500,000 words each. Like, you're not going to be able to jam all of that into, like, the context window. So how do we do multimodal embeddings with that? There's really, like, a lot of things that we have to solve that are almost there, but not quite there yet.Alessio [00:26:16]: On then turning it into audio, I think one of the best things is it has so many of the human... Does that happen in the text generation that then becomes audio? Or is that a part of, like, the audio model that transforms the text?Usama [00:26:27]: It's a bit of both, I would say. The audio model is definitely trying to mimic, like, certain human intonations and, like, sort of natural, like, breathing and pauses and, like, laughter and things like that. But yeah, in generating, like, the text, we also have to sort of give signals on, like, where those things maybe would make sense.Alessio [00:26:45]: And on the input side, instead of having a transcript versus having the audio, like, can you take some of the emotions out of it, too? If I'm giving, like, for example, when we did the recaps of our podcast, we can either give audio of the pod or we can give a diarized transcription of it. But, like, the transcription doesn't have some of the, you know, voice kind of, like, things.Raiza [00:27:05]: Yeah, yeah.Alessio [00:27:05]: Do you reconstruct that when people upload audio or how does that work?Raiza [00:27:09]: So when you upload audio today, we just transcribe it. So it is quite lossy in the sense that, like, we don't transcribe, like, the emotion from that as a source. But when you do upload a text file and it has a lot of, like, that annotation, I think that there is some ability for it to be reused in, like, the audio output, right? But I think it will still contextualize it in the deep dive format. So I think that's something that's, like, particularly important is, like, hey, today we only have one format.Raiza [00:27:37]: It's deep dive.Raiza [00:27:38]: It's meant to be a pretty general overview and it is pretty peppy.Raiza [00:27:42]: It's just very upbeat.Raiza [00:27:43]: It's very enthusiastic, yeah.Raiza [00:27:45]: Yeah, yeah.Raiza [00:27:45]: Even if you had, like, a sad topic, I think they would find a way to be, like, silver lining, though.Raiza [00:27:50]: Really?Raiza [00:27:51]: Yeah.Raiza [00:27:51]: We're having a good chat.Raiza [00:27:54]: Yeah, that's awesome.Swyx [00:27:54]: One of the ways, many, many, many ways that deep dive went viral is people saying, like, if you want to feel good about yourself, just drop in your LinkedIn. Any other, like, favorite use cases that you saw from people discovering things in social media?Raiza [00:28:08]: I mean, there's so many funny ones and I love the funny ones.Raiza [00:28:11]: I think because I'm always relieved when I watch them. I'm like, haha, that was funny and not scary. It's great.Raiza [00:28:17]: There was another one that was interesting, which was a startup founder putting their landing page and being like, all right, let's test whether or not, like, the value prop is coming through. And I was like, wow, that's right.Raiza [00:28:26]: That's smart.Usama [00:28:27]: Yeah.Raiza [00:28:28]: And then I saw a couple of other people following up on that, too.Raiza [00:28:32]: Yeah.Swyx [00:28:32]: I put my about page in there and, like, yeah, if there are things that I'm not comfortable with, I should remove it. You know, so that it can pick it up. Right.Usama [00:28:39]: I think that the personal hype machine was, like, a pretty viral one. I think, like, people uploaded their dreams and, like, some people, like, keep sort of dream journals and it, like, would sort of comment on those and, like, it was therapeutic. I didn't see those.Raiza [00:28:54]: Those are good. I hear from Googlers all the time, especially because we launched it internally first. And I think we launched it during the, you know, the Q3 sort of, like, check-in cycle. So all Googlers have to write notes about, like, hey, you know, what'd you do in Q3? And what Googlers were doing is they would write, you know, whatever they accomplished in Q3 and then they would create an audio overview. And these people they didn't know would just ping me and be like, wow, I feel really good, like, going into a meeting with my manager.Raiza [00:29:25]: And I was like, good, good, good, good. You really did that, right?Usama [00:29:29]: I think another cool one is just, like, any Wikipedia article. Yeah. Like, you drop it in and it's just, like, suddenly, like, the best sort of summary overview.Raiza [00:29:38]: I think that's what Karpathy did, right? Like, he has now a Spotify channel called Histories of Mysteries, which is basically, like, he just took, like, interesting stuff from Wikipedia and made audio overviews out of it.Swyx [00:29:50]: Yeah, he became a podcaster overnight.Raiza [00:29:52]: Yeah.Raiza [00:29:53]: I'm here for it. I fully support him.Raiza [00:29:55]: I'm racking up the listens for him.Swyx [00:29:58]: Honestly, it's useful even without the audio. You know, I feel like the audio does add an element to it, but I always want, you know, paired audio and text. And it's just amazing to see what people are organically discovering. I feel like it's because you laid the groundwork with NotebookLM and then you came in and added the sort of TTS portion and made it so good, so human, which is weird. Like, it's this engineering process of humans. Oh, one thing I wanted to ask. Do you have evals?Raiza [00:30:23]: Yeah.Swyx [00:30:23]: Yes.Raiza [00:30:24]: What? Potatoes for chefs.Swyx [00:30:27]: What is that? What do you mean, potatoes?Raiza [00:30:29]: Oh, sorry.Raiza [00:30:29]: Sorry. We were joking with this, like, a couple of weeks ago. We were doing, like, side-by-sides. But, like, Raiza sent me the file and it was literally called Potatoes for Chefs. And I was like, you know, my job is really serious, but you have to laugh a little bit. Like, the title of the file is, like, Potatoes for Chefs.Swyx [00:30:47]: Is it like a training document for chefs?Usama [00:30:50]: It's just a side-by-side for, like, two different kind of audio transcripts.Swyx [00:30:54]: The question is really, like, as you iterate, the typical engineering advice is you establish some kind of test or benchmark. You're at, like, 30 percent. You want to get it up to 90, right?Raiza [00:31:05]: Yeah.Swyx [00:31:05]: What does that look like for making something sound human and interesting and voice?Usama [00:31:11]: We have the sort of formal eval process as well. But I think, like, for this particular project, we maybe took a slightly different route to begin with. Like, there was a lot of just within the team listening sessions. A lot of, like, sort of, like... Dogfooding.Raiza [00:31:23]: Yeah.Usama [00:31:23]: Like, I think the bar that we tried to get to before even starting formal evals with raters and everything was much higher than I think other projects would. Like, because that's, as you said, like, the traditional advice, right? Like, get that ASAP. Like, what are you looking to improve on? Whatever benchmark it is. So there was a lot of just, like, critical listening. And I think a lot of making sure that those improvements actually could go into the model. And, like, we're happy with that human element of it. And then eventually we had to obviously distill those down into an eval set. But, like, still there's, like, the team is just, like, a very, very, like, avid user of the product at all stages.Raiza [00:32:02]: I think you just have to be really opinionated.Raiza [00:32:05]: I think that sometimes, if you are, your intuition is just sharper and you can move a lot faster on the product.Raiza [00:32:12]: Because it's like, if you hold that bar high, right?Raiza [00:32:15]: Like, if you think about, like, the iterative cycle, it's like, hey, we could take, like, six months to ship this thing. To get it to, like, mid where we were. Or we could just, like, listen to this and be like, yeah, that's not it, right? And I don't need a rater to tell me that. That's my preference, right? And collectively, like, if I have two other people listen to it, they'll probably agree. And it's just kind of this step of, like, just keep improving it to the point where you're like, okay, now I think this is really impressive. And then, like, do evals, right? And then validate that.Swyx [00:32:43]: Was the sound model done and frozen before you started doing all this? Or are you also saying, hey, we need to improve the sound model as well? Both.Usama [00:32:51]: Yeah, we were making improvements on the audio and just, like, generating the transcript as well. I think another weird thing here was, like, we needed to be entertaining. And that's much harder to quantify than some of the other benchmarks that you can make for, like, you know, Sweebench or get better at this math.Swyx [00:33:10]: Do you just have people rate one to five or, you know, or just thumbs up and down?Usama [00:33:14]: For the formal rater evals, we have sort of like a Likert scale and, like, a bunch of different dimensions there. But we had to sort of break down what makes it entertaining into, like, a bunch of different factors. But I think the team stage of that was more critical. It was like, we need to make sure that, like, what is making it fun and engaging? Like, we dialed that as far as it goes. And while we're making other changes that are necessary, like, obviously, they shouldn't make stuff up or, you know, be insensitive.Raiza [00:33:41]: Hallucinations. Safety.Swyx [00:33:42]: Other safety things.Raiza [00:33:43]: Right.Swyx [00:33:43]: Like a bunch of safety stuff.Raiza [00:33:45]: Yeah, exactly.Usama [00:33:45]: So, like, with all of that and, like, also just, you know, following sort of a coherent narrative and structure is really important. But, like, with all of this, we really had to make sure that that central tenet of being entertaining and engaging and something you actually want to listen to. It just doesn't go away, which takes, like, a lot of just active listening time because you're closest to the prompts, the model and everything.Swyx [00:34:07]: I think sometimes the difficulty is because we're dealing with non-deterministic models, sometimes you just got a bad roll of the dice and it's always on the distribution that you could get something bad. Basically, how many do you, like, do ten runs at a time? And then how do you get rid of the non-determinism?Raiza [00:34:23]: Right.Usama [00:34:23]: Yeah, that's bad luck.Raiza [00:34:25]: Yeah.Swyx [00:34:25]: Yeah.Usama [00:34:26]: I mean, there still will be, like, bad audio overviews. There's, like, a bunch of them that happens. Do you mean for, like, the raider? For raiders, right?Swyx [00:34:34]: Like, what if that one person just got, like, a really bad rating? You actually had a great prompt, you actually had a great model, great weights, whatever. And you just, you had a bad output.Usama [00:34:42]: Like, and that's okay, right?Raiza [00:34:44]: I actually think, like, the way that these are constructed, if you think about, like, the different types of controls that the user has, right? Like, what can the user do today to affect it?Usama [00:34:54]: We push a button.Raiza [00:34:55]: You just push a button.Swyx [00:34:56]: I have tried to prompt engineer by changing the title. Yeah, yeah, yeah.Raiza [00:34:59]: Changing the title, people have found out.Raiza [00:35:02]: Yeah.Raiza [00:35:02]: The title of the notebook, people have found out. You can add show notes, right? You can get them to think, like, the show has changed. Someone changed the language of the output. Changing the language of the output. Like, those are less well-tested because we focused on, like, this one aspect. So it did change the way that we sort of think about quality as well, right? So it's like, quality is on the dimensions of entertainment, of course, like, consistency, groundedness. But in general, does it follow the structure of the deep dive? And I think when we talk about, like, non-determinism, it's like, well, as long as it follows, like, the structure of the deep dive, right? It sort of inherently meets all those other qualities. And so it makes it a little bit easier for us to ship something with confidence to the extent that it's like, I know it's going to make a deep dive. It's going to make a good deep dive. Whether or not the person likes it, I don't know. But as we expand to new formats, as we open up controls, I think that's where it gets really much harder. Even with the show notes, right? Like, people don't know what they're going to get when they do that. And we see that already where it's like, this is going to be a lot harder to validate in terms of quality, where now we'll get a greater distribution. Whereas I don't think we really got, like, varied distribution because of, like, that pre-process that Raiza was talking about. And also because of the way that we'd constrain, like, what were we measuring for? Literally, just like, is it a deep dive?Swyx [00:36:18]: And you determine what a deep dive is. Yeah. Everything needs a PM. Yeah, I have, this is very similar to something I've been thinking about for AI products in general. There's always like a chief tastemaker. And for Notebook LM, it seems like it's a combination of you and Steven.Raiza [00:36:31]: Well, okay.Raiza [00:36:32]: I want to take a step back.Swyx [00:36:33]: And Raiza, I mean, presumably for the voice stuff.Raiza [00:36:35]: Raiza's like the head chef, right? Of, like, deep dive, I think. Potatoes.Raiza [00:36:40]: Of potatoes.Raiza [00:36:41]: And I say this because I think even though we are already a very opinionated team, and Steven, for sure, very opinionated, I think of the audio generations, like, Raiza was the most opinionated, right? And we all, like, would say, like, hey, I remember, like, one of the first ones he sent me.Raiza [00:36:57]: I was like, oh, I feel like they should introduce themselves. I feel like they should say a title. But then, like, we would catch things, like, maybe they shouldn't say their names.Raiza [00:37:04]: Yeah, they don't say their names.Usama [00:37:05]: That was a Steven catch, like, not give them names.Raiza [00:37:08]: So stuff like that is, like, we all injected, like, a little bit of just, like, hey, here's, like, my take on, like, how a podcast should be, right? And I think, like, if you're a person who, like, regularly listens to podcasts, there's probably some collective preference there that's generic enough that you can standardize into, like, the deep dive format. But, yeah, it's the new formats where I think, like, oh, that's the next test. Yeah.Swyx [00:37:30]: I've tried to make a clone, by the way. Of course, everyone did. Yeah. Everyone in AI was like, oh, no, this is so easy. I'll just take a TTS model. Obviously, our models are not as good as yours, but I tried to inject a consistent character backstory, like, age, identity, where they work, where they went to school, what their hobbies are. Then it just, the models try to bring it in too much.Raiza [00:37:49]: Yeah.Swyx [00:37:49]: I don't know if you tried this.Raiza [00:37:51]: Yeah.Swyx [00:37:51]: So then I'm like, okay, like, how do I define a personality? But it doesn't keep coming up every single time. Yeah.Raiza [00:37:58]: I mean, we have, like, a really, really good, like, character designer on our team.Raiza [00:38:02]: What?Swyx [00:38:03]: Like a D&D person?Raiza [00:38:05]: Just to say, like, we, just like we had to be opinionated about the format, we had to be opinionated about who are those two people talking.Raiza [00:38:11]: Okay.Raiza [00:38:12]: Right.Raiza [00:38:12]: And then to the extent that, like, you can design the format, you should be able to design the people as well.Raiza [00:38:18]: Yeah.Swyx [00:38:18]: I would love, like, a, you know, like when you play Baldur's Gate, like, you roll, you roll like 17 on Charisma and like, it's like what race they are. I don't know.Raiza [00:38:27]: I recently, actually, I was just talking about character select screens.Raiza [00:38:30]: Yeah. I was like, I love that, right.Raiza [00:38:32]: And I was like, maybe there's something to be learned there because, like, people have fallen in love with the deep dive as a, as a format, as a technology, but also as just like those two personas.Raiza [00:38:44]: Now, when you hear a deep dive and you've heard them, you're like, I know those two.Raiza [00:38:48]: Right.Raiza [00:38:48]: And people, it's so funny when I, when people are trying to find out their names, like, it's a, it's a worthy task.Raiza [00:38:54]: It's a worthy goal.Raiza [00:38:55]: I know what you're doing. But the next step here is to sort of introduce, like, is this like what people want?Raiza [00:39:00]: People want to sort of edit the personas or do they just want more of them?Swyx [00:39:04]: I'm sure you're getting a lot of opinions and they all, they all conflict with each other. Before we move on, I have to ask, because we're kind of on this topic. How do you make audio engaging? Because it's useful, not just for deep dive, but also for us as podcasters. What is, what does engaging mean? If you could break it down for us, that'd be great.Usama [00:39:22]: I mean, I can try. Like, don't, don't claim to be an expert at all.Swyx [00:39:26]: So I'll give you some, like variation in tone and speed. You know, there's this sort of writing advice where, you know, this sentence is five words. This sentence is three, that kind of advice where you, where you vary things, you have excitement, you have laughter, all that stuff. But I'd be curious how else you break down.Usama [00:39:42]: So there's the basics, like obviously structure that can't be meandering, right? Like there needs to be sort of a, an ultimate goal that the voices are trying to get to, human or artificial. I think one thing we find often is if there's just too much agreement between people, like that's not fun to listen to. So there needs to be some sort of tension and build up, you know, withholding information. For example, like as you listen to a story unfold, like you're going to learn more and more about it. And audio that maybe becomes even more important because like you actually don't have the ability to just like skim to the end of something. You're driving or something like you're going to be hooked because like there's, and that's how like, that's how a lot of podcasts work. Like maybe not interviews necessarily, but a lot of true crime, a lot of entertainment in general. There's just like a gradual unrolling of information. And that also like sort of goes back to the content transformation aspect of it. Like maybe you are going from, let's say the Wikipedia article of like one of the History of Mysteries, maybe episodes. Like the Wikipedia article is going to state out the information very differently. It's like, here's what happened would probably be in the very first paragraph. And one approach we could have done is like maybe a person's just narrating that thing. And maybe that would work for like a certain audience. Or I guess that's how I would picture like a standard history lesson to unfold. But like, because we're trying to put it in this two-person dialogue format, like there, we inject like the fact that, you know, there's, you don't give everything at first. And then you set up like differing opinions of the same topic or the same, like maybe you seize on a topic and go deeper into it and then try to bring yourself back out of it and go back to the main narrative. So that's, that's mostly from like the setting up the script perspective. And then the audio, I was saying earlier, it's trying to be as close to just human speech as possible. I think was the, what we found success with so far.Raiza [00:41:40]: Yeah. Like with interjections, right?Raiza [00:41:41]: Like I think like when you listen to two people talk, there's a lot of like, yeah, yeah, right. And then there's like a lot of like that questioning, like, oh yeah, really?Raiza [00:41:49]: What did you think?Swyx [00:41:50]: I noticed that. That's great.Raiza [00:41:52]: Totally.Usama [00:41:54]: Exactly.Swyx [00:41:55]: My question is, do you pull in speech experts to do this? Or did you just come up with it yourselves? You can be like, okay, talk to a whole bunch of fiction writers to, to make things engaging or comedy writers or whatever, stand up comedy, right? They have to make audio engaging, but audio as well. Like there's professional fields of studying where people do this for a living, but us as AI engineers are just making this up as we go.Raiza [00:42:19]: I mean, it's a great idea, but you definitely didn't.Raiza [00:42:22]: Yeah.Swyx [00:42:24]: My guess is you didn't.Raiza [00:42:25]: Yeah.Swyx [00:42:26]: There's a, there's a certain field of authority that people have. They're like, oh, like you can't do this because you don't have any experience like making engaging audio. But that's what you literally did.Raiza [00:42:35]: Right.Usama [00:42:35]: I mean, I was literally chatting with someone at Google earlier today about how some people think that like you need a linguistics person in the room for like making a good chatbot. But that's not actually true because like this person went to school for linguistics. And according to him, he's an engineer now. According to him, like most of his classmates were not actually good at language. Like they knew how to analyze language and like sort of the mathematical patterns and rhythms and language. But that doesn't necessarily mean they were going to be eloquent at like while speaking or writing. So I think, yeah, a lot of we haven't invested in specialists in audio format yet, but maybe that would.Raiza [00:43:13]: I think it's like super interesting because I think there is like a very human question of like what makes something interesting. And there's like a very deep question of like what is it, right? Like what is the quality that we are all looking for? Is it does somebody have to be funny? Does something have to be entertaining? Does something have to be straight to the point? And I think when you try to distill that, this is the interesting thing I think about our experiment, about this particular launch is first, we only launched one format. And so we sort of had to squeeze everything we believed about what an interesting thing is into one package. And as a result of it, I think we learned it's like, hey, interacting with a chatbot is sort of novel at first, but it's not interesting, right? It's like humans are what makes interacting with chatbots interesting.Raiza [00:43:59]: It's like, ha ha ha, I'm going to try to trick it. It's like, that's interesting.Raiza [00:44:02]: Spell strawberry, right?Raiza [00:44:04]: This is like the fun that like people have with it. But like that's not the LLM being interesting.Raiza [00:44:08]: That's you just like kind of giving it your own flavor. But it's like, what does it mean to sort of flip it on its head and say, no, you be interesting now, right? Like you give the chatbot the opportunity to do it. And this is not a chatbot per se. It is like just the audio. And it's like the texture, I think, that really brings it to life. And it's like the things that we've described here, which is like, okay, now I have to like lead you down a path of information about like this commercialization deck.Raiza [00:44:36]: It's like, how do you do that?Raiza [00:44:38]: To be able to successfully do it, I do think that you need experts. I think we'll engage with experts like down the road, but I think it will have to be in the context of, well, what's the next thing we're building, right? It's like, what am I trying to change here? What do I fundamentally believe needs to be improved? And I think there's still like a lot more studying that we have to do in terms of like, well, what are people actually using this for? And we're just in such early days. Like it hasn't even been a month. Two, three weeks.Usama [00:45:05]: Three weeks.Raiza [00:45:06]: Yeah, yeah.Usama [00:45:07]: I think one other element to that is the fact that you're bringing your own sources to it. Like it's your stuff. Like, you know this somewhat well, or you care to know about this. So like that, I think, changed the equation on its head as well. It's like your sources and someone's telling you about it. So like you care about how that dynamic is, but you just care for it to be good enough to be entertaining. Because ultimately they're talking about your mortgage deed or whatever.Swyx [00:45:33]: So it's interesting just from the topic itself. Even taking out all the agreements and the hiding of the slow reveal. I mean, there's a baseline, maybe.Usama [00:45:42]: Like if it was like too drab. Like if someone was reading it off, like, you know, that's like the absolute worst.Raiza [00:45:46]: But like...Swyx [00:45:47]: Do you prompt for humor? That's a tough one, right?Raiza [00:45:51]: I think it's more of a generic way to bring humor out if possible. I think humor is actually one of the hardest things. Yeah.Raiza [00:46:00]: But I don't know if you saw...Raiza [00:46:00]: That is AGI.Swyx [00:46:01]: Humor is AGI.Raiza [00:46:02]: Yeah, but did you see the chicken one?Raiza [00:46:03]: No.Raiza [00:46:04]: Okay. If you haven't heard it... We'll splice it in here.Swyx [00:46:06]: Okay.Raiza [00:46:07]: Yeah.Raiza [00:46:07]: There is a video on Threads. I think it was by Martino Wong. And it's a PDF.Raiza [00:46:16]: Welcome to your deep dive for today. Oh, yeah. Get ready for a fun one. Buckle up. Because we are diving into... Chicken, chicken, chicken. Chicken, chicken. You got that right. By Doug Zonker. Now. And yes, you heard that title correctly. Titles. Our listener today submitted this paper. Yeah, they're going to need our help. And I can totally see why. Absolutely. It's dense. It's baffling. It's a lot. And it's packed with more chicken than a KFC buffet. What? That's hilarious.Raiza [00:46:48]: That's so funny. So it's like stuff like that, that's like truly delightful, truly surprising.Raiza [00:46:53]: But it's like we didn't tell it to be funny.Usama [00:46:55]: Humor is contextual also. Like super contextual is what we're realizing. So we're not prompting for humor, but we're prompting for maybe a lot of other things that are bringing out that humor.Alessio [00:47:04]: I think the thing about ad-generated content, if we look at YouTube, like we do videos on YouTube and it's like, you know, a lot of people like screaming in the thumbnails to get clicks. There's like everybody, there's kind of like a meta of like what you need to do to get clicks. But I think in your product, there's no actual creator on the other side investing the time. So you can actually generate a type of content that is maybe not universally appealing, you know, at a much, yeah, exactly. I think that's the most interesting thing. It's like, well, is there a way for like, take Mr.Raiza [00:47:36]: Beast, right?Alessio [00:47:36]: It's like Mr. Beast optimizes videos to reach the biggest audience and like the most clicks. But what if every video could be kind of like regenerated to be closer to your taste, you know, when you watch it?Raiza [00:47:48]: I think that's kind of the promise of AI that I think we are just like touching on, which is, I think every time I've gotten information from somebody, they have delivered it to me in their preferred method, right?Raiza [00:47:59]: Like if somebody gives me a PDF, it's a PDF.Raiza [00:48:01]: Somebody gives me a hundred slide deck, that is the format in which I'm going to read it. But I think we are now living in the era where transformations are really possible, which is, look, like I don't want to read your hundred slide deck, but I'll listen to a 16 minute audio overview on the drive home. And that, that I think is, is really novel. And that is, is paving the way in a way that like maybe we wanted, but didn'tRaiza [00:48:24]: expect.Raiza [00:48:25]: Where I also think you're listening to a lot of content that normally wouldn't have had content made about it. Like I watched this TikTok where this woman uploaded her diary from 2004.Raiza [00:48:36]: For sure, right?Raiza [00:48:36]: Like nobody was goin

united states american spotify history tiktok learning ai english google voice japan future talk french canadian new york times podcasts italy japanese managing italian putting safety chefs wealth rome chatgpt code humor buckle beast deep dive chicken defining discord cloud gate taste nations honestly wikipedia designing pbs mysteries principle cto plans threads openai gemini residence labs ux titles canceled api spell kfc real time pov powerpoint d d pms ui 4k generating github potatoes notebook apis charisma elmo baldur structuring llm 2k kaya google drive pdfs agi marie curie hallucinations steering ish curate google ai deepmind lambda alessio corpus multilingual mendel lm sota summarize googlers normies notebooklm gcp databricks tts elevenlabs steven johnson raisa usama smol evan smith andrew mason andrej karpathy docx google labs dogfooding likert simon willison latent space pming

Jim Fan on Nvidia's Embodied AI Lab and Jensen Huang's Prediction that All Robots will be Autonomous

Training Data

Play Episode Listen Later Sep 17, 2024 49:13

AI researcher Jim Fan has had a charmed career. He was OpenAI's first intern before he did his PhD at Stanford with “godmother of AI,” Fei-Fei Li. He graduated into a research scientist position at Nvidia and now leads its Embodied AI “GEAR” group. The lab's current work spans foundation models for humanoid robots to agents for virtual worlds. Jim describes a three-pronged data strategy for robotics, combining internet-scale data, simulation data and real world robot data. He believes that in the next few years it will be possible to create a “foundation agent” that can generalize across skills, embodiments and realities—both physical and virtual. He also supports Jensen Huang's idea that “Everything that moves will eventually be autonomous.” Hosted by: Stephanie Zhan and Sonya Huang, Sequoia Capital Mentioned in this episode: World of Bits: Early OpenAI project Jim worked on as an intern with Andrej Karpathy. Part of a bigger initiative called Universe Fei-Fei Li: Jim's PhD advisor at Stanford who founded the ImageNet project in 2010 that revolutionized the field of visual recognition, led the Stanford Vision Lab and just launched her own AI startup, World Labs Project GR00T: Nvidia's “moonshot effort” at a robotic foundation model, premiered at this year's GTC Thinking Fast and Slow: Influential book by Daniel Kahneman that popularized some of his teaching from behavioral economics Jetson Orin chip: The dedicated series of edge computing chips Nvidia is developing to power Project GR00T Eureka: Project by Jim's team that trained a five finger robot hand to do pen spinning MineDojo: A project Jim did when he first got to Nvidia that developed a platform for general purpose agents in the game of Minecraft. Won NeurIPS 2022 Outstanding Paper Award ADI: artificial dog intelligence Mamba: Selective State Space Models, an alternative architecture to Transformers that Jim is interested in (original paper here) 00:00 Introduction 01:35 Jim's journey to embodied intelligence 04:53 The GEAR Group 07:32 Three kinds of data for robotics 10:32 A GPT-3 moment for robotics 16:05 Choosing the humanoid robot form factor 19:37 Specialized generalists 21:59 GR00T gets its own chip 23:35 Eureka and Issac Sim 25:23 Why now for robotics? 28:53 Exploring virtual worlds 36:28 Implications for games 39:13 Is the virtual world in service of the physical world? 42:10 Alternative architectures to Transformers 44:15 Lightning round

The Road to Autonomous Intelligence with Andrej Karpathy

No Priors: Artificial Intelligence | Machine Learning | Technology | Startups

Play Episode Listen Later Sep 5, 2024 44:16

Andrej Karpathy joins Sarah and Elad in this week of No Priors. Andrej, who was a founding team member of OpenAI and former Senior Director of AI at Tesla, needs no introduction. In this episode, Andrej discusses the evolution of self-driving cars, comparing Tesla and Waymo's approaches, and the technical challenges ahead. They also cover Tesla's Optimus humanoid robot, the bottlenecks of AI development today, and how AI capabilities could be further integrated with human cognition. Andrej shares more about his new company Eureka Labs and his insights into AI-driven education, peer networks, and what young people should study to prepare for the reality ahead. Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @Karpathy Show Notes: (0:00) Introduction (0:33) Evolution of self-driving cars (2:23) The Tesla vs. Waymo approach to self-driving (6:32) Training Optimus with automotive models (10:26) Reasoning behind the humanoid form factor (13:22) Existing challenges in robotics (16:12) Bottlenecks of AI progress (20:27) Parallels between human cognition and AI models (22:12) Merging human cognition with AI capabilities (27:10) Building high performance small models (30:33) Andrej's current work in AI-enabled education (36:17) How AI-driven education reshapes knowledge networks and status (41:26) Eureka Labs (42:25) What young people study to prepare for the future

ai building evolution tesla intelligence senior director openai existing merging autonomous parallels reasoning waymo optimus andrej bottlenecks elad andrej karpathy no priors

#440 – Pieter Levels: Programming, Viral AI Startups, and Digital Nomad Life

Lex Fridman Podcast

Play Episode Listen Later Aug 20, 2024

Pieter Levels (aka levelsio on X) is a self-taught developer and entrepreneur who has designed, programmed, launched over 40 startups, many of which are highly successful. Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep440-sc See below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc. Transcript: https://lexfridman.com/pieter-levels-transcript CONTACT LEX: Feedback - give feedback to Lex: https://lexfridman.com/survey AMA - submit questions, videos or call-in: https://lexfridman.com/ama Hiring - join our team: https://lexfridman.com/hiring Other - other ways to get in touch: https://lexfridman.com/contact EPISODE LINKS: Pieter's X: https://x.com/levelsio Pieter's Techno Optimist Shop: https://levelsio.com/ Indie Maker Handbook: https://readmake.com/ Nomad List: https://nomadlist.com Remote OK: https://remoteok.com Hoodmaps: https://hoodmaps.com SPONSORS: To support this podcast, check out our sponsors & get discounts: Shopify: Sell stuff online. Go to https://shopify.com/lex Motific: Generative ai deployment. Go to https://motific.ai AG1: All-in-one daily nutrition drinks. Go to https://drinkag1.com/lex MasterClass: Online classes from world-class experts. Go to https://masterclass.com/lexpod BetterHelp: Online therapy and counseling. Go to https://betterhelp.com/lex Eight Sleep: Temp-controlled smart mattress. Go to https://eightsleep.com/lex OUTLINE: (00:00) - Introduction (11:38) - Startup philosophy (19:09) - Low points (22:37) - 12 startups in 12 months (29:29) - Traveling and depression (42:08) - Indie hacking (46:11) - Photo AI (1:22:28) - How to learn AI (1:31:04) - Robots (1:39:21) - Hoodmaps (2:03:26) - Learning new programming languages (2:12:58) - Monetize your website (2:19:34) - Fighting SPAM (2:23:07) - Automation (2:34:33) - When to sell startup (2:37:26) - Coding solo (2:43:28) - Ship fast (2:52:13) - Best IDE for programming (3:01:43) - Andrej Karpathy (3:11:09) - Productivity (3:24:56) - Minimalism (3:33:41) - Emails (3:40:54) - Coffee (3:48:40) - E/acc (3:50:56) - Advice for young people PODCAST LINKS: - Podcast Website: https://lexfridman.com/podcast - Apple Podcasts: https://apple.co/2lwqZIr - Spotify: https://spoti.fi/2nEwCF8 - RSS: https://lexfridman.com/feed/podcast/ - Podcast Playlist: https://www.youtube.com/playlist?list=PLrAXtmErZgOdP_8GztsuKi9nrraNbKKp4 - Clips Channel: https://www.youtube.com/lexclips

#175 - GPT-4o Mini, OpenAI's Strawberry, Mixture of A Million Experts

Let's Talk AI

Play Episode Listen Later Jul 25, 2024 107:29 Transcription Available

Our 175th episode with a summary and discussion of last week's big AI news! With hosts Andrey Kurenkov (https://twitter.com/andrey_kurenkov) and Jeremie Harris (https://twitter.com/jeremiecharris) In this episode of Last Week in AI, hosts Andrey Kurenkov and Jeremy Harris explore recent AI advancements including OpenAI's release of GPT 4.0 Mini and Mistral's open-source models, covering their impacts on affordability and performance. They delve into enterprise tools for compliance, text-to-video models like Hyper 1.5, and YouTube Music enhancements. The conversation further addresses AI research topics such as the benefits of numerous small expert models, novel benchmarking techniques, and advanced AI reasoning. Policy issues including U.S. export controls on AI technology to China and internal controversies at OpenAI are also discussed, alongside Elon Musk's supercomputer ambitions and OpenAI's Prover-Verify Games initiative. Read out our text newsletter and comment on the podcast at https://lastweekin.ai/ If you would like to become a sponsor for the newsletter, podcast, or both, please fill out this form. Email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai Timestamps + links: (00:00:00) AI Song Intro (00:00:40) Intro / Banter Tools & Apps(00:03:57) OpenAI unveils GPT-4o mini, a small AI model powering ChatGPT (00:11:38) Meet Haiper 1.5, the new AI video generation model challenging Sora, Runway (00:16:32) Anthropic releases Claude app for Android (00:18:59) Google Vids is available to test out Gemini AI-created video presentations (00:20:27) YouTube Music sound search rolling out, AI ‘conversational radio' in testing Applications & Business(00:23:30) OpenAI working on new reasoning technology under code name ‘Strawberry' (00:30:45) Inside Elon Musk's Mad Dash To Build A Giant xAI Supercomputer In Memphis (00:37:15) Apple, NVIDIA and Anthropic reportedly used YouTube transcripts without permission to train AI models (00:41:05) After Tesla and OpenAI, Andrej Karpathy's startup aims to apply AI assistants to education (00:43:40) Menlo Ventures and Anthropic team up on a $100M AI fund Projects & Open Source(00:46:27) Mistral releases Codestral Mamba for faster, longer code generation (00:50:36) Mistral AI and NVIDIA Unveil Mistral NeMo 12B, a Cutting-Edge Enterprise AI Model (00:52:51) Hugging Face Releases SmoLLM, a Series of Small Language Models, Beats Qwen2 and Phi 1.5 (00:56:11) Stable Diffusion 3 License Revamped Amid Blowback, Promising Better Model Research & Advancements(01:01:49) FlashAttention-3 unleashes the power of H100 GPUs for LLMs (01:06:38) Mixture of A Million Experts (01:12:51) AutoBencher: Creating Salient, Novel, Difficult Datasets for Language Models (01:18:23) SpreadsheetLLM: Encoding Spreadsheets for Large Language >Models Policy & Safety(01:20:50) Prover-Verifier Games improve legibility of language model outputs (01:28:05) Trump allies draft AI order to launch ‘Manhattan Projects' for defense (01:34:40) On scalable oversight with weak LLMs judging strong LLMs (01:36:24) Google, Microsoft offer Nvidia chips to Chinese companies, the Information reports (01:38:26) U.S. planning 'draconian' sanctions against China's semiconductor industry: Report (01:48:47) OpenAI illegally barred staff from airing safety risks, whistleblowers say (01:44:59) Outro + AI Song

Silicon Valley's impact on the election and an acquisition making our HeadSpin

Equity

Play Episode Listen Later Jul 19, 2024 32:52

To kick off this week's news roundup, Kirsten walked us through Elon Musk's recent declaration of his intent to move both SpaceX and X's headquarters out of California to Texas. Whether or not he'll see those plans through remains to be seen, but of course, the Equity crew had thoughts.We then got into the deals of the week. First up, we talked about Sequoia Capital's emailing LPs in funds raised between 2009 and 2011 with an offer to buy up to $861 million worth of shares in Stripe. The move is notable for two reasons. For one, it's evidence that LPs are increasingly antsy for liquidity in this dry IPO market. (2024 thus far has delivered just four venture-backed tech IPOs — Reddit, Astera Labs, Ibotta and Rubrik — in March and April.) The Equity team also discussed how Sequoia's gesture reflects that the firm is confident not only of Stripe's future, but in its ability to eventually exit in a way that will reward investors handsomely.Next up, Rebecca Bellan led a discussion as to how Andrej Karpathy, former head of AI at Tesla and researcher at OpenAI, is launching Eureka Labs, an “AI native” education platform. We had a lively discussion on Karpathy's new initiative and when and how AI is appropriate in the classroom.We closed out the deals segment with Mary Ann's scoop on PartnerOne's acquisition of HeadSpin, a company whose founder was sentenced to prison for fraud earlier this year. Employees were upset that they got nothing for their options as part of the buyout, which Marina Temkin this week reported was valued at a mere $28 million.The group then got into an in-depth conversation about Silicon Valley's involvement in the election this year. Former President Donald Trump this week picked Ohio Senator J.D. Vance as his running mate, as he runs to reclaim the office he lost to President Joe Biden in 2020. Vance, who's best known for his memoir, “Hillbilly Elegy,” spent years as a venture capitalist before leaving the industry when elected to the U.S. Senate in 2022. We also talked about Andreessen Horowitz's controversial vocal support of Trump and the startup-related reasons why its leaders are backing the Republican nominee. We wrapped up Equity with a look at Latin America's startup scene and how it rebounded in funding in the second quarter, boosted by late-stage funding in the fintech sector.It was a great episode, so give it a listen!Equity is TechCrunch's flagship podcast, produced by Theresa Loconsolo, and posts every Monday, Wednesday and Friday. Subscribe to us on Apple Podcasts, Overcast, Spotify and all the casts.You also can follow Equity on X and Threads, at @EquityPod. For the full episode transcript, for those who prefer reading over listening, check out our full archive of episodes over at Simplecast. Credits: Equity is produced by Theresa Loconsolo with editing by Kell. Bryce Durbin is our Illustrator. We'd also like to thank the audience development team and Henry Pickavet, who manages TechCrunch audio products.

SPECIAL FOR KEYNOTE: This Is Robotics: Radio News #31

This Is Robotics: Radio News

Play Episode Listen Later Jun 30, 2024 67:55

2024: The Most Important Year in the History of Robotics!Companion podcast #31 to Keynote address at SuperTechFT 3 July 2024 Happy to be with you one and all. I'm Tom Green, your host and companion on this very special journey for 2024. We are only halfway through the year, and already 2024 has shown us that it is the most important year in the history of robotics.This podcast will show you why that is.This podcast is a companion to the live keynote address I will give at SuperTechFT in San Francisco on July 3rd 2024. I want to first thank Dr. Albert Hu, president and director of education at SuperTechFT, and to the staff and patrons of SuperTechFT for inviting me. The title of my keynote: 2024: The Most Important Year in the History of Robotics!What other year can possibly compete for top honors other than 2024?2024 eliminated the barrier to entry for digital programming by eliminating the need to code.As Tesla's former chief of AI, Andrej Karpathy put it: "Welcome to the hottest new programming language...English"2024 opened the door of AI prompt engineering to millions of new jobs and careers in millions of SME industries worldwide.So explains: Andrew Ng, investor and former head of Google Brain and Baidu.2024 converged GenAI with robotics, broadened robot/cobot applications, and freed robots from complexity of operation.So announced NVIDIA's CEO and founder Jensen Huang at the company's March meeting.2024 reinvigorated the liberal arts, creative thinking, expository writing, and language as vital new components in developing robotics applications.So reflects Stephen Wolfram physicist and creator of Mathematica2024 defined the need for the GenAI & the "New Collar" Worker Connection: Vitally needed workers for AI/robot-driven industry worldwide, and just maybe, the revitalization of America's middle class…or the middle class of any nation.Sarah Boisvert technologist, factory owner and wrote the book on the New Collar WorkforceSuddenly in mid-2024, technology has thrown us into a brand-new worldAnd it's only early July of 2024...can you believe it?“Artificial intelligence and robotics could catapult both fields to new heights.”The 4-Year Plight: SMEs in Search of Robots!Tech News May Fade, but Its Stories Are Forever! GenAI & "New Collar" ConnectionDid AI Just Free Humanity from Code?

20VC: Why Foundation Model Performance is Not Diminishing But Models Are Commoditising, Why Nvidia Will Enter the Model Space and Models Will Enter the Chip Space & The Right Business Model for AI Software with David Luan, Co-Founder @ Adept

The Twenty Minute VC: Venture Capital | Startup Funding | The Pitch

Play Episode Listen Later Jun 24, 2024 56:06

David Luan is the CEO and Co-Founder at Adept, a company building AI agents for knowledge workers. To date, David has raised over $400M for the company from Greylock, Andrej Karpathy, Scott Belsky, Nvidia, ServiceNow and WorkDay. Previously, he was VP of Engineering at OpenAI, overseeing research on language, supercomputing, RL, safety, and policy and where his teams shipped GPT, CLIP, and DALL-E. He led Google's giant model efforts as a co-lead of Google Brain. In Today's Episode with David Luan We Discuss: 1. The Biggest Lessons from OpenAI and Google Brain: What did OpenAI realise that no one else did that allowed them to steal the show with ChatGPT? Why did it take 6 years post the introduction of transformers for ChatGPT to be released? What are 1-2 of David's biggest lessons from his time leading teams at OpenAI and Google Brain? 2. Foundation Models: The Hard Truths: Why does David strongly disagree that the performance of foundation models is at a stage of diminishing returns? Why does David believe there will only be 5-7 foundation model providers? What will separate those who win vs those who do not? Does David believe we are seeing the commoditization of foundation models? How and when will we solve core problems of both reasoning and memory for foundation models? 3. Bunding vs Unbundling: Why Chips Are Coming for Models: Why does David believe that Jensen and Nvidia have to move into the model layer to sustain their competitive advantage? Why does David believe that the largest model providers have to make their own chips to make their business model sustainable? What does David believe is the future of the chip and infrastructure layer? 4. The Application Layer: Why Everyone Will Have an Agent: What is the difference between traditional RPA vs agents? Why is agents a 1,000x larger business than RPA? In a world where everyone has an agent, what does the future of work look like? Why does David disagree with the notion of "selling the work" and not the tool? What is the business model for the next generation of application layer AI companies?

A Lot Has Happened in A.I. Let's Catch Up.

The Ezra Klein Show

Play Episode Listen Later Dec 1, 2023 70:21

Thursday marked the one-year anniversary of the release of ChatGPT. A lot has happened since. OpenAI, the makers of ChatGPT, recently dominated headlines again after the nonprofit board of directors fired C.E.O. Sam Altman, only for him to return several days later.But that drama isn't actually the most important thing going on in the A.I. world, which hasn't slowed down over the past year, even as people are still discovering ChatGPT for the first time and reckoning with all of its implications.Tech journalists Kevin Roose and Casey Newton are hosts of the weekly podcast “Hard Fork.” Roose is my colleague at The Times, where he writes a tech column called “The Shift.” Newton is the founder and editor of Platformer, a newsletter about the intersection of technology and democracy. They've been closely tracking developments in the field since well before ChatGPT launched. I invited them on the show to catch up on the state of A.I.We discuss: who is — and isn't — integrating ChatGPT into their daily lives, the ripe market for A.I. social companions, why so many companies are hesitant to dive in, progress in the field of A.I. “interpretability” research, and America's “fecklessness” that cedes major A.I. benefits to the private sector, and much more.Recommendations:Electrifying America by David E. NyeYour Face Belongs to Us by Kashmir Hill“Intro to Large Language Models” by Andrej Karpathy (video)Import AI by Jack Clark.AI Snake Oil by Arvind Narayanan and Sayash KapoorPragmatic Engineer by Gergely OroszThoughts? Guest suggestions? Email us at ezrakleinshow@nytimes.com.You can find transcripts (posted midday) and more episodes of “The Ezra Klein Show” at nytimes.com/ezra-klein-podcast, and you can find Ezra on Twitter @ezraklein. Book recommendations from all our guests are listed at https://www.nytimes.com/article/ezra-klein-show-book-recs.This episode of “The Ezra Klein Show” was produced by Rollin Hu. Fact checking by Michelle Harris, with Kate Sinclair and Mary Marge Locker. Our senior engineer is Jeff Geld. Our senior editor is Claire Gordon. The show's production team also includes Emefa Agawu and Kristin Lin. Original music by Isaac Jones. Audience strategy by Kristina Samulewski and Shannon Busta. The executive producer of New York Times Opinion Audio is Annie-Rose Strasser. And special thanks to Sonia Herrero.

258 | Solo: AI Thinks Different

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas

Play Episode Listen Later Nov 27, 2023 80:46

The Artificial Intelligence landscape is changing with remarkable speed these days, and the capability of Large Language Models in particular has led to speculation (and hope, and fear) that we could be on the verge of achieving Artificial General Intelligence. I don't think so. Or at least, while what is being achieved is legitimately impressive, it's not anything like the kind of thinking that is done by human beings. LLMs do not model the world in the same way we do, nor are they driven by the same kinds of feelings and motivations. It is therefore extremely misleading to throw around words like "intelligence" and "values" without thinking carefully about what is meant in this new context.Blog post with transcript: https://www.preposterousuniverse.com/podcast/2023/11/27/258-solo-ai-thinks-different/Support Mindscape on Patreon.Some relevant references:Introduction to LLMs by Andrej Karpathy (video)OpenAI's GPTMelanie Mitchell: Can Large Language Models Reason?Mitchell et al.: Comparing Humans, GPT-4, and GPT-4V On Abstraction and Reasoning TasksKim et al.: FANToM: A Benchmark for Stress-testing Machine Theory of Mind in InteractionsButlin et al.: Consciousness in Artificial Intelligence: Insights from the Science of ConsciousnessMargaret Boden: AI doesn't have feelingsSee Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

ai science stress thinking society solo mind ideas blog philosophy artificial intelligence consciousness openai gpt llm large language models artificial general intelligence andrej karpathy

Podcasts about Andrej Karpathy

Best podcasts about Andrej Karpathy

Dave Lee on Investing

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Tesla Daily: Tesla News & Analysis

The Nonlinear Library

Lex Fridman Podcast

Ride the Lightning: Tesla Motors Unofficial Podcast

The AI Breakdown: Daily Artificial Intelligence News and Discussions

Lenny's Podcast: Product | Growth | Career

EV News Daily - Electric Car Podcast

Latest news about Andrej Karpathy

Latest podcast episodes about Andrej Karpathy

Vibe Coding: Revolução e Desafios da Programação Assistida por IA

'Vibe coding' makes word of the year by Collins', but what does it mean?

Why This CTO Says AI Coding Agents Are “Insidious”, Overhyped, and Nowhere Near Replacing Human Engineers

Agenti e Browser AI: quanto siamo vicini all'AGI?

Inside Amazon's $100B Automation Gamble

Inside Amazon's $100B Automation Gamble

Episode 543: Arts and Crafts

AI Round Up: Ari Morcos from Datalogy AI and Rob Toews from Radical VC on Karpathy Reactions, OpenAI's Dealmaking, & Bubble Reality Check

ChatGPT Atlas, OpenAI's new web browser

#58 - Marina Vinyes - Elle a recodé ChatGPT

News AI 43/25: AGI Timeline // OpenAI Atlas // Claude Skills

Why an AGI Delay Doesn't Mean an AI Bubble

Andrej Karpathy — AGI is still a decade away

Podlodka #436 – Математика в ИИ

How Attention to Detail Built a Unicorn | Notion's Ivan Zhao

Talking to a billionaire about how he uses ChatGPT

101: Software 101

Como a AI está reescrevendo as regras de produto e distribuição | Papo na Arena #86

AI, Agents and Software 3.0

AI Moves Off the Cloud, Google Breaks the Internet, Google-Wiz Deal Under Fire

Baby Registries, Cold Showers, and Launching opencode

AI's Jagged Age: Memory Limits, Retrieval Bots, and Legal Battles Over Encryption and Privacy

Teaching AI to Understand the Physical World, with Dr. Fei-Fei Li of World Labs

Vibe coding is having its moment

Vibe coding is having its moment

How Others Are Using AI - Claire and Greta

Andrej Karpathy on How AI Empowers

OpenAI's CPO on how AI changes must-have skills, moats, coding, startup playbooks, more | Kevin Weil (CPO at OpenAI, ex-Instagram, Twitter)

We need to talk about vibe coding

Vibe Coding

Vibe Coding

Vibe Coding with Ryan Booth

How Pieter Levels Hit $67k MRR in 3 Weeks

Tue. 02/18 – Grok-3

AI ROLLUP #11: $97B Elon OpenAI Rumor | AI Crypto Rebound? | Virtuals on Solana | ARC Launchpad

The AI Architect — Bret Taylor

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

[Ride Home] Simon Willison: Things we learned about LLMs in 2024

The Best of 2024 with Sarah Guo and Elad Gil

The AI Strategy That Doubled His Email Conversion Rate (Step By Step)

How NotebookLM Was Made

Jim Fan on Nvidia's Embodied AI Lab and Jensen Huang's Prediction that All Robots will be Autonomous

The Road to Autonomous Intelligence with Andrej Karpathy

#440 – Pieter Levels: Programming, Viral AI Startups, and Digital Nomad Life

#175 - GPT-4o Mini, OpenAI's Strawberry, Mixture of A Million Experts

Silicon Valley's impact on the election and an acquisition making our HeadSpin

SPECIAL FOR KEYNOTE: This Is Robotics: Radio News #31

20VC: Why Foundation Model Performance is Not Diminishing But Models Are Commoditising, Why Nvidia Will Enter the Model Space and Models Will Enter the Chip Space & The Right Business Model for AI Software with David Luan, Co-Founder @ Adept

A Lot Has Happened in A.I. Let's Catch Up.

258 | Solo: AI Thinks Different