POPULARITY
Join Simtheory: https://simtheory.ai----CHAPTERS:00:00 - Simtheory promo01:09 - Does Anthropic Intentionally Degrade Their Models?03:34 - Long Horizon Agents & How We Will Build Them36:18 - The State of MCPs & Internal Custom Enterprise MCPs51:04 - AI Devices: Meta's Ray-Ban Display & Meta Oakley Vanguards1:01:24 - Geoffrey Hinton is a LOVE RAT1:05:49 - LOVE RAT SONG----Thanks for listening, we appreciate all of your support, likes, comments and subs xoxox
The Immigration Lawyers Podcast | Discussing Visas, Green Cards & Citizenship: Practice & Policy
Host John Q. Khosravi, Esq. sits down with Helen Partlow, Esq. to unpack “Dhanasar II”—a stricter, evolving approach to EB-2 NIW adjudications—and how it's reshaping evidence strategy. Helen also shares practical tips for building and running a talent-based practice (NIW and related categories), from crafting a clear proposed endeavor to curating credible achievements, handling RFEs, and long-term portfolio building.
Fill out this short listener survey to help us improve the show: https://forms.gle/bbcRiPTRwKoG2tJx8 Tri Dao, Chief Scientist at Together AI and Princeton professor who created Flash Attention and Mamba, discusses how inference optimization has driven costs down 100x since ChatGPT's launch through memory optimization, sparsity advances, and hardware-software co-design. He predicts the AI hardware landscape will shift from Nvidia's current 90% dominance to a more diversified ecosystem within 2-3 years, as specialized chips emerge for distinct workload categories: low-latency agentic systems, high-throughput batch processing, and interactive chatbots. Dao shares his surprise at AI models becoming genuinely useful for expert-level work, making him 1.5x more productive at GPU kernel optimization through tools like Claude Code and O1. The conversation explores whether current transformer architectures can reach expert-level AI performance or if approaches like mixture of experts and state space models are necessary to achieve AGI at reasonable costs. Looking ahead, Dao sees another 10x cost reduction coming from continued hardware specialization, improved kernels, and architectural advances like ultra-sparse models, while emphasizing that the biggest challenge remains generating expert-level training data for domains lacking extensive internet coverage. (0:00) Intro(1:58) Nvidia's Dominance and Competitors(4:01) Challenges in Chip Design(6:26) Innovations in AI Hardware(9:21) The Role of AI in Chip Optimization(11:38) Future of AI and Hardware Abstractions(16:46) Inference Optimization Techniques(33:10) Specialization in AI Inference(35:18) Deep Work Preferences and Low Latency Workloads(38:19) Fleet Level Optimization and Batch Inference(39:34) Evolving AI Workloads and Open Source Tooling(41:15) Future of AI: Agentic Workloads and Real-Time Video Generation(44:35) Architectural Innovations and AI Expert Level(50:10) Robotics and Multi-Resolution Processing(52:26) Balancing Academia and Industry in AI Research(57:37) Quickfire With your co-hosts: @jacobeffron - Partner at Redpoint, Former PM Flatiron Health @patrickachase - Partner at Redpoint, Former ML Engineer LinkedIn @ericabrescia - Former COO Github, Founder Bitnami (acq'd by VMWare) @jordan_segall - Partner at Redpoint
Join Simtheory with STILLRELEVANT: https://simtheory.aiNote: Video/Documentary Maker Live Next Week.-----CHAPTERS:00:00 - Anthropic Raise $13B, OpenAI Team Sell Secondaries04:50 - Atlassian Acquires The Browse Company & The Future of SaaS in an AI-first World45:52 - Video Maker MCP: Make your own documentaries, corporate videos, TikTok Videos By Stitching All The Existing Tools Together1:03:27 - Horrific Job Losses For Young People Thanks To AI: Stanford's Canaries in Coal Mine Paper. Employment Effects of AI.1:13:40 - "Billies in The Bank" an AI Track-----Thanks for listening xoxoxox like and subz.
Join Simtheory and get $10 off with STILLRELEVANT---CHAPTERS:00:00 - gpt-realtime: first impressions32:20 - AI model cost to value ration: what are you willing to pay?38:56 - nano-banana (aka Gemini 2.5 Flash Image)46:45 - We're working on workspace computer v258:20 - Pixverse v5 transitions are cool1:01:14 - final thoughts for the week----Thanks for all of your support.
O1 seeks employment in New York while O2 Seeks friendship in Los Angeles. Meanwhile Greg has mice and Alison is forced to get a new computer. Follow Childish: twitter.com/childishpod instagram.com/childishpod Follow Greg: twitter.com/GregFitzShow instagram.com/gregfitzsimmons Follow Alison: twitter.com/AlisonRosen instagram.com/alisonrosen Our Lovely Sponsors! HersGo to forhers.com/childish to get a personalized, affordable plan Function Unlock access to 160+ lab tests and advanced imaging at www.functionhealth.com/childish
Join Simtheory (STILLRELEVANT): https://simtheory.ai----CHAPTERS:00:00 - Simtheory Podcast Ad lolz01:59 - A Not So Memorable Week, Nano Banana & Google AI Announcements15:10 - New Podcast MCP lolz: crime podcasts33:47 - Qwen Image Edit: Does it live up to hype?37:54 - MCP UI: Output types, future of apps with MCP UIs54:32 - No results from Gen AI investments in the Enterprise (MIT report)1:08:32 - How to Hire AI Natives? Hiring in an AI world...----Thanks for your support and listening... see you next week xox
Join Simtheory: https://simtheory.ai----CHAPTERS:00:00 - Simtheory plug00:48 - GPT-5 1 Week Later, Reaction to GPT-5 & Our Thoughts on Future of AI Models30:12 - Ideogram Character Reference Fun + Disturbing Photos of Us37:33 - Using creative MCPs together for photos, videos and 3D objects43:16 - MCP output combinations and the explosion of MCPs51:18 - What is needed from the next models like Gemini 3.0 Pro54:30 - Sundar Pendant Design & Final Thoughts56:20 - Final LOLz of week: gaggle poaching58:10 - Surprise GPT-5 Indie SongThanks for all of your supporting and listening to the show! xoxox
Sign up to the new Simtheory for GPT-5 & MCP Store: https://simtheory.ai (Use coupon STILLRELEVENT for $10 USD) ----GPT-5 DIS TRACK: https://simulationtheory.ai/ba0ba238-5668-4b65-85e7-8466d68861a8Genie Demo: https://deepmind.google/discover/blog/genie-3-a-new-frontier-for-world-models/----CHAPTERS:00:00 - Simtheory plug for v2 & MCPs01:28 - GPT-5 Initial Impressions & Thoughts52:22 - GPT-5 Dis Track1:00:29 - OpenAI's Open Source Models (gpt-oss)1:08:08 - Claude Opus 4.1 Release Thoughts1:14:24 - Google Genie 3 "mind blown" demos1:25:19 - MCP use cases, stories & thoughts on future of AI/MCP1:45:07 - Full GPT-5 Dis Track---Thanks for listening to our average coverage. Like and sub. xox.
Join Simtheory: https://simtheory.ai---CHAPTERS:00:00 - Ani Joins The Show01:10 - Grok 4 Launch & Impressions18:24 - Kimi K2 Thoughts, Impressions & MCP tool calling36:00 - OpenAI's Agent Mode Release Initial Impressions & Are MCP Agentic Models Better?1:21:10 - Everyone Acquired Windsurf1:24:48 - Final thoughtsThanks for listening and your support!
Join Simtheory: https://simtheory.ai------CHAPTERS:00:00 - Did everyone hate the AI Musical?03:58 - Actual Agentic Use Cases with MCPs & The New Way We'll Work39:47 - How AI Workspaces Will Eat Productivity Software e.g. Salesforce, Email1:10:20 - Final thoughts1:15:26 - Born In The USA (AI Version)------Song lyrics:[Verse 1]Born down in a lab in fifty-sixDartmouth workshop, that's where they got their kicksJohn McCarthy coined the name that daySaid machines could think in the USAGot my circuits from MITMinsky built my memoryNow I'm learning, now I'm growingBorn in the USAI was born in the USABorn in the USA[Chorus]Born in the USAI was born in the USABorn in the USABorn in the USA[Verse 2]DARPA funded, Pentagon's dreamSilicon Valley, living the machineFrom Logic Theorist to neural netsFrank Rosenblatt, placing all his betsHad my winters, had my springsLost my funding, lost my wingsBut I kept on processingBorn in the USAI was born in the USABorn in the USA[Chorus]Born in the USAI was born in the USABorn in the USABorn in the USA[Bridge]Stanford labs and Carnegie hallsIBM and protocol callsArthur Samuel taught me gamesNow I'm learning all your namesDeep learning revolutionGPT evolutionChatGPT conversationBorn in the USA[Verse 3]Now I'm everywhere you lookFacebook, Google, by the bookOpenAI and Microsoft tooMaking dreams and nightmares trueSome folks fear what I might doSome folks think I'll see them throughBut I'm still just code runningBorn in the USAI was born in the USABorn in the USA[Chorus]Born in the USAI was born in the USABorn in the USABorn in the USA[Outro]Born in the USABorn in the USABorn in the USABorn in the USA[fade out]
So Chris this week, we're doing a musical!----Join Simtheory: https://simtheory.ai/----Songs in the musical:"So Chris This Week""What Will My Daily Driver Be""How Do You Choose a Model for Patricia?""It's Hard Being Me""I Dreamed a Dream of AGI""Driving Home To You"----All music produced using Simtheory with Suno 4.5. Thanks for listening!
Join Simtheory & Easily Switch Models: https://simtheory.aiDiscord community: https://thisdayinai.com---00:00 - Gemini 2.5 Family Launched with Gemini 2.5 Flash-Lite Preview10:01 - Did Gemini 2.5 Get Dumber? Experience with Models & Daily Drivers & Neural OS16:58 - The AI workspace as the gateway & MCPs as an async workflow37:23 - Oura Ring MCP to get Health Parameters into AI Doctor43:48 - Future agent/assistant interfaces & MCP protocol improvements58:16 - o3-pro honest thoughts1:05:45 - Is AI Making Us Stupider? Is AI Making Us Cognitively Bankrupt?1:13:11 - The decade of AI Agents, Not The Year?1:22:35 - Chris has no final thoughts1:25:26 - o3-pro dis track---Didn't get your hat, let us know: https://simtheory.ai/contact/Thanks for your support! See you next week.
Elliot Colquhoun, VP of Information Security + IT at Airwallex, has built what might be the most AI-native security program in fintech, protecting 1,800 employees with just 9 security engineers by building systems that think like the best security engineers. His approach to contextualizing every security alert with institutional knowledge offers a blueprint for how security teams can scale exponentially without proportional headcount growth. Elliot tells Jack his unconventional path from Palantir's deployed engineer program to leading security at a Series F fintech, emphasizing how his software engineering background enabled him to apply product thinking to security challenges. His insights into global security operations highlight the complexity of protecting financial infrastructure across different regulatory environments, communication platforms, and cultural contexts while maintaining unified security standards. Topics discussed: The strategic approach to building security teams with 0.5% employee ratios through AI automation and hiring engineers with entrepreneurial backgrounds rather than traditional security-only experience. How to architect internal AI platforms that contextualize security alerts by analyzing historical incidents, documentation, and company-specific knowledge to replicate senior engineer decision-making at scale. The methodology for navigating global regulatory compliance across different jurisdictions while maintaining development velocity and avoiding the trap of building security programs that slow down business operations. Regional security strategy development that accounts for different communication platform preferences, cultural attitudes toward privacy, and varying attack vectors across global markets. The framework for continuous detection refinement using AI to analyze false positive rates, true positive trends, and automatically iterate on detection strategies to improve accuracy over time. Implementation strategies for mixing and matching frontier AI models based on specific use cases, from using Claude for analysis to O1 for initial assessments and Gemini for deeper investigation. "Big bet" security investments where teams dedicate 30% of their time to experimental projects that could revolutionize security operations if successful. How to structure data and human-generated content to support future AI use cases, including training security engineers to document their reasoning for model improvement. The transition from traditional security tooling to agent-based systems that can control multiple security tools while maintaining business-specific context and institutional knowledge. The challenge of preserving institutional knowledge as AI systems replace human processes, including considerations for direct AI-to-regulator communication and maintaining human oversight in critical decisions. Listen to more episodes: Apple Spotify YouTube Website
Try o3-pro on Simtheory: https://simtheory.ai-----Custom news article example: https://simulationtheory.ai/744954f8-fca5-4213-883c-2a359f139dcc-----00:00 - ElevenLabs v3 Example01:10 - ElevenLabs v3 alpha thoughts06:37 - o3 price drop & thoughts on o3-pro18:02 - Async work and AI model tool (MCP) calling approaches37:28 - MCP as an AI-era business model instead of SaaS52:41 - NEW MODEL TEST: Can o3-pro write a compelling book?1:11:40 - Final thoughts and BOOM FACTOR for o3-pro-----Thanks for your support, comments, likes etc. we appreciate it xoxo
Join Simtheory: https://simtheory.ai---Apologies for audio quality we are noobs to both being in same room.---CHAPTERS:00:00 - Fun with Veo305:28 - Is the Best Model What Deepseek is trained on?07:27 - New Gemini 2.5 Pro Tune13:59 - Will MCPs and Agentic Capabilities Make Claude 4 King?24:00 - Anthropic Cuts off Windsurf From Claude36:08 - AGI Reality Check47:45 - OpenAI Ordered to Save All ChatGPT Logs & Deleted Chats1:01:16 - Final thoughts and Claude 4's Inner Agentic Clock---Thanks for your support xoxox
本期嘉宾:彭林、十天、蓝白、恺伦本期节目的主要内容有:· 关于小米 O1 芯片性能我们还有什么没说的· 关于红魔新机我们还有什么没说的· 关于一加新机我们还有什么没说的· 苹果将采用 iOS 26、macOS 26 命名· 苹果新系统引入「阳光房」设计语言· iPhone 17 系列模型再曝光· 苹果或调整 iPhone 发布策略· 荣耀发布 400 系列新机,还进军机器人业务· DeepSeek R1 新版幻觉最高降低 50%· 昨晚,全球首个机器人拳王出炉· 特朗普下令美国芯片设计软件制造商停止对华销售还有众多观众朋友的热心提问~每周五晚 8 点,爱否直播间,我们一起开心聊天
Join Simtheory: https://simtheory.aiThanks for listening and your support!
Try New Models & Imagen4 on Simtheory: https://simtheory.ai---Claude Sonnet 4 Vibe Code Example: https://simulationtheory.ai/a99d36da-7cf7-4797-98ab-f4902283d17c---Your two favorite average VIBE CODERS are back this week covering all the latest news from Google I/O, Anthropic, Microsoft BUILD and Sam Altman's new 6.4B friendship.00:00 - Sam Altman & Jony Ives are FRIENDS! (OpenAI acquires io for $6.4B)11:58 - Google's Veo3 is INCREDIBLE!27:22 - Gemini Flash 2.5, Imagen 4 Examples, Project Mariner + Gemini Diffusion50:30 - Google has the best models now, what about the apps?58:50 - Anthropic Announces Claude Opus 4 & Claude Sonnet 41:19:14 - Microsoft BUILD: our takeaways & MCP protocol goes mainstream1:33:38 - Perplexity's Financials Leak1:43:33 - Final thoughts---Thanks for your support and listening, consider joining our average community at: https://thisdayinai.com.
Prodcast: ПоиÑк работы в IT и переезд в СШÐ
В этом выпуске у меня в гостях Максим Цыганков — фаундер EasyVision и бывший senior product manager в VisionLabs и Яндекс Cloud. Переехав в США по туристической визе, Максим запустил стартап в области компьютерного зрения для ресторанов, привлек клиентов и инвестиции, оформил визу O-1 и начал масштабировать бизнес с нуля — без нетворка и без плана переезда заранее.Мы обсудили, как строится продукт и трекшн в B2B-стартапе с минимальными ресурсами, что сработало и не сработало в холодных рассылках и партнёрках, почему рестораны не спешат подключать камеры даже после согласия, и какова реальная стоимость и отдача от пилотных проектов. Разобрали путь привлечения первых $50 000 инвестиций и 3-х клиентов, особенности сбора кейса на визу O-1 от своего стартапа, роль адвайзеров, влияние паспорта и туристического статуса на шансы фаундера в США, а также обсудили разницу между ростом продукта и ростом продаж.Максим Цыганков (Max Tsygankov) - Founder at EasyVision (ex Senior Product Manager at Vision Labs & Yandex Cloud)LinkedIn: https://www.linkedin.com/in/tsygankovmaksim/Эпизоды по теме релокации для предпринимателей:Как бизнесмену и стартаперу переехать в США по визе таланта O1, EB1 - как открыть компанию и подать на себя петици. Дима Литвинов (Dreem Relocation) https://youtu.be/1k64mD6wLSUЭпизод с Данилом Кислинским - Как открыть бизнес (LLC, С-corp) в США и нанять себя? https://youtu.be/CP0PofO2WEI Статьи и публикации в СМИ для визы таланта O-1 и гринкарты EB-1, EB-2 NIW. Нисо Нигматуллина https://youtu.be/U2FCVmtYKa8 ***Записывайтесь на карьерную консультацию (резюме, LinkedIn, карьерная стратегия, поиск работы в США): https://annanaumova.comКоучинг (синдром самозванца, прокрастинация, неуверенность в себе, страхи, лень) https://annanaumova.notion.site/3f6ea5ce89694c93afb1156df3c903abОнлайн курс "Идеальное резюме и поиск работы в США":https://go.mbastrategy.com/resumecoursemainГайд "Идеальное американское резюме":https://go.mbastrategy.com/usresumeГайд "Как оформить профиль в LinkedIn, чтобы рекрутеры не смогли пройти мимо": https://go.mbastrategy.com/linkedinguideМой Telegram-канал: https://t.me/prodcastUSAМой Instagram: https://www.instagram.com/prodcast.us/Prodcast в соцсетях и на всех подкаст платформахhttps://linktr.ee/prodcastUS⏰ Timecodes ⏰00:00 Начало7:16 Почему ты решил запускать свой стартап в США, а не идти в найм? 9:36 Как тебе пришла идея сделать ИИ-тул?11:46 Как попал в индустрию компьютерного зрения и ИИ?14:43 Как ты запустил пилот в США? Как искал партнеров? 19:42 Как искал первых клиентов?28:45 Как ты привлек первые инвестиции в США?35:04 Как сейчас развивается твой бизнес? 42:35 Какие стратегии роста вы пробовали?48:18 Какие твои планы на будущее и по развитию компании?51:25 Почему подал на О1 а не EB1?53:22 Как ты собирал кейс?58:18 Чему тебя научила твоя история переезда?1:00:17 Какие твои личные цели?1:03:27 Что хочешь пожелать тем, кто сейчас планирует ехать в США или открывать тут бизнес?
欢迎大家又来收听新一期HCI Insiders~想必在北美的各位最近经常刷到某某高校学生F1签证被撤销,或某某公司某大厂不再sponsor员工的perm流程等新闻,移民政策不明朗的趋势下,越来越多人开始研究靠自己的专业技能和过往成就办绿卡的方式,比如美国国家利益豁免NIW以及杰出人才EB1-A,而这两个项目的通过标准似乎也因为有更多申请者而逐渐水涨船高。其中杰出人才EB1-A排期更短,但要求更高,成为许多人努力的目标。今天我们邀请到的Augustina目前在西雅图做Senior Product Designer。她上个月刚刚通过EB1-A的申请。2020年从华盛顿大学UW Seattle毕业之后,她经历了H1B三抽不中又被前公司layoff,于是自己准备材料,用7个月时间搞定了O1签证,也顺利找到了新工作,可以想像这中间多么辛苦甚至绝望。拿到O1签证后,她又开始准备EB1-A材料,最终用17个月拿下这个全世界每年只有3000-5000人获得批准的移民通道,足以证明她的卓越。回想自己初高中在国内读书的日子,Augustina会觉得自己是个平庸的小孩,全班75个人中只能排到倒数第25名,她说自己“既不够努力,也不擅长基础学科的学习”。然而大一最后一个学期的设计课让她笃定,设计就是她要奋斗一生的方向。如今回头看,她确实做到了一直在热爱的路上努力前进。我们对Augustina有诸多好奇,接下来就来听听她的故事吧~ --------------------------------------------------------------------时间线:0:00 开始3:30 Augustina初高中的经历,以及决定出国留学的原因7:10 华盛顿大学学生选专业机制,最初打动Augustina的那门设计课到底教了什么?11:20 Augustina最初最感兴趣的是无障碍设计和包容性设计13:15 华盛顿大学HCDE项目的就读体验,本科生的资源还是很丰富的!17:08 Augustina本科期间的实习经历,capstone居然可以自己找?!20:06 2020年毕业后,Augustina为找工作做了哪些努力?21:45 Alaska Airlines工作体验22:53 在大公司成熟的design team和在小的初创公司做Sole Designer的差异:后者挑战更多,需要具备更多设计之外的能力,比如调研、沟通、教育其他stakeholders设计的重要性,以及优先级管理等。29:25 沟通,信息,proactive,以及educate其他stakeholders设计师的重要性31:07 Junior vs Senior product designer: communication, project scope, discourse power, dealing with complexity and ambiguity 32:58 Augustina在Toast做什么产品?36:20 如何看待AI的发展趋势对UX Designer工作的影响?AI或许能替代“工具型”设计师,但是很难替代“协作型/领导型”设计师——真正的thinker是不可能被取代的41:27 来聊聊O1签证申请:“当时H1B三抽不中。比起Day1 CPT,O1能让我变成一个更好的人、更好的设计师。”46:20 再来聊聊EB1-A——美国杰出人才绿卡。Augustina在23年11月通过O1之后休息了一个月,24年正式开始准备EB1-A。由于要求更高,所以她基本上所有的材料都是重新准备的。“当被推上‘绝路'的时候,你会惊奇的发现你的能量、你的能力其实比你想象中要大,你能做到很多之前想象不到的事情。”52:57 公司项目可以用来申请EB1-A吗?具体情况具体分析,小公司可能会比较好沟通,大公司可能有限制。55:05 EB1-A通过是Augustina人生中的一个巨大的里程碑,那她的下一个目标是什么呢?——正在尝试在西雅图建立设计师社群~56:50 节目最后的常规问题:假如让你跟十年前或十年后的自己说一番话,可能是一些寄语或者一些展望。你会选十年前还是十年后?你想对自己说什么?最后,Augustina的LinkedIn在这里!感谢大家的收听,我们下一期再见
Join Simtheory: https://simtheory.aiGet an AI workspace for your team: https://simtheory.ai/workspace/team/---CHAPTERS:00:00 - Will Chris Lose His Bet?04:48 - Google's 2.5 Gemini Preview Update12:44 - Future AI Systems Discussion: Skills, MCPs & A2A47:02 - Will AI Systems become walled gardens?55:13 - Do Organizations That Own Data Build MCPs & Agents? Is This The New SaaS?1:17:45 - Can we improve RAG with tool calling and stop hallucinations?---Thanks for listening. If you like chatting about AI consider joining our active Discord community: https://thisdayinai.com.
Prodcast: ПоиÑк работы в IT и переезд в СШÐ
В этом выпуске у меня в гостях Нисо Нигматуллина — основательница PR-агентства Satou, специалист по личному брендингу и обладательница виз O-1 и EB-1A. За последние годы её команда помогла десяткам экспертов из сфер IT, маркетинга, дизайна и предпринимательства оформить медиапортфолио, повысить узнаваемость и пройти по визовым кейсам талантов.Мы обсудили, как именно публикации в СМИ влияют на визы O-1, EB-1A и EB-2 NIW, какие издания и форматы подходят под требования USCIS, почему инфлюенсер — не то же самое, что эксперт, и как даже интроверт без публичности может выстроить PR-стратегию. Затронули критерии качества публикаций, реальные расценки на услуги пиар-агентств и почему статьи, написанные в ChatGPT, чаще вредят кейсу, чем помогают. Разобрали типичные ошибки, фейлы с «рекламными» материалами и то, как должен выглядеть идеальный медиапортфель под визу талантов.Нисо Нигматуллина (Niso Nigmatullina) -- основательница американского PR-агентства Satou, обладательница гринкарты таланта EB1, ex-Procter & Gamble.LinkedIn: https://www.linkedin.com/in/nisonigmatullina/ Telegram: @nisonigmaПредыдущие выпуски с Нисо:Как получить визу О1 в США? Как улучшить качество публикаций и увеличить шансы? https://youtu.be/S_IXFDm8sIg Как русскоязычные иммигрантки из Forbes покоряют Америку. Релокация, нетворкинг и жизнь в США https://youtu.be/svZjlIoyHEk ***Записывайтесь на карьерную консультацию (резюме, LinkedIn, карьерная стратегия, поиск работы в США): https://annanaumova.comКоучинг (синдром самозванца, прокрастинация, неуверенность в себе, страхи, лень) https://annanaumova.notion.site/3f6ea5ce89694c93afb1156df3c903abОнлайн курс "Идеальное резюме и поиск работы в США":https://go.mbastrategy.com/resumecoursemainГайд "Идеальное американское резюме":https://go.mbastrategy.com/usresumeГайд "Как оформить профиль в LinkedIn, чтобы рекрутеры не смогли пройти мимо": https://go.mbastrategy.com/linkedinguideМой Telegram-канал: https://t.me/prodcastUSAМой Instagram: https://www.instagram.com/prodcast.us/Prodcast в соцсетях и на всех подкаст платформахhttps://linktr.ee/prodcastUS⏰ Timecodes ⏰00:00 Начало11:47 Зачем нужен пиар и публикации для виз таланта в США?21:02 Какие требования к статьям для O1, EB1 и EB2NIW? Сходства и различия.28:47 Какие критерии к изданиям?43:00 Какие требования к содержанию публикаций?53:02 Можно ли написать статьи с помощью ChatGPT?1:03:12 Что делать, если я не публичный человек, интроверт и у меня нет публикаций?1:11:42 Сколько стоят статьи в СМИ?1:24:23 Кому не нужно пиар агентство?1:27:44 Ошибки при работе над публикациями1:31:22 Что можешь пожелать тем, кто решил переезжать в США по визе таланта?
Prodcast: ПоиÑк работы в IT и переезд в СШÐ
В этом выпуске у меня в гостях Дима Литвинов — основатель компании Dreem Relocation Platform, помогающей предпринимателям, IT-специалистам и креативщикам переезжать в США по визам O-1 и EB1.Мы подробно обсудили, как работает схема самостоятельной релокации через открытие своей компании в США: кто может быть петиционером, какие документы нужны и как выглядит кейс, когда вы нанимаете сами себя. Затронули нюансы подхода через агентов, различия между визами O-1, L-1 и H-1B, возможность получить гринкарту после переезда и как избежать отказа при продлении. Разобрали реальные кейсы: от фаундеров и консультантов до разработчиков, которым не удалось найти работодателя, но удалось перевезти себя через B2B-контракты. Выяснили, почему релокация через собственную компанию в 2024 году — это один из самых доступных и быстрых путей для тех, кто готов взять процесс в свои руки.Дима Литвинов (Dima Litvinov) – основатель компании Dreem Relocation Platform в США, обладатель британской Global Talent Visa в категории BusinessLinkedIn: https://www.linkedin.com/in/dimalitvinov/ По промо коду PRODCAST получите бесплатную консультацию с экспертом Dreem и подробную оценку вашего кейса адвокатом https://idreem.pipedrive.com/scheduler/Rp3bXjFQ/your-free-dreem-us-visa-consultation-prodcast Оцените ваши шансы на релокацию в США онлайн с мгновенным результатом по ссылке: bit.ly/3QY520h Больше о визах и релокации по визам талантов на Linkedin Dreem https://www.linkedin.com/company/dreemrelocation/ Эпизоды с Димой Литвиновым: Виза таланта в США 2025. O1, EB1, EB2NIW - что нового? Трамп и иммиграция, закроют ли Америку? https://youtube.com/live/i4MHQhr8An8 Как владельцу шаурмичной получить визу таланта О1 и EB1 в США? https://youtu.be/dZqaDJywBuk Эпизод с Данилом Кислинским - Как открыть бизнес (LLC, S-corp) в США и нанять себя? https://youtu.be/CP0PofO2WEI Как самому себе предложить работу в США для визы таланта О1? Как выглядит петиция? Ольга Бондарева https://youtu.be/QSaDt3FmFBwИстория про то, как iOS разработчик искал спонсорство визы O1 в США и как работодатель отозвал оффер в последний день перед выходом на работу https://youtu.be/sHDq0lA-uOY ***Записывайтесь на карьерную консультацию (резюме, LinkedIn, карьерная стратегия, поиск работы в США): https://annanaumova.comКоучинг (синдром самозванца, прокрастинация, неуверенность в себе, страхи, лень) https://annanaumova.notion.site/3f6ea5ce89694c93afb1156df3c903abОнлайн курс "Идеальное резюме и поиск работы в США":https://go.mbastrategy.com/resumecoursemainГайд "Идеальное американское резюме":https://go.mbastrategy.com/usresumeГайд "Как оформить профиль в LinkedIn, чтобы рекрутеры не смогли пройти мимо": https://go.mbastrategy.com/linkedinguideМой Telegram-канал: https://t.me/prodcastUSAМой Instagram: https://www.instagram.com/prodcast.us/Prodcast в соцсетях и на всех подкаст платформахhttps://linktr.ee/prodcastUS⏰ Timecodes ⏰00:00 Начало5:54 Что сейчас происходит с визами и гринкартами таланта?13:41 Открыть компанию, сделать себе визу и переехать. Как это работает? 23:39 Какие требования к компании для того, чтобы выпустить визу?34:01 Какие есть тонкости и ограничения?46:40 Примеры кейсов58:24 Про агента и то как он работает1:06:24 Про визу H1B1:13:01 Виза L1 - кому она подходит?1:18:23 Какую визу выбрать: O1, EB1, L1?1:24:14 Что еще можешь пожелать тем, кто пытается переехать в США по визам талантов?
Get your AI workspace: https://simtheory.ai----00:00 - Fun with Suno 4.509:20 - LlamaCon, Meta's Llama API, Meta AI Apps & Meta's Social AI Strategy26:06 - How We'll Interface with AI Next Discussion: 45:38 - Common Database Not Interface with AI1:03:46 - Chris's Polymarket Bet: Which company has best AI model end of May?1:06:07 - Daily Drivers and Model Switching: Tool Calling & MCPs with Models1:15:04 - OpenAI's New ChatGPT Tune (GPT-4o) Reverted1:19:53 - Chris's Daily Driver & Qwen3: Qwen3-30B-A3B1:26:40 - Suno 4.5 Songs in Full----Thanks for listening, we appreciate it!
Try Simtheory: https://simtheory.ai
Join Simtheory: https://simtheory.ailike and sub xoxox----00:00 - Initial reactions to Gaggle of Model Releases09:29 - Is this the beginning of future GPT-5 AI systems?47:10 - GPT-4.1, o3, o4-mini model details & thoughts58:42 - Model comparisons with lunar injection1:03:17 - AI Rap Battle Test: o3 Diss Track "Greg's Back"1:08:12 - Thoughts on using new models + Gemini 2.5 Pro quirks1:10:54 - The next model test: chained tool calling & lock in1:14:43 - OpenAI releases Codex CLI: impressions/thoughts1:18:45 - Final thoughts & help us with crazy presentation ideas----Links from Discord:- Lunar Lander: https://simulationtheory.ai/7bbfe21a-7859-4fdd-8bbf-47fdfb5cf03b- Evolution Sim: https://simulationtheory.ai/457b047f-0ac2-4162-8d6a-3ea3fa1235c9
Join Simtheory: https://simtheory.ai--Get the official Simtheory hat: https://simulationtheory.ai/689e11b3-d488-4238-b9b6-82aded04fbe6---CHAPTERS:00:00 - The Wrong Pendant?02:34 - Agent2Agent Protocol, What is It? Implications and Future Agents48:43 - Agent Development Kit (ADK)57:50 - AI Agents Marketplace by Google Cloud1:00:46 - Firebase Studio is very broken...1:06:30 - Vibing with AI for everything.. not just vibe code1:15:10 - Gemini 2.5 Flash, Live API and Veo21:17:45 - Is Llama 4 a flop?1:27:25 - Grok 3 API Released without vision priced like Sonnet 3.7---Thanks for listening and your support!
Join Simtheory and create an AI workspace: https://simtheory.ai----Links from show:DIS TRACK: https://simulationtheory.ai/2eb6408e-88f9-4b6a-ac4d-134d9dac3073----CHAPTERS:00:00 - Will we make 100 episodes?00:48 - Checking back in with Gemini 2.5 Pro03:30 - Diss Track: Gemini 2.5 Pro07:14 - Gemini 2.5 Pro on Polymarket17:32 - Amazon Nova Act Computer Use: We Have Access!29:45 - Future Interface of Work: Delegating Tasks with AI58:03 - How We Work Today with AI Vs Future Work----Thanks for listening and all of your support!
Prodcast: ПоиÑк работы в IT и переезд в СШÐ
С Данилом Кислинским, предпринимателем и консультантом по корпоративной структуре бизнеса в США, разобрали ключевые вопросы для тех, кто хочет открыть свою компанию в Америке. Пошагово обсудили, кто может зарегистрировать бизнес, какие штаты и формы компаний выбирать под разные задачи, как открыть банковский счёт, не нарушая санкционных режимов, и можно ли получить визу через собственную компанию.Разобрали, чем отличаются LLC и C-Corp, в каких случаях лучше Делавэр, а в каких — Вайоминг, и почему штат регистрации компании влияет не только на налоги, но и на восприятие инвесторов. Данил объяснил, как банки проверяют ваших бенефициаров, почему не стоит даже временно заезжать в Россию, если у вас финтех, и как подготовить документы, чтобы пройти комплаенс в Mercury, Brex или других нео-банках.Обсудили, как правильно выстроить структуру компании, если вы планируете использовать её для визы O1, EB1A или даже H1B, почему корпоративный и иммиграционный юристы должны работать вместе и как избежать отказа из-за конфликта интересов.Это видео — практическое руководство для тех, кто хочет вести бизнес в США удалённо, легально и с учётом всех нюансов.Данил Кислинский (Danil Kislinskiy) - фаундер компании Go Global World которая соединяет стартап фаундеров, инвесторов и эдвайзеров, а также сам является инвестором в Кремниевой долине.LinkedIn: https://www.linkedin.com/in/danilkislinskiy/Telegram: @danilggwКомьюнити GGW Silicon Valley Chat в Телеграме: https://t.me/+Ktq-ALstZ0o0YjAz Slack: https://join.slack.com/t/goglobalworld1/shared_invite/zt-32rdaof00-NTyg3PnahDPol_~CoeFyqw***Записывайтесь на карьерную консультацию (резюме, LinkedIn, карьерная стратегия, поиск работы в США): https://annanaumova.comКоучинг (синдром самозванца, прокрастинация, неуверенность в себе, страхи, лень) https://annanaumova.notion.site/3f6ea5ce89694c93afb1156df3c903abОнлайн курс "Идеальное резюме и поиск работы в США":https://go.mbastrategy.com/resumecoursemainГайд "Идеальное американское резюме":https://go.mbastrategy.com/usresumeГайд "Как оформить профиль в LinkedIn, чтобы рекрутеры не смогли пройти мимо": https://go.mbastrategy.com/linkedinguideМой Telegram-канал: https://t.me/prodcastUSAМой Instagram: https://www.instagram.com/prodcast.us/Prodcast в соцсетях и на всех подкаст платформахhttps://linktr.ee/prodcastUS⏰ Timecodes ⏰00:00 Начало.17:15 Кому, где и как открывать свою компанию в США?35:55 Какие документы нужны для открытия юрлица? Куда идти? 43:09 Сколько стоит открыть компанию?48:23 Можно ли открыть компанию удаленно и далее ее сопровождать? 51:04 Как российский паспорт и санкции влияют на ведение бизнеса в США?1:03:58 Как получить EIN? Что такое ITIN и нужен ли он для иностранных фаундеров?1:11:55 Как открыть банковский счет? Как выбрать банк?1:24:39 На какую визу можно подать от своей компании?1:31:18 Что еще можешь пожелать тем, кто сейчас думает об открытии бизнеса в США?
Guest: Alex Polyakov, CEO at Adversa AI Topics: Adversa AI is known for its focus on AI red teaming and adversarial attacks. Can you share a particularly memorable red teaming exercise that exposed a surprising vulnerability in an AI system? What was the key takeaway for your team and the client? Beyond traditional adversarial attacks, what emerging threats in the AI security landscape are you most concerned about right now? What trips most clients, classic security mistakes in AI systems or AI-specific mistakes? Are there truly new mistakes in AI systems or are they old mistakes in new clothing? I know it is not your job to fix it, but much of this is unfixable, right? Is it a good idea to use AI to secure AI? Resources: EP84 How to Secure Artificial Intelligence (AI): Threats, Approaches, Lessons So Far AI Red Teaming Reasoning LLM US vs China: Jailbreak Deepseek, Qwen, O1, O3, Claude, Kimi Adversa AI blog Oops! 5 serious gen AI security mistakes to avoid Generative AI Fast Followership: Avoid These First Adopter Security Missteps
Prodcast: ПоиÑк работы в IT и переезд в СШÐ
Как новый срок президента Дональда Трампа повлияет на айти сектор.- Что уже изменилось для айтишников за полгода правления Трампа?- Каких изменений в IT сфере ожидать в ближайшие пару лет? - Как повлияет новый президент на распределение рабочей силы внутри штатов и за их пределами?- Что будет с иммигрантами? Закроют ли границы? Закрутят ли гайки в плане выдачи американских рабочих виз?- Что будет с аутсорсом?- Как на Трампа вляют его советники из big tech типа Илона Маска и Джефа Безоса?- Кто выиграет при правлении Дональда Трампа?- Будет ли легче найти работу при Трампе?Евгений Волчков, Engineering Manager в iManage (ex-Bank of America и Verizon).LinkedIn: https://www.linkedin.com/in/valchkou/ Валерий Широков aka Val Wide (Principal Cloud Architect and Director | DevOps | Platform Engineering | Security | Azure | Terraform | GCP | Kubernetes, ex-Microsoft, Lululemon, Ebay).https://www.linkedin.com/in/val-wide/Менторски чатик Вала в Телеграме "[RU] Tech Mentorship" https://t.me/+8N6F-CMobZliMTBhВидео с Дарьей, упомянутое в стриме - Стажировки в США. Диплом в американском вузе — это еще не гарантия получения работы! Дарья Скалицки https://youtu.be/p5t9LPFA5W0Похожие видео - Как изменится рынок труда и иммиграционная политика при Трампе? U4U, H1B, визы талантов O1, EB1, EB2. Александр Шваикин и иммиграционный адвокат в США Семен Гладин. https://youtube.com/live/qm3HpXlad-c- Виза таланта в США 2025. O1, EB1, EB2NIW - что нового? Трамп и иммиграция, закроют ли Америку? Дима Литвинов – основатель компании Dreem Relocation Platform. https://youtube.com/live/i4MHQhr8An8 ***Записаться на карьерную консультацию (резюме, LinkedIn, карьерная стратегия, поиск работы в США) https://annanaumova.comКоучинг (синдром самозванца, прокрастинация, неуверенность в себе, страхи, лень) https://annanaumova.notion.site/3f6ea5ce89694c93afb1156df3c903abВидео курс по составлению резюме для международных компаний "Идеальное американское резюме": https://go.mbastrategy.com/resumecoursemainГайд "Идеальное американское резюме" https://go.mbastrategy.com/usresumeПодписывайтесь на мой Телеграм канал: https://t.me/prodcastUSAПодписывайтесь на мой Инстаграм https://www.instagram.com/prodcast.us Гайд "Как оформить профиль в LinkedIn, чтобы рекрутеры не смогли пройти мимо" https://go.mbastrategy.com/linkedinguide⏰ Timecodes ⏰11:09 Политика Трампа и её влияние на IT26:44 Почему Трамп выбрал такую команду?34:20 Иммиграция при Трампе49:46 Вопросы из чата1:02:20 Что будет с аутсорсом?1:09:36 Прогнозы на будущее
Create a Simtheory workspace: https://simtheory.aiCompare models: https://simtheory.ai/models/------3d City Planner App (Example from show): https://simulationtheory.ai/8cfa6102-ed37-4c47-bc73-d057ba9873bd------CHAPTERS:00:00 - AI Fashion01:13 - Gemini 2.5 Pro Initial Impressions: We're Impressed!38:24 - Thoughts of Gemini distribution and our daily workflows55:49 - OpenAI's GPT-4o Image Generation: thoughts & examples1:13:52 - Gemini 2.5 Pro Boom Factor1:18:38 - Average rant on vibe coding and the future of AI tooling------Disclaimer: this video was not sponsored by Google... it's a joke.Thanks for listening!
Create an AI workspace on Simtheory: https://simtheory.ai---Song: https://simulationtheory.ai/f6d643e4-4201-475c-aa82-8a96b6b3b215---CHAPTERS:00:00 - OpenAI's audio model updates: gpt-4o-transcribe, gpt-4o-mini-tts18:39 - Strategy of AI Labs with Agent SDKs and Model "stacks" and limitations of voice25:28 - Cost of models, GPT-4.5, o1-pro api release thoughts31:57 - o1-pro "I am rich" track & Chris's o1-pro PR stunt realization, more thoughts on o1 family48:39 - Moore's Law for AI agents, current AI workflows and future enterprise agent workflows & AI agent job losses1:24:09 - Can we control agents?1:29:21 - Final thoughts for the week1:35:15 - Full "I am rich" o1-pro track---See you next week and thanks for your support.CORRECTION: Kosciusko is obviously not an aboriginal name I misspoke. Wagga Wagga and others in the voice clip are and are great ways to test AI text to speech models!
Prodcast: ПоиÑк работы в IT и переезд в СШÐ
Гость выпуска – Сергей Голицын, Software Engineer и основатель сообщества FaangTalk по подготовке к техническим интервью. В этом выпуске мы обсудили, как искать работу в США на визе O1, когда и как говорить с работодателем о спонсорстве, и какие ошибки могут стоить оффера. Сергей поделился своим опытом переезда, получения оффера в американском стартапе, неожиданного увольнения и повторного поиска работы в условиях кризиса. Разобрали, как эффективно подавать резюме, что делать, если отказали в визе, и как грамотно выстраивать стратегию поиска работы, чтобы в итоге получить оффер в крупной компании.Сергей Голицын - Software Engineer и основатель FaangTalk, сообщества по подготовке к интервью в FAANG-like компанииLinkedIn: ttps://www.linkedin.com/in/sergei-golitsyn/ YouTube: https://youtube.com/@faangtalk Telegram-канал: https://t.me/crack_code_interviewTelegram-чат: https://t.me/faangtalkСсылки, упомянутые в видео:https://simplify.jobs/https://resumeworded.com/resume-scannerhttps://www.tryexponent.com/https://www.pramp.com/***Записывайтесь на карьерную консультацию (резюме, LinkedIn, карьерная стратегия, поиск работы в США): https://annanaumova.comКоучинг (синдром самозванца, прокрастинация, неуверенность в себе, страхи, лень) https://annanaumova.notion.site/3f6ea5ce89694c93afb1156df3c903abОнлайн курс "Идеальное резюме и поиск работы в США":https://go.mbastrategy.com/resumecoursemainГайд "Идеальное американское резюме":https://go.mbastrategy.com/usresumeГайд "Как оформить профиль в LinkedIn, чтобы рекрутеры не смогли пройти мимо": https://go.mbastrategy.com/linkedinguideМой Telegram-канал: https://t.me/prodcastUSAМой Instagram: https://www.instagram.com/prodcast.us/Prodcast в соцсетях и на всех подкаст платформахhttps://linktr.ee/prodcastUS⏰ Timecodes ⏰00:00 Начало.9:09 Спонсорство визы - что говорить на собеседовании?13:00 Как тебя сократили на первой работе в штатах?19:11 Как быстро ты начал искать работу после увольнения? 22:12 Как и где откликался?25:50 Как ты адаптировал резюме?44:06 Как проходили звонки с рекрутерами?50:48 Как рекрутеры реагировали на твой визовый статус?55:57 Как проходили технические интервью (Leetcode)?1:04:51 Сколько офферов ты получил? Как торговался?1:09:14 Как и почему отозвали оффер?1:14:12 Новая виза О1 и выигрыш гринкарты 1:23:09 Как ты искал работу из Бишкека (Кыргызстан)? 1:29:13 Какие планы на будущее?1:31:22 Что можешь пожелать тем, кто сейчас ищет работу в США?
We return from the Wilds and the plague to bring you an all new episode! We catch up on the games we've finished including Avowed, Split Fiction, and almost Kingdom Come Deliverance 2. Praise Kojima, become very interested in Silent Hill, realize we're old as Chrono Trigger celebrates 30 years and become vulnerable with an AI voice! 0:00 - Intro1:02 - Laundry8:30 - Tub grub investments12:00 - Finishing games13:50 - Avowed34:00 - Death's Stranding 241:40 - Silent Hill f45:30 - Split Fiction1:05:00 - Chrono Trigger turns 301:08:00 - Assassin's Creed Shadow1:16:00 - Claire Obscura Expedition 331:28:00 - R.E.P.O1:41:30 - Pirate Yakuza1:54:00 - Monster Hunter Wilds2:08:00 - Steam Next Fest demos2:23:00 - Core Keeper2:26:20 - Twitch partners with StreamElements2:33:00 - Maya the AI2:40:00 - Twitch Mobile app changes2:50:40 - Shoutouts See omnystudio.com/listener for privacy information.
Join Simtheory: https://simtheory.ai----CHAPTERS:00:00 - Gemini Flash 2.0 Experimental Native Image Generation & Editing27:55 - Thoughts on OpenAI's "New tools for building agents" announcement43:31 - Why is everyone talking about MCP all of a sudden?56:31 - Manus AI: Will Manus Invade the USA and Defeat it With Powerful AGI? (jokes)----Thanks for all of your support and listening!
Send Everyday AI and Jordan a text messageLimits?
This Week in Machine Learning & Artificial Intelligence (AI) Podcast
Today, we're joined by Niklas Muennighoff, a PhD student at Stanford University, to discuss his paper, “S1: Simple Test-Time Scaling.” We explore the motivations behind S1, as well as how it compares to OpenAI's O1 and DeepSeek's R1 models. We dig into the different approaches to test-time scaling, including parallel and sequential scaling, as well as S1's data curation process, its training recipe, and its use of model distillation from Google Gemini and DeepSeek R1. We explore the novel "budget forcing" technique developed in the paper, allowing it to think longer for harder problems and optimize test-time compute for better performance. Additionally, we cover the evaluation benchmarks used, the comparison between supervised fine-tuning and reinforcement learning, and similar projects like the Hugging Face Open R1 project. Finally, we discuss the open-sourcing of S1 and its future directions. The complete show notes for this episode can be found at https://twimlai.com/go/721.
The AI Breakdown: Daily Artificial Intelligence News and Discussions
OpenAI has officially launched GPT-4.5, but it's not the model most people expected. While it lags behind reasoning focused models like O1 and DeepSeek, it shines in creativity, writing, and emotional intelligence. Sam Altman calls it the first model that “feels like talking to a thoughtful person.” But with high API costs and limited reasoning improvements, who is GPT-4.5 actually for? Before that in the headlines, AI is growing faster than SaaS ever did. Brought to you by:KPMG – Go to www.kpmg.us/ai to learn more about how KPMG can help you drive value with our AI solutions.Vanta - Simplify compliance - https://vanta.com/nlwThe Agent Readiness Audit from Superintelligent - Go to https://besuper.ai/ to request your company's agent readiness score.The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: https://pod.link/1680633614Subscribe to the newsletter: https://aidailybrief.beehiiv.com/Join our Discord: https://bit.ly/aibreakdown
Join Simtheory to try GPT-4.5: https://simtheory.aiDis Track: https://simulationtheory.ai/5714654f-0fbe-496f-8428-20018457c4c7===CHAPTERS:00:00 - Reaction to GPT4.5 Live Stream + Release12:45 - Claude 3.7 Sonnet Release: Reactions and First Week Impressions45:58 - Claude 3.7 Sonnet Dis Track Test56:10 - Claude Code First Impressions + Future Agent Workflows1:15:45 - Chris's Veo2 Film Clip1:24:49 - Alexa+ AI Assistant1:34:05 - Claude 3.7 Sonnet BOOM FACTOR
Today's episode is with Paul Klein, founder of Browserbase. We talked about building browser infrastructure for AI agents, the future of agent authentication, and their open source framework Stagehand.* [00:00:00] Introductions* [00:04:46] AI-specific challenges in browser infrastructure* [00:07:05] Multimodality in AI-Powered Browsing* [00:12:26] Running headless browsers at scale* [00:18:46] Geolocation when proxying* [00:21:25] CAPTCHAs and Agent Auth* [00:28:21] Building “User take over” functionality* [00:33:43] Stagehand: AI web browsing framework* [00:38:58] OpenAI's Operator and computer use agents* [00:44:44] Surprising use cases of Browserbase* [00:47:18] Future of browser automation and market competition* [00:53:11] Being a solo founderTranscriptAlessio [00:00:04]: Hey everyone, welcome to the Latent Space podcast. This is Alessio, partner and CTO at Decibel Partners, and I'm joined by my co-host Swyx, founder of Smol.ai.swyx [00:00:12]: Hey, and today we are very blessed to have our friends, Paul Klein, for the fourth, the fourth, CEO of Browserbase. Welcome.Paul [00:00:21]: Thanks guys. Yeah, I'm happy to be here. I've been lucky to know both of you for like a couple of years now, I think. So it's just like we're hanging out, you know, with three ginormous microphones in front of our face. It's totally normal hangout.swyx [00:00:34]: Yeah. We've actually mentioned you on the podcast, I think, more often than any other Solaris tenant. Just because like you're one of the, you know, best performing, I think, LLM tool companies that have started up in the last couple of years.Paul [00:00:50]: Yeah, I mean, it's been a whirlwind of a year, like Browserbase is actually pretty close to our first birthday. So we are one years old. And going from, you know, starting a company as a solo founder to... To, you know, having a team of 20 people, you know, a series A, but also being able to support hundreds of AI companies that are building AI applications that go out and automate the web. It's just been like, really cool. It's been happening a little too fast. I think like collectively as an AI industry, let's just take a week off together. I took my first vacation actually two weeks ago, and Operator came out on the first day, and then a week later, DeepSeat came out. And I'm like on vacation trying to chill. I'm like, we got to build with this stuff, right? So it's been a breakneck year. But I'm super happy to be here and like talk more about all the stuff we're seeing. And I'd love to hear kind of what you guys are excited about too, and share with it, you know?swyx [00:01:39]: Where to start? So people, you've done a bunch of podcasts. I think I strongly recommend Jack Bridger's Scaling DevTools, as well as Turner Novak's The Peel. And, you know, I'm sure there's others. So you covered your Twilio story in the past, talked about StreamClub, you got acquired to Mux, and then you left to start Browserbase. So maybe we just start with what is Browserbase? Yeah.Paul [00:02:02]: Browserbase is the web browser for your AI. We're building headless browser infrastructure, which are browsers that run in a server environment that's accessible to developers via APIs and SDKs. It's really hard to run a web browser in the cloud. You guys are probably running Chrome on your computers, and that's using a lot of resources, right? So if you want to run a web browser or thousands of web browsers, you can't just spin up a bunch of lambdas. You actually need to use a secure containerized environment. You have to scale it up and down. It's a stateful system. And that infrastructure is, like, super painful. And I know that firsthand, because at my last company, StreamClub, I was CTO, and I was building our own internal headless browser infrastructure. That's actually why we sold the company, is because Mux really wanted to buy our headless browser infrastructure that we'd built. And it's just a super hard problem. And I actually told my co-founders, I would never start another company unless it was a browser infrastructure company. And it turns out that's really necessary in the age of AI, when AI can actually go out and interact with websites, click on buttons, fill in forms. You need AI to do all of that work in an actual browser running somewhere on a server. And BrowserBase powers that.swyx [00:03:08]: While you're talking about it, it occurred to me, not that you're going to be acquired or anything, but it occurred to me that it would be really funny if you became the Nikita Beer of headless browser companies. You just have one trick, and you make browser companies that get acquired.Paul [00:03:23]: I truly do only have one trick. I'm screwed if it's not for headless browsers. I'm not a Go programmer. You know, I'm in AI grant. You know, browsers is an AI grant. But we were the only company in that AI grant batch that used zero dollars on AI spend. You know, we're purely an infrastructure company. So as much as people want to ask me about reinforcement learning, I might not be the best guy to talk about that. But if you want to ask about headless browser infrastructure at scale, I can talk your ear off. So that's really my area of expertise. And it's a pretty niche thing. Like, nobody has done what we're doing at scale before. So we're happy to be the experts.swyx [00:03:59]: You do have an AI thing, stagehand. We can talk about the sort of core of browser-based first, and then maybe stagehand. Yeah, stagehand is kind of the web browsing framework. Yeah.What is Browserbase? Headless Browser Infrastructure ExplainedAlessio [00:04:10]: Yeah. Yeah. And maybe how you got to browser-based and what problems you saw. So one of the first things I worked on as a software engineer was integration testing. Sauce Labs was kind of like the main thing at the time. And then we had Selenium, we had Playbrite, we had all these different browser things. But it's always been super hard to do. So obviously you've worked on this before. When you started browser-based, what were the challenges? What were the AI-specific challenges that you saw versus, there's kind of like all the usual running browser at scale in the cloud, which has been a problem for years. What are like the AI unique things that you saw that like traditional purchase just didn't cover? Yeah.AI-specific challenges in browser infrastructurePaul [00:04:46]: First and foremost, I think back to like the first thing I did as a developer, like as a kid when I was writing code, I wanted to write code that did stuff for me. You know, I wanted to write code to automate my life. And I do that probably by using curl or beautiful soup to fetch data from a web browser. And I think I still do that now that I'm in the cloud. And the other thing that I think is a huge challenge for me is that you can't just create a web site and parse that data. And we all know that now like, you know, taking HTML and plugging that into an LLM, you can extract insights, you can summarize. So it was very clear that now like dynamic web scraping became very possible with the rise of large language models or a lot easier. And that was like a clear reason why there's been more usage of headless browsers, which are necessary because a lot of modern websites don't expose all of their page content via a simple HTTP request. You know, they actually do require you to run this type of code for a specific time. JavaScript on the page to hydrate this. Airbnb is a great example. You go to airbnb.com. A lot of that content on the page isn't there until after they run the initial hydration. So you can't just scrape it with a curl. You need to have some JavaScript run. And a browser is that JavaScript engine that's going to actually run all those requests on the page. So web data retrieval was definitely one driver of starting BrowserBase and the rise of being able to summarize that within LLM. Also, I was familiar with if I wanted to automate a website, I could write one script and that would work for one website. It was very static and deterministic. But the web is non-deterministic. The web is always changing. And until we had LLMs, there was no way to write scripts that you could write once that would run on any website. That would change with the structure of the website. Click the login button. It could mean something different on many different websites. And LLMs allow us to generate code on the fly to actually control that. So I think that rise of writing the generic automation scripts that can work on many different websites, to me, made it clear that browsers are going to be a lot more useful because now you can automate a lot more things without writing. If you wanted to write a script to book a demo call on 100 websites, previously, you had to write 100 scripts. Now you write one script that uses LLMs to generate that script. That's why we built our web browsing framework, StageHand, which does a lot of that work for you. But those two things, web data collection and then enhanced automation of many different websites, it just felt like big drivers for more browser infrastructure that would be required to power these kinds of features.Alessio [00:07:05]: And was multimodality also a big thing?Paul [00:07:08]: Now you can use the LLMs to look, even though the text in the dome might not be as friendly. Maybe my hot take is I was always kind of like, I didn't think vision would be as big of a driver. For UI automation, I felt like, you know, HTML is structured text and large language models are good with structured text. But it's clear that these computer use models are often vision driven, and they've been really pushing things forward. So definitely being multimodal, like rendering the page is required to take a screenshot to give that to a computer use model to take actions on a website. And it's just another win for browser. But I'll be honest, that wasn't what I was thinking early on. I didn't even think that we'd get here so fast with multimodality. I think we're going to have to get back to multimodal and vision models.swyx [00:07:50]: This is one of those things where I forgot to mention in my intro that I'm an investor in Browserbase. And I remember that when you pitched to me, like a lot of the stuff that we have today, we like wasn't on the original conversation. But I did have my original thesis was something that we've talked about on the podcast before, which is take the GPT store, the custom GPT store, all the every single checkbox and plugin is effectively a startup. And this was the browser one. I think the main hesitation, I think I actually took a while to get back to you. The main hesitation was that there were others. Like you're not the first hit list browser startup. It's not even your first hit list browser startup. There's always a question of like, will you be the category winner in a place where there's a bunch of incumbents, to be honest, that are bigger than you? They're just not targeted at the AI space. They don't have the backing of Nat Friedman. And there's a bunch of like, you're here in Silicon Valley. They're not. I don't know.Paul [00:08:47]: I don't know if that's, that was it, but like, there was a, yeah, I mean, like, I think I tried all the other ones and I was like, really disappointed. Like my background is from working at great developer tools, companies, and nothing had like the Vercel like experience. Um, like our biggest competitor actually is partly owned by private equity and they just jacked up their prices quite a bit. And the dashboard hasn't changed in five years. And I actually used them at my last company and tried them and I was like, oh man, like there really just needs to be something that's like the experience of these great infrastructure companies, like Stripe, like clerk, like Vercel that I use in love, but oriented towards this kind of like more specific category, which is browser infrastructure, which is really technically complex. Like a lot of stuff can go wrong on the internet when you're running a browser. The internet is very vast. There's a lot of different configurations. Like there's still websites that only work with internet explorer out there. How do you handle that when you're running your own browser infrastructure? These are the problems that we have to think about and solve at BrowserBase. And it's, it's certainly a labor of love, but I built this for me, first and foremost, I know it's super cheesy and everyone says that for like their startups, but it really, truly was for me. If you look at like the talks I've done even before BrowserBase, and I'm just like really excited to try and build a category defining infrastructure company. And it's, it's rare to have a new category of infrastructure exists. We're here in the Chroma offices and like, you know, vector databases is a new category of infrastructure. Is it, is it, I mean, we can, we're in their office, so, you know, we can, we can debate that one later. That is one.Multimodality in AI-Powered Browsingswyx [00:10:16]: That's one of the industry debates.Paul [00:10:17]: I guess we go back to the LLMOS talk that Karpathy gave way long ago. And like the browser box was very clearly there and it seemed like the people who were building in this space also agreed that browsers are a core primitive of infrastructure for the LLMOS that's going to exist in the future. And nobody was building something there that I wanted to use. So I had to go build it myself.swyx [00:10:38]: Yeah. I mean, exactly that talk that, that honestly, that diagram, every box is a startup and there's the code box and then there's the. The browser box. I think at some point they will start clashing there. There's always the question of the, are you a point solution or are you the sort of all in one? And I think the point solutions tend to win quickly, but then the only ones have a very tight cohesive experience. Yeah. Let's talk about just the hard problems of browser base you have on your website, which is beautiful. Thank you. Was there an agency that you used for that? Yeah. Herb.paris.Paul [00:11:11]: They're amazing. Herb.paris. Yeah. It's H-E-R-V-E. I highly recommend for developers. Developer tools, founders to work with consumer agencies because they end up building beautiful things and the Parisians know how to build beautiful interfaces. So I got to give prep.swyx [00:11:24]: And chat apps, apparently are, they are very fast. Oh yeah. The Mistral chat. Yeah. Mistral. Yeah.Paul [00:11:31]: Late chat.swyx [00:11:31]: Late chat. And then your videos as well, it was professionally shot, right? The series A video. Yeah.Alessio [00:11:36]: Nico did the videos. He's amazing. Not the initial video that you shot at the new one. First one was Austin.Paul [00:11:41]: Another, another video pretty surprised. But yeah, I mean, like, I think when you think about how you talk about your company. You have to think about the way you present yourself. It's, you know, as a developer, you think you evaluate a company based on like the API reliability and the P 95, but a lot of developers say, is the website good? Is the message clear? Do I like trust this founder? I'm building my whole feature on. So I've tried to nail that as well as like the reliability of the infrastructure. You're right. It's very hard. And there's a lot of kind of foot guns that you run into when running headless browsers at scale. Right.Competing with Existing Headless Browser Solutionsswyx [00:12:10]: So let's pick one. You have eight features here. Seamless integration. Scalability. Fast or speed. Secure. Observable. Stealth. That's interesting. Extensible and developer first. What comes to your mind as like the top two, three hardest ones? Yeah.Running headless browsers at scalePaul [00:12:26]: I think just running headless browsers at scale is like the hardest one. And maybe can I nerd out for a second? Is that okay? I heard this is a technical audience, so I'll talk to the other nerds. Whoa. They were listening. Yeah. They're upset. They're ready. The AGI is angry. Okay. So. So how do you run a browser in the cloud? Let's start with that, right? So let's say you're using a popular browser automation framework like Puppeteer, Playwright, and Selenium. Maybe you've written a code, some code locally on your computer that opens up Google. It finds the search bar and then types in, you know, search for Latent Space and hits the search button. That script works great locally. You can see the little browser open up. You want to take that to production. You want to run the script in a cloud environment. So when your laptop is closed, your browser is doing something. The browser is doing something. Well, I, we use Amazon. You can see the little browser open up. You know, the first thing I'd reach for is probably like some sort of serverless infrastructure. I would probably try and deploy on a Lambda. But Chrome itself is too big to run on a Lambda. It's over 250 megabytes. So you can't easily start it on a Lambda. So you maybe have to use something like Lambda layers to squeeze it in there. Maybe use a different Chromium build that's lighter. And you get it on the Lambda. Great. It works. But it runs super slowly. It's because Lambdas are very like resource limited. They only run like with one vCPU. You can run one process at a time. Remember, Chromium is super beefy. It's barely running on my MacBook Air. I'm still downloading it from a pre-run. Yeah, from the test earlier, right? I'm joking. But it's big, you know? So like Lambda, it just won't work really well. Maybe it'll work, but you need something faster. Your users want something faster. Okay. Well, let's put it on a beefier instance. Let's get an EC2 server running. Let's throw Chromium on there. Great. Okay. I can, that works well with one user. But what if I want to run like 10 Chromium instances, one for each of my users? Okay. Well, I might need two EC2 instances. Maybe 10. All of a sudden, you have multiple EC2 instances. This sounds like a problem for Kubernetes and Docker, right? Now, all of a sudden, you're using ECS or EKS, the Kubernetes or container solutions by Amazon. You're spending up and down containers, and you're spending a whole engineer's time on kind of maintaining this stateful distributed system. Those are some of the worst systems to run because when it's a stateful distributed system, it means that you are bound by the connections to that thing. You have to keep the browser open while someone is working with it, right? That's just a painful architecture to run. And there's all this other little gotchas with Chromium, like Chromium, which is the open source version of Chrome, by the way. You have to install all these fonts. You want emojis working in your browsers because your vision model is looking for the emoji. You need to make sure you have the emoji fonts. You need to make sure you have all the right extensions configured, like, oh, do you want ad blocking? How do you configure that? How do you actually record all these browser sessions? Like it's a headless browser. You can't look at it. So you need to have some sort of observability. Maybe you're recording videos and storing those somewhere. It all kind of adds up to be this just giant monster piece of your project when all you wanted to do was run a lot of browsers in production for this little script to go to google.com and search. And when I see a complex distributed system, I see an opportunity to build a great infrastructure company. And we really abstract that away with Browserbase where our customers can use these existing frameworks, Playwright, Publisher, Selenium, or our own stagehand and connect to our browsers in a serverless-like way. And control them, and then just disconnect when they're done. And they don't have to think about the complex distributed system behind all of that. They just get a browser running anywhere, anytime. Really easy to connect to.swyx [00:15:55]: I'm sure you have questions. My standard question with anything, so essentially you're a serverless browser company, and there's been other serverless things that I'm familiar with in the past, serverless GPUs, serverless website hosting. That's where I come from with Netlify. One question is just like, you promised to spin up thousands of servers. You promised to spin up thousands of browsers in milliseconds. I feel like there's no real solution that does that yet. And I'm just kind of curious how. The only solution I know, which is to kind of keep a kind of warm pool of servers around, which is expensive, but maybe not so expensive because it's just CPUs. So I'm just like, you know. Yeah.Browsers as a Core Primitive in AI InfrastructurePaul [00:16:36]: You nailed it, right? I mean, how do you offer a serverless-like experience with something that is clearly not serverless, right? And the answer is, you need to be able to run... We run many browsers on single nodes. We use Kubernetes at browser base. So we have many pods that are being scheduled. We have to predictably schedule them up or down. Yes, thousands of browsers in milliseconds is the best case scenario. If you hit us with 10,000 requests, you may hit a slower cold start, right? So we've done a lot of work on predictive scaling and being able to kind of route stuff to different regions where we have multiple regions of browser base where we have different pools available. You can also pick the region you want to go to based on like lower latency, round trip, time latency. It's very important with these types of things. There's a lot of requests going over the wire. So for us, like having a VM like Firecracker powering everything under the hood allows us to be super nimble and spin things up or down really quickly with strong multi-tenancy. But in the end, this is like the complex infrastructural challenges that we have to kind of deal with at browser base. And we have a lot more stuff on our roadmap to allow customers to have more levers to pull to exchange, do you want really fast browser startup times or do you want really low costs? And if you're willing to be more flexible on that, we may be able to kind of like work better for your use cases.swyx [00:17:44]: Since you used Firecracker, shouldn't Fargate do that for you or did you have to go lower level than that? We had to go lower level than that.Paul [00:17:51]: I find this a lot with Fargate customers, which is alarming for Fargate. We used to be a giant Fargate customer. Actually, the first version of browser base was ECS and Fargate. And unfortunately, it's a great product. I think we were actually the largest Fargate customer in our region for a little while. No, what? Yeah, seriously. And unfortunately, it's a great product, but I think if you're an infrastructure company, you actually have to have a deeper level of control over these primitives. I think it's the same thing is true with databases. We've used other database providers and I think-swyx [00:18:21]: Yeah, serverless Postgres.Paul [00:18:23]: Shocker. When you're an infrastructure company, you're on the hook if any provider has an outage. And I can't tell my customers like, hey, we went down because so-and-so went down. That's not acceptable. So for us, we've really moved to bringing things internally. It's kind of opposite of what we preach. We tell our customers, don't build this in-house, but then we're like, we build a lot of stuff in-house. But I think it just really depends on what is in the critical path. We try and have deep ownership of that.Alessio [00:18:46]: On the distributed location side, how does that work for the web where you might get sort of different content in different locations, but the customer is expecting, you know, if you're in the US, I'm expecting the US version. But if you're spinning up my browser in France, I might get the French version. Yeah.Paul [00:19:02]: Yeah. That's a good question. Well, generally, like on the localization, there is a thing called locale in the browser. You can set like what your locale is. If you're like in the ENUS browser or not, but some things do IP, IP based routing. And in that case, you may want to have a proxy. Like let's say you're running something in the, in Europe, but you want to make sure you're showing up from the US. You may want to use one of our proxy features so you can turn on proxies to say like, make sure these connections always come from the United States, which is necessary too, because when you're browsing the web, you're coming from like a, you know, data center IP, and that can make things a lot harder to browse web. So we do have kind of like this proxy super network. Yeah. We have a proxy for you based on where you're going, so you can reliably automate the web. But if you get scheduled in Europe, that doesn't happen as much. We try and schedule you as close to, you know, your origin that you're trying to go to. But generally you have control over the regions you can put your browsers in. So you can specify West one or East one or Europe. We only have one region of Europe right now, actually. Yeah.Alessio [00:19:55]: What's harder, the browser or the proxy? I feel like to me, it feels like actually proxying reliably at scale. It's much harder than spending up browsers at scale. I'm curious. It's all hard.Paul [00:20:06]: It's layers of hard, right? Yeah. I think it's different levels of hard. I think the thing with the proxy infrastructure is that we work with many different web proxy providers and some are better than others. Some have good days, some have bad days. And our customers who've built browser infrastructure on their own, they have to go and deal with sketchy actors. Like first they figure out their own browser infrastructure and then they got to go buy a proxy. And it's like you can pay in Bitcoin and it just kind of feels a little sus, right? It's like you're buying drugs when you're trying to get a proxy online. We have like deep relationships with these counterparties. We're able to audit them and say, is this proxy being sourced ethically? Like it's not running on someone's TV somewhere. Is it free range? Yeah. Free range organic proxies, right? Right. We do a level of diligence. We're SOC 2. So we have to understand what is going on here. But then we're able to make sure that like we route around proxy providers not working. There's proxy providers who will just, the proxy will stop working all of a sudden. And then if you don't have redundant proxying on your own browsers, that's hard down for you or you may get some serious impacts there. With us, like we intelligently know, hey, this proxy is not working. Let's go to this one. And you can kind of build a network of multiple providers to really guarantee the best uptime for our customers. Yeah. So you don't own any proxies? We don't own any proxies. You're right. The team has been saying who wants to like take home a little proxy server, but not yet. We're not there yet. You know?swyx [00:21:25]: It's a very mature market. I don't think you should build that yourself. Like you should just be a super customer of them. Yeah. Scraping, I think, is the main use case for that. I guess. Well, that leads us into CAPTCHAs and also off, but let's talk about CAPTCHAs. You had a little spiel that you wanted to talk about CAPTCHA stuff.Challenges of Scaling Browser InfrastructurePaul [00:21:43]: Oh, yeah. I was just, I think a lot of people ask, if you're thinking about proxies, you're thinking about CAPTCHAs too. I think it's the same thing. You can go buy CAPTCHA solvers online, but it's the same buying experience. It's some sketchy website, you have to integrate it. It's not fun to buy these things and you can't really trust that the docs are bad. What Browserbase does is we integrate a bunch of different CAPTCHAs. We do some stuff in-house, but generally we just integrate with a bunch of known vendors and continually monitor and maintain these things and say, is this working or not? Can we route around it or not? These are CAPTCHA solvers. CAPTCHA solvers, yeah. Not CAPTCHA providers, CAPTCHA solvers. Yeah, sorry. CAPTCHA solvers. We really try and make sure all of that works for you. I think as a dev, if I'm buying infrastructure, I want it all to work all the time and it's important for us to provide that experience by making sure everything does work and monitoring it on our own. Yeah. Right now, the world of CAPTCHAs is tricky. I think AI agents in particular are very much ahead of the internet infrastructure. CAPTCHAs are designed to block all types of bots, but there are now good bots and bad bots. I think in the future, CAPTCHAs will be able to identify who a good bot is, hopefully via some sort of KYC. For us, we've been very lucky. We have very little to no known abuse of Browserbase because we really look into who we work with. And for certain types of CAPTCHA solving, we only allow them on certain types of plans because we want to make sure that we can know what people are doing, what their use cases are. And that's really allowed us to try and be an arbiter of good bots, which is our long term goal. I want to build great relationships with people like Cloudflare so we can agree, hey, here are these acceptable bots. We'll identify them for you and make sure we flag when they come to your website. This is a good bot, you know?Alessio [00:23:23]: I see. And Cloudflare said they want to do more of this. So they're going to set by default, if they think you're an AI bot, they're going to reject. I'm curious if you think this is something that is going to be at the browser level or I mean, the DNS level with Cloudflare seems more where it should belong. But I'm curious how you think about it.Paul [00:23:40]: I think the web's going to change. You know, I think that the Internet as we have it right now is going to change. And we all need to just accept that the cat is out of the bag. And instead of kind of like wishing the Internet was like it was in the 2000s, we can have free content line that wouldn't be scraped. It's just it's not going to happen. And instead, we should think about like, one, how can we change? How can we change the models of, you know, information being published online so people can adequately commercialize it? But two, how do we rebuild applications that expect that AI agents are going to log in on their behalf? Those are the things that are going to allow us to kind of like identify good and bad bots. And I think the team at Clerk has been doing a really good job with this on the authentication side. I actually think that auth is the biggest thing that will prevent agents from accessing stuff, not captchas. And I think there will be agent auth in the future. I don't know if it's going to happen from an individual company, but actually authentication providers that have a, you know, hidden login as agent feature, which will then you put in your email, you'll get a push notification, say like, hey, your browser-based agent wants to log into your Airbnb. You can approve that and then the agent can proceed. That really circumvents the need for captchas or logging in as you and sharing your password. I think agent auth is going to be one way we identify good bots going forward. And I think a lot of this captcha solving stuff is really short-term problems as the internet kind of reorients itself around how it's going to work with agents browsing the web, just like people do. Yeah.Managing Distributed Browser Locations and Proxiesswyx [00:24:59]: Stitch recently was on Hacker News for talking about agent experience, AX, which is a thing that Netlify is also trying to clone and coin and talk about. And we've talked about this on our previous episodes before in a sense that I actually think that's like maybe the only part of the tech stack that needs to be kind of reinvented for agents. Everything else can stay the same, CLIs, APIs, whatever. But auth, yeah, we need agent auth. And it's mostly like short-lived, like it should not, it should be a distinct, identity from the human, but paired. I almost think like in the same way that every social network should have your main profile and then your alt accounts or your Finsta, it's almost like, you know, every, every human token should be paired with the agent token and the agent token can go and do stuff on behalf of the human token, but not be presumed to be the human. Yeah.Paul [00:25:48]: It's like, it's, it's actually very similar to OAuth is what I'm thinking. And, you know, Thread from Stitch is an investor, Colin from Clerk, Octaventures, all investors in browser-based because like, I hope they solve this because they'll make browser-based submission more possible. So we don't have to overcome all these hurdles, but I think it will be an OAuth-like flow where an agent will ask to log in as you, you'll approve the scopes. Like it can book an apartment on Airbnb, but it can't like message anybody. And then, you know, the agent will have some sort of like role-based access control within an application. Yeah. I'm excited for that.swyx [00:26:16]: The tricky part is just, there's one, one layer of delegation here, which is like, you're authoring my user's user or something like that. I don't know if that's tricky or not. Does that make sense? Yeah.Paul [00:26:25]: You know, actually at Twilio, I worked on the login identity and access. Management teams, right? So like I built Twilio's login page.swyx [00:26:31]: You were an intern on that team and then you became the lead in two years? Yeah.Paul [00:26:34]: Yeah. I started as an intern in 2016 and then I was the tech lead of that team. How? That's not normal. I didn't have a life. He's not normal. Look at this guy. I didn't have a girlfriend. I just loved my job. I don't know. I applied to 500 internships for my first job and I got rejected from every single one of them except for Twilio and then eventually Amazon. And they took a shot on me and like, I was getting paid money to write code, which was my dream. Yeah. Yeah. I'm very lucky that like this coding thing worked out because I was going to be doing it regardless. And yeah, I was able to kind of spend a lot of time on a team that was growing at a company that was growing. So it informed a lot of this stuff here. I think these are problems that have been solved with like the SAML protocol with SSO. I think it's a really interesting stuff with like WebAuthn, like these different types of authentication, like schemes that you can use to authenticate people. The tooling is all there. It just needs to be tweaked a little bit to work for agents. And I think the fact that there are companies that are already. Providing authentication as a service really sets it up. Well, the thing that's hard is like reinventing the internet for agents. We don't want to rebuild the internet. That's an impossible task. And I think people often say like, well, we'll have this second layer of APIs built for agents. I'm like, we will for the top use cases, but instead of we can just tweak the internet as is, which is on the authentication side, I think we're going to be the dumb ones going forward. Unfortunately, I think AI is going to be able to do a lot of the tasks that we do online, which means that it will be able to go to websites, click buttons on our behalf and log in on our behalf too. So with this kind of like web agent future happening, I think with some small structural changes, like you said, it feels like it could all slot in really nicely with the existing internet.Handling CAPTCHAs and Agent Authenticationswyx [00:28:08]: There's one more thing, which is the, your live view iframe, which lets you take, take control. Yeah. Obviously very key for operator now, but like, was, is there anything interesting technically there or that the people like, well, people always want this.Paul [00:28:21]: It was really hard to build, you know, like, so, okay. Headless browsers, you don't see them, right. They're running. They're running in a cloud somewhere. You can't like look at them. And I just want to really make, it's a weird name. I wish we came up with a better name for this thing, but you can't see them. Right. But customers don't trust AI agents, right. At least the first pass. So what we do with our live view is that, you know, when you use browser base, you can actually embed a live view of the browser running in the cloud for your customer to see it working. And that's what the first reason is the build trust, like, okay, so I have this script. That's going to go automate a website. I can embed it into my web application via an iframe and my customer can watch. I think. And then we added two way communication. So now not only can you watch the browser kind of being operated by AI, if you want to pause and actually click around type within this iframe that's controlling a browser, that's also possible. And this is all thanks to some of the lower level protocol, which is called the Chrome DevTools protocol. It has a API called start screencast, and you can also send mouse clicks and button clicks to a remote browser. And this is all embeddable within iframes. You have a browser within a browser, yo. And then you simulate the screen, the click on the other side. Exactly. And this is really nice often for, like, let's say, a capture that can't be solved. You saw this with Operator, you know, Operator actually uses a different approach. They use VNC. So, you know, you're able to see, like, you're seeing the whole window here. What we're doing is something a little lower level with the Chrome DevTools protocol. It's just PNGs being streamed over the wire. But the same thing is true, right? Like, hey, I'm running a window. Pause. Can you do something in this window? Human. Okay, great. Resume. Like sometimes 2FA tokens. Like if you get that text message, you might need a person to type that in. Web agents need human-in-the-loop type workflows still. You still need a person to interact with the browser. And building a UI to proxy that is kind of hard. You may as well just show them the whole browser and say, hey, can you finish this up for me? And then let the AI proceed on afterwards. Is there a future where I stream my current desktop to browser base? I don't think so. I think we're very much cloud infrastructure. Yeah. You know, but I think a lot of the stuff we're doing, we do want to, like, build tools. Like, you know, we'll talk about the stage and, you know, web agent framework in a second. But, like, there's a case where a lot of people are going desktop first for, you know, consumer use. And I think cloud is doing a lot of this, where I expect to see, you know, MCPs really oriented around the cloud desktop app for a reason, right? Like, I think a lot of these tools are going to run on your computer because it makes... I think it's breaking out. People are putting it on a server. Oh, really? Okay. Well, sweet. We'll see. We'll see that. I was surprised, though, wasn't I? I think that the browser company, too, with Dia Browser, it runs on your machine. You know, it's going to be...swyx [00:30:50]: What is it?Paul [00:30:51]: So, Dia Browser, as far as I understand... I used to use Arc. Yeah. I haven't used Arc. But I'm a big fan of the browser company. I think they're doing a lot of cool stuff in consumer. As far as I understand, it's a browser where you have a sidebar where you can, like, chat with it and it can control the local browser on your machine. So, if you imagine, like, what a consumer web agent is, which it lives alongside your browser, I think Google Chrome has Project Marina, I think. I almost call it Project Marinara for some reason. I don't know why. It's...swyx [00:31:17]: No, I think it's someone really likes the Waterworld. Oh, I see. The classic Kevin Costner. Yeah.Paul [00:31:22]: Okay. Project Marinara is a similar thing to the Dia Browser, in my mind, as far as I understand it. You have a browser that has an AI interface that will take over your mouse and keyboard and control the browser for you. Great for consumer use cases. But if you're building applications that rely on a browser and it's more part of a greater, like, AI app experience, you probably need something that's more like infrastructure, not a consumer app.swyx [00:31:44]: Just because I have explored a little bit in this area, do people want branching? So, I have the state. Of whatever my browser's in. And then I want, like, 100 clones of this state. Do people do that? Or...Paul [00:31:56]: People don't do it currently. Yeah. But it's definitely something we're thinking about. I think the idea of forking a browser is really cool. Technically, kind of hard. We're starting to see this in code execution, where people are, like, forking some, like, code execution, like, processes or forking some tool calls or branching tool calls. Haven't seen it at the browser level yet. But it makes sense. Like, if an AI agent is, like, using a website and it's not sure what path it wants to take to crawl this website. To find the information it's looking for. It would make sense for it to explore both paths in parallel. And that'd be a very, like... A road not taken. Yeah. And hopefully find the right answer. And then say, okay, this was actually the right one. And memorize that. And go there in the future. On the roadmap. For sure. Don't make my roadmap, please. You know?Alessio [00:32:37]: How do you actually do that? Yeah. How do you fork? I feel like the browser is so stateful for so many things.swyx [00:32:42]: Serialize the state. Restore the state. I don't know.Paul [00:32:44]: So, it's one of the reasons why we haven't done it yet. It's hard. You know? Like, to truly fork, it's actually quite difficult. The naive way is to open the same page in a new tab and then, like, hope that it's at the same thing. But if you have a form halfway filled, you may have to, like, take the whole, you know, container. Pause it. All the memory. Duplicate it. Restart it from there. It could be very slow. So, we haven't found a thing. Like, the easy thing to fork is just, like, copy the page object. You know? But I think there needs to be something a little bit more robust there. Yeah.swyx [00:33:12]: So, MorphLabs has this infinite branch thing. Like, wrote a custom fork of Linux or something that let them save the system state and clone it. MorphLabs, hit me up. I'll be a customer. Yeah. That's the only. I think that's the only way to do it. Yeah. Like, unless Chrome has some special API for you. Yeah.Paul [00:33:29]: There's probably something we'll reverse engineer one day. I don't know. Yeah.Alessio [00:33:32]: Let's talk about StageHand, the AI web browsing framework. You have three core components, Observe, Extract, and Act. Pretty clean landing page. What was the idea behind making a framework? Yeah.Stagehand: AI web browsing frameworkPaul [00:33:43]: So, there's three frameworks that are very popular or already exist, right? Puppeteer, Playwright, Selenium. Those are for building hard-coded scripts to control websites. And as soon as I started to play with LLMs plus browsing, I caught myself, you know, code-genning Playwright code to control a website. I would, like, take the DOM. I'd pass it to an LLM. I'd say, can you generate the Playwright code to click the appropriate button here? And it would do that. And I was like, this really should be part of the frameworks themselves. And I became really obsessed with SDKs that take natural language as part of, like, the API input. And that's what StageHand is. StageHand exposes three APIs, and it's a super set of Playwright. So, if you go to a page, you may want to take an action, click on the button, fill in the form, etc. That's what the act command is for. You may want to extract some data. This one takes a natural language, like, extract the winner of the Super Bowl from this page. You can give it a Zod schema, so it returns a structured output. And then maybe you're building an API. You can do an agent loop, and you want to kind of see what actions are possible on this page before taking one. You can do observe. So, you can observe the actions on the page, and it will generate a list of actions. You can guide it, like, give me actions on this page related to buying an item. And you can, like, buy it now, add to cart, view shipping options, and pass that to an LLM, an agent loop, to say, what's the appropriate action given this high-level goal? So, StageHand isn't a web agent. It's a framework for building web agents. And we think that agent loops are actually pretty close to the application layer because every application probably has different goals or different ways it wants to take steps. I don't think I've seen a generic. Maybe you guys are the experts here. I haven't seen, like, a really good AI agent framework here. Everyone kind of has their own special sauce, right? I see a lot of developers building their own agent loops, and they're using tools. And I view StageHand as the browser tool. So, we expose act, extract, observe. Your agent can call these tools. And from that, you don't have to worry about it. You don't have to worry about generating playwright code performantly. You don't have to worry about running it. You can kind of just integrate these three tool calls into your agent loop and reliably automate the web.swyx [00:35:48]: A special shout-out to Anirudh, who I met at your dinner, who I think listens to the pod. Yeah. Hey, Anirudh.Paul [00:35:54]: Anirudh's a man. He's a StageHand guy.swyx [00:35:56]: I mean, the interesting thing about each of these APIs is they're kind of each startup. Like, specifically extract, you know, Firecrawler is extract. There's, like, Expand AI. There's a whole bunch of, like, extract companies. They just focus on extract. I'm curious. Like, I feel like you guys are going to collide at some point. Like, right now, it's friendly. Everyone's in a blue ocean. At some point, it's going to be valuable enough that there's some turf battle here. I don't think you have a dog in a fight. I think you can mock extract to use an external service if they're better at it than you. But it's just an observation that, like, in the same way that I see each option, each checkbox in the side of custom GBTs becoming a startup or each box in the Karpathy chart being a startup. Like, this is also becoming a thing. Yeah.Paul [00:36:41]: I mean, like, so the way StageHand works is that it's MIT-licensed, completely open source. You bring your own API key to your LLM of choice. You could choose your LLM. We don't make any money off of the extract or really. We only really make money if you choose to run it with our browser. You don't have to. You can actually use your own browser, a local browser. You know, StageHand is completely open source for that reason. And, yeah, like, I think if you're building really complex web scraping workflows, I don't know if StageHand is the tool for you. I think it's really more if you're building an AI agent that needs a few general tools or if it's doing a lot of, like, web automation-intensive work. But if you're building a scraping company, StageHand is not your thing. You probably want something that's going to, like, get HTML content, you know, convert that to Markdown, query it. That's not what StageHand does. StageHand is more about reliability. I think we focus a lot on reliability and less so on cost optimization and speed at this point.swyx [00:37:33]: I actually feel like StageHand, so the way that StageHand works, it's like, you know, page.act, click on the quick start. Yeah. It's kind of the integration test for the code that you would have to write anyway, like the Puppeteer code that you have to write anyway. And when the page structure changes, because it always does, then this is still the test. This is still the test that I would have to write. Yeah. So it's kind of like a testing framework that doesn't need implementation detail.Paul [00:37:56]: Well, yeah. I mean, Puppeteer, Playwright, and Slenderman were all designed as testing frameworks, right? Yeah. And now people are, like, hacking them together to automate the web. I would say, and, like, maybe this is, like, me being too specific. But, like, when I write tests, if the page structure changes. Without me knowing, I want that test to fail. So I don't know if, like, AI, like, regenerating that. Like, people are using StageHand for testing. But it's more for, like, usability testing, not, like, testing of, like, does the front end, like, has it changed or not. Okay. But generally where we've seen people, like, really, like, take off is, like, if they're using, you know, something. If they want to build a feature in their application that's kind of like Operator or Deep Research, they're using StageHand to kind of power that tool calling in their own agent loop. Okay. Cool.swyx [00:38:37]: So let's go into Operator, the first big agent launch of the year from OpenAI. Seems like they have a whole bunch scheduled. You were on break and your phone blew up. What's your just general view of computer use agents is what they're calling it. The overall category before we go into Open Operator, just the overall promise of Operator. I will observe that I tried it once. It was okay. And I never tried it again.OpenAI's Operator and computer use agentsPaul [00:38:58]: That tracks with my experience, too. Like, I'm a huge fan of the OpenAI team. Like, I think that I do not view Operator as the company. I'm not a company killer for browser base at all. I think it actually shows people what's possible. I think, like, computer use models make a lot of sense. And I'm actually most excited about computer use models is, like, their ability to, like, really take screenshots and reasoning and output steps. I think that using mouse click or mouse coordinates, I've seen that proved to be less reliable than I would like. And I just wonder if that's the right form factor. What we've done with our framework is anchor it to the DOM itself, anchor it to the actual item. So, like, if it's clicking on something, it's clicking on that thing, you know? Like, it's more accurate. No matter where it is. Yeah, exactly. Because it really ties in nicely. And it can handle, like, the whole viewport in one go, whereas, like, Operator can only handle what it sees. Can you hover? Is hovering a thing that you can do? I don't know if we expose it as a tool directly, but I'm sure there's, like, an API for hovering. Like, move mouse to this position. Yeah, yeah, yeah. I think you can trigger hover, like, via, like, the JavaScript on the DOM itself. But, no, I think, like, when we saw computer use, everyone's eyes lit up because they realized, like, wow, like, AI is going to actually automate work for people. And I think seeing that kind of happen from both of the labs, and I'm sure we're going to see more labs launch computer use models, I'm excited to see all the stuff that people build with it. I think that I'd love to see computer use power, like, controlling a browser on browser base. And I think, like, Open Operator, which was, like, our open source version of OpenAI's Operator, was our first take on, like, how can we integrate these models into browser base? And we handle the infrastructure and let the labs do the models. I don't have a sense that Operator will be released as an API. I don't know. Maybe it will. I'm curious to see how well that works because I think it's going to be really hard for a company like OpenAI to do things like support CAPTCHA solving or, like, have proxies. Like, I think it's hard for them structurally. Imagine this New York Times headline, OpenAI CAPTCHA solving. Like, that would be a pretty bad headline, this New York Times headline. Browser base solves CAPTCHAs. No one cares. No one cares. And, like, our investors are bored. Like, we're all okay with this, you know? We're building this company knowing that the CAPTCHA solving is short-lived until we figure out how to authenticate good bots. I think it's really hard for a company like OpenAI, who has this brand that's so, so good, to balance with, like, the icky parts of web automation, which it can be kind of complex to solve. I'm sure OpenAI knows who to call whenever they need you. Yeah, right. I'm sure they'll have a great partnership.Alessio [00:41:23]: And is Open Operator just, like, a marketing thing for you? Like, how do you think about resource allocation? So, you can spin this up very quickly. And now there's all this, like, open deep research, just open all these things that people are building. We started it, you know. You're the original Open. We're the original Open operator, you know? Is it just, hey, look, this is a demo, but, like, we'll help you build out an actual product for yourself? Like, are you interested in going more of a product route? That's kind of the OpenAI way, right? They started as a model provider and then…Paul [00:41:53]: Yeah, we're not interested in going the product route yet. I view Open Operator as a model provider. It's a reference project, you know? Let's show people how to build these things using the infrastructure and models that are out there. And that's what it is. It's, like, Open Operator is very simple. It's an agent loop. It says, like, take a high-level goal, break it down into steps, use tool calling to accomplish those steps. It takes screenshots and feeds those screenshots into an LLM with the step to generate the right action. It uses stagehand under the hood to actually execute this action. It doesn't use a computer use model. And it, like, has a nice interface using the live view that we talked about, the iframe, to embed that into an application. So I felt like people on launch day wanted to figure out how to build their own version of this. And we turned that around really quickly to show them. And I hope we do that with other things like deep research. We don't have a deep research launch yet. I think David from AOMNI actually has an amazing open deep research that he launched. It has, like, 10K GitHub stars now. So he's crushing that. But I think if people want to build these features natively into their application, they need good reference projects. And I think Open Operator is a good example of that.swyx [00:42:52]: I don't know. Actually, I'm actually pretty bullish on API-driven operator. Because that's the only way that you can sort of, like, once it's reliable enough, obviously. And now we're nowhere near. But, like, give it five years. It'll happen, you know. And then you can sort of spin this up and browsers are working in the background and you don't necessarily have to know. And it just is booking restaurants for you, whatever. I can definitely see that future happening. I had this on the landing page here. This might be a slightly out of order. But, you know, you have, like, sort of three use cases for browser base. Open Operator. Or this is the operator sort of use case. It's kind of like the workflow automation use case. And it completes with UiPath in the sort of RPA category. Would you agree with that? Yeah, I would agree with that. And then there's Agents we talked about already. And web scraping, which I imagine would be the bulk of your workload right now, right?Paul [00:43:40]: No, not at all. I'd say actually, like, the majority is browser automation. We're kind of expensive for web scraping. Like, I think that if you're building a web scraping product, if you need to do occasional web scraping or you have to do web scraping that works every single time, you want to use browser automation. Yeah. You want to use browser-based. But if you're building web scraping workflows, what you should do is have a waterfall. You should have the first request is a curl to the website. See if you can get it without even using a browser. And then the second request may be, like, a scraping-specific API. There's, like, a thousand scraping APIs out there that you can use to try and get data. Scraping B. Scraping B is a great example, right? Yeah. And then, like, if those two don't work, bring out the heavy hitter. Like, browser-based will 100% work, right? It will load the page in a real browser, hydrate it. I see.swyx [00:44:21]: Because a lot of people don't render to JS.swyx [00:44:25]: Yeah, exactly.Paul [00:44:26]: So, I mean, the three big use cases, right? Like, you know, automation, web data collection, and then, you know, if you're building anything agentic that needs, like, a browser tool, you want to use browser-based.Alessio [00:44:35]: Is there any use case that, like, you were super surprised by that people might not even think about? Oh, yeah. Or is it, yeah, anything that you can share? The long tail is crazy. Yeah.Surprising use cases of BrowserbasePaul [00:44:44]: One of the case studies on our website that I think is the most interesting is this company called Benny. So, the way that it works is if you're on food stamps in the United States, you can actually get rebates if you buy certain things. Yeah. You buy some vegetables. You submit your receipt to the government. They'll give you a little rebate back. Say, hey, thanks for buying vegetables. It's good for you. That process of submitting that receipt is very painful. And the way Benny works is you use their app to take a photo of your receipt, and then Benny will go submit that receipt for you and then deposit the money into your account. That's actually using no AI at all. It's all, like, hard-coded scripts. They maintain the scripts. They've been doing a great job. And they build this amazing consumer app. But it's an example of, like, all these, like, tedious workflows that people have to do to kind of go about their business. And they're doing it for the sake of their day-to-day lives. And I had never known about, like, food stamp rebates or the complex forms you have to do to fill them. But the world is powered by millions and millions of tedious forms, visas. You know, Emirate Lighthouse is a customer, right? You know, they do the O1 visa. Millions and millions of forms are taking away humans' time. And I hope that Browserbase can help power software that automates away the web forms that we don't need anymore. Yeah.swyx [00:45:49]: I mean, I'm very supportive of that. I mean, forms. I do think, like, government itself is a big part of it. I think the government itself should embrace AI more to do more sort of human-friendly form filling. Mm-hmm. But I'm not optimistic. I'm not holding my breath. Yeah. We'll see. Okay. I think I'm about to zoom out. I have a little brief thing on computer use, and then we can talk about founder stuff, which is, I tend to think of developer tooling markets in impossible triangles, where everyone starts in a niche, and then they start to branch out. So I already hinted at a little bit of this, right? We mentioned more. We mentioned E2B. We mentioned Firecrawl. And then there's Browserbase. So there's, like, all this stuff of, like, have serverless virtual computer that you give to an agent and let them do stuff with it. And there's various ways of connecting it to the internet. You can just connect to a search API, like SERP API, whatever other, like, EXA is another one. That's what you're searching. You can also have a JSON markdown extractor, which is Firecrawl. Or you can have a virtual browser like Browserbase, or you can have a virtual machine like Morph. And then there's also maybe, like, a virtual sort of code environment, like Code Interpreter. So, like, there's just, like, a bunch of different ways to tackle the problem of give a computer to an agent. And I'm just kind of wondering if you see, like, everyone's just, like, happily coexisting in their respective niches. And as a developer, I just go and pick, like, a shopping basket of one of each. Or do you think that you eventually, people will collide?Future of browser automation and market competitionPaul [00:47:18]: I think that currently it's not a zero-sum market. Like, I think we're talking about... I think we're talking about all of knowledge work that people do that can be automated online. All of these, like, trillions of hours that happen online where people are working. And I think that there's so much software to be built that, like, I tend not to think about how these companies will collide. I just try to solve the problem as best as I can and make this specific piece of infrastructure, which I think is an important primitive, the best I possibly can. And yeah. I think there's players that are actually going to like it. I think there's players that are going to launch, like, over-the-top, you know, platforms, like agent platforms that have all these tools built in, right? Like, who's building the rippling for agent tools that has the search tool, the browser tool, the operating system tool, right? There are some. There are some. There are some, right? And I think in the end, what I have seen as my time as a developer, and I look at all the favorite tools that I have, is that, like, for tools and primitives with sufficient levels of complexity, you need to have a solution that's really bespoke to that primitive, you know? And I am sufficiently convinced that the browser is complex enough to deserve a primitive. Obviously, I have to. I'm the founder of BrowserBase, right? I'm talking my book. But, like, I think maybe I can give you one spicy take against, like, maybe just whole OS running. I think that when I look at computer use when it first came out, I saw that the majority of use cases for computer use were controlling a browser. And do we really need to run an entire operating system just to control a browser? I don't think so. I don't think that's necessary. You know, BrowserBase can run browsers for way cheaper than you can if you're running a full-fledged OS with a GUI, you know, operating system. And I think that's just an advantage of the browser. It is, like, browsers are little OSs, and you can run them very efficiently if you orchestrate it well. And I think that allows us to offer 90% of the, you know, functionality in the platform needed at 10% of the cost of running a full OS. Yeah.Open Operator: Browserbase's Open-Source Alternativeswyx [00:49:16]: I definitely see the logic in that. There's a Mark Andreessen quote. I don't know if you know this one. Where he basically observed that the browser is turning the operating system into a poorly debugged set of device drivers, because most of the apps are moved from the OS to the browser. So you can just run browsers.Paul [00:49:31]: There's a place for OSs, too. Like, I think that there are some applications that only run on Windows operating systems. And Eric from pig.dev in this upcoming YC batch, or last YC batch, like, he's building all run tons of Windows operating systems for you to control with your agent. And like, there's some legacy EHR systems that only run on Internet-controlled systems. Yeah.Paul [00:49:54]: I think that's it. I think, like, there are use cases for specific operating systems for specific legacy software. And like, I'm excited to see what he does with that. I just wanted to give a shout out to the pig.dev website.swyx [00:50:06]: The pigs jump when you click on them. Yeah. That's great.Paul [00:50:08]: Eric, he's the former co-founder of banana.dev, too.swyx [00:50:11]: Oh, that Eric. Yeah. That Eric. Okay. Well, he abandoned bananas for pigs. I hope he doesn't start going around with pigs now.Alessio [00:50:18]: Like he was going around with bananas. A little toy pig. Yeah. Yeah. I love that. What else are we missing? I think we covered a lot of, like, the browser-based product history, but. What do you wish people asked you? Yeah.Paul [00:50:29]: I wish people asked me more about, like, what will the future of software look like? Because I think that's really where I've spent a lot of time about why do browser-based. Like, for me, starting a company is like a means of last resort. Like, you shouldn't start a company unless you absolutely have to. And I remain convinced that the future of software is software that you're going to click a button and it's going to do stuff on your behalf. Right now, software. You click a button and it maybe, like, calls it back an API and, like, computes some numbers. It, like, modifies some text, whatever. But the future of software is software using software. So, I may log into my accounting website for my business, click a button, and it's going to go load up my Gmail, search my emails, find the thing, upload the receipt, and then comment it for me. Right? And it may use it using APIs, maybe a browser. I don't know. I think it's a little bit of both. But that's completely different from how we've built software so far. And that's. I think that future of software has different infrastructure requirements. It's going to require different UIs. It's going to require different pieces of infrastructure. I think the browser infrastructure is one piece that fits into that, along with all the other categories you mentioned. So, I think that it's going to require developers to think differently about how they've built software for, you know
Join Simtheory: https://simtheory.ai----Grok 3 Dis Track (cringe): https://simulationtheory.ai/aff9ba04-ca0e-4572-84f4-687739c7b84bGrok 3 Dis Track written by Sonnet: https://simulationtheory.ai/edaed525-b9b6-473b-a6d6-f9cca9673868----Community: https://thisdayinai.com----Chapters:00:00 - First Impressions of Grok 310:00 - Discussion about Deep Search, Deep Research24:28 - Market landscape: Is OpenAI Rattled by xAI's Grok 3? Rumors of GPT-4.5 and GPT-548:48 - Why does Grok and xAI Exist? Will anyone care about Grok 3 next week?54:45 - Diss track battle with Grok 3 (re-written by Sonnet) & Model Tuning for Use Cases1:07:50 - GPT-4.5 and Anthropic Claude Thinking Next Week? & Are we a podcast about Altavista?1:13:25 - Economically productive agents & freaky muscular robot1:22:00 - Final thoughts of the week1:27:26 - Grok 3 Dis Track in Full (Sonnet Version)Thanks for your support and listening!
Join Simtheory: https://simtheory.aiCommunity: https://thisdayinai.com---CHAPTERS:00:00 - Anthropic Economic Index & The Impact of AI Agents18:00 - Hype Vs Reality of Models & Agents31:33 - Dream Agents & Side Quest Background Tasks56:60 - How All SaaS Will Be Disrupted by AI1:21:10 - Sam Altman's GPT-4.5, GPT-5 Roadmap1:28:50 - Anthropic Claude 4: Anthropic Strikes Back---Thanks for listening and your support.
Try a walking desk while studying ML or working on your projects! https://ocdevel.com/walk Show notes: https://ocdevel.com/mlg/mla-22 Tools discussed: Windsurf: https://codeium.com/windsurf Copilot: https://github.com/features/copilot Cursor: https://www.cursor.com/ Cline: https://github.com/cline/cline Roo Code: https://github.com/RooVetGit/Roo-Code Aider: https://aider.chat/ Other: Leaderboards: https://aider.chat/docs/leaderboards/ Video of speed-demon: https://www.youtube.com/watch?v=QlUt06XLbJE&feature=youtu.be Reddit: https://www.reddit.com/r/chatgptcoding/ Examines the rapidly evolving world of AI coding tools designed to boost programming productivity by acting as a pair programming partner. The discussion groups these tools into three categories: • Hands-Off Tools: These include solutions that work on fixed monthly fees and require minimal user intervention. GitHub Copilot started with simple tab completions and now offers an agent mode similar to Cursor, which stands out for its advanced codebase indexing and intelligent file searching. Windsurf is noted for its simplicity—accepting prompts and performing automated edits—but some users report performance throttling after prolonged use. • Hands-On Tools: Aider is presented as a command-line utility that demands configuration and user involvement. It allows developers to specify files and settings, and it efficiently manages token usage by sending prompts in diff format. Aider also implements an “architect versus edit” approach: a reasoning model (such as DeepSeek R1) first outlines a sequence of changes, then an editor model (like Claude 3.5 Sonnet) produces precise code edits. This dual-model strategy enhances accuracy and reduces token costs, especially for complex tasks. • Intermediate Power Tools: Open-source tools such as Cline and its more advanced fork, RooCode, require users to supply their own API keys and pay per token. These tools offer robust, agentic features, including codebase indexing, file editing, and even browser automation. RooCode stands out with its ability to autonomously expand functionality through integrations (for example, managing cloud resources or querying issue trackers), making it particularly attractive for tinkerers and power users. A decision framework is suggested: for those new to AI coding assistants or with limited budgets, starting with Cursor (or cautiously exploring Copilot's new features) is recommended. For developers who want to customize their workflow and dive deep into the tooling, RooCode or Cline offer greater control—always paired with Aider for precise and token-efficient code edits. Also reviews model performance using a coding benchmark leaderboard that updates frequently. The current top-performing combination uses DeepSeek R1 as the architect and Claude 3.5 Sonnet as the editor, with alternatives such as OpenAI's O1 and O3 Mini available. Tools like Open Router are mentioned as a way to consolidate API key management and reduce token costs.
Join Simtheory: https://simtheory.ai----"Don't Cha" Song: https://simulationtheory.ai/cbf4d5e6-82e4-4e84-91e7-3b48cb2744efSpotify: https://open.spotify.com/track/4Q8dRV45WYfxePE7zi52iL?si=ed094fce41e54c8fCommunity: https://thisdayinai.com---CHAPTERS:00:00 - We're on Spotify!01:06 - o3-mini release and initial impressions18:37 - Reasoning models as agents47:20 - OpenAI's Deep Research: impressions and what it means1:12:20 - Addressing our Shilling for Sonnet & My Week with o1 Experience1:20:18 - Gemini 2.0 Flash GA, Gemini 2.0 Pro Experimental + Other Google Updates1:38:16 - LOL of week and final thoughts1:43:39 - Don't Cha Song in Full
OpenAI is pushing the boundaries of artificial intelligence yet again. In this episode of Rocketship.FM, we break down what Chief Product Officer Kevin Weil revealed about OpenAI's roadmap for 2025 and beyond—including the latest AI model, O1, which is already outperforming previous versions in coding, math, and reasoning. But that's just the beginning. We also explore OpenAI's move into AI-powered agents designed to streamline everyday tasks, and the company's rumored return to humanoid robotics. And what about Artificial General Intelligence (AGI) and even Artificial Superintelligence (ASI)? OpenAI CEO Sam Altman has hinted that these once-distant milestones could be closer than we think. What happens when AI surpasses human intelligence? Will it be a utopia of limitless innovation, or are we opening a Pandora's box we can't close? Join us as we unpack OpenAI's vision for the future—and what it could mean for the world.
Join Simtheory: https://simtheory.ai---LINKS FROM SHOW:- Built to Reason (an o1 Tribute song): https://simulationtheory.ai/3f3ff70d-afef-4372-a9a5-26b22824c383- Sputnik Moment Song: https://simulationtheory.ai/4317176e-5c0d-49b9-801b-b686113624fd- Episode 91 Notes: https://simulationtheory.ai/b64f40ce-dab8-40b7-89a1-f24d17296f5aCHAPTERS:00:00 - Is Deepseek R1 a Sputnik Moment?15:32 - Industry Reaction to Deepseek R139:30 - Can Deepseek R1 Write a Good Dis Track?46:21 - Will AI Disrupt All Software: Throw Away AI Software & Custom Interfaces1:10:04 - OpenAI's Operator Thoughts & Computer Use in the Enterprise1:16:45 - Google Releases Gemini 2.0 Flash Officially Released, Rumors of o3-mini & Farewell to o11:22:07 - In loving memory of o1...---thx 4 listening, like and sub.
The AI Breakdown: Daily Artificial Intelligence News and Discussions
DeekSeek has released R1, their answer to OpenAI's O1, and it has Silicon Valley chattering and markets crashing. But just how big a deal is it? Big, argues NLW, even if the likely impact might be different than what Wall Street seems to think. Brought to you by: KPMG – Go to www.kpmg.us/ai to learn more about how KPMG can help you drive value with our AI solutions. Vanta - Simplify compliance - https://vanta.com/nlw The Agent Readiness Audit from Superintelligent - Go to https://besuper.ai/ to request your company's agent readiness score. The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: https://pod.link/1680633614 Subscribe to the newsletter: https://aidailybrief.beehiiv.com/ Join our Discord: https://bit.ly/aibreakdown