POPULARITY
Categories
At Google Cloud Next '25, the company introduced Ironwood, its most advanced custom Tensor Processing Unit (TPU) to date. With 9,216 chips per pod delivering 42.5 exaflops of compute power, Ironwood doubles the performance per watt compared to its predecessor. Senior product manager Chelsie Czop explained that designing TPUs involves balancing power, thermal constraints, and interconnectivity. Google's long-term investment in liquid cooling, now in its fourth generation, plays a key role in managing the heat generated by these powerful chips. Czop highlighted the incremental design improvements made visible through changes in the data center setup, such as liquid cooling pipe placements. Customers often ask whether to use TPUs or GPUs, but the answer depends on their specific workloads and infrastructure. Some, like Moloco, have seen a 10x performance boost by moving directly from CPUs to TPUs. However, many still use both TPUs and GPUs. As models evolve faster than hardware, Google relies on collaborations with teams like DeepMind to anticipate future needs.Learn more from The New Stack about the latest AI infrastructure insights from Google Cloud:Google Cloud Therapist on Bringing AI to Cloud Native InfrastructureA2A, MCP, Kafka and Flink: The New Stack for AI AgentsJoin our community of newsletter subscribers to stay on top of the news and at the top of your game.
No matter how we define the edge, the special requirements for use in harsh environments drive unique product decisions. This episode of Utilizing Tech, brought to you by Solidigm, features Alistair Bradbrook, founder of Antillion, discussing edge servers with Jeniece Wnorowski and Stephen Foskett. It pays to start with the intended outcome, defining the solution based on customer needs rather than with the technology at hand. This is especially true at the edge, where unique requirements for mobility, power, ruggedness, and manageability drive novel configurations. When it comes to defense applications, AI is driving greater collection of data at the edge, yet connectivity is often inconsistent, driving increasing processing power. Yet the current CPUs can often handle inferencing in edge use cases, especially when the rest of the server, including storage, can handle high data transfer rates. Edge computers have always needed more storage capacity, and the latest SSDs can bring incredible amounts in a small form factor. Antillion is also a leader in conduction cooling, bringing liquid and immersion cooled devices to market for demanding applications. They are also working to bring disaggregated servers to market using CXL technology, a topic covered in detail on season 4 of this podcast. The edge is all about constraints, and this limitation drives incredible innovation.Guest:Alistair Bradbrook is the Founder and COO of Antilion. You can connect with Alistair on LinkedIn and learn more about Anitllion on their website. Hosts: Stephen Foskett, President of the Tech Field Day Business Unit and Organizer of the Tech Field Day Event SeriesJeniece Wnorowski, Head of Influencer Marketing at Solidigm Scott Shadley, Leadership Narrative Director and Evangelist at SolidigmFollow Tech Field Day on LinkedIn, on X/Twitter, on Bluesky, and on Mastodon. Visit the Tech Field Day website for more information on upcoming events. For more episodes of Utilizing Tech, head to the dedicated website and follow the show on X/Twitter, on Bluesky, and on Mastodon.
We're back! After a long break, The Due Diligence Show kicks off Season 2 with a deep dive into the world of artificial intelligence. Adam Jaques and Luke Silcock unpack what's hype, what's real, and what matters for businesses, investors, and tech leaders. In this episode: What is AI — really — and why does it matter? Who are the major players (OpenAI, Google, Anthropic, DeepSeek, and more)? How is AI already transforming industries, from tax authorities to creative tools? Why GPUs (not CPUs) power today's AI breakthroughs What's coming next: agentic AI, open-source models, and the competitive edge Whether you're just starting to explore AI or knee-deep in the latest models, this episode will get you up to speed — with practical insights and sharp analysis.
In this episode of Book Overflow, Carter and Nathan discuss the first half of Grokking Concurrency by Kirill Bobrov! Join them as they discuss the basic building blocks of concurrency, how concurrency has evolved over time, and how building concurrent applications can increase performance!Go Proverbs: https://go-proverbs.github.io/-- Books Mentioned in this Episode --Note: As an Amazon Associate, we earn from qualifying purchases.----------------------------------------------------------Grokking Concurrency by Kirill Bobrovhttps://amzn.to/3GRbnby (paid link)Web Scalability for Startup Engineers by Artur Ejsmonthttps://amzn.to/3F1VWwF (paid link)----------------00:00 Intro02:07 About the Book and Author03:35 Initial Thoughts on the Book09:12 What is Concurrency vs Parallelism12:35 CPUs and Moore's Law22:19 IO Performance, Embarrassingly Parallel and Conway's Law28:25 Building Blocks of Concurrency: Processes and Threads33:05 Memory Sharing vs Communicating39:13 Multitasking and Context Switching45:24 Task Decomposition and Data Pipelines52:35 Final Thoughts----------------Spotify: https://open.spotify.com/show/5kj6DLCEWR5nHShlSYJI5LApple Podcasts: https://podcasts.apple.com/us/podcast/book-overflow/id1745257325X: https://x.com/bookoverflowpodCarter on X: https://x.com/cartermorganNathan's Functionally Imperative: www.functionallyimperative.com----------------Book Overflow is a podcast for software engineers, by software engineers dedicated to improving our craft by reading the best technical books in the world. Join Carter Morgan and Nathan Toups as they read and discuss a new technical book each week!The full book schedule and links to every major podcast player can be found at https://www.bookoverflow.io
Na adoção de sistemas de Inteligência Artificial para uma ou mais áreas do negócio, as empresas devem começar cada um dos projetos de maneira estruturada e bem planejada. Pensando mais pequeno, se for o caso - e depois, baseadas nos dados, irem escalando o uso da IA de forma mais assertiva, evitando cair no hype de investir recursos e tempo em uma tecnologia tão inovadora sem muito planejamento, apenas para surfar a onda do momento. Para falar desse tema e das iniciativas para democratizar a IA a grandes, médios e pequenos negócios, por meio do processamento em CPUs, mais baratas e energeticamente mais eficientes do que os grandes sistemas baseados em GPUs, tecnologia que vem ganhando espaço no mercado, e o casamento entre a Inteligência Artificial com o desenvolvimento no padrão Open Source (Código Aberto), que incentiva a colaboração e a integração de ecossistemas com diferentes parceiros, o Start Eldorado recebe Sandra Vaz, country manager da Red Hat para o Brasil, que conversou sobre estes e mais temas com o apresentador Daniel Gonzales. O programa vai ao ar todas as quartas-feiras, às 21h, em FM 107,3 para toda a Grande São Paulo, site, app, canais digitais e assistentes de voz.See omnystudio.com/listener for privacy information.
At Arm, open source is the default approach, with proprietary software requiring justification, says Andrew Wafaa, fellow and senior director of software communities. Speaking at KubeCon + CloudNativeCon Europe, Wafaa emphasized Arm's decade-long commitment to open source, highlighting its investment in key projects like the Linux kernel, GCC, and LLVM. This investment is strategic, ensuring strong support for Arm's architecture through vital tools and system software.Wafaa also challenged the hype around GPUs in AI, asserting that CPUs—especially those enhanced with Arm's Scalable Matrix Extension (SME2) and Scalable Vector Extension (SVE2)—are often more suitable for inference workloads. CPUs offer greater flexibility, and Arm's innovations aim to reduce dependency on expensive GPU fleets.On the AI framework front, Wafaa pointed to PyTorch as the emerging hub, likening its ecosystem-building potential to Kubernetes. As a PyTorch Foundation board member, he sees PyTorch becoming the central open source platform in AI development, with broad community and industry backing.Learn more from The New Stack about the latest insights about Arm: Edge Wars Heat Up as Arm Aims to Outflank Intel, Qualcomm Arm: See a Demo About Migrating a x86-Based App to ARM64Join our community of newsletter subscribers to stay on top of the news and at the top of your game.
Yik Ban, a research analyst at Phillip Securities Research will dive into the latest pulse from the semiconductor world—GPUs, CPUs, memory, and all the gear powering tomorrow's tech. Listen to this podcast to stay updated on the latest corporate news. Additionally, you can visit www.poems.com.sg/stock-research to access the full report and gain more insights.#PhillipCapital #YourPartnerinFinance #Servingyousince1975 #Fintech #PYTCH #PYTCHMedia #SGXCompanyInsights #FinanceNews #SGX #Semiconductor #AI #GPU #CPU #Nvidia #AMD #Qualcomm #Broadcom #Intel Follow PYTCH Media:YouTubeFacebookInstagramLinkedIn PodcastWebsite
Die von den USA ausgerufenen Zöllen disruptieren die gesamte Lieferkette der Techindustrie. Das bedeutet: Alles wird teurer. Soll man jetzt schnell noch seine Geräte austauschen? Die jüngste Ankündigung der USA, erhebliche Strafzölle auf Elektronikprodukte aus Asien und Europa zu verhängen, sorgt weltweit für große Unruhe. Besonders betroffen sind Länder wie China mit knapp 60 Prozent, Vietnam mit 46 Prozent und Taiwan mit 32 Prozent Zollaufschlag. Diese Entscheidung hat massive Auswirkungen auf zahlreiche Produkte, die in unserem Alltag unverzichtbar geworden sind: Smartphones, Tablets, Computer sowie wichtige Komponenten wie CPUs, GPUs und Displays.
In this episode of the Azizi Podcast, Samir Azizi sits down with Dr. Mehdi Kargar, co-founder of Heisenberg Network, to dive deep into the most overlooked problem in AI today: AI-ready data. You'll learn: Why 80% of AI projects are failing—and what Heisenberg is doing about it How Data Agents transform raw, messy data into structured, contextual intelligence Why CPUs, not GPUs, are the real heroes in AI data processing How you can earn crypto just by running a lightweight node on your laptop The future of decentralized compute and how it empowers a more open, censorship-resistant AI ecosystem How Heisenberg cuts AI data costs by up to 70% Dr. Kargar explains the origins of Heisenberg, how they went from Web3 predictive analytics to building the missing layer of AI, and why the network's infrastructure is powered by a decentralized network of idle CPUs from everyday users—just like you.
What distinguishes CPUs from GPUs in architecture, and how does this impact their performance in computing tasks? Why are GPUs considered better at handling tasks like graphics rendering compared to CPUs? How do different rendering techniques in games versus offline programs affect the processing demands on CPUs and GPUs? ... we explain like I'm five Thank you to the r/explainlikeimfive community and in particular the following users whose questions and comments formed the basis of this discussion: insane_eraser, popejustice, warlocktx, pourliver, dmartis, and arentol. To the community that has supported us so far, thanks for all your feedback and comments. Join us on Twitter: https://www.twitter.com/eli5ThePodcast/ or send us an e-mail: ELI5ThePodcast@gmail.com
Heimir Thor Sverrisson joins Robby to discuss the importance of software architecture in long-term maintainability. With over four decades in the industry, Heimir has witnessed firsthand how poor architectural decisions can set teams up for failure. He shares his experiences mentoring engineers, tackling technical debt, and solving large-scale performance problems—including one bank's misguided attempt to fix system slowness by simply adding more CPUs.Heimir also discusses his work at MojoTech, the value of code reviews in consulting, and his volunteer efforts designing radiation-tolerant software for satellites.Episode Highlights[00:01:12] Why architecture is the foundation of maintainability – Heimir explains why starting with the wrong architecture dooms software projects.[00:02:20] Upfront design vs. agile methodologies – The tension between planning and iterative development.[00:03:33] When architecture becomes the problem – How business pivots can render initial designs obsolete.[00:05:06] The rising demand for rapid software delivery – Why modern projects have less time for deep architectural planning.[00:06:15] Defining technical debt in practical terms – How to clean up code without waiting for permission.[00:09:56] The rewrite that never launched – What happens when a company cancels a multi-million-dollar software project.[00:12:43] How a major bank tackled system slowness the wrong way – Adding CPUs didn't solve their performance problems.[00:15:00] Performance tuning as an ongoing process – Why fixing one bottleneck only reveals the next.[00:22:34] How MojoTech mentors instead of manages – Heimir explains how their consultancy approaches team development.[00:27:54] Building software for space – How AMSAT develops radiation-resistant software for satellites.[00:32:52] Staying relevant after four decades in tech – The power of curiosity in a constantly changing industry.[00:34:26] How AI might (or might not) help maintainable software – Heimir shares his cautious optimism.[00:37:14] Non-technical book recommendation – The Man Who Broke Capitalism and its relevance to the tech industry.Resources & LinksHeimir Thor Sverrisson on LinkedInHeimir's GitHubMojoTechAMSAT – Amateur Radio Satellite OrganizationThe Man Who Broke CapitalismHow to Make Things Faster
Dr. Arkaprava Basu is an Associate Professor at the Indian Institute of Science, where he mentors students in the Computer Systems Lab. Arka's research focuses on pushing the boundaries of memory management and software reliability for both CPUs and GPUs. His work spans diverse areas, from optimizing memory systems for chiplet-based GPUs to developing innovative techniques to eliminate synchronization bottlenecks in GPU programs. He is also a recipient of the Intel Rising Star Faculty Award, ACM India Early Career Award, and multiple other accolades, recognizing his innovative approaches to enhancing GPU performance, programmability, and reliability.
Better FPS with new CPUs, improved server performance, little feature improvements, ship balances, and new content. The playability of Star Citizen has proven to be moving in a positive direction in Q1 of 2025. Youtuber and SC benchmark TenPoundFortyTwo joins me to discuss this gradual improvement to Star Citizen this year, and how CIG can continue the momentum moving forward.Today's Guests:TenPoundFortyTwoYoutube: YouTube: https://www.youtube.com/@tenpoundfortytwoToC:00:00 Introductions03:20 Performance Optimization in Gaming06:00 How The Game Engines Work20:00 Star Citizen Minimum & Recommended Specs31:00 Is Star CItizen Performance Really Improving?51:30 Are The Graphics Falling Behind?58:30 Star Citizen 4.1 OutlookVideo Podcast: https://www.youtube.com/playlist?list=PLvpiPXCO7OVJOlBIclW9tbpb2g29gur3ISupport This Podcast:Patreon Paypal Ko-FiFollow Space Tomato on social media:Website Youtube My Other Youtube
The PC hardware market has finally settled down with the release of AMD's new Radeon 9000 series and no more major CPU or GPU product launches later this year. So we assess the state of the PC union a bit this week, with a focus on the new AMD cards and their dramatically improved upscaling, ray-tracing, video encoding, and perhaps most of all, price. Plus, some updates on Intel's low-end Battlemage, Nvidia's mounting 50-series woes, the possible delay of Intel's next-gen Panther Lake CPU to 2026, new rumored low-power CPUs for Brad to get excited about running a Linux router on, and more. Support the Pod! Contribute to the Tech Pod Patreon and get access to our booming Discord, a monthly bonus episode, your name in the credits, and other great benefits! You can support the show at: https://patreon.com/techpod
In this episode, Conor and Ben chat with Tristan Brindle about recent updates to Flux, internal iteration vs external iteration and more.Link to Episode 224 on WebsiteDiscuss this episode, leave a comment, or ask a question (on GitHub)SocialsADSP: The Podcast: TwitterConor Hoekstra: Twitter | BlueSky | MastodonBen Deane: Twitter | BlueSkyAbout the GuestTristan Brindle a freelance programmer and trainer based in London, mostly focussing on C++. He is a member of the UK national body (BSI) and ISO WG21. Occasionally I can be found at C++ conferences. He is also a director of C++ London Uni, a not-for-profit organisation offering free beginner programming classes in London and online. He has a few fun projects on GitHub that you can find out about here.Show NotesDate Generated: 2025-02-17Date Released: 2025-03-07FluxLightning Talk: Faster Filtering with Flux - Tristan Brindle - CppNorth 2023Arrays, Fusion & CPUs vs GPUs.pdfIteration Revisited: A Safer Iteration Model for C++ - Tristan Brindle - CppNorth 2023ADSP Episode 126: Flux (and Flow) with Tristan BrindleIterators and Ranges: Comparing C++ to D to Rust - Barry Revzin - [CppNow 2021]Keynote: Iterators and Ranges: Comparing C++ to D, Rust, and Others - Barry Revzin - CPPP 2021Iteration Inside and Out - Bob Nystrom BlogExpanding the internal iteration API #99Intro Song InfoMiss You by Sarah Jansen https://soundcloud.com/sarahjansenmusicCreative Commons — Attribution 3.0 Unported — CC BY 3.0Free Download / Stream: http://bit.ly/l-miss-youMusic promoted by Audio Library https://youtu.be/iYYxnasvfx8
Lords: * Andi * Casey Topics: * Lifehacks as communion with the divine * I decided to fire my computer * Winston is starting to forget things Microtopics: * A Star Trek watchalong podcast that doesn't exist yet. * Positing that what you said is no longer an NDA violation by the time this episode comes out. * Plugging a fake game that you worked on. * Astrobot. * Horror movie clinky noises that you can't hear over the PS4 fan noises. * Caffeine-infused mints with Tux the Penguin branding on Think Geek dot com. * The pre-eminent source for Life Hacks. * Using a hotel shower cap to bake bread. * Anime girls that are happy to see you. * That one time Film Crit Hulk broke character. * The joy of moving efficiently through the world. * More efficient ways to set the microwave timer. * Hotel rooms that you can bake bread in. * Whether bread should contain hair. * Tricking yourself into not being bored while doing something you have to do. * Reading 50 life hacks and applying none of them because. * Viral Life Hack that's killed 33 people. * A life hack that already had a body count in the double digits before someone made a TikTok about it. * Getting really fed up with computers. * Cryptographic signing processes that you can't participate in. * The HDCP certification board taking steps to ensure nobody can take a screen shot of their Crunchy Roll anime. * The analog hole. * Open source web browsers that can't see DRM content. * Microsoft-authenticated Linux installations. * Designing a circuit that solves a math problem. * Stamping a circuit onto your circuit clay. * An independent circuit re-implementation of video game hardware. * Should you use FPGA to do a thing? * Ridiculous multi-level memory caching systems. * Bootstrapping an FPGA design tool that runs on an FPGA device. * Every single circuit doing something on every single cycle. * Voltages going high and/or low. * Making a bunch of CPUs and testing them afterwards to see how many GHz they have. * Why the PS3 Cell processor had 7 SPUs * The industrial uses of the Cell processor. * A GLSL compiler that outputs FPGA circuits. * Mr. MiSTer. * Open-hardware laptops. * Inventing an open-source GPU. * Multics or Minix. * Writing a Breakout clone in Rust targeting the weird CPU your friend just invented. * Making a terrible first effort that is the right kind of good enough. * A laptop that has a FPGA where the CPU/GPU usually goes. * 1970s-era TV games. * The Epoch Cassette Vision. * A game console with interchangeable cartridges where the CPU is on the cartridge. * The Glasgow Interface Explorer. * Describing your FPGA circuit in Python. * Manufacturing homebrew Cassette Vision Homebrew cartridges for the audience of zero Cassette Vision owners. * Making art just for you, in the most overly elaborate and overly complicated way possible. * The programmer equivalent of going to swim with the dolphins. * Diagonal pixels. * Childhood amnesia. * Remembering your memories. * Using 10% of your brain. (And also the other 90%.) * Knowing things about stuff. * When one brother dies, the other brother gets their memories. * Memories that are formed before vs. after you learn to talk. * Being persecuted for being friends with a girl. * Rules of heteronormativity being enforced by three year olds. * Getting off of Wordpress.
An airhacks.fm conversation with Francesco Nigro (@forked_franz) about: Netty committer and performance engineer at Red Hat, discussion of Netty's history, focus on low-level core components like buffers and allocators in Netty, relationship between Vert.x and Netty where Vert.x provides a more opinionated and user-friendly abstraction over Netty, explanation of reactive back pressure implementation in Vert.x, performance advantages of Vert.x over Netty due to batching and reactive design, detailed explanation of IO_uring as a Linux-specific asynchronous I/O mechanism, comparison between event loop architecture and Project Loom for scalability, limitations of Loom when working with IO_uring due to design incompatibilities, discovery of a major Java type system scalability issue related to instance-of checks against interfaces, explanation of how this issue affected Hibernate performance, deep investigation using assembly-level analysis to identify the root cause, collaboration with Andrew Haley to fix the 20-year-old JDK issue, performance improvements of 2-3x after fixing the issue, discussion of CPU cache coherency problems in NUMA architectures, explanation of how container environments like kubernetes can worsen performance issues due to CPU scheduling, insights into how modern CPUs handle branch prediction and speculation, impact of branch misprediction on performance especially with memory access patterns, discussion of memory bandwidth limitations in AI/ML workloads, advantages of unified memory architectures like Apple M-series chips for AI inference Francesco Nigro on twitter: @forked_franz
Today's episode is with Paul Klein, founder of Browserbase. We talked about building browser infrastructure for AI agents, the future of agent authentication, and their open source framework Stagehand.* [00:00:00] Introductions* [00:04:46] AI-specific challenges in browser infrastructure* [00:07:05] Multimodality in AI-Powered Browsing* [00:12:26] Running headless browsers at scale* [00:18:46] Geolocation when proxying* [00:21:25] CAPTCHAs and Agent Auth* [00:28:21] Building “User take over” functionality* [00:33:43] Stagehand: AI web browsing framework* [00:38:58] OpenAI's Operator and computer use agents* [00:44:44] Surprising use cases of Browserbase* [00:47:18] Future of browser automation and market competition* [00:53:11] Being a solo founderTranscriptAlessio [00:00:04]: Hey everyone, welcome to the Latent Space podcast. This is Alessio, partner and CTO at Decibel Partners, and I'm joined by my co-host Swyx, founder of Smol.ai.swyx [00:00:12]: Hey, and today we are very blessed to have our friends, Paul Klein, for the fourth, the fourth, CEO of Browserbase. Welcome.Paul [00:00:21]: Thanks guys. Yeah, I'm happy to be here. I've been lucky to know both of you for like a couple of years now, I think. So it's just like we're hanging out, you know, with three ginormous microphones in front of our face. It's totally normal hangout.swyx [00:00:34]: Yeah. We've actually mentioned you on the podcast, I think, more often than any other Solaris tenant. Just because like you're one of the, you know, best performing, I think, LLM tool companies that have started up in the last couple of years.Paul [00:00:50]: Yeah, I mean, it's been a whirlwind of a year, like Browserbase is actually pretty close to our first birthday. So we are one years old. And going from, you know, starting a company as a solo founder to... To, you know, having a team of 20 people, you know, a series A, but also being able to support hundreds of AI companies that are building AI applications that go out and automate the web. It's just been like, really cool. It's been happening a little too fast. I think like collectively as an AI industry, let's just take a week off together. I took my first vacation actually two weeks ago, and Operator came out on the first day, and then a week later, DeepSeat came out. And I'm like on vacation trying to chill. I'm like, we got to build with this stuff, right? So it's been a breakneck year. But I'm super happy to be here and like talk more about all the stuff we're seeing. And I'd love to hear kind of what you guys are excited about too, and share with it, you know?swyx [00:01:39]: Where to start? So people, you've done a bunch of podcasts. I think I strongly recommend Jack Bridger's Scaling DevTools, as well as Turner Novak's The Peel. And, you know, I'm sure there's others. So you covered your Twilio story in the past, talked about StreamClub, you got acquired to Mux, and then you left to start Browserbase. So maybe we just start with what is Browserbase? Yeah.Paul [00:02:02]: Browserbase is the web browser for your AI. We're building headless browser infrastructure, which are browsers that run in a server environment that's accessible to developers via APIs and SDKs. It's really hard to run a web browser in the cloud. You guys are probably running Chrome on your computers, and that's using a lot of resources, right? So if you want to run a web browser or thousands of web browsers, you can't just spin up a bunch of lambdas. You actually need to use a secure containerized environment. You have to scale it up and down. It's a stateful system. And that infrastructure is, like, super painful. And I know that firsthand, because at my last company, StreamClub, I was CTO, and I was building our own internal headless browser infrastructure. That's actually why we sold the company, is because Mux really wanted to buy our headless browser infrastructure that we'd built. And it's just a super hard problem. And I actually told my co-founders, I would never start another company unless it was a browser infrastructure company. And it turns out that's really necessary in the age of AI, when AI can actually go out and interact with websites, click on buttons, fill in forms. You need AI to do all of that work in an actual browser running somewhere on a server. And BrowserBase powers that.swyx [00:03:08]: While you're talking about it, it occurred to me, not that you're going to be acquired or anything, but it occurred to me that it would be really funny if you became the Nikita Beer of headless browser companies. You just have one trick, and you make browser companies that get acquired.Paul [00:03:23]: I truly do only have one trick. I'm screwed if it's not for headless browsers. I'm not a Go programmer. You know, I'm in AI grant. You know, browsers is an AI grant. But we were the only company in that AI grant batch that used zero dollars on AI spend. You know, we're purely an infrastructure company. So as much as people want to ask me about reinforcement learning, I might not be the best guy to talk about that. But if you want to ask about headless browser infrastructure at scale, I can talk your ear off. So that's really my area of expertise. And it's a pretty niche thing. Like, nobody has done what we're doing at scale before. So we're happy to be the experts.swyx [00:03:59]: You do have an AI thing, stagehand. We can talk about the sort of core of browser-based first, and then maybe stagehand. Yeah, stagehand is kind of the web browsing framework. Yeah.What is Browserbase? Headless Browser Infrastructure ExplainedAlessio [00:04:10]: Yeah. Yeah. And maybe how you got to browser-based and what problems you saw. So one of the first things I worked on as a software engineer was integration testing. Sauce Labs was kind of like the main thing at the time. And then we had Selenium, we had Playbrite, we had all these different browser things. But it's always been super hard to do. So obviously you've worked on this before. When you started browser-based, what were the challenges? What were the AI-specific challenges that you saw versus, there's kind of like all the usual running browser at scale in the cloud, which has been a problem for years. What are like the AI unique things that you saw that like traditional purchase just didn't cover? Yeah.AI-specific challenges in browser infrastructurePaul [00:04:46]: First and foremost, I think back to like the first thing I did as a developer, like as a kid when I was writing code, I wanted to write code that did stuff for me. You know, I wanted to write code to automate my life. And I do that probably by using curl or beautiful soup to fetch data from a web browser. And I think I still do that now that I'm in the cloud. And the other thing that I think is a huge challenge for me is that you can't just create a web site and parse that data. And we all know that now like, you know, taking HTML and plugging that into an LLM, you can extract insights, you can summarize. So it was very clear that now like dynamic web scraping became very possible with the rise of large language models or a lot easier. And that was like a clear reason why there's been more usage of headless browsers, which are necessary because a lot of modern websites don't expose all of their page content via a simple HTTP request. You know, they actually do require you to run this type of code for a specific time. JavaScript on the page to hydrate this. Airbnb is a great example. You go to airbnb.com. A lot of that content on the page isn't there until after they run the initial hydration. So you can't just scrape it with a curl. You need to have some JavaScript run. And a browser is that JavaScript engine that's going to actually run all those requests on the page. So web data retrieval was definitely one driver of starting BrowserBase and the rise of being able to summarize that within LLM. Also, I was familiar with if I wanted to automate a website, I could write one script and that would work for one website. It was very static and deterministic. But the web is non-deterministic. The web is always changing. And until we had LLMs, there was no way to write scripts that you could write once that would run on any website. That would change with the structure of the website. Click the login button. It could mean something different on many different websites. And LLMs allow us to generate code on the fly to actually control that. So I think that rise of writing the generic automation scripts that can work on many different websites, to me, made it clear that browsers are going to be a lot more useful because now you can automate a lot more things without writing. If you wanted to write a script to book a demo call on 100 websites, previously, you had to write 100 scripts. Now you write one script that uses LLMs to generate that script. That's why we built our web browsing framework, StageHand, which does a lot of that work for you. But those two things, web data collection and then enhanced automation of many different websites, it just felt like big drivers for more browser infrastructure that would be required to power these kinds of features.Alessio [00:07:05]: And was multimodality also a big thing?Paul [00:07:08]: Now you can use the LLMs to look, even though the text in the dome might not be as friendly. Maybe my hot take is I was always kind of like, I didn't think vision would be as big of a driver. For UI automation, I felt like, you know, HTML is structured text and large language models are good with structured text. But it's clear that these computer use models are often vision driven, and they've been really pushing things forward. So definitely being multimodal, like rendering the page is required to take a screenshot to give that to a computer use model to take actions on a website. And it's just another win for browser. But I'll be honest, that wasn't what I was thinking early on. I didn't even think that we'd get here so fast with multimodality. I think we're going to have to get back to multimodal and vision models.swyx [00:07:50]: This is one of those things where I forgot to mention in my intro that I'm an investor in Browserbase. And I remember that when you pitched to me, like a lot of the stuff that we have today, we like wasn't on the original conversation. But I did have my original thesis was something that we've talked about on the podcast before, which is take the GPT store, the custom GPT store, all the every single checkbox and plugin is effectively a startup. And this was the browser one. I think the main hesitation, I think I actually took a while to get back to you. The main hesitation was that there were others. Like you're not the first hit list browser startup. It's not even your first hit list browser startup. There's always a question of like, will you be the category winner in a place where there's a bunch of incumbents, to be honest, that are bigger than you? They're just not targeted at the AI space. They don't have the backing of Nat Friedman. And there's a bunch of like, you're here in Silicon Valley. They're not. I don't know.Paul [00:08:47]: I don't know if that's, that was it, but like, there was a, yeah, I mean, like, I think I tried all the other ones and I was like, really disappointed. Like my background is from working at great developer tools, companies, and nothing had like the Vercel like experience. Um, like our biggest competitor actually is partly owned by private equity and they just jacked up their prices quite a bit. And the dashboard hasn't changed in five years. And I actually used them at my last company and tried them and I was like, oh man, like there really just needs to be something that's like the experience of these great infrastructure companies, like Stripe, like clerk, like Vercel that I use in love, but oriented towards this kind of like more specific category, which is browser infrastructure, which is really technically complex. Like a lot of stuff can go wrong on the internet when you're running a browser. The internet is very vast. There's a lot of different configurations. Like there's still websites that only work with internet explorer out there. How do you handle that when you're running your own browser infrastructure? These are the problems that we have to think about and solve at BrowserBase. And it's, it's certainly a labor of love, but I built this for me, first and foremost, I know it's super cheesy and everyone says that for like their startups, but it really, truly was for me. If you look at like the talks I've done even before BrowserBase, and I'm just like really excited to try and build a category defining infrastructure company. And it's, it's rare to have a new category of infrastructure exists. We're here in the Chroma offices and like, you know, vector databases is a new category of infrastructure. Is it, is it, I mean, we can, we're in their office, so, you know, we can, we can debate that one later. That is one.Multimodality in AI-Powered Browsingswyx [00:10:16]: That's one of the industry debates.Paul [00:10:17]: I guess we go back to the LLMOS talk that Karpathy gave way long ago. And like the browser box was very clearly there and it seemed like the people who were building in this space also agreed that browsers are a core primitive of infrastructure for the LLMOS that's going to exist in the future. And nobody was building something there that I wanted to use. So I had to go build it myself.swyx [00:10:38]: Yeah. I mean, exactly that talk that, that honestly, that diagram, every box is a startup and there's the code box and then there's the. The browser box. I think at some point they will start clashing there. There's always the question of the, are you a point solution or are you the sort of all in one? And I think the point solutions tend to win quickly, but then the only ones have a very tight cohesive experience. Yeah. Let's talk about just the hard problems of browser base you have on your website, which is beautiful. Thank you. Was there an agency that you used for that? Yeah. Herb.paris.Paul [00:11:11]: They're amazing. Herb.paris. Yeah. It's H-E-R-V-E. I highly recommend for developers. Developer tools, founders to work with consumer agencies because they end up building beautiful things and the Parisians know how to build beautiful interfaces. So I got to give prep.swyx [00:11:24]: And chat apps, apparently are, they are very fast. Oh yeah. The Mistral chat. Yeah. Mistral. Yeah.Paul [00:11:31]: Late chat.swyx [00:11:31]: Late chat. And then your videos as well, it was professionally shot, right? The series A video. Yeah.Alessio [00:11:36]: Nico did the videos. He's amazing. Not the initial video that you shot at the new one. First one was Austin.Paul [00:11:41]: Another, another video pretty surprised. But yeah, I mean, like, I think when you think about how you talk about your company. You have to think about the way you present yourself. It's, you know, as a developer, you think you evaluate a company based on like the API reliability and the P 95, but a lot of developers say, is the website good? Is the message clear? Do I like trust this founder? I'm building my whole feature on. So I've tried to nail that as well as like the reliability of the infrastructure. You're right. It's very hard. And there's a lot of kind of foot guns that you run into when running headless browsers at scale. Right.Competing with Existing Headless Browser Solutionsswyx [00:12:10]: So let's pick one. You have eight features here. Seamless integration. Scalability. Fast or speed. Secure. Observable. Stealth. That's interesting. Extensible and developer first. What comes to your mind as like the top two, three hardest ones? Yeah.Running headless browsers at scalePaul [00:12:26]: I think just running headless browsers at scale is like the hardest one. And maybe can I nerd out for a second? Is that okay? I heard this is a technical audience, so I'll talk to the other nerds. Whoa. They were listening. Yeah. They're upset. They're ready. The AGI is angry. Okay. So. So how do you run a browser in the cloud? Let's start with that, right? So let's say you're using a popular browser automation framework like Puppeteer, Playwright, and Selenium. Maybe you've written a code, some code locally on your computer that opens up Google. It finds the search bar and then types in, you know, search for Latent Space and hits the search button. That script works great locally. You can see the little browser open up. You want to take that to production. You want to run the script in a cloud environment. So when your laptop is closed, your browser is doing something. The browser is doing something. Well, I, we use Amazon. You can see the little browser open up. You know, the first thing I'd reach for is probably like some sort of serverless infrastructure. I would probably try and deploy on a Lambda. But Chrome itself is too big to run on a Lambda. It's over 250 megabytes. So you can't easily start it on a Lambda. So you maybe have to use something like Lambda layers to squeeze it in there. Maybe use a different Chromium build that's lighter. And you get it on the Lambda. Great. It works. But it runs super slowly. It's because Lambdas are very like resource limited. They only run like with one vCPU. You can run one process at a time. Remember, Chromium is super beefy. It's barely running on my MacBook Air. I'm still downloading it from a pre-run. Yeah, from the test earlier, right? I'm joking. But it's big, you know? So like Lambda, it just won't work really well. Maybe it'll work, but you need something faster. Your users want something faster. Okay. Well, let's put it on a beefier instance. Let's get an EC2 server running. Let's throw Chromium on there. Great. Okay. I can, that works well with one user. But what if I want to run like 10 Chromium instances, one for each of my users? Okay. Well, I might need two EC2 instances. Maybe 10. All of a sudden, you have multiple EC2 instances. This sounds like a problem for Kubernetes and Docker, right? Now, all of a sudden, you're using ECS or EKS, the Kubernetes or container solutions by Amazon. You're spending up and down containers, and you're spending a whole engineer's time on kind of maintaining this stateful distributed system. Those are some of the worst systems to run because when it's a stateful distributed system, it means that you are bound by the connections to that thing. You have to keep the browser open while someone is working with it, right? That's just a painful architecture to run. And there's all this other little gotchas with Chromium, like Chromium, which is the open source version of Chrome, by the way. You have to install all these fonts. You want emojis working in your browsers because your vision model is looking for the emoji. You need to make sure you have the emoji fonts. You need to make sure you have all the right extensions configured, like, oh, do you want ad blocking? How do you configure that? How do you actually record all these browser sessions? Like it's a headless browser. You can't look at it. So you need to have some sort of observability. Maybe you're recording videos and storing those somewhere. It all kind of adds up to be this just giant monster piece of your project when all you wanted to do was run a lot of browsers in production for this little script to go to google.com and search. And when I see a complex distributed system, I see an opportunity to build a great infrastructure company. And we really abstract that away with Browserbase where our customers can use these existing frameworks, Playwright, Publisher, Selenium, or our own stagehand and connect to our browsers in a serverless-like way. And control them, and then just disconnect when they're done. And they don't have to think about the complex distributed system behind all of that. They just get a browser running anywhere, anytime. Really easy to connect to.swyx [00:15:55]: I'm sure you have questions. My standard question with anything, so essentially you're a serverless browser company, and there's been other serverless things that I'm familiar with in the past, serverless GPUs, serverless website hosting. That's where I come from with Netlify. One question is just like, you promised to spin up thousands of servers. You promised to spin up thousands of browsers in milliseconds. I feel like there's no real solution that does that yet. And I'm just kind of curious how. The only solution I know, which is to kind of keep a kind of warm pool of servers around, which is expensive, but maybe not so expensive because it's just CPUs. So I'm just like, you know. Yeah.Browsers as a Core Primitive in AI InfrastructurePaul [00:16:36]: You nailed it, right? I mean, how do you offer a serverless-like experience with something that is clearly not serverless, right? And the answer is, you need to be able to run... We run many browsers on single nodes. We use Kubernetes at browser base. So we have many pods that are being scheduled. We have to predictably schedule them up or down. Yes, thousands of browsers in milliseconds is the best case scenario. If you hit us with 10,000 requests, you may hit a slower cold start, right? So we've done a lot of work on predictive scaling and being able to kind of route stuff to different regions where we have multiple regions of browser base where we have different pools available. You can also pick the region you want to go to based on like lower latency, round trip, time latency. It's very important with these types of things. There's a lot of requests going over the wire. So for us, like having a VM like Firecracker powering everything under the hood allows us to be super nimble and spin things up or down really quickly with strong multi-tenancy. But in the end, this is like the complex infrastructural challenges that we have to kind of deal with at browser base. And we have a lot more stuff on our roadmap to allow customers to have more levers to pull to exchange, do you want really fast browser startup times or do you want really low costs? And if you're willing to be more flexible on that, we may be able to kind of like work better for your use cases.swyx [00:17:44]: Since you used Firecracker, shouldn't Fargate do that for you or did you have to go lower level than that? We had to go lower level than that.Paul [00:17:51]: I find this a lot with Fargate customers, which is alarming for Fargate. We used to be a giant Fargate customer. Actually, the first version of browser base was ECS and Fargate. And unfortunately, it's a great product. I think we were actually the largest Fargate customer in our region for a little while. No, what? Yeah, seriously. And unfortunately, it's a great product, but I think if you're an infrastructure company, you actually have to have a deeper level of control over these primitives. I think it's the same thing is true with databases. We've used other database providers and I think-swyx [00:18:21]: Yeah, serverless Postgres.Paul [00:18:23]: Shocker. When you're an infrastructure company, you're on the hook if any provider has an outage. And I can't tell my customers like, hey, we went down because so-and-so went down. That's not acceptable. So for us, we've really moved to bringing things internally. It's kind of opposite of what we preach. We tell our customers, don't build this in-house, but then we're like, we build a lot of stuff in-house. But I think it just really depends on what is in the critical path. We try and have deep ownership of that.Alessio [00:18:46]: On the distributed location side, how does that work for the web where you might get sort of different content in different locations, but the customer is expecting, you know, if you're in the US, I'm expecting the US version. But if you're spinning up my browser in France, I might get the French version. Yeah.Paul [00:19:02]: Yeah. That's a good question. Well, generally, like on the localization, there is a thing called locale in the browser. You can set like what your locale is. If you're like in the ENUS browser or not, but some things do IP, IP based routing. And in that case, you may want to have a proxy. Like let's say you're running something in the, in Europe, but you want to make sure you're showing up from the US. You may want to use one of our proxy features so you can turn on proxies to say like, make sure these connections always come from the United States, which is necessary too, because when you're browsing the web, you're coming from like a, you know, data center IP, and that can make things a lot harder to browse web. So we do have kind of like this proxy super network. Yeah. We have a proxy for you based on where you're going, so you can reliably automate the web. But if you get scheduled in Europe, that doesn't happen as much. We try and schedule you as close to, you know, your origin that you're trying to go to. But generally you have control over the regions you can put your browsers in. So you can specify West one or East one or Europe. We only have one region of Europe right now, actually. Yeah.Alessio [00:19:55]: What's harder, the browser or the proxy? I feel like to me, it feels like actually proxying reliably at scale. It's much harder than spending up browsers at scale. I'm curious. It's all hard.Paul [00:20:06]: It's layers of hard, right? Yeah. I think it's different levels of hard. I think the thing with the proxy infrastructure is that we work with many different web proxy providers and some are better than others. Some have good days, some have bad days. And our customers who've built browser infrastructure on their own, they have to go and deal with sketchy actors. Like first they figure out their own browser infrastructure and then they got to go buy a proxy. And it's like you can pay in Bitcoin and it just kind of feels a little sus, right? It's like you're buying drugs when you're trying to get a proxy online. We have like deep relationships with these counterparties. We're able to audit them and say, is this proxy being sourced ethically? Like it's not running on someone's TV somewhere. Is it free range? Yeah. Free range organic proxies, right? Right. We do a level of diligence. We're SOC 2. So we have to understand what is going on here. But then we're able to make sure that like we route around proxy providers not working. There's proxy providers who will just, the proxy will stop working all of a sudden. And then if you don't have redundant proxying on your own browsers, that's hard down for you or you may get some serious impacts there. With us, like we intelligently know, hey, this proxy is not working. Let's go to this one. And you can kind of build a network of multiple providers to really guarantee the best uptime for our customers. Yeah. So you don't own any proxies? We don't own any proxies. You're right. The team has been saying who wants to like take home a little proxy server, but not yet. We're not there yet. You know?swyx [00:21:25]: It's a very mature market. I don't think you should build that yourself. Like you should just be a super customer of them. Yeah. Scraping, I think, is the main use case for that. I guess. Well, that leads us into CAPTCHAs and also off, but let's talk about CAPTCHAs. You had a little spiel that you wanted to talk about CAPTCHA stuff.Challenges of Scaling Browser InfrastructurePaul [00:21:43]: Oh, yeah. I was just, I think a lot of people ask, if you're thinking about proxies, you're thinking about CAPTCHAs too. I think it's the same thing. You can go buy CAPTCHA solvers online, but it's the same buying experience. It's some sketchy website, you have to integrate it. It's not fun to buy these things and you can't really trust that the docs are bad. What Browserbase does is we integrate a bunch of different CAPTCHAs. We do some stuff in-house, but generally we just integrate with a bunch of known vendors and continually monitor and maintain these things and say, is this working or not? Can we route around it or not? These are CAPTCHA solvers. CAPTCHA solvers, yeah. Not CAPTCHA providers, CAPTCHA solvers. Yeah, sorry. CAPTCHA solvers. We really try and make sure all of that works for you. I think as a dev, if I'm buying infrastructure, I want it all to work all the time and it's important for us to provide that experience by making sure everything does work and monitoring it on our own. Yeah. Right now, the world of CAPTCHAs is tricky. I think AI agents in particular are very much ahead of the internet infrastructure. CAPTCHAs are designed to block all types of bots, but there are now good bots and bad bots. I think in the future, CAPTCHAs will be able to identify who a good bot is, hopefully via some sort of KYC. For us, we've been very lucky. We have very little to no known abuse of Browserbase because we really look into who we work with. And for certain types of CAPTCHA solving, we only allow them on certain types of plans because we want to make sure that we can know what people are doing, what their use cases are. And that's really allowed us to try and be an arbiter of good bots, which is our long term goal. I want to build great relationships with people like Cloudflare so we can agree, hey, here are these acceptable bots. We'll identify them for you and make sure we flag when they come to your website. This is a good bot, you know?Alessio [00:23:23]: I see. And Cloudflare said they want to do more of this. So they're going to set by default, if they think you're an AI bot, they're going to reject. I'm curious if you think this is something that is going to be at the browser level or I mean, the DNS level with Cloudflare seems more where it should belong. But I'm curious how you think about it.Paul [00:23:40]: I think the web's going to change. You know, I think that the Internet as we have it right now is going to change. And we all need to just accept that the cat is out of the bag. And instead of kind of like wishing the Internet was like it was in the 2000s, we can have free content line that wouldn't be scraped. It's just it's not going to happen. And instead, we should think about like, one, how can we change? How can we change the models of, you know, information being published online so people can adequately commercialize it? But two, how do we rebuild applications that expect that AI agents are going to log in on their behalf? Those are the things that are going to allow us to kind of like identify good and bad bots. And I think the team at Clerk has been doing a really good job with this on the authentication side. I actually think that auth is the biggest thing that will prevent agents from accessing stuff, not captchas. And I think there will be agent auth in the future. I don't know if it's going to happen from an individual company, but actually authentication providers that have a, you know, hidden login as agent feature, which will then you put in your email, you'll get a push notification, say like, hey, your browser-based agent wants to log into your Airbnb. You can approve that and then the agent can proceed. That really circumvents the need for captchas or logging in as you and sharing your password. I think agent auth is going to be one way we identify good bots going forward. And I think a lot of this captcha solving stuff is really short-term problems as the internet kind of reorients itself around how it's going to work with agents browsing the web, just like people do. Yeah.Managing Distributed Browser Locations and Proxiesswyx [00:24:59]: Stitch recently was on Hacker News for talking about agent experience, AX, which is a thing that Netlify is also trying to clone and coin and talk about. And we've talked about this on our previous episodes before in a sense that I actually think that's like maybe the only part of the tech stack that needs to be kind of reinvented for agents. Everything else can stay the same, CLIs, APIs, whatever. But auth, yeah, we need agent auth. And it's mostly like short-lived, like it should not, it should be a distinct, identity from the human, but paired. I almost think like in the same way that every social network should have your main profile and then your alt accounts or your Finsta, it's almost like, you know, every, every human token should be paired with the agent token and the agent token can go and do stuff on behalf of the human token, but not be presumed to be the human. Yeah.Paul [00:25:48]: It's like, it's, it's actually very similar to OAuth is what I'm thinking. And, you know, Thread from Stitch is an investor, Colin from Clerk, Octaventures, all investors in browser-based because like, I hope they solve this because they'll make browser-based submission more possible. So we don't have to overcome all these hurdles, but I think it will be an OAuth-like flow where an agent will ask to log in as you, you'll approve the scopes. Like it can book an apartment on Airbnb, but it can't like message anybody. And then, you know, the agent will have some sort of like role-based access control within an application. Yeah. I'm excited for that.swyx [00:26:16]: The tricky part is just, there's one, one layer of delegation here, which is like, you're authoring my user's user or something like that. I don't know if that's tricky or not. Does that make sense? Yeah.Paul [00:26:25]: You know, actually at Twilio, I worked on the login identity and access. Management teams, right? So like I built Twilio's login page.swyx [00:26:31]: You were an intern on that team and then you became the lead in two years? Yeah.Paul [00:26:34]: Yeah. I started as an intern in 2016 and then I was the tech lead of that team. How? That's not normal. I didn't have a life. He's not normal. Look at this guy. I didn't have a girlfriend. I just loved my job. I don't know. I applied to 500 internships for my first job and I got rejected from every single one of them except for Twilio and then eventually Amazon. And they took a shot on me and like, I was getting paid money to write code, which was my dream. Yeah. Yeah. I'm very lucky that like this coding thing worked out because I was going to be doing it regardless. And yeah, I was able to kind of spend a lot of time on a team that was growing at a company that was growing. So it informed a lot of this stuff here. I think these are problems that have been solved with like the SAML protocol with SSO. I think it's a really interesting stuff with like WebAuthn, like these different types of authentication, like schemes that you can use to authenticate people. The tooling is all there. It just needs to be tweaked a little bit to work for agents. And I think the fact that there are companies that are already. Providing authentication as a service really sets it up. Well, the thing that's hard is like reinventing the internet for agents. We don't want to rebuild the internet. That's an impossible task. And I think people often say like, well, we'll have this second layer of APIs built for agents. I'm like, we will for the top use cases, but instead of we can just tweak the internet as is, which is on the authentication side, I think we're going to be the dumb ones going forward. Unfortunately, I think AI is going to be able to do a lot of the tasks that we do online, which means that it will be able to go to websites, click buttons on our behalf and log in on our behalf too. So with this kind of like web agent future happening, I think with some small structural changes, like you said, it feels like it could all slot in really nicely with the existing internet.Handling CAPTCHAs and Agent Authenticationswyx [00:28:08]: There's one more thing, which is the, your live view iframe, which lets you take, take control. Yeah. Obviously very key for operator now, but like, was, is there anything interesting technically there or that the people like, well, people always want this.Paul [00:28:21]: It was really hard to build, you know, like, so, okay. Headless browsers, you don't see them, right. They're running. They're running in a cloud somewhere. You can't like look at them. And I just want to really make, it's a weird name. I wish we came up with a better name for this thing, but you can't see them. Right. But customers don't trust AI agents, right. At least the first pass. So what we do with our live view is that, you know, when you use browser base, you can actually embed a live view of the browser running in the cloud for your customer to see it working. And that's what the first reason is the build trust, like, okay, so I have this script. That's going to go automate a website. I can embed it into my web application via an iframe and my customer can watch. I think. And then we added two way communication. So now not only can you watch the browser kind of being operated by AI, if you want to pause and actually click around type within this iframe that's controlling a browser, that's also possible. And this is all thanks to some of the lower level protocol, which is called the Chrome DevTools protocol. It has a API called start screencast, and you can also send mouse clicks and button clicks to a remote browser. And this is all embeddable within iframes. You have a browser within a browser, yo. And then you simulate the screen, the click on the other side. Exactly. And this is really nice often for, like, let's say, a capture that can't be solved. You saw this with Operator, you know, Operator actually uses a different approach. They use VNC. So, you know, you're able to see, like, you're seeing the whole window here. What we're doing is something a little lower level with the Chrome DevTools protocol. It's just PNGs being streamed over the wire. But the same thing is true, right? Like, hey, I'm running a window. Pause. Can you do something in this window? Human. Okay, great. Resume. Like sometimes 2FA tokens. Like if you get that text message, you might need a person to type that in. Web agents need human-in-the-loop type workflows still. You still need a person to interact with the browser. And building a UI to proxy that is kind of hard. You may as well just show them the whole browser and say, hey, can you finish this up for me? And then let the AI proceed on afterwards. Is there a future where I stream my current desktop to browser base? I don't think so. I think we're very much cloud infrastructure. Yeah. You know, but I think a lot of the stuff we're doing, we do want to, like, build tools. Like, you know, we'll talk about the stage and, you know, web agent framework in a second. But, like, there's a case where a lot of people are going desktop first for, you know, consumer use. And I think cloud is doing a lot of this, where I expect to see, you know, MCPs really oriented around the cloud desktop app for a reason, right? Like, I think a lot of these tools are going to run on your computer because it makes... I think it's breaking out. People are putting it on a server. Oh, really? Okay. Well, sweet. We'll see. We'll see that. I was surprised, though, wasn't I? I think that the browser company, too, with Dia Browser, it runs on your machine. You know, it's going to be...swyx [00:30:50]: What is it?Paul [00:30:51]: So, Dia Browser, as far as I understand... I used to use Arc. Yeah. I haven't used Arc. But I'm a big fan of the browser company. I think they're doing a lot of cool stuff in consumer. As far as I understand, it's a browser where you have a sidebar where you can, like, chat with it and it can control the local browser on your machine. So, if you imagine, like, what a consumer web agent is, which it lives alongside your browser, I think Google Chrome has Project Marina, I think. I almost call it Project Marinara for some reason. I don't know why. It's...swyx [00:31:17]: No, I think it's someone really likes the Waterworld. Oh, I see. The classic Kevin Costner. Yeah.Paul [00:31:22]: Okay. Project Marinara is a similar thing to the Dia Browser, in my mind, as far as I understand it. You have a browser that has an AI interface that will take over your mouse and keyboard and control the browser for you. Great for consumer use cases. But if you're building applications that rely on a browser and it's more part of a greater, like, AI app experience, you probably need something that's more like infrastructure, not a consumer app.swyx [00:31:44]: Just because I have explored a little bit in this area, do people want branching? So, I have the state. Of whatever my browser's in. And then I want, like, 100 clones of this state. Do people do that? Or...Paul [00:31:56]: People don't do it currently. Yeah. But it's definitely something we're thinking about. I think the idea of forking a browser is really cool. Technically, kind of hard. We're starting to see this in code execution, where people are, like, forking some, like, code execution, like, processes or forking some tool calls or branching tool calls. Haven't seen it at the browser level yet. But it makes sense. Like, if an AI agent is, like, using a website and it's not sure what path it wants to take to crawl this website. To find the information it's looking for. It would make sense for it to explore both paths in parallel. And that'd be a very, like... A road not taken. Yeah. And hopefully find the right answer. And then say, okay, this was actually the right one. And memorize that. And go there in the future. On the roadmap. For sure. Don't make my roadmap, please. You know?Alessio [00:32:37]: How do you actually do that? Yeah. How do you fork? I feel like the browser is so stateful for so many things.swyx [00:32:42]: Serialize the state. Restore the state. I don't know.Paul [00:32:44]: So, it's one of the reasons why we haven't done it yet. It's hard. You know? Like, to truly fork, it's actually quite difficult. The naive way is to open the same page in a new tab and then, like, hope that it's at the same thing. But if you have a form halfway filled, you may have to, like, take the whole, you know, container. Pause it. All the memory. Duplicate it. Restart it from there. It could be very slow. So, we haven't found a thing. Like, the easy thing to fork is just, like, copy the page object. You know? But I think there needs to be something a little bit more robust there. Yeah.swyx [00:33:12]: So, MorphLabs has this infinite branch thing. Like, wrote a custom fork of Linux or something that let them save the system state and clone it. MorphLabs, hit me up. I'll be a customer. Yeah. That's the only. I think that's the only way to do it. Yeah. Like, unless Chrome has some special API for you. Yeah.Paul [00:33:29]: There's probably something we'll reverse engineer one day. I don't know. Yeah.Alessio [00:33:32]: Let's talk about StageHand, the AI web browsing framework. You have three core components, Observe, Extract, and Act. Pretty clean landing page. What was the idea behind making a framework? Yeah.Stagehand: AI web browsing frameworkPaul [00:33:43]: So, there's three frameworks that are very popular or already exist, right? Puppeteer, Playwright, Selenium. Those are for building hard-coded scripts to control websites. And as soon as I started to play with LLMs plus browsing, I caught myself, you know, code-genning Playwright code to control a website. I would, like, take the DOM. I'd pass it to an LLM. I'd say, can you generate the Playwright code to click the appropriate button here? And it would do that. And I was like, this really should be part of the frameworks themselves. And I became really obsessed with SDKs that take natural language as part of, like, the API input. And that's what StageHand is. StageHand exposes three APIs, and it's a super set of Playwright. So, if you go to a page, you may want to take an action, click on the button, fill in the form, etc. That's what the act command is for. You may want to extract some data. This one takes a natural language, like, extract the winner of the Super Bowl from this page. You can give it a Zod schema, so it returns a structured output. And then maybe you're building an API. You can do an agent loop, and you want to kind of see what actions are possible on this page before taking one. You can do observe. So, you can observe the actions on the page, and it will generate a list of actions. You can guide it, like, give me actions on this page related to buying an item. And you can, like, buy it now, add to cart, view shipping options, and pass that to an LLM, an agent loop, to say, what's the appropriate action given this high-level goal? So, StageHand isn't a web agent. It's a framework for building web agents. And we think that agent loops are actually pretty close to the application layer because every application probably has different goals or different ways it wants to take steps. I don't think I've seen a generic. Maybe you guys are the experts here. I haven't seen, like, a really good AI agent framework here. Everyone kind of has their own special sauce, right? I see a lot of developers building their own agent loops, and they're using tools. And I view StageHand as the browser tool. So, we expose act, extract, observe. Your agent can call these tools. And from that, you don't have to worry about it. You don't have to worry about generating playwright code performantly. You don't have to worry about running it. You can kind of just integrate these three tool calls into your agent loop and reliably automate the web.swyx [00:35:48]: A special shout-out to Anirudh, who I met at your dinner, who I think listens to the pod. Yeah. Hey, Anirudh.Paul [00:35:54]: Anirudh's a man. He's a StageHand guy.swyx [00:35:56]: I mean, the interesting thing about each of these APIs is they're kind of each startup. Like, specifically extract, you know, Firecrawler is extract. There's, like, Expand AI. There's a whole bunch of, like, extract companies. They just focus on extract. I'm curious. Like, I feel like you guys are going to collide at some point. Like, right now, it's friendly. Everyone's in a blue ocean. At some point, it's going to be valuable enough that there's some turf battle here. I don't think you have a dog in a fight. I think you can mock extract to use an external service if they're better at it than you. But it's just an observation that, like, in the same way that I see each option, each checkbox in the side of custom GBTs becoming a startup or each box in the Karpathy chart being a startup. Like, this is also becoming a thing. Yeah.Paul [00:36:41]: I mean, like, so the way StageHand works is that it's MIT-licensed, completely open source. You bring your own API key to your LLM of choice. You could choose your LLM. We don't make any money off of the extract or really. We only really make money if you choose to run it with our browser. You don't have to. You can actually use your own browser, a local browser. You know, StageHand is completely open source for that reason. And, yeah, like, I think if you're building really complex web scraping workflows, I don't know if StageHand is the tool for you. I think it's really more if you're building an AI agent that needs a few general tools or if it's doing a lot of, like, web automation-intensive work. But if you're building a scraping company, StageHand is not your thing. You probably want something that's going to, like, get HTML content, you know, convert that to Markdown, query it. That's not what StageHand does. StageHand is more about reliability. I think we focus a lot on reliability and less so on cost optimization and speed at this point.swyx [00:37:33]: I actually feel like StageHand, so the way that StageHand works, it's like, you know, page.act, click on the quick start. Yeah. It's kind of the integration test for the code that you would have to write anyway, like the Puppeteer code that you have to write anyway. And when the page structure changes, because it always does, then this is still the test. This is still the test that I would have to write. Yeah. So it's kind of like a testing framework that doesn't need implementation detail.Paul [00:37:56]: Well, yeah. I mean, Puppeteer, Playwright, and Slenderman were all designed as testing frameworks, right? Yeah. And now people are, like, hacking them together to automate the web. I would say, and, like, maybe this is, like, me being too specific. But, like, when I write tests, if the page structure changes. Without me knowing, I want that test to fail. So I don't know if, like, AI, like, regenerating that. Like, people are using StageHand for testing. But it's more for, like, usability testing, not, like, testing of, like, does the front end, like, has it changed or not. Okay. But generally where we've seen people, like, really, like, take off is, like, if they're using, you know, something. If they want to build a feature in their application that's kind of like Operator or Deep Research, they're using StageHand to kind of power that tool calling in their own agent loop. Okay. Cool.swyx [00:38:37]: So let's go into Operator, the first big agent launch of the year from OpenAI. Seems like they have a whole bunch scheduled. You were on break and your phone blew up. What's your just general view of computer use agents is what they're calling it. The overall category before we go into Open Operator, just the overall promise of Operator. I will observe that I tried it once. It was okay. And I never tried it again.OpenAI's Operator and computer use agentsPaul [00:38:58]: That tracks with my experience, too. Like, I'm a huge fan of the OpenAI team. Like, I think that I do not view Operator as the company. I'm not a company killer for browser base at all. I think it actually shows people what's possible. I think, like, computer use models make a lot of sense. And I'm actually most excited about computer use models is, like, their ability to, like, really take screenshots and reasoning and output steps. I think that using mouse click or mouse coordinates, I've seen that proved to be less reliable than I would like. And I just wonder if that's the right form factor. What we've done with our framework is anchor it to the DOM itself, anchor it to the actual item. So, like, if it's clicking on something, it's clicking on that thing, you know? Like, it's more accurate. No matter where it is. Yeah, exactly. Because it really ties in nicely. And it can handle, like, the whole viewport in one go, whereas, like, Operator can only handle what it sees. Can you hover? Is hovering a thing that you can do? I don't know if we expose it as a tool directly, but I'm sure there's, like, an API for hovering. Like, move mouse to this position. Yeah, yeah, yeah. I think you can trigger hover, like, via, like, the JavaScript on the DOM itself. But, no, I think, like, when we saw computer use, everyone's eyes lit up because they realized, like, wow, like, AI is going to actually automate work for people. And I think seeing that kind of happen from both of the labs, and I'm sure we're going to see more labs launch computer use models, I'm excited to see all the stuff that people build with it. I think that I'd love to see computer use power, like, controlling a browser on browser base. And I think, like, Open Operator, which was, like, our open source version of OpenAI's Operator, was our first take on, like, how can we integrate these models into browser base? And we handle the infrastructure and let the labs do the models. I don't have a sense that Operator will be released as an API. I don't know. Maybe it will. I'm curious to see how well that works because I think it's going to be really hard for a company like OpenAI to do things like support CAPTCHA solving or, like, have proxies. Like, I think it's hard for them structurally. Imagine this New York Times headline, OpenAI CAPTCHA solving. Like, that would be a pretty bad headline, this New York Times headline. Browser base solves CAPTCHAs. No one cares. No one cares. And, like, our investors are bored. Like, we're all okay with this, you know? We're building this company knowing that the CAPTCHA solving is short-lived until we figure out how to authenticate good bots. I think it's really hard for a company like OpenAI, who has this brand that's so, so good, to balance with, like, the icky parts of web automation, which it can be kind of complex to solve. I'm sure OpenAI knows who to call whenever they need you. Yeah, right. I'm sure they'll have a great partnership.Alessio [00:41:23]: And is Open Operator just, like, a marketing thing for you? Like, how do you think about resource allocation? So, you can spin this up very quickly. And now there's all this, like, open deep research, just open all these things that people are building. We started it, you know. You're the original Open. We're the original Open operator, you know? Is it just, hey, look, this is a demo, but, like, we'll help you build out an actual product for yourself? Like, are you interested in going more of a product route? That's kind of the OpenAI way, right? They started as a model provider and then…Paul [00:41:53]: Yeah, we're not interested in going the product route yet. I view Open Operator as a model provider. It's a reference project, you know? Let's show people how to build these things using the infrastructure and models that are out there. And that's what it is. It's, like, Open Operator is very simple. It's an agent loop. It says, like, take a high-level goal, break it down into steps, use tool calling to accomplish those steps. It takes screenshots and feeds those screenshots into an LLM with the step to generate the right action. It uses stagehand under the hood to actually execute this action. It doesn't use a computer use model. And it, like, has a nice interface using the live view that we talked about, the iframe, to embed that into an application. So I felt like people on launch day wanted to figure out how to build their own version of this. And we turned that around really quickly to show them. And I hope we do that with other things like deep research. We don't have a deep research launch yet. I think David from AOMNI actually has an amazing open deep research that he launched. It has, like, 10K GitHub stars now. So he's crushing that. But I think if people want to build these features natively into their application, they need good reference projects. And I think Open Operator is a good example of that.swyx [00:42:52]: I don't know. Actually, I'm actually pretty bullish on API-driven operator. Because that's the only way that you can sort of, like, once it's reliable enough, obviously. And now we're nowhere near. But, like, give it five years. It'll happen, you know. And then you can sort of spin this up and browsers are working in the background and you don't necessarily have to know. And it just is booking restaurants for you, whatever. I can definitely see that future happening. I had this on the landing page here. This might be a slightly out of order. But, you know, you have, like, sort of three use cases for browser base. Open Operator. Or this is the operator sort of use case. It's kind of like the workflow automation use case. And it completes with UiPath in the sort of RPA category. Would you agree with that? Yeah, I would agree with that. And then there's Agents we talked about already. And web scraping, which I imagine would be the bulk of your workload right now, right?Paul [00:43:40]: No, not at all. I'd say actually, like, the majority is browser automation. We're kind of expensive for web scraping. Like, I think that if you're building a web scraping product, if you need to do occasional web scraping or you have to do web scraping that works every single time, you want to use browser automation. Yeah. You want to use browser-based. But if you're building web scraping workflows, what you should do is have a waterfall. You should have the first request is a curl to the website. See if you can get it without even using a browser. And then the second request may be, like, a scraping-specific API. There's, like, a thousand scraping APIs out there that you can use to try and get data. Scraping B. Scraping B is a great example, right? Yeah. And then, like, if those two don't work, bring out the heavy hitter. Like, browser-based will 100% work, right? It will load the page in a real browser, hydrate it. I see.swyx [00:44:21]: Because a lot of people don't render to JS.swyx [00:44:25]: Yeah, exactly.Paul [00:44:26]: So, I mean, the three big use cases, right? Like, you know, automation, web data collection, and then, you know, if you're building anything agentic that needs, like, a browser tool, you want to use browser-based.Alessio [00:44:35]: Is there any use case that, like, you were super surprised by that people might not even think about? Oh, yeah. Or is it, yeah, anything that you can share? The long tail is crazy. Yeah.Surprising use cases of BrowserbasePaul [00:44:44]: One of the case studies on our website that I think is the most interesting is this company called Benny. So, the way that it works is if you're on food stamps in the United States, you can actually get rebates if you buy certain things. Yeah. You buy some vegetables. You submit your receipt to the government. They'll give you a little rebate back. Say, hey, thanks for buying vegetables. It's good for you. That process of submitting that receipt is very painful. And the way Benny works is you use their app to take a photo of your receipt, and then Benny will go submit that receipt for you and then deposit the money into your account. That's actually using no AI at all. It's all, like, hard-coded scripts. They maintain the scripts. They've been doing a great job. And they build this amazing consumer app. But it's an example of, like, all these, like, tedious workflows that people have to do to kind of go about their business. And they're doing it for the sake of their day-to-day lives. And I had never known about, like, food stamp rebates or the complex forms you have to do to fill them. But the world is powered by millions and millions of tedious forms, visas. You know, Emirate Lighthouse is a customer, right? You know, they do the O1 visa. Millions and millions of forms are taking away humans' time. And I hope that Browserbase can help power software that automates away the web forms that we don't need anymore. Yeah.swyx [00:45:49]: I mean, I'm very supportive of that. I mean, forms. I do think, like, government itself is a big part of it. I think the government itself should embrace AI more to do more sort of human-friendly form filling. Mm-hmm. But I'm not optimistic. I'm not holding my breath. Yeah. We'll see. Okay. I think I'm about to zoom out. I have a little brief thing on computer use, and then we can talk about founder stuff, which is, I tend to think of developer tooling markets in impossible triangles, where everyone starts in a niche, and then they start to branch out. So I already hinted at a little bit of this, right? We mentioned more. We mentioned E2B. We mentioned Firecrawl. And then there's Browserbase. So there's, like, all this stuff of, like, have serverless virtual computer that you give to an agent and let them do stuff with it. And there's various ways of connecting it to the internet. You can just connect to a search API, like SERP API, whatever other, like, EXA is another one. That's what you're searching. You can also have a JSON markdown extractor, which is Firecrawl. Or you can have a virtual browser like Browserbase, or you can have a virtual machine like Morph. And then there's also maybe, like, a virtual sort of code environment, like Code Interpreter. So, like, there's just, like, a bunch of different ways to tackle the problem of give a computer to an agent. And I'm just kind of wondering if you see, like, everyone's just, like, happily coexisting in their respective niches. And as a developer, I just go and pick, like, a shopping basket of one of each. Or do you think that you eventually, people will collide?Future of browser automation and market competitionPaul [00:47:18]: I think that currently it's not a zero-sum market. Like, I think we're talking about... I think we're talking about all of knowledge work that people do that can be automated online. All of these, like, trillions of hours that happen online where people are working. And I think that there's so much software to be built that, like, I tend not to think about how these companies will collide. I just try to solve the problem as best as I can and make this specific piece of infrastructure, which I think is an important primitive, the best I possibly can. And yeah. I think there's players that are actually going to like it. I think there's players that are going to launch, like, over-the-top, you know, platforms, like agent platforms that have all these tools built in, right? Like, who's building the rippling for agent tools that has the search tool, the browser tool, the operating system tool, right? There are some. There are some. There are some, right? And I think in the end, what I have seen as my time as a developer, and I look at all the favorite tools that I have, is that, like, for tools and primitives with sufficient levels of complexity, you need to have a solution that's really bespoke to that primitive, you know? And I am sufficiently convinced that the browser is complex enough to deserve a primitive. Obviously, I have to. I'm the founder of BrowserBase, right? I'm talking my book. But, like, I think maybe I can give you one spicy take against, like, maybe just whole OS running. I think that when I look at computer use when it first came out, I saw that the majority of use cases for computer use were controlling a browser. And do we really need to run an entire operating system just to control a browser? I don't think so. I don't think that's necessary. You know, BrowserBase can run browsers for way cheaper than you can if you're running a full-fledged OS with a GUI, you know, operating system. And I think that's just an advantage of the browser. It is, like, browsers are little OSs, and you can run them very efficiently if you orchestrate it well. And I think that allows us to offer 90% of the, you know, functionality in the platform needed at 10% of the cost of running a full OS. Yeah.Open Operator: Browserbase's Open-Source Alternativeswyx [00:49:16]: I definitely see the logic in that. There's a Mark Andreessen quote. I don't know if you know this one. Where he basically observed that the browser is turning the operating system into a poorly debugged set of device drivers, because most of the apps are moved from the OS to the browser. So you can just run browsers.Paul [00:49:31]: There's a place for OSs, too. Like, I think that there are some applications that only run on Windows operating systems. And Eric from pig.dev in this upcoming YC batch, or last YC batch, like, he's building all run tons of Windows operating systems for you to control with your agent. And like, there's some legacy EHR systems that only run on Internet-controlled systems. Yeah.Paul [00:49:54]: I think that's it. I think, like, there are use cases for specific operating systems for specific legacy software. And like, I'm excited to see what he does with that. I just wanted to give a shout out to the pig.dev website.swyx [00:50:06]: The pigs jump when you click on them. Yeah. That's great.Paul [00:50:08]: Eric, he's the former co-founder of banana.dev, too.swyx [00:50:11]: Oh, that Eric. Yeah. That Eric. Okay. Well, he abandoned bananas for pigs. I hope he doesn't start going around with pigs now.Alessio [00:50:18]: Like he was going around with bananas. A little toy pig. Yeah. Yeah. I love that. What else are we missing? I think we covered a lot of, like, the browser-based product history, but. What do you wish people asked you? Yeah.Paul [00:50:29]: I wish people asked me more about, like, what will the future of software look like? Because I think that's really where I've spent a lot of time about why do browser-based. Like, for me, starting a company is like a means of last resort. Like, you shouldn't start a company unless you absolutely have to. And I remain convinced that the future of software is software that you're going to click a button and it's going to do stuff on your behalf. Right now, software. You click a button and it maybe, like, calls it back an API and, like, computes some numbers. It, like, modifies some text, whatever. But the future of software is software using software. So, I may log into my accounting website for my business, click a button, and it's going to go load up my Gmail, search my emails, find the thing, upload the receipt, and then comment it for me. Right? And it may use it using APIs, maybe a browser. I don't know. I think it's a little bit of both. But that's completely different from how we've built software so far. And that's. I think that future of software has different infrastructure requirements. It's going to require different UIs. It's going to require different pieces of infrastructure. I think the browser infrastructure is one piece that fits into that, along with all the other categories you mentioned. So, I think that it's going to require developers to think differently about how they've built software for, you know
TikTok is back on the App Store and the Play Store in the U.S. Elon Musk's DOGE Website Is Already Getting Hacked IRS Acquiring Nvidia Supercomputer Elon's bid for OpenAI is about making the for-profit transition as painful as possible for Altman, Intel has spoken with the Trump administration and TSMC over the past few months about a deal for TSMC to take control of Intel's foundry business Broadcom Joins TSMC In Considering Deals For Parts of Intel Arm to start making server CPUs in-house Thomson Reuters wins the first major US AI copyright ruling against fair use, in a case filed in May 2020 against legal research AI startup Ross Intelligence Perplexity just made AI research crazy cheap—what that means for the industry YouTube Surprise: CEO Says TV Overtakes Mobile as "Primary Device" for Viewing Google Maps now shows the 'Gulf of America' Scarlett Johansson Urges Government to Limit A.I. After Faked Video of Her Opposing Kanye West Goes Viral Google CEO Sees 'Useful' Quantum Computers 5 to 10 Years Away Trump says he has directed US Treasury to stop minting new pennies, citing rising cost Nearly 10 years after Data and Goliath, Bruce Schneier says: Privacy's still screwed Amazon's revamped Alexa might launch over a month after its announcement event Meta's Brain-to-Text AI Host: Leo Laporte Guests: Wesley Faulkner, Iain Thomson, and Brian McCullough Download or subscribe to This Week in Tech at https://twit.tv/shows/this-week-in-tech Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free shows, a members-only Discord, and behind-the-scenes access. Join today: https://twit.tv/clubtwit Sponsors: shopify.com/twit oracle.com/twit zscaler.com/security ziprecruiter.com/twit joindeleteme.com/twit promo code TWIT
TikTok is back on the App Store and the Play Store in the U.S. Elon Musk's DOGE Website Is Already Getting Hacked IRS Acquiring Nvidia Supercomputer Elon's bid for OpenAI is about making the for-profit transition as painful as possible for Altman, Intel has spoken with the Trump administration and TSMC over the past few months about a deal for TSMC to take control of Intel's foundry business Broadcom Joins TSMC In Considering Deals For Parts of Intel Arm to start making server CPUs in-house Thomson Reuters wins the first major US AI copyright ruling against fair use, in a case filed in May 2020 against legal research AI startup Ross Intelligence Perplexity just made AI research crazy cheap—what that means for the industry YouTube Surprise: CEO Says TV Overtakes Mobile as "Primary Device" for Viewing Google Maps now shows the 'Gulf of America' Scarlett Johansson Urges Government to Limit A.I. After Faked Video of Her Opposing Kanye West Goes Viral Google CEO Sees 'Useful' Quantum Computers 5 to 10 Years Away Trump says he has directed US Treasury to stop minting new pennies, citing rising cost Nearly 10 years after Data and Goliath, Bruce Schneier says: Privacy's still screwed Amazon's revamped Alexa might launch over a month after its announcement event Meta's Brain-to-Text AI Host: Leo Laporte Guests: Wesley Faulkner, Iain Thomson, and Brian McCullough Download or subscribe to This Week in Tech at https://twit.tv/shows/this-week-in-tech Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free shows, a members-only Discord, and behind-the-scenes access. Join today: https://twit.tv/clubtwit Sponsors: shopify.com/twit oracle.com/twit zscaler.com/security ziprecruiter.com/twit joindeleteme.com/twit promo code TWIT
TikTok is back on the App Store and the Play Store in the U.S. Elon Musk's DOGE Website Is Already Getting Hacked IRS Acquiring Nvidia Supercomputer Elon's bid for OpenAI is about making the for-profit transition as painful as possible for Altman, Intel has spoken with the Trump administration and TSMC over the past few months about a deal for TSMC to take control of Intel's foundry business Broadcom Joins TSMC In Considering Deals For Parts of Intel Arm to start making server CPUs in-house Thomson Reuters wins the first major US AI copyright ruling against fair use, in a case filed in May 2020 against legal research AI startup Ross Intelligence Perplexity just made AI research crazy cheap—what that means for the industry YouTube Surprise: CEO Says TV Overtakes Mobile as "Primary Device" for Viewing Google Maps now shows the 'Gulf of America' Scarlett Johansson Urges Government to Limit A.I. After Faked Video of Her Opposing Kanye West Goes Viral Google CEO Sees 'Useful' Quantum Computers 5 to 10 Years Away Trump says he has directed US Treasury to stop minting new pennies, citing rising cost Nearly 10 years after Data and Goliath, Bruce Schneier says: Privacy's still screwed Amazon's revamped Alexa might launch over a month after its announcement event Meta's Brain-to-Text AI Host: Leo Laporte Guests: Wesley Faulkner, Iain Thomson, and Brian McCullough Download or subscribe to This Week in Tech at https://twit.tv/shows/this-week-in-tech Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free shows, a members-only Discord, and behind-the-scenes access. Join today: https://twit.tv/clubtwit Sponsors: shopify.com/twit oracle.com/twit zscaler.com/security ziprecruiter.com/twit joindeleteme.com/twit promo code TWIT
TikTok is back on the App Store and the Play Store in the U.S. Elon Musk's DOGE Website Is Already Getting Hacked IRS Acquiring Nvidia Supercomputer Elon's bid for OpenAI is about making the for-profit transition as painful as possible for Altman, Intel has spoken with the Trump administration and TSMC over the past few months about a deal for TSMC to take control of Intel's foundry business Broadcom Joins TSMC In Considering Deals For Parts of Intel Arm to start making server CPUs in-house Thomson Reuters wins the first major US AI copyright ruling against fair use, in a case filed in May 2020 against legal research AI startup Ross Intelligence Perplexity just made AI research crazy cheap—what that means for the industry YouTube Surprise: CEO Says TV Overtakes Mobile as "Primary Device" for Viewing Google Maps now shows the 'Gulf of America' Scarlett Johansson Urges Government to Limit A.I. After Faked Video of Her Opposing Kanye West Goes Viral Google CEO Sees 'Useful' Quantum Computers 5 to 10 Years Away Trump says he has directed US Treasury to stop minting new pennies, citing rising cost Nearly 10 years after Data and Goliath, Bruce Schneier says: Privacy's still screwed Amazon's revamped Alexa might launch over a month after its announcement event Meta's Brain-to-Text AI Host: Leo Laporte Guests: Wesley Faulkner, Iain Thomson, and Brian McCullough Download or subscribe to This Week in Tech at https://twit.tv/shows/this-week-in-tech Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free shows, a members-only Discord, and behind-the-scenes access. Join today: https://twit.tv/clubtwit Sponsors: shopify.com/twit oracle.com/twit zscaler.com/security ziprecruiter.com/twit joindeleteme.com/twit promo code TWIT
TikTok is back on the App Store and the Play Store in the U.S. Elon Musk's DOGE Website Is Already Getting Hacked IRS Acquiring Nvidia Supercomputer Elon's bid for OpenAI is about making the for-profit transition as painful as possible for Altman, Intel has spoken with the Trump administration and TSMC over the past few months about a deal for TSMC to take control of Intel's foundry business Broadcom Joins TSMC In Considering Deals For Parts of Intel Arm to start making server CPUs in-house Thomson Reuters wins the first major US AI copyright ruling against fair use, in a case filed in May 2020 against legal research AI startup Ross Intelligence Perplexity just made AI research crazy cheap—what that means for the industry YouTube Surprise: CEO Says TV Overtakes Mobile as "Primary Device" for Viewing Google Maps now shows the 'Gulf of America' Scarlett Johansson Urges Government to Limit A.I. After Faked Video of Her Opposing Kanye West Goes Viral Google CEO Sees 'Useful' Quantum Computers 5 to 10 Years Away Trump says he has directed US Treasury to stop minting new pennies, citing rising cost Nearly 10 years after Data and Goliath, Bruce Schneier says: Privacy's still screwed Amazon's revamped Alexa might launch over a month after its announcement event Meta's Brain-to-Text AI Host: Leo Laporte Guests: Wesley Faulkner, Iain Thomson, and Brian McCullough Download or subscribe to This Week in Tech at https://twit.tv/shows/this-week-in-tech Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free shows, a members-only Discord, and behind-the-scenes access. Join today: https://twit.tv/clubtwit Sponsors: shopify.com/twit oracle.com/twit zscaler.com/security ziprecruiter.com/twit joindeleteme.com/twit promo code TWIT
TikTok is back on the App Store and the Play Store in the U.S. Elon Musk's DOGE Website Is Already Getting Hacked IRS Acquiring Nvidia Supercomputer Elon's bid for OpenAI is about making the for-profit transition as painful as possible for Altman, Intel has spoken with the Trump administration and TSMC over the past few months about a deal for TSMC to take control of Intel's foundry business Broadcom Joins TSMC In Considering Deals For Parts of Intel Arm to start making server CPUs in-house Thomson Reuters wins the first major US AI copyright ruling against fair use, in a case filed in May 2020 against legal research AI startup Ross Intelligence Perplexity just made AI research crazy cheap—what that means for the industry YouTube Surprise: CEO Says TV Overtakes Mobile as "Primary Device" for Viewing Google Maps now shows the 'Gulf of America' Scarlett Johansson Urges Government to Limit A.I. After Faked Video of Her Opposing Kanye West Goes Viral Google CEO Sees 'Useful' Quantum Computers 5 to 10 Years Away Trump says he has directed US Treasury to stop minting new pennies, citing rising cost Nearly 10 years after Data and Goliath, Bruce Schneier says: Privacy's still screwed Amazon's revamped Alexa might launch over a month after its announcement event Meta's Brain-to-Text AI Host: Leo Laporte Guests: Wesley Faulkner, Iain Thomson, and Brian McCullough Download or subscribe to This Week in Tech at https://twit.tv/shows/this-week-in-tech Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free shows, a members-only Discord, and behind-the-scenes access. Join today: https://twit.tv/clubtwit Sponsors: shopify.com/twit oracle.com/twit zscaler.com/security ziprecruiter.com/twit joindeleteme.com/twit promo code TWIT
As Windows 10 approaches its end of life on October 14, 2025, businesses here in Ireland face a crucial transition to Windows 11. This shift is more than a software update; it's a strategic opportunity to leverage advanced AI-enabled devices that future-proof IT operations. Windows 11 offers more than compliance with current software - it's a chance to embrace high-performing, secure systems that set the stage for long-term competitiveness. Moreover, with Government aiming to ensure that 75% of Irish businesses embrace AI by 2030, employees will increasingly rely on high-performance hardware and advanced processing power in their devices to embrace the AI opportunity that lies ahead. By embracing AI PCs, organisations can stay competitive and achieve key business goals all by enhancing operational efficiency, security, and employee productivity. Why act now? October is fast approaching and transitioning to a new operating system across an enterprise is complex. Businesses that remain tethered to legacy systems potentially face critical risks, including increased cybersecurity vulnerabilities, non-compliance, compatibility issues, and reduced operational efficiencies. For example, as older platforms lose access to security updates, sensitive company data becomes more vulnerable to breaches, phishing attacks, and ransomware. Here are key considerations for businesses embarking on a PC refresh as part of their transition to Windows 11: Assess Current Hardware and Software Compatibility Before making any changes, businesses should start with an assessment of their existing PC fleet. One crucial aspect to note is the updated hardware requirement for Windows 11, so planning ahead for software updates or replacements will help avoid downtime and ensure a seamless transition. Many older devices do not meet the hardware requirements to upgrade to Windows 11. The first step in the transition process is to understand your client hardware estate. Businesses can stay on Windows 10, but Microsoft will start charging for extended security updates from October 2025. All Windows 10 devices are eligible, and a license for the Extended Security Updates (ESU) program is sold as a subscription per device. This is an expensive cost on an older device with no new features and functionality. Planning now will help determine which devices or systems can be upgraded and where new investments are required. Embrace AI-Optimised Hardware A critical component of future-proofing your business lies in adopting AI-optimised hardware. AI PCs powered by neural processing units (NPUs) are built with AI accelerators that work in tandem with existing CPUs and GPUs. AI PCs can empower users to accomplish more, faster: from streamlining workflows to enhancing video conferencing and integrating advanced AI features into creative processes. For example, Dell's latest line of AI-powered PCs is designed to leverage the features of Windows 11, like Copilot, your AI assistant, or Dell's latest Copilot+ PCs with unique on-device AI experiences, like real-time adaptability. Dell AI PCs offer intelligent performance, longer battery life, cooler and lighter laptops. Coupled with Microsoft's AI and cloud tools, this collaboration aims to redefine how businesses view workforce empowerment and operational ingenuity. Prioritise Security with Modern PCs Findings from Dell's Innovation Catalyst Study reveal that 83% of organisations have been impacted by security attacks in the past 12 months, mainly from malware, phishing, and data breaches. Built-in PC security can de-risk organizations, yet just 4 in 10 organizations surveyed strongly agree that they emphasize buying technologies or applications with security built into them. Therein lies a major opportunity. The security features in Windows 11 are best utilized on newer hardware. With Dell commercial devices, your workloads and data are protected wherever your employee's work. Secure design principles and robust supply chain...
SANS Internet Stormcenter Daily Network/Cyber Security and Information Security Stormcast
Reminder: 7-Zip MoW The MoW must be added to any files extracted from ZIP or other compound file formats. 7-Zip does not do so by default unless you alter the default configuration. https://isc.sans.edu/diary/Reminder%3A%207-Zip%20%26%20MoW/31668 Apple Fixes 0-Day Apple released updates to iOS and iPadOS fixing a bypass for USB Restricted Mode. The vulnerability is already being exploited. https://support.apple.com/en-us/122174 AMD ZEN CPU Microcode Update An attacker is able to replace microcode on some AMD CPUs. This may alter how the CPUs function and Google released a PoC showing how it can be used to manipulate the random number generator. https://github.com/google/security-research/security/advisories/GHSA-4xq7-4mgh-gp6w Trimble Cityworks Exploited CISA added a recent Trimble Cityworks vulnerabliity to its list of exploited vulnerabilities. https://learn.assetlifecycle.trimble.com/i/1532182-cityworks-customer-communication-2025-02-06-docx/0? Google Tag Manager Skimmer Steals Credit Card Info Sucuri released a blog post with updates to the mage cart campaign. The latest version is injecting malicious code as part of the google tag manager / analytics code. https://blog.sucuri.net/2025/02/google-tag-manager-skimmer-steals-credit-card-info-from-magento-site.html
Líder de Opinión e Influencer (negocios a: spartandrak@gmail.com ) Digital Dedicado por más de una década a vender los mejores equipos de cómputo en Latinoamérica, Soy tester, investigador, crítico, y recomendador de hardware de alto desempeño. Me misión es llevar la nueva tecnología al límite para que tú no tengas que hacerlo y puedas invertir bien tu preciado dinero. Mientras se trate de hardware y software que empuje los límites humanos... yo ya lo he probado, analizado y desafiado. Mi visión es prever lo que viene, también de comprender cómo la naturaleza de los seres humanos y su existencia están continuamente evolucionando y cómo podemos influir positivamente en esa evolución a través de la tecnología y la innovación. Tecnólogo futurista, Conferencista y Divulgador Tecnológico. Certificado por la Universidad de Harvard en adaptaciones sociales y laborales en Inteligencia Artificial Únete a la comunidad y resuelve tus dudas con expertos: Redes Sociales Oficiales:►https://linktr.ee/DrakSpartanOficialFecha Del Video[10-02-2025]#flush #intel #amd #bandainamco
Immich provides a consistant photo experience across devices, and across family members. This week Noah and Steve give you a deep dive into Immich. We answer your questions, and give you a brief look at news from this week. -- During The Show -- 00:52 Steve's review process Two different review styles It's like a BBQ Gril 02:30 OpenShift In Labs - Ole Fine for home use Learn how to rebuild the cluster Treat it like an enterprise product OpenShift can be heavy 10:00 Continuing Computerized Car Convo - James Remote car take overs are a problem Auto makers are more aware now Cars are drive by wire now 13:56 Episode 414 - Max Nano is for gentlemen Vim is for ninjas Automation shortcuts on iOS Things can be hit or miss on iOS 18:30 News Wire Gparted Live 1.7.0 - gparted.org (https://gparted.org/news.php?item=257) Binutils 2.44 - lists.gnu.org (https://lists.gnu.org/archive/html/info-gnu/2025-02/msg00001.html) Thunderbird 134.0 - thunderbird.net (https://www.thunderbird.net/en-US/thunderbird/134.0/releasenotes/) Archbank 2701 - sourceforge.net (https://sourceforge.net/projects/archbang/files/ArchBang/archbang-2701-x86_64.iso/) KaOS 2025.01 - kaosx.us (https://kaosx.us/news/2025/kaos01/) Parrot 6.3 - parrotsec.org (https://www.parrotsec.org/blog/2025-01-31-parrot-6.3-release-notes/) Nitrux 3.9 - nxos.org (https://nxos.org/psa/psa-nitrux-3-9-0/) Ai2 Tulu 3 405B LLM - venturebeat.com (https://venturebeat.com/ai/ai2-releases-tulu-3-a-fully-open-source-model-that-bests-deepseek-v3-gpt-4o-with-novel-post-training-approach/) EU Investing 56 Million - bestofai.com (https://bestofai.com/article/the-eu-is-betting-56-million-on-open-source-ai) Small 3 - zdnet.com (https://www.zdnet.com/article/mistral-ai-says-its-small-3-model-is-a-local-open-source-alternative-to-gpt-4o-mini/) Oumi - venturebeat.com (https://venturebeat.com/ai/ex-google-apple-engineers-launch-unconditionally-open-source-oumi-ai-platform-that-could-help-to-build-the-next-deepseek/) 20:18 Immich Review Story behind Immich Why Steve deployed Immich Been in use for 3 months Does what the tin says Min of 2 CPUs and 4GB RAM Can be resource intensive Using containers for everything Critical Path/Knowing how things work Official install is via docker compose env file Documentation is pretty good Impressed by Immich's development and progress Interface is great Great for finding pictures Global hotspot map Tagging Facial and object recognition How Immich puts files on the file system Metadata isn't shared between users De-duplication and stacking Partner Sharing Highlights old photos Feature rich compared to google and apple photos Strong emphasis on privacy Good mobile app No E2EE Wouldn't expose this to the Internet Hasn't overcome muscle memory Developer's don't consider it "production ready" Backing up Exit strategy -- The Extra Credit Section -- For links to the articles and material referenced in this week's episode check out this week's page from our podcast dashboard! This Episode's Podcast Dashboard (http://podcast.asknoahshow.com/426) Phone Systems for Ask Noah provided by Voxtelesys (http://www.voxtelesys.com/asknoah) Join us in our dedicated chatroom #GeekLab:linuxdelta.com on Matrix (https://element.linuxdelta.com/#/room/#geeklab:linuxdelta.com) -- Stay In Touch -- Find all the resources for this show on the Ask Noah Dashboard Ask Noah Dashboard (http://www.asknoahshow.com) Need more help than a radio show can offer? Altispeed provides commercial IT services and they're excited to offer you a great deal for listening to the Ask Noah Show. Call today and ask about the discount for listeners of the Ask Noah Show! Altispeed Technologies (http://www.altispeed.com/) Contact Noah live [at] asknoahshow.com -- Twitter -- Noah - Kernellinux (https://twitter.com/kernellinux) Ask Noah Show (https://twitter.com/asknoahshow) Altispeed Technologies (https://twitter.com/altispeed)
On this episode of Chit Chat Stocks, we are joined by Krzysztof and Luke from the Wall Street Wildlife Podcast to discuss Nvidia. We discuss: (04:21) The AI Paradigm Shift and Nvidia's Role (07:28) Understanding Nvidia's Market Dominance (10:20) Competitive Landscape: Who's Challenging Nvidia? (13:27) The Importance of Nvidia's Culture and Innovation (16:39) The Role of TSMC in Nvidia's Success (19:24) Future Demand and Market Potential for AI (22:22) Conclusion: Nvidia's Position in the AI Landscape (35:03) The Accelerating Power of AI (37:04) Nvidia's Blackwell System and Market Demand (40:40) Long-Term Vision vs. Short-Term Goals (47:16) The Shift from CPUs to GPUs (50:05) Understanding Nvidia's Moat (54:19) Nvidia's Unique Culture and Work Ethic (58:15) Succession Planning and Future Risks (01:01:43) Potential Risks and Market Cyclicality Listen to Wall Street Wildlife Podcast on YouTube, Spotify, and Apple Podcasts: https://open.spotify.com/show/27AqyYq2a8KOwse0KyO8iz ***************************************************** JOIN OUR FREE CHAT COMMUNITY: https://chitchatstocks.substack.com/ ********************************************************************* Sign-up for a bond account at Public.com/chitchatstocks A Bond Account is a self-directed brokerage account with Public Investing, member FINRA/SIPC. Deposits into this account are used to purchase 10 investment-grade and high-yield bonds. As of 9/26/24, the average, annualized yield to worst (YTW) across the Bond Account is greater than 6%. A bond's yield is a function of its market price, which can fluctuate; therefore, a bond's YTW is not “locked in” until the bond is purchased, and your yield at time of purchase may be different from the yield shown here. The “locked in” YTW is not guaranteed; you may receive less than the YTW of the bonds in the Bond Account if you sell any of the bonds before maturity or if the issuer defaults on the bond. Public Investing charges a markup on each bond trade. See our Fee Schedule. Bond Accounts are not recommendations of individual bonds or default allocations. The bonds in the Bond Account have not been selected based on your needs or risk profile. See https://public.com/disclosures/bond-account to learn more. ********************************************************************* FinChat.io is The Complete Stock Research Platform for fundamental investors. With its beautiful design and institutional-quality data, FinChat is incredibly powerful and easy to use. Use our LINK and get 15% off any premium plan: finchat.io/chitchat ********************************************************************* Sign up for YellowBrick Investing to track the best investing pitches across the internet: joinyellowbrick.com/chitchat ********************************************************************* Bluechippers Club is a tight-knit community of stock focused investors. Members share ideas, participate in weekly calls, and compete in portfolio competitions. To join, go to bluechippersclub.com and hit apply! ********************************************************************* Disclosure: Chit Chat Stocks hosts and guests are not financial advisors, and nothing they say on this show is formal advice or a recommendation.
Dylan Patel is the founder of SemiAnalysis, a research & analysis company specializing in semiconductors, GPUs, CPUs, and AI hardware. Nathan Lambert is a research scientist at the Allen Institute for AI (Ai2) and the author of a blog on AI called Interconnects. Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep459-sc See below for timestamps, and to give feedback, submit questions, contact Lex, etc. CONTACT LEX: Feedback - give feedback to Lex: https://lexfridman.com/survey AMA - submit questions, videos or call-in: https://lexfridman.com/ama Hiring - join our team: https://lexfridman.com/hiring Other - other ways to get in touch: https://lexfridman.com/contact EPISODE LINKS: Dylan's X: https://x.com/dylan522p SemiAnalysis: https://semianalysis.com/ Nathan's X: https://x.com/natolambert Nathan's Blog: https://www.interconnects.ai/ Nathan's Podcast: https://www.interconnects.ai/podcast Nathan's Website: https://www.natolambert.com/ Nathan's YouTube: https://youtube.com/@natolambert Nathan's Book: https://rlhfbook.com/ SPONSORS: To support this podcast, check out our sponsors & get discounts: Invideo AI: AI video generator. Go to https://invideo.io/i/lexpod GitHub: Developer platform and AI code editor. Go to https://gh.io/copilot Shopify: Sell stuff online. Go to https://shopify.com/lex NetSuite: Business management software. Go to http://netsuite.com/lex AG1: All-in-one daily nutrition drinks. Go to https://drinkag1.com/lex OUTLINE: (00:00) - Introduction (13:28) - DeepSeek-R1 and DeepSeek-V3 (35:02) - Low cost of training (1:01:19) - DeepSeek compute cluster (1:08:52) - Export controls on GPUs to China (1:19:10) - AGI timeline (1:28:35) - China's manufacturing capacity (1:36:30) - Cold war with China (1:41:00) - TSMC and Taiwan (2:04:38) - Best GPUs for AI (2:19:30) - Why DeepSeek is so cheap (2:32:49) - Espionage (2:41:52) - Censorship (2:54:46) - Andrej Karpathy and magic of RL (3:05:17) - OpenAI o3-mini vs DeepSeek r1 (3:24:25) - NVIDIA (3:28:53) - GPU smuggling (3:35:30) - DeepSeek training on OpenAI data (3:45:59) - AI megaclusters (4:21:21) - Who wins the race to AGI? (4:31:34) - AI agents (4:40:16) - Programming and AI (4:47:43) - Open source (4:56:55) - Stargate (5:04:24) - Future of AI PODCAST LINKS: - Podcast Website: https://lexfridman.com/podcast - Apple Podcasts: https://apple.co/2lwqZIr - Spotify: https://spoti.fi/2nEwCF8 - RSS: https://lexfridman.com/feed/podcast/ - Podcast Playlist: https://www.youtube.com/playlist?list=PLrAXtmErZgOdP_8GztsuKi9nrraNbKKp4 - Clips Channel: https://www.youtube.com/lexclips
Questions! The time to answer them is here again, and this month we do our best with such topics as the relative scarcity of nuclear energy, nested comment systems, USB thumb drives versus portable SSDs, browser RAM usage, why CPUs get faster from one model to the next, the difficulty of naming operating systems, phones without camera bumps, learning to read an analog clock (and a lot of other things), and when we'll finally get around to reviewing that high-tech toilet.Submit ideas about secret information encoding in the world around us for an upcoming episode: https://forms.gle/VYgL9gLeSBKkNtfy9 Support the Pod! Contribute to the Tech Pod Patreon and get access to our booming Discord, a monthly bonus episode, your name in the credits, and other great benefits! You can support the show at: https://patreon.com/techpod
A day after falling to a new 52-week low, AMD is downgraded to hold from buy at Melius Research. Jenny Horne shares why the analyst thinks the stock could be in trouble regardless of the DeepSeek news as Nvidia looks to take market share in CPUs for accelerated PCs. ======== Schwab Network ======== Empowering every investor and trader, every market day. Subscribe to the Market Minute newsletter - https://schwabnetwork.com/subscribe Download the iOS app - https://apps.apple.com/us/app/schwab-network/id1460719185 Download the Amazon Fire Tv App - https://www.amazon.com/TD-Ameritrade-Network/dp/B07KRD76C7 Watch on Sling - https://watch.sling.com/1/asset/191928615bd8d47686f94682aefaa007/watch Watch on Vizio - https://www.vizio.com/en/watchfreeplus-explore Watch on DistroTV - https://www.distro.tv/live/schwab-network/ Follow us on X – https://twitter.com/schwabnetwork Follow us on Facebook – https://www.facebook.com/schwabnetwork Follow us on LinkedIn - https://www.linkedin.com/company/schwab-network/ About Schwab Network - https://schwabnetwork.com/about
Skalierung und verteilte Berechnungen: Sind mehr CPUs wirklich immer schneller?Stell dir vor, du bist Softwareentwickler*in und jeder spricht von Skalierung und verteilten Systemen. Doch wie effizient sind diese eigentlich wirklich? Heißt mehr Rechenpower gleich schnellere Ergebnisse?In dieser Episode werfen wir einen Blick auf ein wissenschaftliches Paper, das behauptet, die wahre Leistung von verteilten Systemen kritisch zu hinterfragen. Wir diskutieren, ab wann es sich lohnt, mehr Ressourcen einzusetzen, und was es mit der mysteriösen Metrik COST (ausgesprochen Configuration that Outperforms a Single Thread) auf sich hat. Hör rein, wenn du wissen willst, ob Single-Threaded Algorithmen in vielen Fällen die bessere Wahl sind.Bonus: Ggf. machen nicht alle Wissenschaftler so wissenschaftliche Arbeit.Unsere aktuellen Werbepartner findest du auf https://engineeringkiosk.dev/partnersDas schnelle Feedback zur Episode:
Shahin Khan, co-founder of OrionX, joins Yuval Boger to explore the intersection of quantum computing and high-performance computing (HPC). Shahin discusses why HPC is a natural early adopter of quantum technologies, the role of QPUs alongside GPUs and CPUs, and how quantum computing aligns with global megatrends. They delve into scaling challenges, the potential for quantum to revolutionize tensor-based problems, and the broader implications of quantum on energy efficiency and scientific discovery. Shahin also reflects on the industry's progress, the importance of rational exuberance, and the need to set realistic expectations while maintaining excitement about quantum's transformative potential.
Generative AI is a data-driven story with significant infrastructure and operational implications, particularly around the rising demand for GPUs, which are better suited for AI workloads than CPUs. In an episode ofThe New Stack Makersrecorded at KubeCon + CloudNativeCon North America, Sudha Raghavan, SVP for Developer Platform at Oracle Cloud Infrastructure, discussed how AI's rapid adoption has reshaped infrastructure needs.The release of ChatGPT triggered a surge in GPU demand, with organizations requiring GPUs for tasks ranging from testing workloads to training large language models across massive GPU clusters. These workloads run continuously at peak power, posing challenges such as high hardware failure rates and energy consumption.Oracle is addressing these issues by building GPU superclusters and enhancing Kubernetes functionality. Tools like Oracle's Node Manager simplify interactions between Kubernetes and GPUs, providing tailored observability while maintaining Kubernetes' user-friendly experience. Raghavan emphasized the importance of stateful job management and infrastructure innovations to meet the demands of modern AI workloads.Learn more from The New Stack about how Oracle is addressing the GPU demand for AI workloads with its GPU superclusters and enhancing Kubernetes functionality: Oracle Code Assist, Java-Optimized, Now in BetaOracle's Code Assist: Fashionably Late to the GenAI PartyOracle Unveils Java 23: Simplicity Meets Enterprise PowerJoin our community of newsletter subscribers to stay on top of the news and at the top of your game.
Today we are talking all about STORAGE! Well - at least some of it - we are talking about HDDs. Hard Disk Drive storage. We also talk a little bit about the hierarchy of storage and volatile and non-volatile storage! Make sure to check our previous part of this episode about CPUs and GPUs if you missed that because we will be putting all this together at the end in an ultimate comparison of Mac v/s Windows =) Here's the hierarchy I mentioned in the podcast: Stage A - Cache Storage [THE FASTEST - TINIEST STORAGE] Stage B - Cache Storage [VERY VERY FAST - TINY STORAGE] Stage C - Cache Storage [ VERY FAST - SMALL STORAGE] Stage D - RAM [FAST - BIGGER STORAGE] Stage E - HDD/SSD [ STILL FAST, BUT NOT AS MUCH - TONS OF STORAGE] All of these are very fast to the point that you may not notice if a data packet is stored in Stage A, or D, but they are still faster than the previous one! (It's a matter of miliseconds and seconds) Sources: https://www.youtube.com/watch?v=wteUW2sL7bc&t=3s https://www.youtube.com/watch?v=wtdnatmVdIg&t=3s Also bleeds into the other Episodes sources.
Applications for the NYC AI Engineer Summit, focused on Agents at Work, are open!When we first started Latent Space, in the lightning round we'd always ask guests: “What's your favorite AI product?”. The majority would say Midjourney. The simple UI of prompt → very aesthetic image turned it into a $300M+ ARR bootstrapped business as it rode the first wave of AI image generation.In open source land, StableDiffusion was congregating around AUTOMATIC1111 as the de-facto web UI. Unlike Midjourney, which offered some flags but was mostly prompt-driven, A1111 let users play with a lot more parameters, supported additional modalities like img2img, and allowed users to load in custom models. If you're interested in some of the SD history, you can look at our episodes with Lexica, Replicate, and Playground.One of the people involved with that community was comfyanonymous, who was also part of the Stability team in 2023, decided to build an alternative called ComfyUI, now one of the fastest growing open source projects in generative images, and is now the preferred partner for folks like Black Forest Labs's Flux Tools on Day 1. The idea behind it was simple: “Everyone is trying to make easy to use interfaces. Let me try to make a powerful interface that's not easy to use.”Unlike its predecessors, ComfyUI does not have an input text box. Everything is based around the idea of a node: there's a text input node, a CLIP node, a checkpoint loader node, a KSampler node, a VAE node, etc. While daunting for simple image generation, the tool is amazing for more complex workflows since you can break down every step of the process, and then chain many of them together rather than manually switching between tools. You can also re-start execution halfway instead of from the beginning, which can save a lot of time when using larger models.To give you an idea of some of the new use cases that this type of UI enables:* Sketch something → Generate an image with SD from sketch → feed it into SD Video to animate* Generate an image of an object → Turn into a 3D asset → Feed into interactive experiences* Input audio → Generate audio-reactive videosTheir Examples page also includes some of the more common use cases like AnimateDiff, etc. They recently launched the Comfy Registry, an online library of different nodes that users can pull from rather than having to build everything from scratch. The project has >60,000 Github stars, and as the community grows, some of the projects that people build have gotten quite complex:The most interesting thing about Comfy is that it's not a UI, it's a runtime. You can build full applications on top of image models simply by using Comfy. You can expose Comfy workflows as an endpoint and chain them together just like you chain a single node. We're seeing the rise of AI Engineering applied to art.Major Tom's ComfyUI Resources from the Latent Space DiscordMajor shoutouts to Major Tom on the LS Discord who is a image generation expert, who offered these pointers:* “best thing about comfy is the fact it supports almost immediately every new thing that comes out - unlike A1111 or forge, which still don't support flux cnet for instance. It will be perfect tool when conflicting nodes will be resolved”* AP Workflows from Alessandro Perili are a nice example of an all-in-one train-evaluate-generate system built atop Comfy* ComfyUI YouTubers to learn from:* @sebastiankamph* @NerdyRodent* @OlivioSarikas* @sedetweiler* @pixaroma* ComfyUI Nodes to check out:* https://github.com/kijai/ComfyUI-IC-Light* https://github.com/MrForExample/ComfyUI-3D-Pack* https://github.com/PowerHouseMan/ComfyUI-AdvancedLivePortrait* https://github.com/pydn/ComfyUI-to-Python-Extension* https://github.com/THtianhao/ComfyUI-Portrait-Maker* https://github.com/ssitu/ComfyUI_NestedNodeBuilder* https://github.com/longgui0318/comfyui-magic-clothing* https://github.com/atmaranto/ComfyUI-SaveAsScript* https://github.com/ZHO-ZHO-ZHO/ComfyUI-InstantID* https://github.com/AIFSH/ComfyUI-FishSpeech* https://github.com/coolzilj/ComfyUI-Photopea* https://github.com/lks-ai/anynode* Sarav: https://www.youtube.com/@mickmumpitz/videos ( applied stuff )* Sarav: https://www.youtube.com/@latentvision (technical, but infrequent)* look for comfyui node for https://github.com/magic-quill/MagicQuill* “Comfy for Video” resources* Kijai (https://github.com/kijai) pushing out support for Mochi, CogVideoX, AnimateDif, LivePortrait etc* Comfyui node support like LTX https://github.com/Lightricks/ComfyUI-LTXVideo , and HunyuanVideo* FloraFauna AI* Communities: https://www.reddit.com/r/StableDiffusion/, https://www.reddit.com/r/comfyui/Full YouTube EpisodeAs usual, you can find the full video episode on our YouTube (and don't forget to like and subscribe!)Timestamps* 00:00:04 Introduction of hosts and anonymous guest* 00:00:35 Origins of Comfy UI and early Stable Diffusion landscape* 00:02:58 Comfy's background and development of high-res fix* 00:05:37 Area conditioning and compositing in image generation* 00:07:20 Discussion on different AI image models (SD, Flux, etc.)* 00:11:10 Closed source model APIs and community discussions on SD versions* 00:14:41 LoRAs and textual inversion in image generation* 00:18:43 Evaluation methods in the Comfy community* 00:20:05 CLIP models and text encoders in image generation* 00:23:05 Prompt weighting and negative prompting* 00:26:22 Comfy UI's unique features and design choices* 00:31:00 Memory management in Comfy UI* 00:33:50 GPU market share and compatibility issues* 00:35:40 Node design and parameter settings in Comfy UI* 00:38:44 Custom nodes and community contributions* 00:41:40 Video generation models and capabilities* 00:44:47 Comfy UI's development timeline and rise to popularity* 00:48:13 Current state of Comfy UI team and future plans* 00:50:11 Discussion on other Comfy startups and potential text generation supportTranscriptAlessio [00:00:04]: Hey everyone, welcome to the Latent Space podcast. This is Alessio, partner and CTO at Decibel Partners, and I'm joined by my co-host Swyx, founder of Small AI.swyx [00:00:12]: Hey everyone, we are in the Chroma Studio again, but with our first ever anonymous guest, Comfy Anonymous, welcome.Comfy [00:00:19]: Hello.swyx [00:00:21]: I feel like that's your full name, you just go by Comfy, right?Comfy [00:00:24]: Yeah, well, a lot of people just call me Comfy, even when they know my real name. Hey, Comfy.Alessio [00:00:32]: Swyx is the same. You know, not a lot of people call you Shawn.swyx [00:00:35]: Yeah, you have a professional name, right, that people know you by, and then you have a legal name. Yeah, it's fine. How do I phrase this? I think people who are in the know, know that Comfy is like the tool for image generation and now other multimodality stuff. I would say that when I first got started with Stable Diffusion, the star of the show was Automatic 111, right? And I actually looked back at my notes from 2022-ish, like Comfy was already getting started back then, but it was kind of like the up and comer, and your main feature was the flowchart. Can you just kind of rewind to that moment, that year and like, you know, how you looked at the landscape there and decided to start Comfy?Comfy [00:01:10]: Yeah, I discovered Stable Diffusion in 2022, in October 2022. And, well, I kind of started playing around with it. Yes, I, and back then I was using Automatic, which was what everyone was using back then. And so I started with that because I had, it was when I started, I had no idea like how Diffusion works. I didn't know how Diffusion models work, how any of this works, so.swyx [00:01:36]: Oh, yeah. What was your prior background as an engineer?Comfy [00:01:39]: Just a software engineer. Yeah. Boring software engineer.swyx [00:01:44]: But like any, any image stuff, any orchestration, distributed systems, GPUs?Comfy [00:01:49]: No, I was doing basically nothing interesting. Crud, web development? Yeah, a lot of web development, just, yeah, some basic, maybe some basic like automation stuff. Okay. Just. Yeah, no, like, no big companies or anything.swyx [00:02:08]: Yeah, but like already some interest in automations, probably a lot of Python.Comfy [00:02:12]: Yeah, yeah, of course, Python. But I wasn't actually used to like the Node graph interface before I started Comfy UI. It was just, I just thought it was like, oh, like, what's the best way to represent the Diffusion process in the user interface? And then like, oh, well. Well, like, naturally, oh, this is the best way I've found. And this was like with the Node interface. So how I got started was, yeah, so basic October 2022, just like I hadn't written a line of PyTorch before that. So it's completely new. What happened was I kind of got addicted to generating images.Alessio [00:02:58]: As we all did. Yeah.Comfy [00:03:00]: And then I started. I started experimenting with like the high-res fixed in auto, which was for those that don't know, the high-res fix is just since the Diffusion models back then could only generate that low-resolution. So what you would do, you would generate low-resolution image, then upscale, then refine it again. And that was kind of the hack to generate high-resolution images. I really liked generating. Like higher resolution images. So I was experimenting with that. And so I modified the code a bit. Okay. What happens if I, if I use different samplers on the second pass, I was edited the code of auto. So what happens if I use a different sampler? What happens if I use a different, like a different settings, different number of steps? And because back then the. The high-res fix was very basic, just, so. Yeah.swyx [00:04:05]: Now there's a whole library of just, uh, the upsamplers.Comfy [00:04:08]: I think, I think they added a bunch of, uh, of options to the high-res fix since, uh, since, since then. But before that was just so basic. So I wanted to go further. I wanted to try it. What happens if I use a different model for the second, the second pass? And then, well, then the auto code base was, wasn't good enough for. Like, it would have been, uh, harder to implement that in the auto interface than to create my own interface. So that's when I decided to create my own. And you were doing that mostly on your own when you started, or did you already have kind of like a subgroup of people? No, I was, uh, on my own because, because it was just me experimenting with stuff. So yeah, that was it. Then, so I started writing the code January one. 2023, and then I released the first version on GitHub, January 16th, 2023. That's how things got started.Alessio [00:05:11]: And what's, what's the name? Comfy UI right away or? Yeah.Comfy [00:05:14]: Comfy UI. The reason the name, my name is Comfy is people thought my pictures were comfy, so I just, uh, just named it, uh, uh, it's my Comfy UI. So yeah, that's, uh,swyx [00:05:27]: Is there a particular segment of the community that you targeted as users? Like more intensive workflow artists, you know, compared to the automatic crowd or, you know,Comfy [00:05:37]: This was my way of like experimenting with, uh, with new things, like the high risk fixed thing I mentioned, which was like in Comfy, the first thing you could easily do was just chain different models together. And then one of the first things, I think the first times it got a bit of popularity was when I started experimenting with the different, like applying. Prompts to different areas of the image. Yeah. I called it area conditioning, posted it on Reddit and it got a bunch of upvotes. So I think that's when, like, when people first learned of Comfy UI.swyx [00:06:17]: Is that mostly like fixing hands?Comfy [00:06:19]: Uh, no, no, no. That was just, uh, like, let's say, well, it was very, well, it still is kind of difficult to like, let's say you want a mountain, you have an image and then, okay. I'm like, okay. I want the mountain here and I want the, like a, a Fox here.swyx [00:06:37]: Yeah. So compositing the image. Yeah.Comfy [00:06:40]: My way was very easy. It was just like, oh, when you run the diffusion process, you kind of generate, okay. You do pass one pass through the diffusion, every step you do one pass. Okay. This place of the image with this brand, this space, place of the image with the other prop. And then. The entire image with another prop and then just average everything together, every step, and that was, uh, area composition, which I call it. And then, then a month later, there was a paper that came out called multi diffusion, which was the same thing, but yeah, that's, uh,Alessio [00:07:20]: could you do area composition with different models or because you're averaging out, you kind of need the same model.Comfy [00:07:26]: Could do it with, but yeah, I hadn't implemented it. For different models, but, uh, you, you can do it with, uh, with different models if you want, as long as the models share the same latent space, like we, we're supposed to ring a bell every time someone says, yeah, like, for example, you couldn't use like Excel and SD 1.5, because those have a different latent space, but like, uh, yeah, like SD 1.5 models, different ones. You could, you could do that.swyx [00:07:59]: There's some models that try to work in pixel space, right?Comfy [00:08:03]: Yeah. They're very slow. Of course. That's the problem. That that's the, the reason why stable diffusion actually became like popular, like, cause was because of the latent space.swyx [00:08:14]: Small and yeah. Because it used to be latent diffusion models and then they trained it up.Comfy [00:08:19]: Yeah. Cause a pixel pixel diffusion models are just too slow. So. Yeah.swyx [00:08:25]: Have you ever tried to talk to like, like stability, the latent diffusion guys, like, you know, Robin Rombach, that, that crew. Yeah.Comfy [00:08:32]: Well, I used to work at stability.swyx [00:08:34]: Oh, I actually didn't know. Yeah.Comfy [00:08:35]: I used to work at stability. I got, uh, I got hired, uh, in June, 2023.swyx [00:08:42]: Ah, that's the part of the story I didn't know about. Okay. Yeah.Comfy [00:08:46]: So the, the reason I was hired is because they were doing, uh, SDXL at the time and they were basically SDXL. I don't know if you remember it was a base model and then a refiner model. Basically they wanted to experiment, like chaining them together. And then, uh, they saw, oh, right. Oh, this, we can use this to do that. Well, let's hire that guy.swyx [00:09:10]: But they didn't, they didn't pursue it for like SD3. What do you mean? Like the SDXL approach. Yeah.Comfy [00:09:16]: The reason for that approach was because basically they had two models and then they wanted to publish both of them. So they, they trained one on. Lower time steps, which was the refiner model. And then they, the first one was trained normally. And then they went during their test, they realized, oh, like if we string these models together are like quality increases. So let's publish that. It worked. Yeah. But like right now, I don't think many people actually use the refiner anymore, even though it is actually a full diffusion model. Like you can use it on its own. And it's going to generate images. I don't think anyone, people have mostly forgotten about it. But, uh.Alessio [00:10:05]: Can we talk about models a little bit? So stable diffusion, obviously is the most known. I know flux has gotten a lot of traction. Are there any underrated models that people should use more or what's the state of the union?Comfy [00:10:17]: Well, the, the latest, uh, state of the art, at least, yeah, for images there's, uh, yeah, there's flux. There's also SD3.5. SD3.5 is two models. There's a, there's a small one, 2.5B and there's the bigger one, 8B. So it's, it's smaller than flux. So, and it's more, uh, creative in a way, but flux, yeah, flux is the best. People should give SD3.5 a try cause it's, uh, it's different. I won't say it's better. Well, it's better for some like specific use cases. Right. If you want some to make something more like creative, maybe SD3.5. If you want to make something more consistent and flux is probably better.swyx [00:11:06]: Do you ever consider supporting the closed source model APIs?Comfy [00:11:10]: Uh, well, they, we do support them as custom nodes. We actually have some, uh, official custom nodes from, uh, different. Ideogram.swyx [00:11:20]: Yeah. I guess DALI would have one. Yeah.Comfy [00:11:23]: That's, uh, it's just not, I'm not the person that handles that. Sure.swyx [00:11:28]: Sure. Quick question on, on SD. There's a lot of community discussion about the transition from SD1.5 to SD2 and then SD2 to SD3. People still like, you know, very loyal to the previous generations of SDs?Comfy [00:11:41]: Uh, yeah. SD1.5 then still has a lot of, a lot of users.swyx [00:11:46]: The last based model.Comfy [00:11:49]: Yeah. Then SD2 was mostly ignored. It wasn't, uh, it wasn't a big enough improvement over the previous one. Okay.swyx [00:11:58]: So SD1.5, SD3, flux and whatever else. SDXL. SDXL.Comfy [00:12:03]: That's the main one. Stable cascade. Stable cascade. That was a good model. But, uh, that's, uh, the problem with that one is, uh, it got, uh, like SD3 was announced one week after. Yeah.swyx [00:12:16]: It was like a weird release. Uh, what was it like inside of stability actually? I mean, statute of limitations. Yeah. The statute of limitations expired. You know, management has moved. So it's easier to talk about now. Yeah.Comfy [00:12:27]: And inside stability, actually that model was ready, uh, like three months before, but it got, uh, stuck in, uh, red teaming. So basically the product, if that model had released or was supposed to be released by the authors, then it would probably have gotten very popular since it's a, it's a step up from SDXL. But it got all of its momentum stolen. It got stolen by the SD3 announcement. So people kind of didn't develop anything on top of it, even though it's, uh, yeah. It was a good model, at least, uh, completely mostly ignored for some reason. Likeswyx [00:13:07]: I think the naming as well matters. It seemed like a branch off of the main, main tree of development. Yeah.Comfy [00:13:15]: Well, it was different researchers that did it. Yeah. Yeah. Very like, uh, good model. Like it's the Worcestershire authors. I don't know if I'm pronouncing it correctly. Yeah. Yeah. Yeah.swyx [00:13:28]: I actually met them in Vienna. Yeah.Comfy [00:13:30]: They worked at stability for a bit and they left right after the Cascade release.swyx [00:13:35]: This is Dustin, right? No. Uh, Dustin's SD3. Yeah.Comfy [00:13:38]: Dustin is a SD3 SDXL. That's, uh, Pablo and Dome. I think I'm pronouncing his name correctly. Yeah. Yeah. Yeah. Yeah. That's very good.swyx [00:13:51]: It seems like the community is very, they move very quickly. Yeah. Like when there's a new model out, they just drop whatever the current one is. And they just all move wholesale over. Like they don't really stay to explore the full capabilities. Like if, if the stable cascade was that good, they would have AB tested a bit more. Instead they're like, okay, SD3 is out. Let's go. You know?Comfy [00:14:11]: Well, I find the opposite actually. The community doesn't like, they only jump on a new model when there's a significant improvement. Like if there's a, only like a incremental improvement, which is what, uh, most of these models are going to have, especially if you, cause, uh, stay the same parameter count. Yeah. Like you're not going to get a massive improvement, uh, into like, unless there's something big that, that changes. So, uh. Yeah.swyx [00:14:41]: And how are they evaluating these improvements? Like, um, because there's, it's a whole chain of, you know, comfy workflows. Yeah. How does, how does one part of the chain actually affect the whole process?Comfy [00:14:52]: Are you talking on the model side specific?swyx [00:14:54]: Model specific, right? But like once you have your whole workflow based on a model, it's very hard to move.Comfy [00:15:01]: Uh, not, well, not really. Well, it depends on your, uh, depends on their specific kind of the workflow. Yeah.swyx [00:15:09]: So I do a lot of like text and image. Yeah.Comfy [00:15:12]: When you do change, like most workflows are kind of going to be complete. Yeah. It's just like, you might have to completely change your prompt completely change. Okay.swyx [00:15:24]: Well, I mean, then maybe the question is really about evals. Like what does the comfy community do for evals? Just, you know,Comfy [00:15:31]: Well, that they don't really do that. It's more like, oh, I think this image is nice. So that's, uh,swyx [00:15:38]: They just subscribe to Fofr AI and just see like, you know, what Fofr is doing. Yeah.Comfy [00:15:43]: Well, they just, they just generate like it. Like, I don't see anyone really doing it. Like, uh, at least on the comfy side, comfy users, they, it's more like, oh, generate images and see, oh, this one's nice. It's like, yeah, it's not, uh, like the, the more, uh, like, uh, scientific, uh, like, uh, like checking that's more on specifically on like model side. If, uh, yeah, but there is a lot of, uh, vibes also, cause it is a like, uh, artistic, uh, you can create a very good model that doesn't generate nice images. Cause most images on the internet are ugly. So if you, if that's like, if you just, oh, I have the best model at 10th giant, it's super smart. I created on all the, like I've trained on just all the images on the internet. The images are not going to look good. So yeah.Alessio [00:16:42]: Yeah.Comfy [00:16:43]: They're going to be very consistent. But yeah. People like, it's not going to be like the, the look that people are going to be expecting from, uh, from a model. So. Yeah.swyx [00:16:54]: Can we talk about LoRa's? Cause we thought we talked about models then like the next step is probably LoRa's. Before, I actually, I'm kind of curious how LoRa's entered the tool set of the image community because the LoRa paper was 2021. And then like, there was like other methods like textual inversion that was popular at the early SD stage. Yeah.Comfy [00:17:13]: I can't even explain the difference between that. Yeah. Textual inversions. That's basically what you're doing is you're, you're training a, cause well, yeah. Stable diffusion. You have the diffusion model, you have text encoder. So basically what you're doing is training a vector that you're going to pass to the text encoder. It's basically you're training a new word. Yeah.swyx [00:17:37]: It's a little bit like representation engineering now. Yeah.Comfy [00:17:40]: Yeah. Basically. Yeah. You're just, so yeah, if you know how like the text encoder works, basically you have, you take your, your words of your product, you convert those into tokens with the tokenizer and those are converted into vectors. Basically. Yeah. Each token represents a different vector. So each word presents a vector. And those, depending on your words, that's the list of vectors that get passed to the text encoder, which is just. Yeah. Yeah. I'm just a stack of, of attention. Like basically it's a very close to LLM architecture. Yeah. Yeah. So basically what you're doing is just training a new vector. We're saying, well, I have all these images and I want to know which word does that represent? And it's going to get like, you train this vector and then, and then when you use this vector, it hopefully generates. Like something similar to your images. Yeah.swyx [00:18:43]: I would say it's like surprisingly sample efficient in picking up the concept that you're trying to train it on. Yeah.Comfy [00:18:48]: Well, people have kind of stopped doing that even though back as like when I was at Stability, we, we actually did train internally some like textual versions on like T5 XXL actually worked pretty well. But for some reason, yeah, people don't use them. And also they might also work like, like, yeah, this is something and probably have to test, but maybe if you train a textual version, like on T5 XXL, it might also work with all the other models that use T5 XXL because same thing with like, like the textual inversions that, that were trained for SD 1.5, they also kind of work on SDXL because SDXL has the, has two text encoders. And one of them is the same as the, as the SD 1.5 CLIP-L. So those, they actually would, they don't work as strongly because they're only applied to one of the text encoders. But, and the same thing for SD3. SD3 has three text encoders. So it works. It's still, you can still use your textual version SD 1.5 on SD3, but it's just a lot weaker because now there's three text encoders. So it gets even more diluted. Yeah.swyx [00:20:05]: Do people experiment a lot on, just on the CLIP side, there's like Siglip, there's Blip, like do people experiment a lot on those?Comfy [00:20:12]: You can't really replace. Yeah.swyx [00:20:14]: Because they're trained together, right? Yeah.Comfy [00:20:15]: They're trained together. So you can't like, well, what I've seen people experimenting with is a long CLIP. So basically someone fine tuned the CLIP model to accept longer prompts.swyx [00:20:27]: Oh, it's kind of like long context fine tuning. Yeah.Comfy [00:20:31]: So, so like it's, it's actually supported in Core Comfy.swyx [00:20:35]: How long is long?Comfy [00:20:36]: Regular CLIP is 77 tokens. Yeah. Long CLIP is 256. Okay. So, but the hack that like you've, if you use stable diffusion 1.5, you've probably noticed, oh, it still works if I, if I use long prompts, prompts longer than 77 words. Well, that's because the hack is to just, well, you split, you split it up in chugs of 77, your whole big prompt. Let's say you, you give it like the massive text, like the Bible or something, and it would split it up in chugs of 77 and then just pass each one through the CLIP and then just cut anything together at the end. It's not ideal, but it actually works.swyx [00:21:26]: Like the positioning of the words really, really matters then, right? Like this is why order matters in prompts. Yeah.Comfy [00:21:33]: Yeah. Like it, it works, but it's, it's not ideal, but it's what people expect. Like if, if someone gives a huge prompt, they expect at least some of the concepts at the end to be like present in the image. But usually when they give long prompts, they, they don't, they like, they don't expect like detail, I think. So that's why it works very well.swyx [00:21:58]: And while we're on this topic, prompts waiting, negative comments. Negative prompting all, all sort of similar part of this layer of the stack. Yeah.Comfy [00:22:05]: The, the hack for that, which works on CLIP, like it, basically it's just for SD 1.5, well, for SD 1.5, the prompt waiting works well because CLIP L is a, is not a very deep model. So you have a very high correlation between, you have the input token, the index of the input token vector. And the output token, they're very, the concepts are very close, closely linked. So that means if you interpolate the vector from what, well, the, the way Comfy UI does it is it has, okay, you have the vector, you have an empty prompt. So you have a, a chunk, like a CLIP output for the empty prompt, and then you have the one for your prompt. And then it interpolates from that, depending on your prompt. Yeah.Comfy [00:23:07]: So that's how it, how it does prompt waiting. But this stops working the deeper your text encoder is. So on T5X itself, it doesn't work at all. So. Wow.swyx [00:23:20]: Is that a problem for people? I mean, cause I'm used to just move, moving up numbers. Probably not. Yeah.Comfy [00:23:25]: Well.swyx [00:23:26]: So you just use words to describe, right? Cause it's a bigger language model. Yeah.Comfy [00:23:30]: Yeah. So. Yeah. So honestly it might be good, but I haven't seen many complaints on Flux that it's not working. So, cause I guess people can sort of get around it with, with language. So. Yeah.swyx [00:23:46]: Yeah. And then coming back to LoRa's, now the, the popular way to, to customize models is LoRa's. And I saw you also support Locon and LoHa, which I've never heard of before.Comfy [00:23:56]: There's a bunch of, cause what, what the LoRa is essentially is. Instead of like, okay, you have your, your model and then you want to fine tune it. So instead of like, what you could do is you could fine tune the entire thing, but that's a bit heavy. So to speed things up and make things less heavy, what you can do is just fine tune some smaller weights, like basically two, two matrices that when you multiply like two low rank matrices and when you multiply them together, gives a, represents a difference between trained weights and your base weights. So by training those two smaller matrices, that's a lot less heavy. Yeah.Alessio [00:24:45]: And they're portable. So you're going to share them. Yeah. It's like easier. And also smaller.Comfy [00:24:49]: Yeah. That's the, how LoRa's work. So basically, so when, when inferencing you, you get an inference with them pretty efficiently, like how ComputeWrite does it. It just, when you use a LoRa, it just applies it straight on the weights so that there's only a small delay at the base, like before the sampling to when it applies the weights and then it just same speed as, as before. So for, for inference, it's, it's not that bad, but, and then you have, so basically all the LoRa types like LoHa, LoCon, everything, that's just different ways of representing that like. Basically, you can call it kind of like compression, even though it's not really compression, it's just different ways of represented, like just, okay, I want to train a different on the difference on the weights. What's the best way to represent that difference? There's the basic LoRa, which is just, oh, let's multiply these two matrices together. And then there's all the other ones, which are all different algorithms. So. Yeah.Alessio [00:25:57]: So let's talk about LoRa. Let's talk about what comfy UI actually is. I think most people have heard of it. Some people might've seen screenshots. I think fewer people have built very complex workflows. So when you started, automatic was like the super simple way. What were some of the choices that you made? So the node workflow, is there anything else that stands out as like, this was like a unique take on how to do image generation workflows?Comfy [00:26:22]: Well, I feel like, yeah, back then everyone was trying to make like easy to use interface. Yeah. So I'm like, well, everyone's trying to make an easy to use interface.swyx [00:26:32]: Let's make a hard to use interface.Comfy [00:26:37]: Like, so like, I like, I don't need to do that, everyone else doing it. So let me try something like, let me try to make a powerful interface that's not easy to use. So.swyx [00:26:52]: So like, yeah, there's a sort of node execution engine. Yeah. Yeah. And it actually lists, it has this really good list of features of things you prioritize, right? Like let me see, like sort of re-executing from, from any parts of the workflow that was changed, asynchronous queue system, smart memory management, like all this seems like a lot of engineering that. Yeah.Comfy [00:27:12]: There's a lot of engineering in the back end to make things, cause I was always focused on making things work locally very well. Cause that's cause I was using it locally. So everything. So there's a lot of, a lot of thought and working by getting everything to run as well as possible. So yeah. ConfUI is actually more of a back end, at least, well, not all the front ends getting a lot more development, but, but before, before it was, I was pretty much only focused on the backend. Yeah.swyx [00:27:50]: So v0.1 was only August this year. Yeah.Comfy [00:27:54]: With the new front end. Before there was no versioning. So yeah. Yeah. Yeah.swyx [00:27:57]: And so what was the big rewrite for the 0.1 and then the 1.0?Comfy [00:28:02]: Well, that's more on the front end side. That's cause before that it was just like the UI, what, cause when I first wrote it, I just, I said, okay, how can I make, like, I can do web development, but I don't like doing it. Like what's the easiest way I can slap a node interface on this. And then I found this library. Yeah. Like JavaScript library.swyx [00:28:26]: Live graph?Comfy [00:28:27]: Live graph.swyx [00:28:28]: Usually people will go for like react flow for like a flow builder. Yeah.Comfy [00:28:31]: But that seems like too complicated. So I didn't really want to spend time like developing the front end. So I'm like, well, oh, light graph. This has the whole node interface. So, okay. Let me just plug that into, to my backend.swyx [00:28:49]: I feel like if Streamlit or Gradio offered something that you would have used Streamlit or Gradio cause it's Python. Yeah.Comfy [00:28:54]: Yeah. Yeah. Yeah.Comfy [00:29:00]: Yeah.Comfy [00:29:14]: Yeah. logic and your backend logic and just sticks them together.swyx [00:29:20]: It's supposed to be easy for you guys. If you're a Python main, you know, I'm a JS main, right? Okay. If you're a Python main, it's supposed to be easy.Comfy [00:29:26]: Yeah, it's easy, but it makes your whole software a huge mess.swyx [00:29:30]: I see, I see. So you're mixing concerns instead of separating concerns?Comfy [00:29:34]: Well, it's because... Like frontend and backend. Frontend and backend should be well separated with a defined API. Like that's how you're supposed to do it. Smart people disagree. It just sticks everything together. It makes it easy to like a huge mess. And also it's, there's a lot of issues with Gradio. Like it's very good if all you want to do is just get like slap a quick interface on your, like to show off your ML project. Like that's what it's made for. Yeah. Like there's no problem using it. Like, oh, I have my, I have my code. I just wanted a quick interface on it. That's perfect. Like use Gradio. But if you want to make something that's like a real, like real software that will last a long time and will be easy to maintain, then I would avoid it. Yeah.swyx [00:30:32]: So your criticism is Streamlit and Gradio are the same. I mean, those are the same criticisms.Comfy [00:30:37]: Yeah, Streamlit I haven't used as much. Yeah, I just looked a bit.swyx [00:30:43]: Similar philosophy.Comfy [00:30:44]: Yeah, it's similar. It's just, it just seems to me like, okay, for quick, like AI demos, it's perfect.swyx [00:30:51]: Yeah. Going back to like the core tech, like asynchronous queues, slow re-execution, smart memory management, you know, anything that you were very proud of or was very hard to figure out?Comfy [00:31:00]: Yeah. The thing that's the biggest pain in the ass is probably the memory management. Yeah.swyx [00:31:05]: Were you just paging models in and out or? Yeah.Comfy [00:31:08]: Before it was just, okay, load the model, completely unload it. Then, okay, that, that works well when you, your model are small, but if your models are big and it takes sort of like, let's say someone has a, like a, a 4090, and the model size is 10 gigabytes, that can take a few seconds to like load and load, load and load, so you want to try to keep things like in memory, in the GPU memory as much as possible. What Comfy UI does right now is it. It tries to like estimate, okay, like, okay, you're going to sample this model, it's going to take probably this amount of memory, let's remove the models, like this amount of memory that's been loaded on the GPU and then just execute it. But so there's a fine line between just because try to remove the least amount of models that are already loaded. Because as fans, like Windows drivers, and one other problem is the NVIDIA driver on Windows by default, because there's a way to, there's an option to disable that feature, but by default it, like, if you start loading, you can overflow your GPU memory and then it's, the driver's going to automatically start paging to RAM. But the problem with that is it's, it makes everything extremely slow. So when you see people complaining, oh, this model, it works, but oh, s**t, it starts slowing down a lot, that's probably what's happening. So it's basically you have to just try to get, use as much memory as possible, but not too much, or else things start slowing down, or people get out of memory, and then just find, try to find that line where, oh, like the driver on Windows starts paging and stuff. Yeah. And the problem with PyTorch is it's, it's high levels, don't have that much fine-grained control over, like, specific memory stuff, so kind of have to leave, like, the memory freeing to, to Python and PyTorch, which is, can be annoying sometimes.swyx [00:33:32]: So, you know, I think one thing is, as a maintainer of this project, like, you're designing for a very wide surface area of compute, like, you even support CPUs.Comfy [00:33:42]: Yeah, well, that's... That's just, for PyTorch, PyTorch supports CPUs, so, yeah, it's just, that's not, that's not hard to support.swyx [00:33:50]: First of all, is there a market share estimate, like, is it, like, 70% NVIDIA, like, 30% AMD, and then, like, miscellaneous on Apple, Silicon, or whatever?Comfy [00:33:59]: For Comfy? Yeah. Yeah, and, yeah, I don't know the market share.swyx [00:34:03]: Can you guess?Comfy [00:34:04]: I think it's mostly NVIDIA. Right. Because, because AMD, the problem, like, AMD works horribly on Windows. Like, on Linux, it works fine. It's, it's lower than the price equivalent NVIDIA GPU, but it works, like, you can use it, you generate images, everything works. On Linux, on Windows, you might have a hard time, so, that's the problem, and most people, I think most people who bought AMD probably use Windows. They probably aren't going to switch to Linux, so... Yeah. So, until AMD actually, like, ports their, like, raw cam to, to Windows properly, and then there's actually PyTorch, I think they're, they're doing that, they're in the process of doing that, but, until they get it, they get a good, like, PyTorch raw cam build that works on Windows, it's, like, they're going to have a hard time. Yeah.Alessio [00:35:06]: We got to get George on it. Yeah. Well, he's trying to get Lisa Su to do it, but... Let's talk a bit about, like, the node design. So, unlike all the other text-to-image, you have a very, like, deep, so you have, like, a separate node for, like, clip and code, you have a separate node for, like, the case sampler, you have, like, all these nodes. Going back to, like, the making it easy versus making it hard, but, like, how much do people actually play with all the settings, you know? Kind of, like, how do you guide people to, like, hey, this is actually going to be very impactful versus this is maybe, like, less impactful, but we still want to expose it to you?Comfy [00:35:40]: Well, I try to... I try to expose, like, I try to expose everything or, but, yeah, at least for the, but for things, like, for example, for the samplers, like, there's, like, yeah, four different sampler nodes, which go in easiest to most advanced. So, yeah, if you go, like, the easy node, the regular sampler node, that's, you have just the basic settings. But if you use, like, the sampler advanced... If you use, like, the custom advanced node, that, that one you can actually, you'll see you have, like, different nodes.Alessio [00:36:19]: I'm looking it up now. Yeah. What are, like, the most impactful parameters that you use? So, it's, like, you know, you can have more, but, like, which ones, like, really make a difference?Comfy [00:36:30]: Yeah, they all do. They all have their own, like, they all, like, for example, yeah, steps. Usually you want steps, you want them to be as low as possible. But you want, if you're optimizing your workflow, you want to, you lower the steps until, like, the images start deteriorating too much. Because that, yeah, that's the number of steps you're running the diffusion process. So, if you want things to be faster, lower is better. But, yeah, CFG, that's more, you can kind of see that as the contrast of the image. Like, if your image looks too bursty. Then you can lower the CFG. So, yeah, CFG, that's how, yeah, that's how strongly the, like, the negative versus positive prompt. Because when you sample a diffusion model, it's basically a negative prompt. It's just, yeah, positive prediction minus negative prediction.swyx [00:37:32]: Contrastive loss. Yeah.Comfy [00:37:34]: It's positive minus negative, and the CFG does the multiplier. Yeah. Yeah. Yeah, so.Alessio [00:37:41]: What are, like, good resources to understand what the parameters do? I think most people start with automatic, and then they move over, and it's, like, snap, CFG, sampler, name, scheduler, denoise. Read it.Comfy [00:37:53]: But, honestly, well, it's more, it's something you should, like, try out yourself. I don't know, you don't necessarily need to know how it works to, like, what it does. Because even if you know, like, CFGO, it's, like, positive minus negative prompt. Yeah. So the only thing you know at CFG is if it's 1.0, then that means the negative prompt isn't applied. It also means sampling is two times faster. But, yeah. But other than that, it's more, like, you should really just see what it does to the images yourself, and you'll probably get a more intuitive understanding of what these things do.Alessio [00:38:34]: Any other nodes or things you want to shout out? Like, I know the animate diff IP adapter. Those are, like, some of the most popular ones. Yeah. What else comes to mind?Comfy [00:38:44]: Not nodes, but there's, like, what I like is when some people, sometimes they make things that use ComfyUI as their backend. Like, there's a plugin for Krita that uses ComfyUI as its backend. So you can use, like, all the models that work in Comfy in Krita. And I think I've tried it once. But I know a lot of people use it, and it's probably really nice, so.Alessio [00:39:15]: What's the craziest node that people have built, like, the most complicated?Comfy [00:39:21]: Craziest node? Like, yeah. I know some people have made, like, video games in Comfy with, like, stuff like that. So, like, someone, like, I remember, like, yeah, last, I think it was last year, someone made, like, a, like, Wolfenstein 3D in Comfy. Of course. And then one of the inputs was, oh, you can generate a texture, and then it changes the texture in the game. So you can plug it to, like, the workflow. And there's a lot of, if you look there, there's a lot of crazy things people do, so. Yeah.Alessio [00:39:59]: And now there's, like, a node register that people can use to, like, download nodes. Yeah.Comfy [00:40:04]: Like, well, there's always been the, like, the ComfyUI manager. Yeah. But we're trying to make this more, like, I don't know, official, like, with, yeah, with the node registry. Because before the node registry, the, like, okay, how did your custom node get into ComfyUI manager? That's the guy running it who, like, every day he searched GitHub for new custom nodes and added dev annually to his custom node manager. So we're trying to make it less effortless. So we're trying to make it less effortless for him, basically. Yeah.Alessio [00:40:40]: Yeah. But I was looking, I mean, there's, like, a YouTube download node. There's, like, this is almost like, you know, a data pipeline more than, like, an image generation thing at this point. It's, like, you can get data in, you can, like, apply filters to it, you can generate data out.Comfy [00:40:54]: Yeah. You can do a lot of different things. Yeah. So I'm thinking, I think what I did is I made it easy to make custom nodes. So I think that helped a lot. I think that helped a lot for, like, the ecosystem because it is very easy to just make a node. So, yeah, a bit too easy sometimes. Then we have the issue where there's a lot of custom node packs which share similar nodes. But, well, that's, yeah, something we're trying to solve by maybe bringing some of the functionality into the core. Yeah. Yeah. Yeah.Alessio [00:41:36]: And then there's, like, video. People can do video generation. Yeah.Comfy [00:41:40]: Video, that's, well, the first video model was, like, stable video diffusion, which was last, yeah, exactly last year, I think. Like, one year ago. But that wasn't a true video model. So it was...swyx [00:41:55]: It was, like, moving images? Yeah.Comfy [00:41:57]: I generated video. What I mean by that is it's, like, it's still 2D Latents. It's basically what I'm trying to do. So what they did is they took SD2, and then they added some temporal attention to it, and then trained it on videos and all. So it's kind of, like, animated, like, same idea, basically. Why I say it's not a true video model is that you still have, like, the 2D Latents. Like, a true video model, like Mochi, for example, would have 3D Latents. Mm-hmm.Alessio [00:42:32]: Which means you can, like, move through the space, basically. It's the difference. You're not just kind of, like, reorienting. Yeah.Comfy [00:42:39]: And it's also, well, it's also because you have a temporal VAE. Mm-hmm. Also, like, Mochi has a temporal VAE that compresses on, like, the temporal direction, also. So that's something you don't have with, like, yeah, animated diff and stable video diffusion. They only, like, compress spatially, not temporally. Mm-hmm. Right. So, yeah. That's why I call that, like, true video models. There's, yeah, there's actually a few of them, but the one I've implemented in comfy is Mochi, because that seems to be the best one so far. Yeah.swyx [00:43:15]: We had AJ come and speak at the stable diffusion meetup. The other open one I think I've seen is COG video. Yeah.Comfy [00:43:21]: COG video. Yeah. That one's, yeah, it also seems decent, but, yeah. Chinese, so we don't use it. No, it's fine. It's just, yeah, I could. Yeah. It's just that there's a, it's not the only one. There's also a few others, which I.swyx [00:43:36]: The rest are, like, closed source, right? Like, Cling. Yeah.Comfy [00:43:39]: Closed source, there's a bunch of them. But I mean, open. I've seen a few of them. Like, I can't remember their names, but there's COG videos, the big, the big one. Then there's also a few of them that released at the same time. There's one that released at the same time as SSD 3.5, same day, which is why I don't remember the name.swyx [00:44:02]: We should have a release schedule so we don't conflict on each of these things. Yeah.Comfy [00:44:06]: I think SD 3.5 and Mochi released on the same day. So everything else was kind of drowned, completely drowned out. So for some reason, lots of people picked that day to release their stuff.Comfy [00:44:21]: Yeah. Which is, well, shame for those. And I think Omnijet also released the same day, which also seems interesting. Yeah. Yeah.Alessio [00:44:30]: What's Comfy? So you are Comfy. And then there's like, comfy.org. I know we do a lot of things for, like, news research and those guys also have kind of like a more open source thing going on. How do you work? Like you mentioned, you mostly work on like, the core piece of it. And then what...Comfy [00:44:47]: Maybe I should fade it in because I, yeah, I feel like maybe, yeah, I only explain part of the story. Right. Yeah. Maybe I should explain the rest. So yeah. So yeah. Basically, January, that's when the first January 2023, January 16, 2023, that's when Amphi was first released to the public. Then, yeah, did a Reddit post about the area composition thing somewhere in, I don't remember exactly, maybe end of January, beginning of February. And then someone, a YouTuber, made a video about it, like Olivio, he made a video about Amphi in March 2023. I think that's when it was a real burst of attention. And by that time, I was continuing to develop it and it was getting, people were starting to use it more, which unfortunately meant that I had first written it to do like experiments, but then my time to do experiments went down. It started going down, because people were actually starting to use it then. Like, I had to, and I said, well, yeah, time to add all these features and stuff. Yeah, and then I got hired by Stability June, 2023. Then I made, basically, yeah, they hired me because they wanted the SD-XL. So I got the SD-XL working very well withітhe UI, because they were experimenting withámphi.house.com. Actually, the SDX, how the SDXL released worked is they released, for some reason, like they released the code first, but they didn't release the model checkpoint. So they released the code. And then, well, since the research was related to code, I released the code in Compute 2. And then the checkpoints were basically early access. People had to sign up and they only allowed a lot of people from edu emails. Like if you had an edu email, like they gave you access basically to the SDXL 0.9. And, well, that leaked. Right. Of course, because of course it's going to leak if you do that. Well, the only way people could easily use it was with Comfy. So, yeah, people started using. And then I fixed a few of the issues people had. So then the big 1.0 release happened. And, well, Comfy UI was the only way a lot of people could actually run it on their computers. Because it just like automatic was so like inefficient and bad that most people couldn't actually, like it just wouldn't work. Like because he did a quick implementation. So people were forced. To use Comfy UI, and that's how it became popular because people had no choice.swyx [00:47:55]: The growth hack.Comfy [00:47:56]: Yeah.swyx [00:47:56]: Yeah.Comfy [00:47:57]: Like everywhere, like people who didn't have the 4090, they had like, who had just regular GPUs, they didn't have a choice.Alessio [00:48:05]: So yeah, I got a 4070. So think of me. And so today, what's, is there like a core Comfy team or?Comfy [00:48:13]: Uh, yeah, well, right now, um, yeah, we are hiring. Okay. Actually, so right now core, like, um, the core core itself, it's, it's me. Uh, but because, uh, the reason where folks like all the focus has been mostly on the front end right now, because that's the thing that's been neglected for a long time. So, uh, so most of the focus right now is, uh, all on the front end, but we are, uh, yeah, we will soon get, uh, more people to like help me with the actual backend stuff. Yeah. So, no, I'm not going to say a hundred percent because that's why once the, once we have our V one release, which is because it'd be the package, come fee-wise with the nice interface and easy to install on windows and hopefully Mac. Uh, yeah. Yeah. Once we have that, uh, we're going to have to, lots of stuff to do on the backend side and also the front end side, but, uh.Alessio [00:49:14]: What's the release that I'm on the wait list. What's the timing?Comfy [00:49:18]: Uh, soon. Uh, soon. Yeah, I don't want to promise a release date. We do have a release date we're targeting, but I'm not sure if it's public. Yeah, and we're still going to continue doing the open source, making MPUI the best way to run stable infusion models. At least the open source side, it's going to be the best way to run models locally. But we will have a few things to make money from it, like cloud inference or that type of thing. And maybe some things for some enterprises.swyx [00:50:08]: I mean, a few questions on that. How do you feel about the other comfy startups?Comfy [00:50:11]: I mean, I think it's great. They're using your name. Yeah, well, it's better they use comfy than they use something else. Yeah, that's true. It's fine. We're going to try not to... We don't want to... We want people to use comfy. Like I said, it's better that people use comfy than something else. So as long as they use comfy, I think it helps the ecosystem. Because more people, even if they don't contribute directly, the fact that they are using comfy means that people are more likely to join the ecosystem. So, yeah.swyx [00:50:57]: And then would you ever do text?Comfy [00:50:59]: Yeah, well, you can already do text with some custom nodes. So, yeah, it's something we like. Yeah, it's something I've wanted to eventually add to core, but it's more like not a very... It's a very high priority. But because a lot of people use text for prompt enhancement and other things like that. So, yeah, it's just that my focus has always been on diffusion models. Yeah, unless some text diffusion model comes out.swyx [00:51:30]: Yeah, David Holtz is investing a lot in text diffusion.Comfy [00:51:34]: Yeah, well, if a good one comes out, then we'll probably implement it since it fits with the whole...swyx [00:51:39]: Yeah, I mean, I imagine it's going to be a close source to Midjourney. Yeah.Comfy [00:51:43]: Well, if an open one comes out, then I'll probably implement it.Alessio [00:51:54]: Cool, comfy. Thanks so much for coming on. This was fun. Bye. Get full access to Latent Space at www.latent.space/subscribe
Die Prozessorfrage beim Notebook spielt dieses Jahr eine größere Rolle als bisher, weil die Auswahl größer ist und die Prozessoren sich in mehr Punkten unterscheiden. Aber auch Aspekte wie Laufzeit, Gewicht, Schnittstellen und Display bleiben wichtig. Im Podcast entwirrt c't-Redakteur Florian Müssig die Zusammenhänge. Das Besondere bei den CPUs: Wer viel Rechenleistung, viel KI-Leistung, lange Laufzeiten und einen günstigen Preis erwartet, wird nicht fündig. Vier Prozessorklassen sind auf dem Markt, und jede schwächelt in mindestens einer der Kategorien: AMD Ryzen 9, Apple M4, Qualcomm Snapdragon X, Intel Core Ultra 200. Auch die Notebook-Hersteller müssen sich entscheiden, welche Plattformen sie unterstützen. Wir erklären auch, für welche KI-Funktionen man genau diese NPU-Prozessoren braucht oder welche auch auf CPUs ohne NPU laufen, wir schauen auf Erweiterbarkeit – besonders beim Framework-Notebook –, auf Displayhelligkeiten und -größen und auf viele weitere Details. Mit dabei: Florian Müssig Moderation: Jörg Wirtgen Produktion: Michael Wieczorek ► Der c't-Artikel zum Thema (Paywall): https://www.heise.de/select/ct/2025/1/2427415183769428827 ► c't Magazin: https://ct.de ► c't auf Mastodon: https://social.heise.de/@ct_Magazin ► c't auf Bluesky: https://bsky.app/profile/ct.de ► c't auf Instagram: https://www.instagram.com/ct_magazin ► c't auf Facebook: https://www.facebook.com/ctmagazin ► c't auf Papier: überall wo es Zeitschriften gibt!
Die Prozessorfrage beim Notebook spielt dieses Jahr eine größere Rolle als bisher, weil die Auswahl größer ist und die Prozessoren sich in mehr Punkten unterscheiden. Aber auch Aspekte wie Laufzeit, Gewicht, Schnittstellen und Display bleiben wichtig. Im Podcast entwirrt c't-Redakteur Florian Müssig die Zusammenhänge. Das Besondere bei den CPUs: Wer viel Rechenleistung, viel KI-Leistung, lange Laufzeiten und einen günstigen Preis erwartet, wird nicht fündig. Vier Prozessorklassen sind auf dem Markt, und jede schwächelt in mindestens einer der Kategorien: AMD Ryzen 9, Apple M4, Qualcomm Snapdragon X, Intel Core Ultra 200. Auch die Notebook-Hersteller müssen sich entscheiden, welche Plattformen sie unterstützen. Wir erklären auch, für welche KI-Funktionen man genau diese NPU-Prozessoren braucht oder welche auch auf CPUs ohne NPU laufen, wir schauen auf Erweiterbarkeit – besonders beim Framework-Notebook –, auf Displayhelligkeiten und -größen und auf viele weitere Details. ► [Der c't-Artikel zum Thema (Paywall)](https://www.heise.de/select/ct/2025/1/2427415183769428827)
We discuss Sony x AMD Project Amethyst, RDNA 4, RTX 5000, and next-gen CPUs! [SPON: Use "brokensilicon“ at CDKeyOffer's Christmas Sale to get Win 11 Pro for $23: https://www.cdkeyoffer.com/cko/Moore11 ] [SPON: Save BIG on the MinisForum BD795i & BD790i SE Motherboards: https://shrsl.com/4sgrf ] ***RECORDED 12/20/2024*** 0:00 PS5 Pro Sales Update, Sony buying 10% of Kadokawa 10:08 Project Amethyst for PS6 & RADEON 15:45 DRAM Companies are starting to bet on AMD over Intel 18:25 Intel terminates x86S 19:49 What we're looking forward to in 2025 23:47 Nintendo Switch 2 29:42 PC Handhelds in 2025, ChromeOS vs Windows 33:32 What will happen to XBOX & Game Pass in 2025? 35:24 Nvidia entering Laptop CPU 40:38 RTX 5000 Series 46:06 AMD RDNA 4 54:45 DLSS vs FSR in 2025 58:15 AMD Krackan, R9 9950X3D, R5 9600 1:02:58 AMD Zen 6 1:05:30 Zen 5 Threadripper, ARL mobile, Panther Lake 1:10:54 Bartlett Lake - Should gamers be excited for ADL+++? 1:14:47 Final Thoughts on 2025 https://x.com/CultureCrave/status/1869667605268816256 https://www.techspot.com/news/106024-initial-playstation-5-pro-sales-promising-despite-console.html https://www.tomshardware.com/pc-components/cpus/intel-doesnt-plan-to-bring-3d-v-cache-like-tech-to-consumer-cpus-for-now-next-gen-clearwater-forest-xeon-cpus-will-feature-local-cache-in-the-base-tile-akin-to-amds-3d-v-cache https://www.tomshardware.com/pc-components/cpus/intel-terminates-x86s-initiative-unilateral-quest-to-de-bloat-x86-instruction-set-comes-to-an-end https://videocardz.com/newz/lexar-unveils-ultra-low-latency-ddr5-6000-cl26-expo-memory-ideal-for-9800x3d https://videocardz.com/newz/amd-ryzen-5-9600-non-x-sku-reportedly-launches-late-january https://wccftech.com/nintendo-switch-2-dock-4k-30-fps-only/ https://www.theverge.com/2024/12/18/24324317/amd-playstation-ai-work-better-graphics-project-amethyst https://www.hardwareluxx.de/index.php/news/hardware/grafikkarten/65128-geforce-rtx-50-serie-inno3d-spricht-von-neural-rendering-f%C3%A4higkeiten.html https://wccftech.com/amd-krackan-point-8-core-cpu-benchmarked-on-passmark/
Dawid joins to discuss the highlights (and lowlights) of 2024, and what we're excited for in 2025. [SPON: Use "brokensilicon“ at CDKeyOffer's Christmas Sale to get Win 11 Pro for $23: https://www.cdkeyoffer.com/cko/Moore11 ] [SPON: Check out the MinisForum EliteMini AI370 powered by Zen 5 Strix: https://shareasale.com/r.cfm?b=1620564&u=3766867&m=101288&urllink=store%2Eminisforum%2Ecom%2Fproducts%2Felitemini%2Dai370%3Fgad%5Fsource%3D1%26gclid%3DCjwKCAiAmfq6BhAsEiwAX1jsZ0HqY2876HbsMw9zx%2DymLrGg0Tyw1%2D2hgF6vDC7ny976hTHVh4HPMBoCRpAQAvD%5FBwE&afftrack= ] ***RECORDED 12/15/2024*** 0:00 RTX 4010, 3070M, Vertical Gaming Mice (Intro) 11:05 Was Nvidia SUPER a success? What is a “midrange” GPU? 23.39 Was PlayStation 5 Pro a success? 33:51 Should the PS Handheld be a mini PS6 or PS5? 42:09 Nintendo Switch 2 47:29 Nvidia RTX 5060 8/12GB, AMD Zen 6 Medusa 54:44 Nvidia making CPUs! 1:02:01 Our Favorite Lovelace & RDNA 3 GPUs 1:07:32 AMD RDNA 4 vs RTX 5000 1:11:35 Thoughts on Intel's Situation 1:16:41 AMD Zen 5 - From Rags to Riches 1:22:09 Best Products of 2024 1:24:22 Valve's upcoming Steam Machine Consoles 1:39:53 Why we're optimistic for 2025 Last Time Dawid was on: https://youtu.be/SulVF2k-V9g?si=N0ZSdRDQlY_vsJGr RTX SUPER Pricing Leak: https://youtu.be/27mT_Am9NUo?si=7yCJoq8J-7vEQ6_n RTX 4070 SUPER Sales Analysis: https://youtu.be/Oz7G2RjIjj4?si=rZYUYRp25OTeiS1q BMG Laptop Cancelled Leak: https://youtu.be/hJJJLQZD_vA?si=No8uCjSa5ipEVTo3 PlayStation Handheld Leak: https://youtu.be/A0s0Usfg__g?si=7HCpAHCFEHMUcPCv Switch 2 Leak: https://youtu.be/SibxVnw3LwY?si=bFM79xc5aOW3TGl1 Zen 7 Prometheus: https://youtu.be/cv3k_LP1cr4?si=GAillp1_GUjEwxwS Strix Halo Low-Power Leak: https://youtu.be/5tah4ALPgOk?si=tbWWWxfEw_cXYpZE RTX 5080 / 5090 Laptop Leak: https://youtu.be/I_2I5K5r5jE?si=NTy6TiocF9tTRnVd RTX 5090 24GB Laptop Leak: https://youtu.be/Gy4HAJdjeRA?si=OvrfLnpH8We86vxX Pictures of RTX 5080 Die: https://youtu.be/V_UHVWcfeTg?si=oNdvDLg1Yuk9yQQ7 Blackwell Pricing Leak: https://youtu.be/EbEPwJvtA5M?si=C89U2iaoPVSf0Co5 AMD's ARM Sound Wave APU Leak: https://youtu.be/u19FZQ1ZBYc?si=gjwsH94qvjCkd9WN Third (fourth?!) Zen 5 IPC Leak: https://youtu.be/SeEBPbehttY?si=xadfctMJVhZ7MVrg Zen 5 Pricing Leak: https://youtu.be/ytH3MzK0on8?si=6_4UZMgeqsCzum49 Early Zen 6 Medusa Leak: https://youtu.be/oUv4US_FAjM?si=VcCfA0uFGpnrxU_n Recent Zen 6 Medusa Point Leak: https://youtu.be/h5dDn8nzvAw?si=l5saDjjik0JUXMKr Strix Halo Picture Leak: https://youtu.be/pZjqzQVc-So?si=pVVF84VRBnu1RC2p RPL Ring Bus Flaw: https://youtu.be/ZFE4q35buKs?si=Jv1idvxD9zMRd2cG Dev Interview Regarding RPL Flaws: https://youtu.be/rkVSgix0L38?si=P0YFHUJm0eJhI8un Zen 5 Launch Analysis: https://youtu.be/5Z1-FNBMw0w?si=-D-m_BjycWVwbm5t Arrow Lake Analysis: https://youtu.be/PKUXU1gewco?si=vAHcvgfVZ-afexhn 40-Core Arrow Lake Cancelled: https://youtu.be/yzimFlRJbAM?si=7TpgYJGeJgDYG39D Beast Lake Cancelled: https://youtu.be/yzimFlRJbAM?si=Lt7xi_Wq_l14v9x9 Intel Griffin Cove Leaked: https://youtu.be/G0bDB2AkHvE?si=TEMV8eDbPszB8iWC PS5 Pro Review Podcast: https://youtu.be/JyTRoMXngRk?si=Oz3wS5BW8dh1YAS4
Edwin Olson, Co-Founder & CEO, May Mobility joined Grayson Brulte on The Road to Autonomy podcast to discuss May Mobility's strategic expansion into Japan through partnerships with Toyota and NTT and their plans to launch with Lyft in Atlanta in 2025.May Mobility is focused on creating scalable, low-power autonomous vehicles that are profitable. One of the keys to May Mobility's success so far has been their multi-policy decision making (MPDM) system that runs on the edge, primarily with CPUs instead of GPUs, requiring less compute power.Recorded on Tuesday, December 10, 2024Episode Chapters0:00 May Mobility in Japan3:11 May Mobility's Approach to Autonomous Driving7:23 Autonomous Driving Models (LLMs)10:56 On-board Compute14:23 Multi-Policy Decision Making (MPDM)20:46 Japan Deployments 24:25 Simulation 26:16 MPDM on the Edge29:12 Scaling May Mobility34:57 Passenger Experience42:24 Atlanta Expansion / Lyft Partnership48:50 Freeways52:33 Future of May Mobility--------About The Road to AutonomyThe Road to Autonomy® is a leading source of data, insight and commentary on autonomous vehicles/trucks and the emerging autonomy economy™.Sign up for This Week in The Autonomy Economy newsletter: https://www.roadtoautonomy.com/autonomy-economy/See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.
Audio-Podcast – OrionX.net: Deep Insight, Market Execution, Customer Engagement
We have a full house with Adrian Cockcroft, Stephen Perrenod, Chris Kruell, and Shahin Khan with a "postview" of the SC24 conference, the latest CryptoSuper500 list, a snapshot of quantum computing, and the RISC-V Summit. They also discuss Bitcoin, AI vs. HPC, HPC in the cloud, liquid cooling, InfraTech, Interconnects, Optical Computing, OpenMP, PCIe, CXL, GPUs, CPUs, and energy efficiency and ESG. [audio mp3="https://orionx.net/wp-content/uploads/2024/12/OXD025_SC24-Postview_CryptoSuper500_Quantum_RISCV-Summit_20241216.mp3"][/audio] The post SC24, Supercomputing, CryptoSuper500, Quantum, RISC-V Summit – OXD25 appeared first on OrionX.net.
We discuss upcoming RTX 5000 launches, the future of ARC, and Zen 5 / 6 CPUs! [SPON: Check out ALL Microsoft Surfaces: https://micro.center/s5c8 ] [SPON: Santa Clara Location coming SOON: https://micro.center/sdq0 ] [SPON: Shop Racing and Flight Sim Bundles: https://micro.center/rbqy ] [SPON: Shop Micro Center Holiday Savings: https://micro.center/t7iv ] 0:00 Slow News Holiday, New Jersey Drones, Console Laptops (Intro Banter) 3:56 Virtual Link, Intel & CHIPS (Corrections) 10:58 Nvidia RTX 5070 Ti 16GB, Blackwell Pricing 19:29 Pat Gelsinger “Retired” from Intel – Was he to blame? 26:29 Could Intel bounce back in a few years? 31:42 B580 12GB Reviewed & “Launched” 50:45 Intel Celestial, ARC Chiplets, B770 57:47 NEW Arrow Lake firmware offers little to no Performance Gains 1:02:45 AMD Zen 5 Krackan Performance & Specs Leaked 1:08:19 Will Krackan have a Zen 6 successor? 1:11:04 Valve is working on Steam Machines again! 1:14:42 Nintendo Switch 2 Pictured 1:18:09 Strix Latency, Intel Falcon Shores, Qualcomm Returns, GTA 6 (Wrap-up) 1:23:16 8K 144Hz Monitors, The Real Ultra Enthusiasts (Final RM) https://www.techpowerup.com/329873/nvidia-geforce-rtx-5070-ti-leak-tips-more-vram-cores-and-power-draw https://www.reuters.com/business/intel-ceo-pat-gelsinger-retire-2024-12-02/ https://www.techspot.com/review/2935-intel-arc-b580/ https://www.techpowerup.com/review/intel-arc-b580/ https://www.pcworld.com/article/2553897/intel-arc-b580-review.html https://www.youtube.com/watch?v=599O7Q4BfaI&ab_channel=KitGuruTech https://x.com/GPUsAreMagic/status/1867311371848650856 https://youtu.be/1PDiFr0UhtQ?si=PY9qhjNxnKta77GO HWU Tim talking with Intel's Tom Peterson https://www.tomshardware.com/pc-components/cpus/intels-latest-arrow-lake-cpu-firmware-reportedly-offers-little-to-no-performance-gains-users-test-the-microcode-ahead-of-launch-on-the-asrock-z890-taichi-ocf https://www.tomshardware.com/pc-components/cpus/cyberpunk-2077-update-2-2-reportedly-improves-arrow-lake-performance-by-up-to-33-percent-theoretically-matching-the-ryzen-7-7800x3d https://www.pcgamesn.com/intel/cyberpunk-2077-arrow-lake-benchmarks https://videocardz.com/newz/cyberpunk-2077-2-0-patch-to-increase-8-core-cpu-utilization-to-90-mods-for-ryzen-smt-no-longer-needed https://videocardz.com/newz/acer-swift-laptop-to-feature-8-core-ryzen-ai-7-350-amd-krackan-processor-faster-than-ryzen-7-8845hs https://browser.geekbench.com/v6/cpu/9384066 https://www.techspot.com/news/105892-valve-may-ready-revive-steam-machines-project-fremont.html https://x.com/SadlyItsBradley/status/1864960198760100115 https://x.com/SadlyItsBradley/status/1864960200924450934 https://www.techspot.com/news/105849-powered-steamos-valve-linux-gaming-os-prepares-expand.html https://www.theverge.com/2024/12/13/24320477/lenovo-legion-go-s-steamos-handheld-gaming-pc-rumors https://wccftech.com/nintendo-switch-2-joy-con-pictures/ https://www.bilibili.com/video/BV19GzdYSEhe/?spm_id_from=333.999.0.0 https://videocardz.com/newz/nintendo-switch-2-design-revealed-by-case-manufacturers https://x.com/deckwizardyt/status/1868098994963910865 https://wccftech.com/amd-finally-finds-fixes-for-improving-inter-core-latency-on-strix-point-apus/ https://www.techspot.com/news/105914-time-names-amd-boss-lisa-su-ceo-year.html https://time.com/7200909/ceo-of-the-year-2024-lisa-su/ https://www.techpowerup.com/329828/intel-co-ceo-dampens-expectations-for-first-gen-falcon-shores-gpu https://www.techspot.com/news/105962-intel-claims-retailers-facing-high-return-rates-snapdragon.html
In this episode of the Virtually Speaking Podcast, we sit down with Dave Morera, Technical Marketing Architect, and Arvind Jagannath, Product Management Lead from Broadcom, to explore how VMware by Broadcom is addressing customer challenges like rising costs, underutilized CPUs, and unbalanced memory-CPU ratios. We dive into the exciting tech preview of NVMe-based memory tiering, uncovering its potential to transform performance for workloads like databases and virtual desktop infrastructure (VDI). Drawing on VMware's rich history of memory management expertise, including technologies like vSphere vMotion, Dave and Arvind share insights into the customer value proposition, real-world performance data, and how memory tiering can alleviate critical pain points for IT environments. Don't miss this in-depth look at a groundbreaking approach to optimizing memory and CPU utilization!
Surprise! There's an episode today! We are continuing our series on The Computer in this episode. We dive into CPUS and GPUS each of their individual parts, and the difference between them. Please DO Watch Part one of the video (S2 E14) for more context on what we are talking about! (Unless you have a general knowledge on RAM, SSD, HDD, and Display Technology) Here's the video I was talking about (CPU v/s GPU) https://www.youtube.com/watch?v=-P28LKWTzrI Sources: https://handwovenmagazine.com/green-dye-death/ https://www.techtarget.com/searchstorage/definition/cache-memory#:~:text=There%20are%20three%20general%20cache,often%20more%20capacious%20than%20L1. https://ms.codes/blogs/computer-hardware/how-many-alu-in-cpu#:~:text=In%20a%20single%20ALU%20architecture,all%20arithmetic%20and%20logical%20operations. https://www.redhat.com/en/blog/cpu-components-functionality https://aws.amazon.com/what-is/cpu/ https://edu.gcfglobal.org/en/computerbasics/inside-a-computer/1/ https://www.lenovo.com/us/en/glossary/arithmetic-logic-unit/ https://www.digitaltrends.com/computing/what-is-a-cpu/ https://www.youtube.com/watch?v=Axd50ew4pco
It's been a wild few months in CPUs, with next-generation releases from both AMD and Intel in their respective Zen 5 and Arrow Lake categories. Now that most all the big parts are out, we break down what's what, including why everyone is finally going disaggregated (and what that means), what's going on with OS updates to make your processor run faster, which one to get if you just want to play games, what the new CU-DIMM standard means for RAM, and more. Support the Pod! Contribute to the Tech Pod Patreon and get access to our booming Discord, a monthly bonus episode, your name in the credits, and other great benefits! You can support the show at: https://patreon.com/techpod
-Intel's latest CPUs are a real mixed bag -Just let it go: https://wccftech.com/woman-fell-upside-down-trying-to-retrieve-dropped-iphone/ -CIA goodies: https://en.wikipedia.org/wiki/ANT_catalog -Space Power https://arstechnica.com/space/2024/10/solar-power-from-space-actually-it-might-happen-in-a-couple-of-years/ -mRNA is very cool!! https://gizmodo.com/covid-vaccine-tech-is-being-used-to-fight-a-nasty-diarrhea-causing-bacteria-2000514541 -Bucket list adventure, now done With robots: https://nypost.com/2024/10/14/lifestyle/aquariums-star-whale-shark-shocks-visitors-because-it-turned-out-to-be-a-robot/ https://youtu.be/xoePM7uj-B4?si=3fLs_jdzyB2HXNcb (just a great whale shark video -Well crud, I'm out of the job: https://www.mainebiz.biz/article/jackson-lab-partners-with-animal-housing-firm-on-ai-cage-monitoring-system
Qualcomm has unveiled its big, next gen Snapdragon SoC which will someday be in all the phones. Except iPhones, of course. Is Netflix pulling back on its gaming strategy? Why a new marketplace from Epic might actually point the way to the metaverse. And how are various people trying to get AI to have a better personality?Links:Snapdragon 8 Elite deep dive: A return to custom CPUs and much more (Android Authority)Scoop: Netflix shuts down 'AAA' Team Blue gaming studio, amid gaming shake-up (Game File)Netflix has closed its AAA gaming studio (Engadget)Hulu and Disney+ No Longer Support Signups and Payment Using App Store (MacRumors)Epic's ambitious digital asset shop is now open (The Verge)Biden administration proposes new rules governing data transfers to adversarial nations (The Record)How AI groups are infusing their chatbots with personality (Financial Times)See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.
Send Everyday AI and Jordan a text messageAI is sucking up energy at an alarming rate. Gartner predicts that AI could consume up to 3.5% of global electricity by 2030. But what if quantum computing could change that? Peter Chapman of IonQ, will break down how quantum tech could reduce the power needed to fuel AI's explosive growth and why it's the next big thing in computing.Newsletter: Sign up for our free daily newsletterMore on this Episode: Episode PageJoin the discussion: Ask Jordan Peter questions on AI and energyUpcoming Episodes: Check out the upcoming Everyday AI Livestream lineupWebsite: YourEverydayAI.comEmail The Show: info@youreverydayai.comConnect with Jordan on LinkedInTopics Covered in This Episode:1. Quantum Computing in AI2. Barriers to Adopting Quantum Computing3. Mechanics of Quantum Computing4. Quantum Computing's Role in Energy Efficiency5. Quantum Computing's Future RoleTimestamps:01:45 Daily AI news05:00 About Petter and IonQ06:24 Quantum computers needed for complex problem solving.09:11 Quantum cubits: electrons exist as probabilities everywhere.12:53 Quantum computing at cusp, future applications unknown.15:42 Quantum can address generative AI's energy demands.18:48 Quantum power surpasses universal atoms; AI potential.21:38 Exploring quantum processors for LLM efficiency improvement.27:08 Reduce energy demand to address climate change.29:20 Quantum excels in chemistry, optimization, AI tasks.31:26 Is human intelligence inherently quantum and efficient?Keywords:Peter Chapman, Quantum computing, classical systems, transistors, quantum processor, AI, large language models, Prime Prompt Polished Chat GPT, efficient prompting, Quantum Processing Units, linear algebra, barriers to adoption, theoretical perceptions, cloud services, energy savings, environmental impact, nuclear power, data centers, energy demands, power plants, optimization problems, CPUs, GPUs, QPUs, drug discovery, artificial intelligence, qubits, parallelization, classical bits, 64-qubit chip. Get more out of ChatGPT by learning our PPP method in this live, interactive and free training! Sign up now: https://youreverydayai.com/ppp-registration/