Podcasts about webgpu

39PODCASTS
52EPISODES
59mAVG DURATION
1MONTHLY NEW EPISODE
Sep 5, 2025LATEST

POPULARITY

20172018201920202021202220232024

Best podcasts about webgpu

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

3 episodes with webgpu

Hacker News Recap

2 episodes with webgpu

Building the Open Metaverse

3 episodes with webgpu

Les Cast Codeurs Podcast

2 episodes with webgpu

The top AI news from the past week, every ThursdAI

4 episodes with webgpu

Latest podcast episodes about webgpu

The top AI news from the past week, every ThursdAI

Play Episode Listen Later Sep 5, 2025 98:00

Wohoo, hey ya'll, Alex here,I'm back from the desert (pic at the end) and what a great feeling it is to be back in the studio to talk about everything that happened in AI! It's been a pretty full week (or two) in AI, with Coding agent space heating up, Grok entering the ring and taking over free tokens, Codex 10xing usage and Anthropic... well, we'll get to Anthropic. Today on the show we had Roger and Bhavesh from Nous Research cover the awesome Hermes 4 release and the new PokerBots benchmark, then we had a returning favorite, Kwindla Hultman Kramer, to talk about the GA of RealTime voice from OpenAI. Plus we got some massive funding news, some drama with model quality on Claude Code, and some very exciting news right here from CoreWeave aquiring OpenPipe!

E357 Cetáceos CEOívoros

Podcast Ubuntu Portugal

Play Episode Listen Later Jul 24, 2025 66:52

Saiu o Firefox 141 com imensas novidades (algumas delas estúpidas, outras catitas); como a Webgpu vai acelerar a vossa experiência de navegação; porque é que a Battlestar Galactica é a melhor série de sempre; o Miguel emitiu em AM num rádio prestes a explodir e está à procura de engenheiros espanhóis vivos - e finalmente instalou Snaps no Ubuntu Touch; em África vai decorrer finalmente a primeira Ubucon África, na Tanzânia, de 11 a 15 de Agosto; o Diogo e o Miguel discutem planos diabólicos para soltarem orcas do Zoomarine na sequência da iniciativa Stop Killing Games; e apressem-se a enviar o vosso papel de parede bonito para o concurso!.

ceos tanz battlestar galactica firefox snaps diogo saiu voros ubuntu touch webgpu ubucon

075 - Epic React Native Packages, Background Images, Skia and WebGPU Updates, Expo Router & The MCP Hype

Rocket Ship

Play Episode Listen Later Jul 3, 2025 23:22

This week, we're diving deep into some of the most exciting updates in the React Native ecosystem—and I'm sharing a few personal shifts too.

ai video epic hype developers images expo web3 packages itthe galaxies router app development mcp react native reactjs mobile app development webgpu skottie skia

913: NEWS: Remix drops React, Safari 26 CSS + mega fast Vite and TypeSCript

Syntax - Tasty Web Development Treats

Play Episode Listen Later Jun 23, 2025 50:11

Wes and CJ break down the latest web dev news, including big changes in Safari 26, TypeScript Native Previews, and Remix dropping React. They also chat about new proposals from TC39, Vite 7 beta, and a surprise project from the Astro team. Show Notes 00:00 Welcome to Syntax! 00:41 Safari WWDC. 01:05 SVG Favicons. 02:01 Every site can be a web app on iOS and iPadOS. 03:08 WebGPU in Safari. 08:02 Lots of CSS goodies. @Una Tweet. 10:19 Remix 3 dropping React. Wake Up Remix. @mjackson Tweet. 17:40 Typescript Native Preview. @drosenwasser Tweet. Microsoft Blog: Announcing TypeScript Native Previews. 20:53 Cursor 1.0. 29:12 TC39 Advances Several Proposals to Stage 4. 29:51 Array.fromAsync. 31:15 Error.isError. 32:14 Explicit Resource Management: using. 36:53 Astro Creators working on an email client. @FredKSchott Tweet. 39:23 Announcing Rolldown-Vite. Voidzero. Compatibility. 44:43 Vite 7 in Beta. 46:04 Angular v20 Released. 47:30 Take the State of CSS Survey! 48:40 Brought to you by Sentry.io. Hit us up on Socials! Syntax: X Instagram Tiktok LinkedIn Threads Wes: X Instagram Tiktok LinkedIn Threads Scott: X Instagram Tiktok LinkedIn Threads Randy: X Instagram YouTube Threads

state stage ios released mega remix drops beta error react cj safari socials astro vite compatibility css array ipados sentry angular syntax typescript cursor tc39 webgpu

#069 - Expo Router v5, Skia WebGPU, App Updates & Galaxies Lifetime

Rocket Ship

Play Episode Listen Later May 15, 2025 24:53

Expo Router v5 was released, which dramatically improves authentication flows and finally allows to use RSC in production - although still in beta. Beyond that William Candillon shared epic updates about Skia and WebGPU, making even more powerful React Native apps possible in the future.Also in this episode:- Galaxies Lifetime pricing with one-time payment- Receiving Feedback on Podcast & Apps- Sharing my next app projects- Flutter devs love React Native

lifetime developers expo galaxies router flutter rsc app development react native receiving feedback app updates reactjs mobile app development webgpu skia

#1524: HTC Viverse Leverages Open Web Tech in Aim to Become ‘YouTube of 3D Content’

Voices of VR Podcast – Designing for Virtual Reality

Play Episode Listen Later Feb 26, 2025 64:42

HTC has launched Viverse platform that is aiming to become the YouTube of 3D Content. It's a platform that hosts 3D content and worlds built on PlayCanvas as well as many other open source technologies like WebXR, WebGPU, and VRM. It is sort of like a mix between VRChat with social worlds, 3D web crypto-based metaverse worlds, but also with some more enterprise embedding features as well. It's exciting to see this shift towards building out the open and interoperable metaverse, but in the context of this hybrid walled garden context that is built using open web technology stacks. It is also a hybrid in another way in that it is mostly consumer-facing but also has enterprise use cases like privately embedding of 3D content. I interviewed HTC's Andranik Aslanyan about the new VIVERSE platform, how they recruited over a hundred XR and WebXR developers to seed the content, and how VIVERSE fits into their overall strategy. I get some clarifications on the Android XR non-exclusive IP acquisition, and how VIVERSE fits into where HTC will be heading now that we're coming up on 10 years the HTC Vive was announced on March 1, 2014. This is a listener-supported podcast through the Voices of VR Patreon. Music: Fatality

tech 3d voices ip xr htc htc vive leverages vr chat open web vrm webgpu

News 06/25: Apples neue App // JavaScript Temporal // Web AI Acceleration Fund // Angular Dokumentation // Ross Ulbricht // Bitcoins in El Salvador

programmier.bar – der Podcast für App- und Webentwicklung

Play Episode Listen Later Feb 5, 2025 26:42

Ganz ungeplant kommt Dennis doch noch zu unserer Aufnahme und hat die neuste (und unerwartete) App von Apple im Gepäck. Was sich hinter dem Codenamen „Confetti“ verbirgt, erfahrt ihr natürlich bei uns!Außerdem berichten wir über eine ganze Sammlung an News aus der JavaScript-Welt. Mit Temporal soll es bald eine bessere Implementierung von Daten und Zeiträumen geben und mit dem „Web AI Acceleration Fund“ will Google den Support für WebGPU und In-Browser-LLMs verbessern und die zugehörigen Toolchains und Framework-Integrationen ausbauen.Wer sich lieber etwas zurücklehnen möchte, kann jetzt die neuste Dokumentation von Honeypot anschauen. Mit „Angular: The Documentary | An origin Story“ gibt es jetzt interessante Eindrücke und Interviews mit Gründer:innen und Wegbegleiter:innen des JavaScript-Frameworks.Außerdem berichtet Dave von der Begnadigung des Silk Road Gründers Ross Ulbricht und wir erfahren von Garrelt, warum El Salvador sich von seinem Projekt einer digital Staatswährung verabschieden musste.Nicht vergessen: Ab 100 Teilnehmenden gibt es für das Ausfüllen unserer Hörer:innen-Umfrage etwas zu gewinnen! Macht also gern mit und motiviert eure Fellow-Hörer:innen.Details zum Gewinnspiel findet ihr unter https://www.programmier.bar/gewinnspiel Schreibt uns! Schickt uns eure Themenwünsche und euer Feedback: podcast@programmier.barFolgt uns! Bleibt auf dem Laufenden über zukünftige Folgen und virtuelle Meetups und beteiligt euch an Community-Diskussionen. BlueskyInstagramLinkedInMeetupYouTube

#156 Web AI - DeepSeek Хайпует | AI уничтожит фронтенд? | WebGPU портит волосы | Гугл несет AI всем

Podcast proConf

Play Episode Listen Later Jan 29, 2025 115:06

Валера: https://www.linkedin.com/in/wavecut/ Плотва: https://t.me/PlotvoBot https://t.me/plotquot AI сообщество: https://itbeard.com/projects/evocoders 18:07 Web AI Summit 2024: State of client side machine learning (https://youtu.be/tF70o1Q8VkM) 23:32 Web AI on next generation AI PCs (https://youtu.be/5BjB7AIed3A) 23:32 The future of AI is now: Real-life case studies for on client-side AI adoption in web apps (https://youtu.be/LFveSvTJh5U) 40:17 MediaPipe Web: Bringing cross-platform AI tech to the browser (https://youtu.be/tVvKlx-oVqc) 46:03 State isn't all you need, but It helps: building better LLM apps in the browser (https://youtu.be/87un2cGrn-0) 55:27 Transformers.js: State-of-the-art Machine Learning for the web (https://youtu.be/n18Lrbo8VU8) 01:03:16 Beyond the banner: The power of Web AI to personalize paid rich media ads (https://youtu.be/vizYvB3-Z8o) 01:09:20 Web AI in industry: How TensorFlow.js has driven what you see on the supermarket shelves (https://youtu.be/u9MCtWrgEUs) 00:16:24 Lessons learned from being customer zero of Chrome's built-in APIs (https://youtu.be/surh_D8CU9A) 01:23:27 WebLLM: A high-performance in-browser LLM Inference engine (https://youtu.be/MhTCzq7iTy0) 01:28:00 Overview of Chrome built-in AI (https://youtu.be/1TAhv4vqkTw) 01:36:07 Transforming access to healthcare through Web AI (https://youtu.be/rR08A-X11Ys) 01:39:43 ml5.js - Friendly machine learning for the web (https://youtu.be/LHhSxtgyuUw) 01:42:30 The Web Neural Network (WebNN) API: Where we are and what's Next (https://youtu.be/FoYBWzXCsmM) 01:46:36 Exploring alternative interactions in JavaScript (https://youtu.be/i-znmbB-SGA) Нас можно найти: 1. Telegram: https://t.me/proConf 2. Youtube: https://www.youtube.com/c/proconf 3. SoundCloud: https://soundcloud.com/proconf 4. Itunes: https://podcasts.apple.com/by/podcast/podcast-proconf/id1455023466 5. Spotify: https://open.spotify.com/show/77BSWwGavfnMKGIg5TDnLz

spotify ai lessons real state soundcloud exploring transforming telegram friendly transformers machine learning chrome apis javascript llm webgpu

Bolt.new, Flow Engineering for Code Agents, and >$8m ARR in 2 months as a Claude Wrapper

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Dec 2, 2024 98:39

The full schedule for Latent Space LIVE! at NeurIPS has been announced, featuring Best of 2024 overview talks for the AI Startup Landscape, Computer Vision, Open Models, Transformers Killers, Synthetic Data, Agents, and Scaling, and speakers from Sarah Guo of Conviction, Roboflow, AI2/Meta, Recursal/Together, HuggingFace, OpenHands and SemiAnalysis. Join us for the IRL event/Livestream! Alessio will also be holding a meetup at AWS Re:Invent in Las Vegas this Wednesday. See our new Events page for dates of AI Engineer Summit, Singapore, and World's Fair in 2025. LAST CALL for questions for our big 2024 recap episode! Submit questions and messages on Speakpipe here for a chance to appear on the show!When we first observed that GPT Wrappers are Good, Actually, we did not even have Bolt on our radar. Since we recorded our Anthropic episode discussing building Agents with the new Claude 3.5 Sonnet, Bolt.new (by Stackblitz) has easily cleared the $8m ARR bar, repeating and accelerating its initial $4m feat.There are very many AI code generators and VS Code forks out there, but Bolt probably broke through initially because of its incredible zero shot low effort app generation:But as we explain in the pod, Bolt also emphasized deploy (Netlify)/ backend (Supabase)/ fullstack capabilities on top of Stackblitz's existing WebContainer full-WASM-powered-developer-environment-in-the-browser tech. Since then, the team has been shipping like mad (with weekly office hours), with bugfixing, full screen, multi-device, long context, diff based edits (using speculative decoding like we covered in Inference, Fast and Slow).All of this has captured the imagination of low/no code builders like Greg Isenberg and many others on YouTube/TikTok/Reddit/X/Linkedin etc:Just as with Fireworks, our relationship with Bolt/Stackblitz goes a bit deeper than normal - swyx advised the launch and got a front row seat to this epic journey, as well as demoed it with Realtime Voice at the recent OpenAI Dev Day. So we are very proud to be the first/closest to tell the full open story of Bolt/Stackblitz!Flow Engineering + Qodo/AlphaCodium UpdateIn year 2 of the pod we have been on a roll getting former guests to return as guest cohosts (Harrison Chase, Aman Sanger, Jon Frankle), and it was a pleasure to catch Itamar Friedman back on the pod, giving us an update on all things Qodo and Testing Agents from our last catchup a year and a half ago:Qodo (they renamed in September) went viral in early January this year with AlphaCodium (paper here, code here) beating DeepMind's AlphaCode with high efficiency:With a simple problem solving code agent:* The first step is to have the model reason about the problem. They describe it using bullet points and focus on the goal, inputs, outputs, rules, constraints, and any other relevant details.* Then, they make the model reason about the public tests and come up with an explanation of why the input leads to that particular output. * The model generates two to three potential solutions in text and ranks them in terms of correctness, simplicity, and robustness. * Then, it generates more diverse tests for the problem, covering cases not part of the original public tests. * Iteratively, pick a solution, generate the code, and run it on a few test cases. * If the tests fail, improve the code and repeat the process until the code passes every test.swyx has previously written similar thoughts on types vs tests for putting bounds on program behavior, but AlphaCodium extends this to AI generated tests and code.More recently, Itamar has also shown that AlphaCodium's techniques also extend well to the o1 models:Making Flow Engineering a useful technique to improve code model performance on every model. This is something we see AI Engineers uniquely well positioned to do compared to ML Engineers/Researchers.Full Video PodcastLike and subscribe!Show Notes* Itamar* Qodo* First episode* Eric* Bolt* StackBlitz* Thinkster* AlphaCodium* WebContainersChapters* 00:00:00 Introductions & Updates* 00:06:01 Generic vs. Specific AI Agents* 00:07:40 Maintaining vs Creating with AI* 00:17:46 Human vs Agent Computer Interfaces* 00:20:15 Why Docker doesn't work for Bolt* 00:24:23 Creating Testing and Code Review Loops* 00:28:07 Bolt's Task Breakdown Flow* 00:31:04 AI in Complex Enterprise Environments* 00:41:43 AlphaCodium* 00:44:39 Strategies for Breaking Down Complex Tasks* 00:45:22 Building in Open Source* 00:50:35 Choosing a product as a founder* 00:59:03 Reflections on Bolt Success* 01:06:07 Building a B2C GTM* 01:18:11 AI Capabilities and Pricing Tiers* 01:20:28 What makes Bolt unique* 01:23:07 Future Growth and Product Development* 01:29:06 Competitive Landscape in AI Engineering* 01:30:01 Advice to Founders and Embracing AI* 01:32:20 Having a baby and completing an Iron ManTranscriptAlessio [00:00:00]: Hey everyone, welcome to the Latent Space Podcast. This is Alessio, partner and CTO at Decibel Partners, and I'm joined by my co-host Swyx, founder of Smol.ai.Swyx [00:00:12]: Hey, and today we're still in our sort of makeshift in-between studio, but we're very delighted to have a former returning guest host, Itamar. Welcome back.Itamar [00:00:21]: Great to be here after a year or more. Yeah, a year and a half.Swyx [00:00:24]: You're one of our earliest guests on Agents. Now you're CEO co-founder of Kodo. Right. Which has just been renamed. You also raised a $40 million Series A, and we can get caught up on everything, but we're also delighted to have our new guest, Eric. Welcome.Eric [00:00:42]: Thank you. Excited to be here. Should I say Bolt or StackBlitz?Swyx [00:00:45]: Like, is it like its own company now or?Eric [00:00:47]: Yeah. Bolt's definitely bolt.new. That's the thing that we're probably the most known for, I imagine, at this point.Swyx [00:00:54]: Which is ridiculous to say because you were working at StackBlitz for so long.Eric [00:00:57]: Yeah. I mean, within a week, we were doing like double the amount of traffic. And StackBlitz had been online for seven years, and we were like, what? But anyways, yeah. So we're StackBlitz, the company behind bolt.new. If you've heard of bolt.new, that's our stuff. Yeah.Swyx [00:01:12]: Yeah.Itamar [00:01:13]: Excellent. I see, by the way, that the founder mode, you need to know to capture opportunities. So kudos on doing that, right? You're working on some technology, and then suddenly you can exploit that to a new world. Yeah.Eric [00:01:24]: Totally. And I think, well, not to jump, but 100%, I mean, a couple of months ago, we had the idea for Bolt earlier this year, but we haven't really shared this too much publicly. But we actually had tried to build it with some of those state-of-the-art models back in January, February, you can kind of imagine which, and they just weren't good enough to actually do the code generation where the code was accurate and it was fast and whatever have you without a ton of like rag, but then there was like issues with that. So we put it on the shelf and then we got kind of a sneak peek of some of the new models that have come out in the past couple of months now. And so once we saw that, once we actually saw the code gen from it, we were like, oh my God, like, okay, we can build a product around this. And so that was really the impetus of us building the thing. But with that, it was StackBlitz, the core StackBlitz product the past seven years has been an IDE for developers. So the entire user experience flow we've built up just didn't make sense. And so when we kind of went out to build Bolt, we just thought, you know, if we were inventing our product today, what would the interface look like given what is now possible with the AI code gen? And so there's definitely a lot of conversations we had internally, but you know, just kind of when we logically laid it out, we were like, yeah, I think it makes sense to just greenfield a new thing and let's see what happens. If it works great, then we'll figure it out. If it doesn't work great, then it'll get deleted at some point. So that's kind of how it actually came to be.Swyx [00:02:49]: I'll mention your background a little bit. You were also founder of Thinkster before you started StackBlitz. So both of you are second time founders. Both of you have sort of re-founded your company recently. Yours was more of a rename. I think a slightly different direction as well. And then we can talk about both. Maybe just chronologically, should we get caught up on where Kodo is first and then you know, just like what people should know since the last pod? Sure.Itamar [00:03:12]: The last pod was two months after we launched and we basically had the vision that we talked about. The idea that software development is about specification, test and code, etc. We are more on the testing part as in essence, we think that if you solve testing, you solve software development. The beautiful chart that we'll put up on screen. And testing is a really big field, like there are many dimensions, unit testing, the level of the component, how big it is, how large it is. And then there is like different type of testing, is it regression or smoke or whatever. So back then we only had like one ID extension with unit tests as in focus. One and a half year later, first ID extension supports more type of testing as context aware. We index local, local repos, but also 10,000s of repos for Fortune 500 companies. We have another agent, another tool that is called, the pure agent is the open source and the commercial one is CodoMerge. And then we have another open source called CoverAgent, which is not yet a commercial product coming very soon. It's very impressive. It could be that already people are approving automated pull requests that they don't even aware in really big open sources. So once we have enough of these, we will also launch another agent. So for the first one and a half year, what we did is grew in our offering and mostly on the side of, does this code actually works, testing, code review, et cetera. And we believe that's the critical milestone that needs to be achieved to actually have the AI engineer for enterprise software. And then like for the first year was everything bottom up, getting to 1 million installation. 2024, that was 2023, 2024 was starting to monetize, to feel like how it is to make the first buck. So we did the teams offering, it went well with a thousand of teams, et cetera. And then we started like just a few months ago to do enterprise with everything you need, which is a lot of things that discussed in the last post that was just released by Codelm. So that's how we call it at Codelm. Just opening the brackets, our company name was Codelm AI, and we renamed to Codo and we call our models Codelm. So back to my point, so we started Enterprise Motion and already have multiple Fortune 100 companies. And then with that, we raised a series of $40 million. And what's exciting about it is that enables us to develop more agents. That's our focus. I think it's very different. We're not coming very soon with an ID or something like that.Swyx [00:06:01]: You don't want to fork this code?Itamar [00:06:03]: Maybe we'll fork JetBrains or something just to be different.Swyx [00:06:08]: I noticed that, you know, I think the promise of general purpose agents has kind of died. Like everyone is doing kind of what you're doing. There's Codogen, Codomerge, and then there's a third one. What's the name of it?Itamar [00:06:17]: Yeah. Codocover. Cover. Which is like a commercial version of a cover agent. It's coming soon.Swyx [00:06:23]: Yeah. It's very similar with factory AI, also doing like droids. They all have special purpose doing things, but people don't really want general purpose agents. Right. The last time you were here, we talked about AutoGBT, the biggest thing of 2023. This year, not really relevant anymore. And I think it's mostly just because when you give me a general purpose agent, I don't know what to do with it.Eric [00:06:42]: Yeah.Itamar [00:06:43]: I totally agree with that. We're seeing it for a while and I think it will stay like that despite the computer use, et cetera, that supposedly can just replace us. You can just like prompt it to be, hey, now be a QA or be a QA person or a developer. I still think that there's a few reasons why you see like a dedicated agent. Again, I'm a bit more focused, like my head is more on complex software for big teams and enterprise, et cetera. And even think about permissions and what are the data sources and just the same way you manage permissions for users. Developers, you probably want to have dedicated guardrails and dedicated approvals for agents. I intentionally like touched a point on not many people think about. And of course, then what you can think of, like maybe there's different tools, tool use, et cetera. But just the first point by itself is a good reason why you want to have different agents.Alessio [00:07:40]: Just to compare that with Bot.new, you're almost focused on like the application is very complex and now you need better tools to kind of manage it and build on top of it. On Bot.new, it's almost like I was using it the other day. There's basically like, hey, look, I'm just trying to get started. You know, I'm not very opinionated on like how you're going to implement this. Like this is what I want to do. And you build a beautiful app with it. What people ask as the next step, you know, going back to like the general versus like specific, have you had people say, hey, you know, this is great to start, but then I want a specific Bot.new dot whatever else to do a more vertical integration and kind of like development or what's the, what do people say?Eric [00:08:18]: Yeah. I think, I think you kind of hit the, hit it head on, which is, you know, kind of the way that we've, we've kind of talked about internally is it's like people are using Bolt to go from like 0.0 to 1.0, like that's like kind of the biggest unlock that Bolt has versus most other things out there. I mean, I think that's kind of what's, what's very unique about Bolt. I think the, you know, the working on like existing enterprise applications is, I mean, it's crazy important because, you know, there's a, you look, when you look at the fortune 500, I mean, these code bases, some of these have been around for 20, 30 plus years. And so it's important to be going from, you know, 101.3 to 101.4, et cetera. I think for us, so what's been actually pretty interesting is we see there's kind of two different users for us that are coming in and it's very distinct. It's like people that are developers already. And then there's people that have never really written software and more if they have, it's been very, very minimal. And so in the first camp, what these developers are doing, like to go from zero to one, they're coming to Bolt and then they're ejecting the thing to get up or just downloading it and, you know, opening cursor, like whatever to, to, you know, keep iterating on the thing. And sometimes they'll bring it back to Bolt to like add in a huge piece of functionality or something. Right. But for the people that don't know how to code, they're actually just, they, they live in this thing. And that was one of the weird things when we launched is, you know, within a day of us being online, one of the most popular YouTube videos, and there's been a ton since, which was, you know, there's like, oh, Bolt is the cursor killer. And I originally saw the headlines and I was like, thanks for the views. I mean, I don't know. This doesn't make sense to me. That's not, that's not what we kind of thought.Swyx [00:09:44]: It's how YouTubers talk to each other. Well, everything kills everything else.Eric [00:09:47]: Totally. But what blew my mind was that there was any comparison because it's like cursor is a, is a local IDE product. But when, when we actually kind of dug into it and we, and we have people that are using our product saying this, I'm not using cursor. And I was like, what? And it turns out there are hundreds of thousands of people that we have seen that we're using cursor and we're trying to build apps with that where they're not traditional software does, but we're heavily leaning on the AI. And as you can imagine, it is very complicated, right? To do that with cursor. So when Bolt came out, they're like, wow, this thing's amazing because it kind of inverts the complexity where it's like, you know, it's not an IDE, it's, it's a, it's a chat-based sort of interface that we have. So that's kind of the split, which is rather interesting. We've had like the first startups now launch off of Bolt entirely where this, you know, tomorrow I'm doing a live stream with this guy named Paul, who he's built an entire CRM using this thing and you know, with backend, et cetera. And people have made their first money on the internet period, you know, launching this with Stripe or whatever have you. So that's, that's kind of the two main, the two main categories of folks that we see using Bolt though.Itamar [00:10:51]: I agree that I don't understand the comparison. It doesn't make sense to me. I think like we have like two type of families of tools. One is like we re-imagine the software development. I think Bolt is there and I think like a cursor is more like a evolution of what we already have. It's like taking the IDE and it's, it's amazing and it's okay, let's, let's adapt the IDE to an era where LLMs can do a lot for us. And Bolt is more like, okay, let's rethink everything totally. And I think we see a few tools there, like maybe Vercel, Veo and maybe Repl.it in that area. And then in the area of let's expedite, let's change, let's, let's progress with what we already have. You can see Cursor and Kodo, but we're different between ourselves, Cursor and Kodo, but definitely I think that comparison doesn't make sense.Alessio [00:11:42]: And just to set the context, this is not a Twitter demo. You've made 4 million of revenue in four weeks. So this is, this is actually working, you know, it's not a, what, what do you think that is? Like, there's been so many people demoing coding agents on Twitter and then it doesn't really work. And then you guys were just like, here you go, it's live, go use it, pay us for it. You know, is there anything in the development that was like interesting and maybe how that compares to building your own agents?Eric [00:12:08]: We had no idea, honestly, like we, we, we've been pretty blown away and, and things have just kind of continued to grow faster since then. We're like, oh, today is week six. So I, I kind of came back to the point you just made, right, where it's, you, you kind of outlined, it's like, there's kind of this new market of like kind of rethinking the software development and then there's heavily augmenting existing developers. I think that, you know, both of which are, you know, AI code gen being extremely good, it's allowed existing developers, it's allowing existing developers to camera out software far faster than they could have ever before, right? It's like the ultimate power tool for an existing developer. But this code gen stuff is now so good. And then, and we saw this over the past, you know, from the beginning of the year when we tried to first build, it's actually lowered the barrier to people that, that aren't traditionally software engineers. But the kind of the key thing is if you kind of think about it from, imagine you've never written software before, right? My co-founder and I, he and I grew up down the street from each other in Chicago. We learned how to code when we were 13 together and we've been building stuff ever since. And this is back in like the mid 2000s or whatever, you know, there was nothing for free to learn from online on the internet and how to code. For our 13th birthdays, we asked our parents for, you know, O'Reilly books cause you couldn't get this at the library, right? And so instead of like an Xbox, we got, you know, programming books. But the hardest part for everyone learning to code is getting an environment set up locally, you know? And so when we built StackBlitz, like kind of the key thesis, like seven years ago, the insight we had was that, Hey, it seems like the browser has a lot of new APIs like WebAssembly and service workers, et cetera, where you could actually write an operating system that ran inside the browser that could boot in milliseconds. And you, you know, basically there's this missing capability of the web. Like the web should be able to build apps for the web, right? You should be able to build the web on the web. Every other platform has that, Visual Studio for Windows, Xcode for Mac. The web has no built in primitive for this. And so just like our built in kind of like nerd instinct on this was like, that seems like a huge hole and it's, you know, it will be very valuable or like, you know, very valuable problem to solve. So if you want to set up that environments, you know, this is what we spent the past seven years doing. And the reality is existing developers have running locally. They already know how to set up that environment. So the problem isn't as acute for them. When we put Bolt online, we took that technology called WebContainer and married it with these, you know, state of the art frontier models. And the people that have the most pain with getting stuff set up locally is people that don't code. I think that's been, you know, really the big explosive reason is no one else has been trying to make dev environments work inside of a browser tab, you know, for the past if since ever, other than basically our company, largely because there wasn't an immediate demand or need. So I think we kind of find ourselves at the right place at the right time. And again, for this market of people that don't know how to write software, you would kind of expect that you should be able to do this without downloading something to your computer in the same way that, hey, I don't have to download Photoshop now to make designs because there's Figma. I don't have to download Word because there's, you know, Google Docs. They're kind of looking at this as that sort of thing, right? Which was kind of the, you know, our impetus and kind of vision from the get-go. But you know, the code gen, the AI code gen stuff that's come out has just been, you know, an order of magnitude multiplier on how magic that is, right? So that's kind of my best distillation of like, what is going on here, you know?Alessio [00:15:21]: And you can deploy too, right?Eric [00:15:22]: Yeah.Alessio [00:15:23]: Yeah.Eric [00:15:24]: And so that's, what's really cool is it's, you know, we have deployment built in with Netlify and this is actually, I think, Sean, you actually built this at Netlify when you were there. Yeah. It's one of the most brilliant integrations actually, because, you know, effectively the API that Sean built, maybe you can speak to it, but like as a provider, we can just effectively give files to Netlify without the user even logging in and they have a live website. And if they want to keep, hold onto it, they can click a link and claim it to their Netlify account. But it basically is just this really magic experience because when you come to Bolt, you say, I want a website. Like my mom, 70, 71 years old, made her first website, you know, on the internet two weeks ago, right? It was about her nursing days.Swyx [00:16:03]: Oh, that's fantastic though. It wouldn't have been made.Eric [00:16:06]: A hundred percent. Cause even in, you know, when we've had a lot of people building personal, like deeply personal stuff, like in the first week we launched this, the sales guy from the East Coast, you know, replied to a tweet of mine and he said, thank you so much for building this to your team. His daughter has a medical condition and so for her to travel, she has to like line up donors or something, you know, so ahead of time. And so he actually used Bolt to make a website to do that, to actually go and send it to folks in the region she was going to travel to ahead of time. I was really touched by it, but I also thought like, why, you know, why didn't he use like Wix or Squarespace? Right? I mean, this is, this is a solved problem, quote unquote, right? And then when I thought, I actually use Squarespace for my, for my, uh, the wedding website for my wife and I, like back in 2021, so I'm familiar, you know, it was, it was faster. I know how to code. I was like, this is faster. Right. And I thought back and I was like, there's a whole interface you have to learn how to use. And it's actually not that simple. There's like a million things you can configure in that thing. When you come to Bolt, there's a, there's a text box. You just say, I need a, I need a wedding website. Here's the date. Here's where it is. And here's a photo of me and my wife, put it somewhere relevant. It's actually the simplest way. And that's what my, when my mom came, she said, uh, I'm Pat Simons. I was a nurse in the seventies, you know, and like, here's the things I did and a website came out. So coming back to why is this such a, I think, why are we seeing this sort of growth? It's, this is the simplest interface I think maybe ever created to actually build it, a deploy a website. And then that website, my mom made, she's like, okay, this looks great. And there's, there's one button, you just click it, deploy, and it's live and you can buy a domain name, attach it to it. And you know, it's as simple as it gets, it's getting even simpler with some of the stuff we're working on. But anyways, so that's, it's, it's, uh, it's been really interesting to see some of the usage like that.Swyx [00:17:46]: I can offer my perspective. So I, you know, I probably should have disclosed a little bit that, uh, I'm a, uh, stack list investor.Alessio [00:17:53]: Canceled the episode. I know, I know. Don't play it now. Pause.Eric actually reached out to ShowMeBolt before the launch. And we, you know, we talked a lot about, like, the framing of, of what we're going to talk about how we marketed the thing, but also, like, what we're So that's what Bolt was going to need, like a whole sort of infrastructure.swyx: Netlify, I was a maintainer but I won't take claim for the anonymous upload. That's actually the origin story of Netlify. We can have Matt Billman talk about it, but that was [00:18:00] how Netlify started. You could drag and drop your zip file or folder from your desktop onto a website, it would have a live URL with no sign in.swyx: And so that was the origin story of Netlify. And it just persists to today. And it's just like it's really nice, interesting that both Bolt and CognitionDevIn and a bunch of other sort of agent type startups, they all use Netlify to deploy because of this one feature. They don't really care about the other features.swyx: But, but just because it's easy for computers to use and talk to it, like if you build an interface for computers specifically, that it's easy for them to Navigate, then they will be used in agents. And I think that's a learning that a lot of developer tools companies are having. That's my bolt launch story and now if I say all that stuff.swyx: And I just wanted to come back to, like, the Webcontainers things, right? Like, I think you put a lot of weight on the technical modes. I think you also are just like, very good at product. So you've, you've like, built a better agent than a lot of people, the rest of us, including myself, who have tried to build these things, and we didn't get as far as you did.swyx: Don't shortchange yourself on products. But I think specifically [00:19:00] on, on infra, on like the sandboxing, like this is a thing that people really want. Alessio has Bax E2B, which we'll have on at some point, talking about like the sort of the server full side. But yours is, you know, inside of the browser, serverless.swyx: It doesn't cost you anything to serve one person versus a million people. It doesn't, doesn't cost you anything. I think that's interesting. I think in theory, we should be able to like run tests because you can run the full backend. Like, you can run Git, you can run Node, you can run maybe Python someday.swyx: We talked about this. But ideally, you should be able to have a fully gentic loop, running code, seeing the errors, correcting code, and just kind of self healing, right? Like, I mean, isn't that the dream?Eric: Totally.swyx: Yeah,Eric: totally. At least in bold, we've got, we've got a good amount of that today. I mean, there's a lot more for us to do, but one of the nice things, because like in web container, you know, there's a lot of kind of stuff you go Google like, you know, turn docker container into wasm.Eric: You'll find a lot of stuff out there that will do that. The problem is it's very big, it's slow, and that ruins the experience. And so what we ended up doing is just writing an operating system from [00:20:00] scratch that was just purpose built to, you know, run in a browser tab. And the reason being is, you know, Docker 2 awesome things will give you an image that's like out 60 to 100 megabits, you know, maybe more, you know, and our, our OS, you know, kind of clocks in, I think, I think we're in like a, maybe, maybe a megabyte or less or something like that.Eric: I mean, it's, it's, you know, really, really, you know, stripped down.swyx: This is basically the task involved is I understand that it's. Mapping every single, single Linux call to some kind of web, web assembly implementation,Eric: but more or less, and, and then there's a lot of things actually, like when you're looking at a dev environment, there's a lot of things that you don't need that a traditional OS is gonna have, right?Eric: Like, you know audio drivers or you like, there's just like, there's just tons of things. Oh, yeah. Right. Yeah. That goes . Yeah. You can just kind, you can, you can kind of tos them. Or alternatively, what you can do is you can actually be the nice thing. And this is, this kind of comes back to the origins of browsers, which is, you know, they're, they're at the beginning of the web and, you know, the late nineties, there was two very different kind of visions for the web where Alan Kay vehemently [00:21:00] disagree with the idea that should be document based, which is, you know, Tim Berners Lee, you know, that, and that's kind of what ended up winning, winning was this document based kind of browsing documents on the web thing.Eric: Alan Kay, he's got this like very famous quote where he said, you know, you want web browsers to be mini operating systems. They should download little mini binaries and execute with like a little mini virtualized operating system in there. And what's kind of interesting about the history, not to geek out on this aspect, what's kind of interesting about the history is both of those folks ended up being right.Eric: Documents were actually the pragmatic way that the web worked. Was, you know, became the most ubiquitous platform in the world to the degree now that this is why WebAssembly has been invented is that we're doing, we need to do more low level things in a browser, same thing with WebGPU, et cetera. And so all these APIs, you know, to build an operating system came to the browser.Eric: And that was actually the realization we had in 2017 was, holy heck, like you can actually, you know, service workers, which were designed for allowing your app to work offline. That was the kind of the key one where it was like, wait a second, you can actually now run. Web servers within a [00:22:00] browser, like you can run a server that you open up.Eric: That's wild. Like full Node. js. Full Node. js. Like that capability. Like, I can have a URL that's programmatically controlled. By a web application itself, boom. Like the web can build the web. The primitive is there. Everyone at the time, like we talked to people that like worked on, you know Chrome and V8 and they were like, uhhhh.Eric: You know, like I don't know. But it's one of those things you just kind of have to go do it to find out. So we spent a couple of years, you know, working on it and yeah. And, and, and got to work in back in 2021 is when we kind of put the first like data of web container online. Butswyx: in partnership with Google, right?swyx: Like Google actually had to help you get over the finish line with stuff.Eric: A hundred percent, because well, you know, over the years of when we were doing the R and D on the thing. Kind of the biggest challenge, the two ways that you can kind of test how powerful and capable a platform are, the two types of applications are one, video games, right, because they're just very compute intensive, a lot of calculations that have to happen, right?Eric: The second one are IDEs, because you're talking about actually virtualizing the actual [00:23:00] runtime environment you are in to actually build apps on top of it, which requires sophisticated capabilities, a lot of access to data. You know, a good amount of compute power, right, to effectively, you know, building app in app sort of thing.Eric: So those, those are the stress tests. So if your platform is missing stuff, those are the things where you find out. Those are, those are the people building games and IDEs. They're the ones filing bugs on operating system level stuff. And for us, browser level stuff.Eric [00:23:47]: yeah, what ended up happening is we were just hammering, you know, the Chromium bug tracker, and they're like, who are these guys? Yeah. And, and they were amazing because I mean, just making Chrome DevTools be able to debug, I mean, it's, it's not, it wasn't originally built right for debugging an operating system, right? They've been phenomenal working with us and just kind of really pushing the limits, but that it's a rising tide that's kind of lifted all boats because now there's a lot of different types of applications that you can debug with Chrome Dev Tools that are running a browser that runs more reliably because just the stress testing that, that we and, you know, games that are coming to the web are kind of pushing as well, but.Itamar [00:24:23]: That's awesome. About the testing, I think like most, let's say coding assistant from different kinds will need this loop of testing. And even I would add code review to some, to some extent that you mentioned. How is testing different from code review? Code review could be, for example, PR review, like a code review that is done at the point of when you want to merge branches. But I would say that code review, for example, checks best practices, maintainability, and so on. It's not just like CI, but more than CI. And testing is like a more like checking functionality, et cetera. So it's different. We call, by the way, all of these together code integrity, but that's a different story. Just to go back to the, to the testing and specifically. Yeah. It's, it's, it's since the first slide. Yeah. We're consistent. So if we go back to the testing, I think like, it's not surprising that for us testing is important and for Bolt it's testing important, but I want to shed some light on a different perspective of it. Like let's think about autonomous driving. Those startups that are doing autonomous driving for highway and autonomous driving for the city. And I think like we saw the autonomous of the highway much faster and reaching to a level, I don't know, four or so much faster than those in the city. Now, in both cases, you need testing and quote unquote testing, you know, verifying validation that you're doing the right thing on the road and you're reading and et cetera. But it's probably like so different in the city that it could be like actually different technology. And I claim that we're seeing something similar here. So when you're building the next Wix, and if I was them, I was like looking at you and being a bit scared. That's what you're disrupting, what you just said. Then basically, I would say that, for example, the UX UI is freaking important. And because you're you're more aiming for the end user. In this case, maybe it's an end user that doesn't know how to develop for developers. It's also important. But let alone those that do not know to develop, they need a slick UI UX. And I think like that's one reason, for example, I think Cursor have like really good technology. I don't know the underlying what's under the hood, but at least what they're saying. But I think also their UX UI is great. It's a lot because they did their own ID. While if you're aiming for the city AI, suddenly like there's a lot of testing and code review technology that it's not necessarily like that important. For example, let's talk about integration tests. Probably like a lot of what you're building involved at the moment is isolated applications. Maybe the vision or the end game is maybe like having one solution for everything. It could be that eventually the highway companies will go into the city and the other way around. But at the beginning, there is a difference. And integration tests are a good example. I guess they're a bit less important. And when you think about enterprise software, they're really important. So to recap, like I think like the idea of looping and verifying your test and verifying your code in different ways, testing or code review, et cetera, seems to be important in the highway AI and the city AI, but in different ways and different like critical for the city, even more and more variety. Actually, I was looking to ask you like what kind of loops you guys are doing. For example, when I'm using Bolt and I'm enjoying it a lot, then I do see like sometimes you're trying to catch the errors and fix them. And also, I noticed that you're breaking down tasks into smaller ones and then et cetera, which is already a common notion for a year ago. But it seems like you're doing it really well. So if you're willing to share anything about it.Eric [00:28:07]: Yeah, yeah. I realized I never actually hit the punchline of what I was saying before. I mentioned the point about us kind of writing an operating system from scratch because what ended up being important about that is that to your point, it's actually a very, like compared to like a, you know, if you're like running cursor on anyone's machine, you kind of don't know what you're dealing with, with the OS you're running on. There could be an error happens. It could be like a million different things, right? There could be some config. There could be, it could be God knows what, right? The thing with WebConnect is because we wrote the entire thing from scratch. It's actually a unified image basically. And we can instrument it at any level that we think is going to be useful, which is exactly what we did when we started building Bolt is we instrumented stuff at like the process level, at the runtime level, you know, et cetera, et cetera, et cetera. Stuff that would just be not impossible to do on local, but to do that in a way that works across any operating system, whatever is, I mean, would just be insanely, you know, insanely difficult to do right and reliably. And that's what you saw when you've used Bolt is that when an error actually will occur, whether it's in the build process or the actual web application itself is failing or anything kind of in between, you can actually capture those errors. And today it's a very primitive way of how we've implemented it largely because the product just didn't exist 90 days ago. So we're like, we got some work ahead of us and we got to hire some more a little bit, but basically we present and we say, Hey, this is, here's kind of the things that went wrong. There's a fix it button and then a ignore button, and then you can just hit fix it. And then we take all that telemetry through our agent, you run it through our agent and say, kind of, here's the state of the application. Here's kind of the errors that we got from Node.js or the browser or whatever, and like dah, dah, dah, dah. And it can take a crack at actually solving it. And it's actually pretty darn good at being able to do that. That's kind of been a, you know, closing the loop and having it be a reliable kind of base has seemed to be a pretty big upgrade over doing stuff locally, just because I think that's a pretty key ingredient of it. And yeah, I think breaking things down into smaller tasks, like that's, that's kind of a key part of our agent. I think like Claude did a really good job with artifacts. I think, you know, us and kind of everyone else has, has kind of taken their approach of like actually breaking out certain tasks in a certain order into, you know, kind of a concrete way. And, and so actually the core of Bolt, I know we actually made open source. So you can actually go and check out like the system prompts and et cetera, and you can run it locally and whatever have you. So anyone that's interested in this stuff, I'd highly recommend taking a look at. There's not a lot of like stuff that's like open source in this realm. It's, that was one of the fun things that we've we thought would be cool to do. And people, people seem to like it. I mean, there's a lot of forks and people adding different models and stuff. So it's been cool to see.Swyx [00:30:41]: Yeah. I'm happy to add, I added real-time voice for my opening day demo and it was really fun to hack with. So thank you for doing that. Yeah. Thank you. I'm going to steal your code.Eric [00:30:52]: Because I want that.Swyx [00:30:52]: It's funny because I built on top of the fork of Bolt.new that already has the multi LLM thing. And so you just told me you're going to merge that in. So then you're going to merge two layers of forks down into this thing. So it'll be fun.Eric [00:31:03]: Heck yeah.Alessio [00:31:04]: Just to touch on like the environment, Itamar, you maybe go into the most complicated environments that even the people that work there don't know how to run. How much of an impact does that have on your performance? Like, you know, it's most of the work you're doing actually figuring out environment and like the libraries, because I'm sure they're using outdated version of languages, they're using outdated libraries, they're using forks that have not been on the public internet before. How much of the work that you're doing is like there versus like at the LLM level?Itamar [00:31:32]: One of the reasons I was asking about, you know, what are the steps to break things down, because it really matters. Like, what's the tech stack? How complicated the software is? It's hard to figure it out when you're dealing with the real world, any environment of enterprise as a city, when I'm like, while maybe sometimes like, I think you do enable like in Bolt, like to install stuff, but it's quite a like controlled environment. And that's a good thing to do, because then you narrow down and it's easier to make things work. So definitely, there are two dimensions, I think, actually spaces. One is the fact just like installing our software without yet like doing anything, making it work, just installing it because we work with enterprise and Fortune 500, etc. Many of them want on prem solution.Swyx [00:32:22]: So you have how many deployment options?Itamar [00:32:24]: Basically, we had, we did a metric metrics, say 96 options, because, you know, they're different dimensions. Like, for example, one dimension, we connect to your code management system to your Git. So are you having like GitHub, GitLab? Subversion? Is it like on cloud or deployed on prem? Just an example. Which model agree to use its APIs or ours? Like we have our Is it TestGPT? Yeah, when we started with TestGPT, it was a huge mistake name. It was cool back then, but I don't think it's a good idea to name a model after someone else's model. Anyway, that's my opinion. So we gotSwyx [00:33:02]: I'm interested in these learnings, like things that you change your mind on.Itamar [00:33:06]: Eventually, when you're building a company, you're building a brand and you want to create your own brand. By the way, when I thought about Bolt.new, I also thought about if it's not a problem, because when I think about Bolt, I do think about like a couple of companies that are already called this way.Swyx [00:33:19]: Curse companies. You could call it Codium just to...Itamar [00:33:24]: Okay, thank you. Touche. Touche.Eric [00:33:27]: Yeah, you got to imagine the board meeting before we launched Bolt, one of our investors, you can imagine they're like, are you sure? Because from the investment side, it's kind of a famous, very notorious Bolt. And they're like, are you sure you want to go with that name? Oh, yeah. Yeah, absolutely.Itamar [00:33:43]: At this point, we have actually four models. There is a model for autocomplete. There's a model for the chat. There is a model dedicated for more for code review. And there is a model that is for code embedding. Actually, you might notice that there isn't a good code embedding model out there. Can you name one? Like dedicated for code?Swyx [00:34:04]: There's code indexing, and then you can do sort of like the hide for code. And then you can embed the descriptions of the code.Itamar [00:34:12]: Yeah, but you do see a lot of type of models that are dedicated for embedding and for different spaces, different fields, etc. And I'm not aware. And I know that if you go to the bedrock, try to find like there's a few code embedding models, but none of them are specialized for code.Swyx [00:34:31]: Is there a benchmark that you would tell us to pay attention to?Itamar [00:34:34]: Yeah, so it's coming. Wait for that. Anyway, we have our models. And just to go back to the 96 option of deployment. So I'm closing the brackets for us. So one is like dimensional, like what Git deployment you have, like what models do you agree to use? Dotter could be like if it's air-gapped completely, or you want VPC, and then you have Azure, GCP, and AWS, which is different. Do you use Kubernetes or do not? Because we want to exploit that. There are companies that do not do that, etc. I guess you know what I mean. So that's one thing. And considering that we are dealing with one of all four enterprises, we needed to deal with that. So you asked me about how complicated it is to solve that complex code. I said, it's just a deployment part. And then now to the software, we see a lot of different challenges. For example, some companies, they did actually a good job to build a lot of microservices. Let's not get to if it's good or not, but let's first assume that it is a good thing. A lot of microservices, each one of them has their own repo. And now you have tens of thousands of repos. And you as a developer want to develop something. And I remember me coming to a corporate for the first time. I don't know where to look at, like where to find things. So just doing a good indexing for that is like a challenge. And moreover, the regular indexing, the one that you can find, we wrote a few blogs on that. By the way, we also have some open source, different than yours, but actually three and growing. Then it doesn't work. You need to let the tech leads and the companies influence your indexing. For example, Mark with different repos with different colors. This is a high quality repo. This is a lower quality repo. This is a repo that we want to deprecate. This is a repo we want to grow, etc. And let that be part of your indexing. And only then things actually work for enterprise and they don't get to a fatigue of, oh, this is awesome. Oh, but I'm starting, it's annoying me. I think Copilot is an amazing tool, but I'm quoting others, meaning GitHub Copilot, that they see not so good retention of GitHub Copilot and enterprise. Ooh, spicy. Yeah. I saw snapshots of people and we have customers that are Copilot users as well. And also I saw research, some of them is public by the way, between 38 to 50% retention for users using Copilot and enterprise. So it's not so good. By the way, I don't think it's that bad, but it's not so good. So I think that's a reason because, yeah, it helps you auto-complete, but then, and especially if you're working on your repo alone, but if it's need that context of remote repos that you're code-based, that's hard. So to make things work, there's a lot of work on that, like giving the controllability for the tech leads, for the developer platform or developer experience department in the organization to influence how things are working. A short example, because if you have like really old legacy code, probably some of it is not so good anymore. If you just fine tune on these code base, then there is a bias to repeat those mistakes or old practices, etc. So you need, for example, as I mentioned, to influence that. For example, in Coda, you can have a markdown of best practices by the tech leads and Coda will include that and relate to that and will not offer suggestions that are not according to the best practices, just as an example. So that's just a short list of things that you need to do in order to deal with, like you mentioned, the 100.1 to 100.2 version of software. I just want to say what you're doing is extremelyEric [00:38:32]: impressive because it's very difficult. I mean, the business of Stackplus, kind of before bulk came online, we sold a version of our IDE that went on-prem. So I understand what you're saying about the difficulty of getting stuff just working on-prem. Holy heck. I mean, that is extremely hard. I guess the question I have for you is, I mean, we were just doing that with kind of Kubernetes-based stuff, but the spread of Fortune 500 companies that you're working with, how are they doing the inference for this? Are you kind of plugging into Azure's OpenAI stuff and AWS's Bedrock, you know, Cloud stuff? Or are they just like running stuff on GPUs? Like, what is that? How are these folks approaching that? Because, man, what we saw on the enterprise side, I mean, I got to imagine that that's a huge challenge. Everything you said and more, like,Itamar [00:39:15]: for example, like someone could be, and I don't think any of these is bad. Like, they made their decision. Like, for example, some people, they're, I want only AWS and VPC on AWS, no matter what. And then they, some of them, like there is a subset, I will say, I'm willing to take models only for from Bedrock and not ours. And we have a problem because there is no good code embedding model on Bedrock. And that's part of what we're doing now with AWS to solve that. We solve it in a different way. But if you are willing to run on AWS VPC, but run your run models on GPUs or inferentia, like the new version of the more coming out, then our models can run on that. But everything you said is right. Like, we see like on-prem deployment where they have their own GPUs. We see Azure where you're using OpenAI Azure. We see cases where you're running on GCP and they want OpenAI. Like this cross, like a case, although there is Gemini or even Sonnet, I think is available on GCP, just an example. So all the options, that's part of the challenge. I admit that we thought about it, but it was even more complicated. And it took us a few months to actually, that metrics that I mentioned, to start clicking each one of the blocks there. A few months is impressive. I mean,Eric [00:40:35]: honestly, just that's okay. Every one of these enterprises is, their networking is different. Just everything's different. Every single one is different. I see you understand. Yeah. So that just cannot be understated. That it is, that's extremely impressive. Hats off.Itamar [00:40:50]: It could be, by the way, like, for example, oh, we're only AWS, but our GitHub enterprise is on-prem. Oh, we forgot. So we need like a private link or whatever, like every time like that. It's not, and you do need to think about it if you want to work with an enterprise. And it's important. Like I understand like their, I respect their point of view.Swyx [00:41:10]: And this primarily impacts your architecture, your tech choices. Like you have to, you can't choose some vendors because...Itamar [00:41:15]: Yeah, definitely. To be frank, it makes us hard for a startup because it means that we want, we want everyone to enjoy all the variety of models. By the way, it was hard for us with our technology. I want to open a bracket, like a window. I guess you're familiar with our Alpha Codium, which is an open source.Eric [00:41:33]: We got to go over that. Yeah. So I'll do that quickly.Itamar [00:41:36]: Yeah. A pin in that. Yeah. Actually, we didn't have it in the last episode. So, so, okay.Swyx [00:41:41]: Okay. We'll come back to that later, but let's talk about...Itamar [00:41:43]: Yeah. So, so just like shortly, and then we can double click on Alpha Codium. But Alpha Codium is a open source tool. You can go and try it and lets you compete on CodeForce. This is a website and a competition and actually reach a master level level, like 95% with a click of a button. You don't need to do anything. And part of what we did there is taking a problem and breaking it to different, like smaller blocks. And then the models are doing a much better job. Like we all know it by now that taking small tasks and solving them, by the way, even O1, which is supposed to be able to do system two thinking like Greg from OpenAI like hinted, is doing better on these kinds of problems. But still, it's very useful to break it down for O1, despite O1 being able to think by itself. And that's what we presented like just a month ago, OpenAI released that now they are doing 93 percentile with O1 IOI left and International Olympiad of Formation. Sorry, I forgot. Exactly. I told you I forgot. And we took their O1 preview with Alpha Codium and did better. Like it just shows like, and there is a big difference between the preview and the IOI. It shows like that these models are not still system two thinkers, and there is a big difference. So maybe they're not complete system two. Yeah, they need some guidance. I call them system 1.5. We can, we can have it. I thought about it. Like, you know, I care about this philosophy stuff. And I think like we didn't see it even close to a system two thinking. I can elaborate later. But closing the brackets, like we take Alpha Codium and as our principle of thinking, we take tasks and break them down to smaller tasks. And then we want to exploit the best model to solve them. So I want to enable anyone to enjoy O1 and SONET and Gemini 1.5, etc. But at the same time, I need to develop my own models as well, because some of the Fortune 500 want to have all air gapped or whatever. So that's a challenge. Now you need to support so many models. And to some extent, I would say that the flow engineering, the breaking down to two different blocks is a necessity for us. Why? Because when you take a big block, a big problem, you need a very different prompt for each one of the models to actually work. But when you take a big problem and break it into small tasks, we can talk how we do that, then the prompt matters less. What I want to say, like all this, like as a startup trying to do different deployment, getting all the juice that you can get from models, etc. is a big problem. And one need to think about it. And one of our mitigation is that process of taking tasks and breaking them down. That's why I'm really interested to know how you guys are doing it. And part of what we do is also open source. So you can see.Swyx [00:44:39]: There's a lot in there. But yeah, flow over prompt. I do believe that that does make sense. I feel like there's a lot that both of you can sort of exchange notes on breaking down problems. And I just want you guys to just go for it. This is fun to watch.Eric [00:44:55]: Yeah. I mean, what's super interesting is the context you're working in is, because for us too with Bolt, we've started thinking because our kind of existing business line was going behind the firewall, right? We were like, how do we do this? Adding the inference aspect on, we're like, okay, how does... Because I mean, there's not a lot of prior art, right? I mean, this is all new. This is all new. So I definitely am going to have a lot of questions for you.Itamar [00:45:17]: I'm here. We're very open, by the way. We have a paper on a blog or like whatever.Swyx [00:45:22]: The Alphacodeum, GitHub, and we'll put all this in the show notes.Itamar [00:45:25]: Yeah. And even the new results of O1, we published it.Eric [00:45:29]: I love that. And I also just, I think spiritually, I like your approach of being transparent. Because I think there's a lot of hype-ium around AI stuff. And a lot of it is, it's just like, you have these companies that are just kind of keep their stuff closed source and then just max hype it, but then it's kind of nothing. And I think it kind of gives a bad rep to the incredible stuff that's actually happening here. And so I think it's stuff like what you're doing where, I mean, true merit and you're cracking open actual code for others to learn from and use. That strikes me as the right approach. And it's great to hear that you're making such incredible progress.Itamar [00:46:02]: I have something to share about the open source. Most of our tools are, we have an open source version and then a premium pro version. But it's not an easy decision to do that. I actually wanted to ask you about your strategy, but I think in your case, there is, in my opinion, relatively a good strategy where a lot of parts of open source, but then you have the deployment and the environment, which is not right if I get it correctly. And then there's a clear, almost hugging face model. Yeah, you can do that, but why should you try to deploy it yourself, deploy it with us? But in our case, and I'm not sure you're not going to hit also some competitors, and I guess you are. I wanted to ask you, for example, on some of them. In our case, one day we looked on one of our competitors that is doing code review. We're a platform. We have the code review, the testing, et cetera, spread over the ID to get. And in each agent, we have a few startups or a big incumbents that are doing only that. So we noticed one of our competitors having not only a very similar UI of our open source, but actually even our typo. And you sit there and you're kind of like, yeah, we're not that good. We don't use enough Grammarly or whatever. And we had a couple of these and we saw it there. And then it's a challenge. And I want to ask you, Bald is doing so well, and then you open source it. So I think I know what my answer was. I gave it before, but still interestingEric [00:47:29]: to hear what you think. GeoHot said back, I don't know who he was up to at this exact moment, but I think on comma AI, all that stuff's open source. And someone had asked him, why is this open source? And he's like, if you're not actually confident that you can go and crush it and build the best thing, then yeah, you should probably keep your stuff closed source. He said something akin to that. I'm probably kind of butchering it, but I thought it was kind of a really good point. And that's not to say that you should just open source everything, because for obvious reasons, there's kind of strategic things you have to kind of take in mind. But I actually think a pretty liberal approach, as liberal as you kind of can be, it can really make a lot of sense. Because that is so validating that one of your competitors is taking your stuff and they're like, yeah, let's just kind of tweak the styles. I mean, clearly, right? I think it's kind of healthy because it keeps, I'm sure back at HQ that day when you saw that, you're like, oh, all right, well, we have to grind even harder to make sure we stay ahead. And so I think it's actually a very useful, motivating thing for the teams. Because you might feel this period of comfort. I think a lot of companies will have this period of comfort where they're not feeling the competition and one day they get disrupted. So kind of putting stuff out there and letting people push it forces you to face reality soon, right? And actually feel that incrementally so you can kind of adjust course. And that's for us, the open source version of Bolt has had a lot of features people have been begging us for, like persisting chat messages and checkpoints and stuff. Within the first week, that stuff was landed in the open source versions. And they're like, why can't you ship this? It's in the open, so people have forked it. And we're like, we're trying to keep our servers and GPUs online. But it's been great because the folks in the community did a great job, kept us on our toes. And we've got to know most of these folks too at this point that have been building these things. And so it actually was very instructive. Like, okay, well, if we're going to go kind of land this, there's some UX patterns we can kind of look at and the code is open source to this stuff. What's great about these, what's not. So anyways, NetNet, I think it's awesome. I think from a competitive point of view for us, I think in particular, what's interesting is the core technology of WebContainer going. And I think that right now, there's really nothing that's kind of on par with that. And we also, we have a business of, because WebContainer runs in your browser, but to make it work, you have to install stuff from NPM. You have to make cores bypass requests, like connected databases, which all require server-side proxying or acceleration. And so we actually sell WebContainer as a service. One of the core reasons we open-sourced kind of the core components of Bolt when we launched was that we think that there's going to be a lot more of these AI, in-your-browser AI co-gen experiences, kind of like what Anthropic did with Artifacts and Clod. By the way, Artifacts uses WebContainers. Not yet. No, yeah. Should I strike that? I think that they've got their own thing at the moment, but there's been a lot of interest in WebContainers from folks doing things in that sort of realm and in the AI labs and startups and everything in between. So I think there'll be, I imagine, over the coming months, there'll be lots of things being announced to folks kind of adopting it. But yeah, I think effectively...Swyx [00:50:35]: Okay, I'll say this. If you're a large model lab and you want to build sandbox environments inside of your chat app, you should call Eric.Itamar [00:50:43]: But wait, wait, wait, wait, wait, wait. I have a question about that. I think OpenAI, they felt that people are not using their model as they would want to. So they built ChatGPT. But I would say that ChatGPT now defines OpenAI. I know they're doing a lot of business from their APIs, but still, is this how you think? Isn't Bolt.new your business now? Why don't you focus on that instead of the...Swyx [00:51:16]: What's your advice as a founder?Eric [00:51:18]: You're right. And so going into it, we, candidly, we were like, Bolt.new, this thing is super cool. We think people are stoked. We think people will be stoked. But we were like, maybe that's allowed. Best case scenario, after month one, we'd be mind blown if we added a couple hundred K of error or something. And we were like, but we think there's probably going to be an immediate huge business. Because there was some early poll on folks wanting to put WebContainer into their product offerings, kind of similar to what Bolt is doing or whatever. We were actually prepared for the inverse outcome here. But I mean, well, I guess we've seen poll on both. But I mean, what's happened with Bolt, and you're right, it's actually the same strategy as like OpenAI or Anthropic, where we have our ChatGPT to OpenAI's APIs is Bolt to WebContainer. And so we've kind of taken that same approach. And we're seeing, I guess, some of the similar results, except right now, the revenue side is extremely lopsided to Bolt.Itamar [00:52:16]: I think if you ask me what's my advice, I think you have three options. One is to focus on Bolt. The other is to focus on the WebContainer. The third is to raise one billion dollars and do them both. I'm serious. I think otherwise, you need to choose. And if you raise enough money, and I think it's big bucks, because you're going to be chased by competitors. And I think it will be challenging to do both. And maybe you can. I don't know. We do see these numbers right now, raising above $100 million, even without havingEric [00:52:49]: a product. You can see these. It's excellent advice. And I think what's been amazing, but also kind of challenging is we're trying to forecast, okay, well, where are these things going? I mean, in the initial weeks, I think us and all the investors in the company that we're sharing this with, it was like, this is cool. Okay, we added 500k. Wow, that's crazy. Wow, we're at a million now. Most things, you have this kind of the tech crunch launch of initiation and then the thing of sorrow. And if there's going to be a downtrend, it's just not coming yet. Now that we're kind of looking ahead, we're six weeks in. So now we're getting enough confidence in our convictions to go, okay, this se

god ceo california founders tiktok world ai chicago google strategy las vegas pr advice germany building ukraine microsoft holy events fortune reflections chatgpt code human fall in love web curse os thailand engineering cloud iron man singapore id mac maintaining xbox windows bc navigate excited east coast scaling dom livestream saas heck developers cto conviction crm bots fireworks formation openai gemini salesforce correct sf bald mapping ux api canceled irl hats b2c chrome open source hq python gpt ui rsvp aws ml photoshop linux github coda bolt apis admin reasonable product development stripe sia qa 10x javascript azure last call arr copilot google docs llm squarespace upwork agi generic km artifacts php dns icp ides ide wix docker node git kubernetes bedrock gpus sonnets figma v8 anthropic deepmind subversion touche grammarly alessio wp ui ux gitlab ux ui computer vision veo speakpipe luma trl chromium tim berners lee embracing ai vms gcp cursor github copilot inference vs code itamar visual studio npm future growth webassembly xcode firebase pmf wasm jetbrains dotter chatgbt netlify smol wrapper competitive landscape codo kodo vpc ioi o1 alan kay repl neurips cogen clod huggingface supabase sonet greg isenberg alphacode stackblitz chrome devtools google colab full node webgpu latent space geohot eric you thinkster javascript node pat simons

PART 1: Matthew Hartman | How Factorial Invests in the Future

Generative Now | AI Builders on Creating the Future

Play Episode Listen Later Nov 14, 2024 42:57

This week, we are sharing PART ONE with Matthew Hartman, a managing partner at the venture firm Factorial Capital. Factorial partners with angel operators, with a focus on startups in the AI space. Host and Lightspeed Partner Michael Mignano sits down with Matt to discuss how Factorial partners with technical founders to explore opportunities. They discuss OpenAI, Apple Intelligence, WebGPU, Suno, and much more. Join us next week for PART TWO of this conversation. Episode Chapters (00:00) Introduction(01:42) Factorial and Technical Founders(06:24) WebGPU(10:47) Future of AI and LLMs(15:47) AI and Calendars(18:16) Understanding Scout Funds(23:07) The Role of Angels in Investment(24:12) Focus on AI and New Tech Products(25:42) Transformers and Predictive Models(27:26) Making Music with Suno(33:27) The Future of Creative Tools(42:30) The Economics of Content Creation Stay in touch: ⁠⁠⁠⁠www.lsvp.com⁠⁠⁠⁠ X: ⁠⁠⁠⁠https://twitter.com/lightspeedvp⁠⁠⁠⁠ LinkedIn: ⁠⁠⁠⁠https://www.linkedin.com/company/lightspeed-venture-partners/⁠⁠⁠⁠ Instagram: ⁠⁠⁠⁠https://www.instagram.com/lightspeedventurepartners/⁠⁠⁠⁠ Subscribe on your favorite podcast app: ⁠⁠⁠⁠generativenow.co⁠⁠⁠⁠ Email: generativenow@lsvp.com The content here does not constitute tax, legal, business or investment advice or an offer to provide such advice, should not be construed as advocating the purchase or sale of any security or investment or a recommendation of any company, and is not an offer, or solicitation of an offer, for the purchase or sale of any security or investment product. For more details please see ⁠⁠⁠⁠lsvp.com/legal⁠⁠⁠⁠.

ai future focus angels economics investment transformers openai making music calendars invests suno factorial creative tools webgpu matthew hartman

The top AI news from the past week, every ThursdAI

Play Episode Listen Later Oct 25, 2024 116:20

Hey all, Alex here, coming to you from the (surprisingly) sunny Seattle, with just a mind-boggling week of releases. Really, just on Tuesday there was so much news already! I had to post a recap thread, something I do usually after I finish ThursdAI! From Anthropic reclaiming close-second sometimes-first AI lab position + giving Claude the wheel in the form of computer use powers, to more than 3 AI video generation updates with open source ones, to Apple updating Apple Intelligence beta, it's honestly been very hard to keep up, and again, this is literally part of my job! But once again I'm glad that we were able to cover this in ~2hrs, including multiple interviews with returning co-hosts ( Simon Willison came back, Killian came back) so definitely if you're only a reader at this point, listen to the show! Ok as always (recently) the TL;DR and show notes at the bottom (I'm trying to get you to scroll through ha, is it working?) so grab a bucket of popcorn, let's dive in

ai apple future training british seattle elon musk drive open blog 3d partnership chatgpt web commitment massive medium act comparison empowering acting longer large ios base excited scaling mixed models developers command folks mj mania limitations siri transformers big tech openai arc sf improved api visualizations nsfw ramsey simplifying open source extend 4k controls transforms github canvas apis improves runway flux lam gordon ramsay apache opus tl llm faithfully gpu 3b 5b moonshine midjourney 4m weave grok docker aider node janus haiku replaces unlocks diffusion enables gpus sonnets tweaks anthropic 8k benchmarking multilingual 8b combines expressive apple ios hf embed v1 mochi bun stable diffusion stabilizing tts multimodal adept elevenlabs act one deno comparable mysterious case asr scm translates ai news swe cohere yes chef corbitt o1 imagenet streamlines visual intelligence azure ai ideogram tinkerers simon willison webgpu

How to Get Started With 3D in React | React Universe On Air: Coffee Talk #21

The React Native Show Podcast

Play Episode Listen Later Sep 24, 2024 65:33

Why settle for flat graphics, when you can see the React Universe in 3D? In this Coffee Talk, Jakub invited three very special guests to dig deep into the topic of 3D rendering in React and React Native:

universe 3d babylon react get started jakub coffee talk webgl webgpu

September 5th, 2024 | UE5 Nanite in WebGPU

Hacker News Recap

Play Episode Listen Later Sep 7, 2024 13:11

This is a recap of the top 10 posts on Hacker News on September 5th, 2024.This podcast was generated by wondercraft.ai(00:38): UE5 Nanite in WebGPUOriginal post: https://news.ycombinator.com/item?id=41458987&utm_source=wondercraft_ai(01:56): AlphaProteo generates novel proteins for biology and health researchOriginal post: https://news.ycombinator.com/item?id=41457331&utm_source=wondercraft_ai(03:09): Porting systemd to musl Libc-powered LinuxOriginal post: https://news.ycombinator.com/item?id=41454779&utm_source=wondercraft_ai(04:22): Clojure 1.12.0 is now availableOriginal post: https://news.ycombinator.com/item?id=41460037&utm_source=wondercraft_ai(05:28): Building a WoW (World of Warcraft) Server in ElixirOriginal post: https://news.ycombinator.com/item?id=41454741&utm_source=wondercraft_ai(06:46): Common food dye found to make skin and muscle temporarily transparentOriginal post: https://news.ycombinator.com/item?id=41459865&utm_source=wondercraft_ai(07:53): The Early Days of Valve from a Woman InsideOriginal post: https://news.ycombinator.com/item?id=41460276&utm_source=wondercraft_ai(08:59): Desed: Demystify and debug your sed scriptsOriginal post: https://news.ycombinator.com/item?id=41453557&utm_source=wondercraft_ai(10:11): Deploying Rust in Existing Firmware CodebasesOriginal post: https://news.ycombinator.com/item?id=41458508&utm_source=wondercraft_ai(11:26): serverless-registry: A Docker registry backed by Workers and R2Original post: https://news.ycombinator.com/item?id=41458240&utm_source=wondercraft_aiThis is a third-party project, independent from HN and YC. Text and audio generated using AI, by wondercraft.ai. Create your own studio quality podcast with text as the only input in seconds at app.wondercraft.ai. Issues or feedback? We'd love to hear from you: team@wondercraft.ai

ai building workers valve early days docker yc hn hacker news clojure porting webgpu

AI Magic: Shipping 1000s of successful products with no managers and a team of 12 — Jeremy Howard of Answer.ai

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Aug 16, 2024 58:56

Disclaimer: We recorded this episode ~1.5 months ago, timing for the FastHTML release. It then got bottlenecked by Llama3.1, Winds of AI Winter, and SAM2 episodes, so we're a little late. Since then FastHTML was released, swyx is building an app in it for AINews, and Anthropic has also released their prompt caching API. Remember when Dylan Patel of SemiAnalysis coined the GPU Rich vs GPU Poor war? (if not, see our pod with him). The idea was that if you're GPU poor you shouldn't waste your time trying to solve GPU rich problems (i.e. pre-training large models) and are better off working on fine-tuning, optimized inference, etc. Jeremy Howard (see our “End of Finetuning” episode to catchup on his background) and Eric Ries founded Answer.AI to do exactly that: “Practical AI R&D”, which is very in-line with the GPU poor needs. For example, one of their first releases was a system based on FSDP + QLoRA that let anyone train a 70B model on two NVIDIA 4090s. Since then, they have come out with a long list of super useful projects (in no particular order, and non-exhaustive):* FSDP QDoRA: this is just as memory efficient and scalable as FSDP/QLoRA, and critically is also as accurate for continued pre-training as full weight training.* Cold Compress: a KV cache compression toolkit that lets you scale sequence length without impacting speed.* colbert-small: state of the art retriever at only 33M params* JaColBERTv2.5: a new state-of-the-art retrievers on all Japanese benchmarks.* gpu.cpp: portable GPU compute for C++ with WebGPU.* Claudette: a better Anthropic API SDK. They also recently released FastHTML, a new way to create modern interactive web apps. Jeremy recently released a 1 hour “Getting started” tutorial on YouTube; while this isn't AI related per se, but it's close to home for any AI Engineer who are looking to iterate quickly on new products: In this episode we broke down 1) how they recruit 2) how they organize what to research 3) and how the community comes together. At the end, Jeremy gave us a sneak peek at something new that he's working on that he calls dialogue engineering: So I've created a new approach. It's not called prompt engineering. I'm creating a system for doing dialogue engineering. It's currently called AI magic. I'm doing most of my work in this system and it's making me much more productive than I was before I used it.He explains it a bit more ~44:53 in the pod, but we'll just have to wait for the public release to figure out exactly what he means.Timestamps* [00:00:00] Intro by Suno AI* [00:03:02] Continuous Pre-Training is Here* [00:06:07] Schedule-Free Optimizers and Learning Rate Schedules* [00:07:08] Governance and Structural Issues within OpenAI and Other AI Labs* [00:13:01] How Answer.ai works* [00:23:40] How to Recruit Productive Researchers* [00:27:45] Building a new BERT* [00:31:57] FSDP, QLoRA, and QDoRA: Innovations in Fine-Tuning Large Models* [00:36:36] Research and Development on Model Inference Optimization* [00:39:49] FastHTML for Web Application Development* [00:46:53] AI Magic & Dialogue Engineering* [00:52:19] AI wishlist & predictionsShow Notes* Jeremy Howard* Previously on Latent Space: The End of Finetuning, NeurIPS Startups* Answer.ai* Fast.ai* FastHTML* answerai-colbert-small-v1* gpu.cpp* Eric Ries* Aaron DeFazio* Yi Tai* Less Wright* Benjamin Warner* Benjamin Clavié* Jono Whitaker* Austin Huang* Eric Gilliam* Tim Dettmers* Colin Raffel* Sebastian Raschka* Carson Gross* Simon Willison* Sepp Hochreiter* Llama3.1 episode* Snowflake Arctic* Ranger Optimizer* Gemma.cpp* HTMX* UL2* BERT* DeBERTa* Efficient finetuning of Llama 3 with FSDP QDoRA* xLSTMTranscriptAlessio [00:00:00]: Hey everyone, welcome to the Latent Space podcast. This is Alessio, partner and CTO-in-Residence at Decibel Partners, and I'm joined by my co-host Swyx, founder of Smol AI.Swyx [00:00:14]: And today we're back with Jeremy Howard, I think your third appearance on Latent Space. Welcome.Jeremy [00:00:19]: Wait, third? Second?Swyx [00:00:21]: Well, I grabbed you at NeurIPS.Jeremy [00:00:23]: I see.Swyx [00:00:24]: Very fun, standing outside street episode.Jeremy [00:00:27]: I never heard that, by the way. You've got to send me a link. I've got to hear what it sounded like.Swyx [00:00:30]: Yeah. Yeah, it's a NeurIPS podcast.Alessio [00:00:32]: I think the two episodes are six hours, so there's plenty to listen, we'll make sure to send it over.Swyx [00:00:37]: Yeah, we're trying this thing where at the major ML conferences, we, you know, do a little audio tour of, give people a sense of what it's like. But the last time you were on, you declared the end of fine tuning. I hope that I sort of editorialized the title a little bit, and I know you were slightly uncomfortable with it, but you just own it anyway. I think you're very good at the hot takes. And we were just discussing in our pre-show that it's really happening, that the continued pre-training is really happening.Jeremy [00:01:02]: Yeah, absolutely. I think people are starting to understand that treating the three ULM FIT steps of like pre-training, you know, and then the kind of like what people now call instruction tuning, and then, I don't know if we've got a general term for this, DPO, RLHFE step, you know, or the task training, they're not actually as separate as we originally suggested they were in our paper, and when you treat it more as a continuum, and that you make sure that you have, you know, more of kind of the original data set incorporated into the later stages, and that, you know, we've also seen with LLAMA3, this idea that those later stages can be done for a lot longer. These are all of the things I was kind of trying to describe there. It wasn't the end of fine tuning, but more that we should treat it as a continuum, and we should have much higher expectations of how much you can do with an already trained model. You can really add a lot of behavior to it, you can change its behavior, you can do a lot. So a lot of our research has been around trying to figure out how to modify the model by a larger amount rather than starting from random weights, because I get very offended at the idea of starting from random weights.Swyx [00:02:14]: Yeah, I saw that in ICLR in Vienna, there was an outstanding paper about starting transformers from data-driven piers. I don't know if you saw that one, they called it sort of never trained from scratch, and I think it was kind of rebelling against like the sort of random initialization.Jeremy [00:02:28]: Yeah, I've, you know, that's been our kind of continuous message since we started Fast AI, is if you're training for random weights, you better have a really good reason, you know, because it seems so unlikely to me that nobody has ever trained on data that has any similarity whatsoever to the general class of data you're working with, and that's the only situation in which I think starting from random weights makes sense.Swyx [00:02:51]: The other trends since our last pod that I would point people to is I'm seeing a rise in multi-phase pre-training. So Snowflake released a large model called Snowflake Arctic, where they detailed three phases of training where they had like a different mixture of like, there was like 75% web in the first instance, and then they reduced the percentage of the web text by 10% each time and increased the amount of code in each phase. And I feel like multi-phase is being called out in papers more. I feel like it's always been a thing, like changing data mix is not something new, but calling it a distinct phase is new, and I wonder if there's something that you're seeingJeremy [00:03:32]: on your end. Well, so they're getting there, right? So the point at which they're doing proper continued pre-training is the point at which that becomes a continuum rather than a phase. So the only difference with what I was describing last time is to say like, oh, there's a function or whatever, which is happening every batch. It's not a huge difference. You know, I always used to get offended when people had learning rates that like jumped. And so one of the things I started doing early on in Fast.ai was to say to people like, no, you should actually have your learning rate schedule should be a function, not a list of numbers. So now I'm trying to give the same idea about training mix.Swyx [00:04:07]: There's been pretty public work from Meta on schedule-free optimizers. I don't know if you've been following Aaron DeFazio and what he's doing, just because you mentioned learning rate schedules, you know, what if you didn't have a schedule?Jeremy [00:04:18]: I don't care very much, honestly. I don't think that schedule-free optimizer is that exciting. It's fine. We've had non-scheduled optimizers for ages, like Less Wright, who's now at Meta, who was part of the Fast.ai community there, created something called the Ranger optimizer. I actually like having more hyperparameters. You know, as soon as you say schedule-free, then like, well, now I don't get to choose. And there isn't really a mathematically correct way of, like, I actually try to schedule more parameters rather than less. So like, I like scheduling my epsilon in my atom, for example. I schedule all the things. But then the other thing we always did with the Fast.ai library was make it so you don't have to set any schedules. So Fast.ai always supported, like, you didn't even have to pass a learning rate. Like, it would always just try to have good defaults and do the right thing. But to me, I like to have more parameters I can play with if I want to, but you don't have to.Alessio [00:05:08]: And then the more less technical side, I guess, of your issue, I guess, with the market was some of the large research labs taking all this innovation kind of behind closed doors and whether or not that's good, which it isn't. And now we could maybe make it more available to people. And then a month after we released the episode, there was the whole Sam Altman drama and like all the OpenAI governance issues. And maybe people started to think more, okay, what happens if some of these kind of labs, you know, start to break from within, so to speak? And the alignment of the humans is probably going to fall before the alignment of the models. So I'm curious, like, if you have any new thoughts and maybe we can also tie in some of the way that we've been building Answer as like a public benefit corp and some of those aspects.Jeremy [00:05:51]: Sure. So, yeah, I mean, it was kind of uncomfortable because two days before Altman got fired, I did a small public video interview in which I said, I'm quite sure that OpenAI's current governance structure can't continue and that it was definitely going to fall apart. And then it fell apart two days later and a bunch of people were like, what did you know, Jeremy?Alessio [00:06:13]: What did Jeremy see?Jeremy [00:06:15]: I didn't see anything. It's just obviously true. Yeah. So my friend Eric Ries and I spoke a lot before that about, you know, Eric's, I think probably most people would agree, the top expert in the world on startup and AI governance. And you know, we could both clearly see that this didn't make sense to have like a so-called non-profit where then there are people working at a company, a commercial company that's owned by or controlled nominally by the non-profit, where the people in the company are being given the equivalent of stock options, like everybody there was working there with expecting to make money largely from their equity. So the idea that then a board could exercise control by saying like, oh, we're worried about safety issues and so we're going to do something that decreases the profit of the company, when every stakeholder in the company, their remuneration pretty much is tied to their profit, it obviously couldn't work. So I mean, that was a huge oversight there by someone. I guess part of the problem is that the kind of people who work at non-profits and in this case the board, you know, who are kind of academics and, you know, people who are kind of true believers. I think it's hard for them to realize that 99.999% of the world is driven very heavily by money, especially huge amounts of money. So yeah, Eric and I had been talking for a long time before that about what could be done differently, because also companies are sociopathic by design and so the alignment problem as it relates to companies has not been solved. Like, companies become huge, they devour their founders, they devour their communities and they do things where even the CEOs, you know, often of big companies tell me like, I wish our company didn't do that thing. You know, I know that if I didn't do it, then I would just get fired and the board would put in somebody else and the board knows if they don't do it, then their shareholders can sue them because they're not maximizing profitability or whatever. So what Eric's spent a lot of time doing is trying to think about how do we make companies less sociopathic, you know, how to, or more, you know, maybe a better way to think of it is like, how do we make it so that the founders of companies can ensure that their companies continue to actually do the things they want them to do? You know, when we started a company, hey, we very explicitly decided we got to start a company, not a academic lab, not a nonprofit, you know, we created a Delaware Seacorp, you know, the most company kind of company. But when we did so, we told everybody, you know, including our first investors, which was you Alessio. They sound great. We are going to run this company on the basis of maximizing long-term value. And in fact, so when we did our second round, which was an angel round, we had everybody invest through a long-term SPV, which we set up where everybody had to agree to vote in line with long-term value principles. So like never enough just to say to people, okay, we're trying to create long-term value here for society as well as for ourselves and everybody's like, oh, yeah, yeah, I totally agree with that. But when it comes to like, okay, well, here's a specific decision we have to make, which will not maximize short-term value, people suddenly change their mind. So you know, it has to be written into the legal documents of everybody so that no question that that's the way the company has to be managed. So then you mentioned the PBC aspect, Public Benefit Corporation, which I never quite understood previously. And turns out it's incredibly simple, like it took, you know, like one paragraph added to our corporate documents to become a PBC. It was cheap, it was easy, but it's got this huge benefit, which is if you're not a public benefit corporation, then somebody can come along and offer to buy you with a stated description of like turning your company into the thing you most hate, right? And if they offer you more than the market value of your company and you don't accept it, then you are not necessarily meeting the kind of your fiduciary responsibilities. So the way like Eric always described it to me is like, if Philip Morris came along and said that you've got great technology for marketing cigarettes to children, so we're going to pivot your company to do that entirely, and we're going to pay you 50% more than the market value, you're going to have to say yes. If you have a PBC, then you are more than welcome to say no, if that offer is not in line with your stated public benefit. So our stated public benefit is to maximize the benefit to society through using AI. So given that more children smoking doesn't do that, then we can say like, no, we're not selling to you.Alessio [00:11:01]: I was looking back at some of our emails. You sent me an email on November 13th about talking and then on the 14th, I sent you an email working together to free AI was the subject line. And then that was kind of the start of the C round. And then two days later, someone got fired. So you know, you were having these thoughts even before we had like a public example of like why some of the current structures didn't work. So yeah, you were very ahead of the curve, so to speak. You know, people can read your awesome introduction blog and answer and the idea of having a R&D lab versus our lab and then a D lab somewhere else. I think to me, the most interesting thing has been hiring and some of the awesome people that you've been bringing on that maybe don't fit the central casting of Silicon Valley, so to speak. Like sometimes I got it like playing baseball cards, you know, people are like, oh, what teams was this person on, where did they work versus focusing on ability. So I would love for you to give a shout out to some of the awesome folks that you have on the team.Jeremy [00:11:58]: So, you know, there's like a graphic going around describing like the people at XAI, you know, Elon Musk thing. And like they are all connected to like multiple of Stanford, Meta, DeepMind, OpenAI, Berkeley, Oxford. Look, these are all great institutions and they have good people. And I'm definitely not at all against that, but damn, there's so many other people. And one of the things I found really interesting is almost any time I see something which I think like this is really high quality work and it's something I don't think would have been built if that person hadn't built the thing right now, I nearly always reach out to them and ask to chat. And I tend to dig in to find out like, okay, you know, why did you do that thing? Everybody else has done this other thing, your thing's much better, but it's not what other people are working on. And like 80% of the time, I find out the person has a really unusual background. So like often they'll have like, either they like came from poverty and didn't get an opportunity to go to a good school or had dyslexia and, you know, got kicked out of school in year 11, or they had a health issue that meant they couldn't go to university or something happened in their past and they ended up out of the mainstream. And then they kind of succeeded anyway. Those are the people that throughout my career, I've tended to kind of accidentally hire more of, but it's not exactly accidentally. It's like when I see somebody who's done, two people who have done extremely well, one of them did extremely well in exactly the normal way from the background entirely pointing in that direction and they achieved all the hurdles to get there. And like, okay, that's quite impressive, you know, but another person who did just as well, despite lots of constraints and doing things in really unusual ways and came up with different approaches. That's normally the person I'm likely to find useful to work with because they're often like risk-takers, they're often creative, they're often extremely tenacious, they're often very open-minded. So that's the kind of folks I tend to find myself hiring. So now at Answer.ai, it's a group of people that are strong enough that nearly every one of them has independently come to me in the past few weeks and told me that they have imposter syndrome and they're not convinced that they're good enough to be here. And I kind of heard it at the point where I was like, okay, I don't think it's possible that all of you are so far behind your peers that you shouldn't get to be here. But I think part of the problem is as an R&D lab, the great developers look at the great researchers and they're like, wow, these big-brained, crazy research people with all their math and s**t, they're too cool for me, oh my God. And then the researchers look at the developers and they're like, oh, they're killing it, making all this stuff with all these people using it and talking on Twitter about how great it is. I think they're both a bit intimidated by each other, you know. And so I have to kind of remind them like, okay, there are lots of things in this world where you suck compared to lots of other people in this company, but also vice versa, you know, for all things. And the reason you came here is because you wanted to learn about those other things from those other people and have an opportunity to like bring them all together into a single unit. You know, it's not reasonable to expect you're going to be better at everything than everybody else. I guess the other part of it is for nearly all of the people in the company, to be honest, they have nearly always been better than everybody else at nearly everything they're doing nearly everywhere they've been. So it's kind of weird to be in this situation now where it's like, gee, I can clearly see that I suck at this thing that I'm meant to be able to do compared to these other people where I'm like the worst in the company at this thing for some things. So I think that's a healthy place to be, you know, as long as you keep reminding each other about that's actually why we're here. And like, it's all a bit of an experiment, like we don't have any managers. We don't have any hierarchy from that point of view. So for example, I'm not a manager, which means I don't get to tell people what to do or how to do it or when to do it. Yeah, it's been a bit of an experiment to see how that would work out. And it's been great. So for instance, Ben Clavier, who you might have come across, he's the author of Ragatouille, he's the author of Rerankers, super strong information retrieval guy. And a few weeks ago, you know, this additional channel appeared on Discord, on our private Discord called Bert24. And these people started appearing, as in our collab sections, we have a collab section for like collaborating with outsiders. And these people started appearing, there are all these names that I recognize, like Bert24, and they're all talking about like the next generation of Bert. And I start following along, it's like, okay, Ben decided that I think, quite rightly, we need a new Bert. Because everybody, like so many people are still using Bert, and it's still the best at so many things, but it actually doesn't take advantage of lots of best practices. And so he just went out and found basically everybody who's created better Berts in the last four or five years, brought them all together, suddenly there's this huge collaboration going on. So yeah, I didn't tell him to do that. He didn't ask my permission to do that. And then, like, Benjamin Warner dived in, and he's like, oh, I created a whole transformers from scratch implementation designed to be maximally hackable. He originally did it largely as a teaching exercise to show other people, but he was like, I could, you know, use that to create a really hackable BERT implementation. In fact, he didn't say that. He said, I just did do that, you know, and I created a repo, and then everybody's like starts using it. They're like, oh my god, this is amazing. I can now implement all these other BERT things. And it's not just answer AI guys there, you know, there's lots of folks, you know, who have like contributed new data set mixes and blah, blah, blah. So, I mean, I can help in the same way that other people can help. So like, then Ben Clavier reached out to me at one point and said, can you help me, like, what have you learned over time about how to manage intimidatingly capable and large groups of people who you're nominally meant to be leading? And so, you know, I like to try to help, but I don't direct. Another great example was Kerem, who, after our FSTP QLORA work, decided quite correctly that it didn't really make sense to use LoRa in today's world. You want to use the normalized version, which is called Dora. Like two or three weeks after we did FSTP QLORA, he just popped up and said, okay, I've just converted the whole thing to Dora, and I've also created these VLLM extensions, and I've got all these benchmarks, and, you know, now I've got training of quantized models with adapters that are as fast as LoRa, and as actually better than, weirdly, fine tuning. Just like, okay, that's great, you know. And yeah, so the things we've done to try to help make these things happen as well is we don't have any required meetings, you know, but we do have a meeting for each pair of major time zones that everybody's invited to, and, you know, people see their colleagues doing stuff that looks really cool and say, like, oh, how can I help, you know, or how can I learn or whatever. So another example is Austin, who, you know, amazing background. He ran AI at Fidelity, he ran AI at Pfizer, he ran browsing and retrieval for Google's DeepMind stuff, created Jemma.cpp, and he's been working on a new system to make it easier to do web GPU programming, because, again, he quite correctly identified, yeah, so I said to him, like, okay, I want to learn about that. Not an area that I have much expertise in, so, you know, he's going to show me what he's working on and teach me a bit about it, and hopefully I can help contribute. I think one of the key things that's happened in all of these is everybody understands what Eric Gilliam, who wrote the second blog post in our series, the R&D historian, describes as a large yard with narrow fences. Everybody has total flexibility to do what they want. We all understand kind of roughly why we're here, you know, we agree with the premises around, like, everything's too expensive, everything's too complicated, people are building too many vanity foundation models rather than taking better advantage of fine-tuning, like, there's this kind of general, like, sense of we're all on the same wavelength about, you know, all the ways in which current research is fucked up, and, you know, all the ways in which we're worried about centralization. We all care a lot about not just research for the point of citations, but research that actually wouldn't have happened otherwise, and actually is going to lead to real-world outcomes. And so, yeah, with this kind of, like, shared vision, people understand, like, you know, so when I say, like, oh, well, you know, tell me, Ben, about BERT 24, what's that about? And he's like, you know, like, oh, well, you know, you can see from an accessibility point of view, or you can see from a kind of a actual practical impact point of view, there's far too much focus on decoder-only models, and, you know, like, BERT's used in all of these different places and industry, and so I can see, like, in terms of our basic principles, what we're trying to achieve, this seems like something important. And so I think that's, like, a really helpful that we have that kind of shared perspective, you know?Alessio [00:21:14]: Yeah. And before we maybe talk about some of the specific research, when you're, like, reaching out to people, interviewing them, what are some of the traits, like, how do these things come out, you know, usually? Is it working on side projects that you, you know, you're already familiar with? Is there anything, like, in the interview process that, like, helps you screen for people that are less pragmatic and more research-driven versus some of these folks that are just gonna do it, you know? They're not waiting for, like, the perfect process.Jeremy [00:21:40]: Everybody who comes through the recruiting is interviewed by everybody in the company. You know, our goal is 12 people, so it's not an unreasonable amount. So the other thing to say is everybody so far who's come into the recruiting pipeline, everybody bar one, has been hired. So which is to say our original curation has been good. And that's actually pretty easy, because nearly everybody who's come in through the recruiting pipeline are people I know pretty well. So Jono Whitaker and I, you know, he worked on the stable diffusion course we did. He's outrageously creative and talented, and he's super, like, enthusiastic tinkerer, just likes making things. Benjamin was one of the strongest parts of the fast.ai community, which is now the alumni. It's, like, hundreds of thousands of people. And you know, again, like, they're not people who a normal interview process would pick up, right? So Benjamin doesn't have any qualifications in math or computer science. Jono was living in Zimbabwe, you know, he was working on, like, helping some African startups, you know, but not FAANG kind of credentials. But yeah, I mean, when you actually see people doing real work and they stand out above, you know, we've got lots of Stanford graduates and open AI people and whatever in our alumni community as well. You know, when you stand out above all of those people anyway, obviously you've got something going for you. You know, Austin, him and I worked together on the masks study we did in the proceeding at the National Academy of Science. You know, we had worked together, and again, that was a group of, like, basically the 18 or 19 top experts in the world on public health and epidemiology and research design and so forth. And Austin, you know, one of the strongest people in that collaboration. So yeah, you know, like, I've been lucky enough to have had opportunities to work with some people who are great and, you know, I'm a very open-minded person, so I kind of am always happy to try working with pretty much anybody and some people stand out. You know, there have been some exceptions, people I haven't previously known, like Ben Clavier, actually, I didn't know before. But you know, with him, you just read his code, and I'm like, oh, that's really well-written code. And like, it's not written exactly the same way as everybody else's code, and it's not written to do exactly the same thing as everybody else's code. So yeah, and then when I chatted to him, it's just like, I don't know, I felt like we'd known each other for years, like we just were on the same wavelength, but I could pretty much tell that was going to happen just by reading his code. I think you express a lot in the code you choose to write and how you choose to write it, I guess. You know, or another example, a guy named Vic, who was previously the CEO of DataQuest, and like, in that case, you know, he's created a really successful startup. He won the first, basically, Kaggle NLP competition, which was automatic essay grading. He's got the current state-of-the-art OCR system, Surya. Again, he's just a guy who obviously just builds stuff, you know, he doesn't ask for permission, he doesn't need any, like, external resources. Actually, Karim's another great example of this, I mean, I already knew Karim very well because he was my best ever master's student, but it wasn't a surprise to me then when he then went off to create the world's state-of-the-art language model in Turkish on his own, in his spare time, with no budget, from scratch. This is not fine-tuning or whatever, he, like, went back to Common Crawl and did everything. Yeah, it's kind of, I don't know what I'd describe that process as, but it's not at all based on credentials.Swyx [00:25:17]: Assemble based on talent, yeah. We wanted to dive in a little bit more on, you know, turning from the people side of things into the technical bets that you're making. Just a little bit more on Bert. I was actually, we just did an interview with Yi Tay from Reka, I don't know if you're familiar with his work, but also another encoder-decoder bet, and one of his arguments was actually people kind of over-index on the decoder-only GPT-3 type paradigm. I wonder if you have thoughts there that is maybe non-consensus as well. Yeah, no, absolutely.Jeremy [00:25:45]: So I think it's a great example. So one of the people we're collaborating with a little bit with BERT24 is Colin Raffle, who is the guy behind, yeah, most of that stuff, you know, between that and UL2, there's a lot of really interesting work. And so one of the things I've been encouraging the BERT group to do, Colin has as well, is to consider using a T5 pre-trained encoder backbone as a thing you fine-tune, which I think would be really cool. You know, Colin was also saying actually just use encoder-decoder as your Bert, you know, why don't you like use that as a baseline, which I also think is a good idea. Yeah, look.Swyx [00:26:25]: What technical arguments are people under-weighting?Jeremy [00:26:27]: I mean, Colin would be able to describe this much better than I can, but I'll give my slightly non-expert attempt. Look, I mean, think about like diffusion models, right? Like in stable diffusion, like we use things like UNet. You have this kind of downward path and then in the upward path you have the cross connections, which it's not a tension, but it's like a similar idea, right? You're inputting the original encoding path into your decoding path. It's critical to make it work, right? Because otherwise in the decoding part, the model has to do so much kind of from scratch. So like if you're doing translation, like that's a classic kind of encoder-decoder example. If it's decoder only, you never get the opportunity to find the right, you know, feature engineering, the right feature encoding for the original sentence. And it kind of means then on every token that you generate, you have to recreate the whole thing, you know? So if you have an encoder, it's basically saying like, okay, this is your opportunity model to create a really useful feature representation for your input information. So I think there's really strong arguments for encoder-decoder models anywhere that there is this kind of like context or source thing. And then why encoder only? Well, because so much of the time what we actually care about is a classification, you know? It's like an output. It's like generating an arbitrary length sequence of tokens. So anytime you're not generating an arbitrary length sequence of tokens, decoder models don't seem to make much sense. Now the interesting thing is, you see on like Kaggle competitions, that decoder models still are at least competitive with things like Deberta v3. They have to be way bigger to be competitive with things like Deberta v3. And the only reason they are competitive is because people have put a lot more time and money and effort into training the decoder only ones, you know? There isn't a recent Deberta. There isn't a recent Bert. Yeah, it's a whole part of the world that people have slept on a little bit. And this is just what happens. This is how trends happen rather than like, to me, everybody should be like, oh, let's look at the thing that has shown signs of being useful in the past, but nobody really followed up with properly. That's the more interesting path, you know, where people tend to be like, oh, I need to get citations. So what's everybody else doing? Can I make it 0.1% better, you know, or 0.1% faster? That's what everybody tends to do. Yeah. So I think it's like, Itay's work commercially now is interesting because here's like a whole, here's a whole model that's been trained in a different way. So there's probably a whole lot of tasks it's probably better at than GPT and Gemini and Claude. So that should be a good commercial opportunity for them if they can figure out what those tasks are.Swyx [00:29:07]: Well, if rumors are to be believed, and he didn't comment on this, but, you know, Snowflake may figure out the commercialization for them. So we'll see.Jeremy [00:29:14]: Good.Alessio [00:29:16]: Let's talk about FSDP, Qlora, Qdora, and all of that awesome stuff. One of the things we talked about last time, some of these models are meant to run on systems that nobody can really own, no single person. And then you were like, well, what if you could fine tune a 70B model on like a 4090? And I was like, no, that sounds great, Jeremy, but like, can we actually do it? And then obviously you all figured it out. Can you maybe tell us some of the worst stories behind that, like the idea behind FSDP, which is kind of taking sharded data, parallel computation, and then Qlora, which is do not touch all the weights, just go quantize some of the model, and then within the quantized model only do certain layers instead of doing everything.Jeremy [00:29:57]: Well, do the adapters. Yeah.Alessio [00:29:59]: Yeah. Yeah. Do the adapters. Yeah. I will leave the floor to you. I think before you published it, nobody thought this was like a short term thing that we're just going to have. And now it's like, oh, obviously you can do it, but it's not that easy.Jeremy [00:30:12]: Yeah. I mean, to be honest, it was extremely unpleasant work to do. It's like not at all enjoyable. I kind of did version 0.1 of it myself before we had launched the company, or at least the kind of like the pieces. They're all pieces that are difficult to work with, right? So for the quantization, you know, I chatted to Tim Detmers quite a bit and, you know, he very much encouraged me by saying like, yeah, it's possible. He actually thought it'd be easy. It probably would be easy for him, but I'm not Tim Detmers. And, you know, so he wrote bits and bytes, which is his quantization library. You know, he wrote that for a paper. He didn't write that to be production like code. It's now like everybody's using it, at least the CUDA bits. So like, it's not particularly well structured. There's lots of code paths that never get used. There's multiple versions of the same thing. You have to try to figure it out. So trying to get my head around that was hard. And you know, because the interesting bits are all written in CUDA, it's hard to like to step through it and see what's happening. And then, you know, FSTP is this very complicated library and PyTorch, which not particularly well documented. So the only really, really way to understand it properly is again, just read the code and step through the code. And then like bits and bytes doesn't really work in practice unless it's used with PEF, the HuggingFace library and PEF doesn't really work in practice unless you use it with other things. And there's a lot of coupling in the HuggingFace ecosystem where like none of it works separately. You have to use it all together, which I don't love. So yeah, trying to just get a minimal example that I can play with was really hard. And so I ended up having to rewrite a lot of it myself to kind of create this like minimal script. One thing that helped a lot was Medec had this LlamaRecipes repo that came out just a little bit before I started working on that. And like they had a kind of role model example of like, here's how to train FSTP, LoRa, didn't work with QLoRa on Llama. A lot of the stuff I discovered, the interesting stuff would be put together by Les Wright, who's, he was actually the guy in the Fast.ai community I mentioned who created the Ranger Optimizer. So he's doing a lot of great stuff at Meta now. So yeah, I kind of, that helped get some minimum stuff going and then it was great once Benjamin and Jono joined full time. And so we basically hacked at that together and then Kerim joined like a month later or something. And it was like, gee, it was just a lot of like fiddly detailed engineering on like barely documented bits of obscure internals. So my focus was to see if it kind of could work and I kind of got a bit of a proof of concept working and then the rest of the guys actually did all the work to make it work properly. And, you know, every time we thought we had something, you know, we needed to have good benchmarks, right? So we'd like, it's very easy to convince yourself you've done the work when you haven't, you know, so then we'd actually try lots of things and be like, oh, and these like really important cases, the memory use is higher, you know, or it's actually slower. And we'd go in and we just find like all these things that were nothing to do with our library that just didn't work properly. And nobody had noticed they hadn't worked properly because nobody had really benchmarked it properly. So we ended up, you know, trying to fix a whole lot of different things. And even as we did so, new regressions were appearing in like transformers and stuff that Benjamin then had to go away and figure out like, oh, how come flash attention doesn't work in this version of transformers anymore with this set of models and like, oh, it turns out they accidentally changed this thing, so it doesn't work. You know, there's just, there's not a lot of really good performance type evals going on in the open source ecosystem. So there's an extraordinary amount of like things where people say like, oh, we built this thing and it has this result. And when you actually check it, so yeah, there's a shitload of war stories from getting that thing to work. And it did require a particularly like tenacious group of people and a group of people who don't mind doing a whole lot of kind of like really janitorial work, to be honest, to get the details right, to check them. Yeah.Alessio [00:34:09]: We had a trade out on the podcast and we talked about how a lot of it is like systems work to make some of these things work. It's not just like beautiful, pure math that you do on a blackboard. It's like, how do you get into the nitty gritty?Jeremy [00:34:22]: I mean, flash attention is a great example of that. Like it's, it basically is just like, oh, let's just take the attention and just do the tiled version of it, which sounds simple enough, you know, but then implementing that is challenging at lots of levels.Alessio [00:34:36]: Yeah. What about inference? You know, obviously you've done all this amazing work on fine tuning. Do you have any research you've been doing on the inference side, how to make local inference really fast on these models too?Jeremy [00:34:47]: We're doing quite a bit on that at the moment. We haven't released too much there yet. But one of the things I've been trying to do is also just to help other people. And one of the nice things that's happened is that a couple of folks at Meta, including Mark Seraphim, have done a nice job of creating this CUDA mode community of people working on like CUDA kernels or learning about that. And I tried to help get that going well as well and did some lessons to help people get into it. So there's a lot going on in both inference and fine tuning performance. And a lot of it's actually happening kind of related to that. So PyTorch team have created this Torch AO project on quantization. And so there's a big overlap now between kind of the FastAI and AnswerAI and CUDA mode communities of people working on stuff for both inference and fine tuning. But we're getting close now. You know, our goal is that nobody should be merging models, nobody should be downloading merged models, everybody should be using basically quantized plus adapters for almost everything and just downloading the adapters. And that should be much faster. So that's kind of the place we're trying to get to. It's difficult, you know, because like Karim's been doing a lot of work with VLM, for example. These inference engines are pretty complex bits of code. They have a whole lot of custom kernel stuff going on as well, as do the quantization libraries. So we've been working on, we're also quite a bit of collaborating with the folks who do HQQ, which is a really great quantization library and works super well. So yeah, there's a lot of other people outside AnswerAI that we're working with a lot who are really helping on all this performance optimization stuff, open source.Swyx [00:36:27]: Just to follow up on merging models, I picked up there that you said nobody should be merging models. That's interesting because obviously a lot of people are experimenting with this and finding interesting results. I would say in defense of merging models, you can do it without data. That's probably the only thing that's going for it.Jeremy [00:36:45]: To explain, it's not that you shouldn't merge models. You shouldn't be distributing a merged model. You should distribute a merged adapter 99% of the time. And actually often one of the best things happening in the model merging world is actually that often merging adapters works better anyway. The point is, Sean, that once you've got your new model, if you distribute it as an adapter that sits on top of a quantized model that somebody's already downloaded, then it's a much smaller download for them. And also the inference should be much faster because you're not having to transfer FB16 weights from HPM memory at all or ever load them off disk. You know, all the main weights are quantized and the only floating point weights are in the adapters. So that should make both inference and fine tuning faster. Okay, perfect.Swyx [00:37:33]: We're moving on a little bit to the rest of the fast universe. I would have thought that, you know, once you started Answer.ai, that the sort of fast universe would be kind of on hold. And then today you just dropped Fastlight and it looks like, you know, there's more activity going on in sort of Fastland.Jeremy [00:37:49]: Yeah. So Fastland and Answerland are not really distinct things. Answerland is kind of like the Fastland grown up and funded. They both have the same mission, which is to maximize the societal benefit of AI broadly. We want to create thousands of commercially successful products at Answer.ai. And we want to do that with like 12 people. So that means we need a pretty efficient stack, you know, like quite a few orders of magnitude more efficient, not just for creation, but for deployment and maintenance than anything that currently exists. People often forget about the D part of our R&D firm. So we've got to be extremely good at creating, deploying and maintaining applications, not just models. Much to my horror, the story around creating web applications is much worse now than it was 10 or 15 years ago in terms of, if I say to a data scientist, here's how to create and deploy a web application, you know, either you have to learn JavaScript or TypeScript and about all the complex libraries like React and stuff, and all the complex like details around security and web protocol stuff around how you then talk to a backend and then all the details about creating the backend. You know, if that's your job and, you know, you have specialists who work in just one of those areas, it is possible for that to all work. But compared to like, oh, write a PHP script and put it in the home directory that you get when you sign up to this shell provider, which is what it was like in the nineties, you know, here are those 25 lines of code and you're done and now you can pass that URL around to all your friends, or put this, you know, .pl file inside the CGI bin directory that you got when you signed up to this web host. So yeah, the thing I've been mainly working on the last few weeks is fixing all that. And I think I fixed it. I don't know if this is an announcement, but I tell you guys, so yeah, there's this thing called fastHTML, which basically lets you create a complete web application in a single Python file. Unlike excellent projects like Streamlit and Gradio, you're not working on top of a highly abstracted thing. That's got nothing to do with web foundations. You're working with web foundations directly, but you're able to do it by using pure Python. There's no template, there's no ginger, there's no separate like CSS and JavaScript files. It looks and behaves like a modern SPA web application. And you can create components for like daisy UI, or bootstrap, or shoelace, or whatever fancy JavaScript and or CSS tailwind etc library you like, but you can write it all in Python. You can pip install somebody else's set of components and use them entirely from Python. You can develop and prototype it all in a Jupyter notebook if you want to. It all displays correctly, so you can like interactively do that. And then you mentioned Fastlight, so specifically now if you're using SQLite in particular, it's like ridiculously easy to have that persistence, and all of your handlers will be passed database ready objects automatically, that you can just call dot delete dot update dot insert on. Yeah, you get session, you get security, you get all that. So again, like with most everything I do, it's very little code. It's mainly tying together really cool stuff that other people have written. You don't have to use it, but a lot of the best stuff comes from its incorporation of HTMX, which to me is basically the thing that changes your browser to make it work the way it always should have. So it just does four small things, but those four small things are the things that are basically unnecessary constraints that HTML should never have had, so it removes the constraints. It sits on top of Starlet, which is a very nice kind of lower level platform for building these kind of web applications. The actual interface matches as closely as possible to FastAPI, which is a really nice system for creating the kind of classic JavaScript type applications. And Sebastian, who wrote FastAPI, has been kind enough to help me think through some of these design decisions, and so forth. I mean, everybody involved has been super helpful. Actually, I chatted to Carson, who created HTMX, you know, so about it. Some of the folks involved in Django, like everybody in the community I've spoken to definitely realizes there's a big gap to be filled around, like, highly scalable, web foundation-based, pure Python framework with a minimum of fuss. So yeah, I'm getting a lot of support and trying to make sure that FastHTML works well for people.Swyx [00:42:38]: I would say, when I heard about this, I texted Alexio. I think this is going to be pretty huge. People consider Streamlit and Gradio to be the state of the art, but I think there's so much to improve, and having what you call web foundations and web fundamentals at the core of it, I think, would be really helpful.Jeremy [00:42:54]: I mean, it's based on 25 years of thinking and work for me. So like, FastML was built on a system much like this one, but that was of hell. And so I spent, you know, 10 years working on that. We had millions of people using that every day, really pushing it hard. And I really always enjoyed working in that. Yeah. So, you know, and obviously lots of other people have done like great stuff, and particularly HTMX. So I've been thinking about like, yeah, how do I pull together the best of the web framework I created for FastML with HTMX? There's also things like PicoCSS, which is the CSS system, which by default, FastHTML comes with. Although, as I say, you can pip install anything you want to, but it makes it like super easy to, you know, so we try to make it so that just out of the box, you don't have any choices to make. Yeah. You can make choices, but for most people, you just, you know, it's like the PHP in your home directory thing. You just start typing and just by default, you'll get something which looks and feels, you know, pretty okay. And if you want to then write a version of Gradio or Streamlit on top of that, you totally can. And then the nice thing is if you then write it in kind of the Gradio equivalent, which will be, you know, I imagine we'll create some kind of pip installable thing for that. Once you've outgrown, or if you outgrow that, it's not like, okay, throw that all away and start again. And this like whole separate language that it's like this kind of smooth, gentle path that you can take step-by-step because it's all just standard web foundations all the way, you know.Swyx [00:44:29]: Just to wrap up the sort of open source work that you're doing, you're aiming to create thousands of projects with a very, very small team. I haven't heard you mention once AI agents or AI developer tooling or AI code maintenance. I know you're very productive, but you know, what is the role of AI in your own work?Jeremy [00:44:47]: So I'm making something. I'm not sure how much I want to say just yet.Swyx [00:44:52]: Give us a nibble.Jeremy [00:44:53]: All right. I'll give you the key thing. So I've created a new approach. It's not called prompt engineering. It's called dialogue engineering. But I'm creating a system for doing dialogue engineering. It's currently called AI magic. I'm doing most of my work in this system and it's making me much more productive than I was before I used it. So I always just build stuff for myself and hope that it'll be useful for somebody else. Think about chat GPT with code interpreter, right? The basic UX is the same as a 1970s teletype, right? So if you wrote APL on a teletype in the 1970s, you typed onto a thing, your words appeared at the bottom of a sheet of paper and you'd like hit enter and it would scroll up. And then the answer from APL would be printed out, scroll up, and then you would type the next thing. And like, which is also the way, for example, a shell works like bash or ZSH or whatever. It's not terrible, you know, like we all get a lot done in these like very, very basic teletype style REPL environments, but I've never felt like it's optimal and everybody else has just copied chat GPT. So it's also the way BART and Gemini work. It's also the way the Claude web app works. And then you add code interpreter. And the most you can do is to like plead with chat GPT to write the kind of code I want. It's pretty good for very, very, very beginner users who like can't code at all, like by default now the code's even hidden away, so you never even have to see it ever happened. But for somebody who's like wanting to learn to code or who already knows a bit of code or whatever, it's, it seems really not ideal. So okay, that's one end of the spectrum. The other end of the spectrum, which is where Sean's work comes in, is, oh, you want to do more than chat GPT? No worries. Here is Visual Studio Code. I run it. There's an empty screen with a flashing cursor. Okay, start coding, you know, and it's like, okay, you can use systems like Sean's or like cursor or whatever to be like, okay, Apple K in cursors, like a creative form that blah, blah, blah. But in the end, it's like a convenience over the top of this incredibly complicated system that full-time sophisticated software engineers have designed over the past few decades in a totally different environment as a way to build software, you know. And so we're trying to like shoehorn in AI into that. And it's not easy to do. And I think there are like much better ways of thinking about the craft of software development in a language model world to be much more interactive, you know. So the thing that I'm building is neither of those things. It's something between the two. And it's built around this idea of crafting a dialogue, you know, where the outcome of the dialogue is the artifacts that you want, whether it be a piece of analysis or whether it be a Python library or whether it be a technical blog post or whatever. So as part of building that, I've created something called Claudette, which is a library for Claude. I've created something called Cosette, which is a library for OpenAI. They're libraries which are designed to make those APIs much more usable, much easier to use, much more concise. And then I've written AI magic on top of those. And that's been an interesting exercise because I did Claudette first, and I was looking at what Simon Willison did with his fantastic LLM library. And his library is designed around like, let's make something that supports all the LLM inference engines and commercial providers. I thought, okay, what if I did something different, which is like make something that's as Claude friendly as possible and forget everything else. So that's what Claudette was. So for example, one of the really nice things in Claude is prefill. So by telling the assistant that this is what your response started with, there's a lot of powerful things you can take advantage of. So yeah, I created Claudette to be as Claude friendly as possible. And then after I did that, and then particularly with GPT 4.0 coming out, I kind of thought, okay, now let's create something that's as OpenAI friendly as possible. And then I tried to look to see, well, where are the similarities and where are the differences? And now can I make them compatible in places where it makes sense for them to be compatible without losing out on the things that make each one special for what they are. So yeah, those are some of the things I've been working on in that space. And I'm thinking we might launch AI magic via a course called how to solve it with code. The name is based on the classic Polya book, if you know how to solve it, which is, you know, one of the classic math books of all time, where we're basically going to try to show people how to solve challenging problems that they didn't think they could solve without doing a full computer science course, by taking advantage of a bit of AI and a bit of like practical skills, as particularly for this like whole generation of people who are learning to code with and because of ChatGPT. Like I love it, I know a lot of people who didn't really know how to code, but they've created things because they use ChatGPT, but they don't really know how to maintain them or fix them or add things to them that ChatGPT can't do, because they don't really know how to code. And so this course will be designed to show you how you can like either become a developer who can like supercharge their capabilities by using language models, or become a language model first developer who can supercharge their capabilities by understanding a bit about process and fundamentals.Alessio [00:50:19]: Nice. That's a great spoiler. You know, I guess the fourth time you're going to be on learning space, we're going to talk about AI magic. Jeremy, before we wrap, this was just a great run through everything. What are the things that when you next come on the podcast in nine, 12 months, we're going to be like, man, Jeremy was like really ahead of it. Like, is there anything that you see in the space that maybe people are not talking enough? You know, what's the next company that's going to fall, like have drama internally, anything in your mind?Jeremy [00:50:47]: You know, hopefully we'll be talking a lot about fast HTML and hopefully the international community that at that point has come up around that. And also about AI magic and about dialogue engineering. Hopefully dialogue engineering catches on because I think it's the right way to think about a lot of this stuff. What else? Just trying to think about all on the research side. Yeah. I think, you know, I mean, we've talked about a lot of it. Like I think encoder decoder architectures, encoder only architectures, hopefully we'll be talking about like the whole re-interest in BERT that BERT 24 stimulated.Swyx [00:51:17]: There's a safe space model that came out today that might be interesting for this general discussion. One thing that stood out to me with Cartesia's blog posts was that they were talking about real time ingestion, billions and trillions of tokens, and keeping that context, obviously in the state space that they have.Jeremy [00:51:34]: Yeah.Swyx [00:51:35]: I'm wondering what your thoughts are because you've been entirely transformers the whole time.Jeremy [00:51:38]: Yeah. No. So obviously my background is RNNs and LSTMs. Of course. And I'm still a believer in the idea that state is something you can update, you know? So obviously Sepp Hochreiter came up, came out with xLSTM recently. Oh my God. Okay. Another whole thing we haven't talked about, just somewhat related. I've been going crazy for like a long time about like, why can I not pay anybody to save my KV cash? I just ingested the Great Gatsby or the documentation for Starlet or whatever, you know, I'm sending it as my prompt context. Why are you redoing it every time? So Gemini is about to finally come out with KV caching, and this is something that Austin actually in Gemma.cpp had had on his roadmap for years, well not years, months, long time. The idea that the KV cache is like a thing that, it's a third thing, right? So there's RAG, you know, there's in-context learning, you know, and prompt engineering, and there's KV cache creation. I think it creates like a whole new class almost of applications or as techniques where, you know, for me, for example, I very often work with really new libraries or I've created my own library that I'm now writing with rather than on. So I want all the docs in my new library to be there all the time. So I want to upload them once, and then we have a whole discussion about building this application using FastHTML. Well nobody's got FastHTML in their language model yet, I don't want to send all the FastHTML docs across every time. So one of the things I'm looking at doing in AI Magic actually is taking advantage of some of these ideas so that you can have the documentation of the libraries you're working on be kind of always available. Something over the next 12 months people will be spending time thinking about is how to like, where to use RAG, where to use fine-tuning, where to use KV cache storage, you know. And how to use state, because in state models and XLSTM, again, state is something you update. So how do we combine the best of all of these worlds?Alessio [00:53:46]: And Jeremy, I know before you talked about how some of the autoregressive models are not maybe a great fit for agents. Any other thoughts on like JEPA, diffusion for text, any interesting thing that you've seen pop up?Jeremy [00:53:58]: In the same way that we probably ought to have state that you can update, i.e. XLSTM and state models, in the same way that a lot of things probably should have an encoder, JEPA and diffusion both seem like the right conceptual mapping for a lot of things we probably want to do. So the idea of like, there should be a piece of the generative pipeline, which is like thinking about the answer and coming up with a sketch of what the answer looks like before you start outputting tokens. That's where it kind of feels like diffusion ought to fit, you know. And diffusion is, because it's not autoregressive, it's like, let's try to like gradually de-blur the picture of how to solve this. So this is also where dialogue engineering fits in, by the way. So with dialogue engineering, one of the reasons it's working so well for me is I use it to kind of like craft the thought process before I generate the code, you know. So yeah, there's a lot of different pieces here and I don't know how they'll all kind of exactly fit together. I don't know if JEPA is going to actually end up working in the text world. I don't know if diffusion will end up working in the text world, but they seem to be like trying to solve a class of problem which is currently unsolved.Alessio [00:55:13]: Awesome, Jeremy. This was great, as usual. Thanks again for coming back on the pod and thank you all for listening. Yeah, that was fantastic. Get full access to Latent Space at www.latent.space/subscribe

god ceo ai google science magic building research elon musk japanese development african chatgpt silicon valley ceos discord oxford stanford spa products berkeley cto pfizer react bart managers governance openai gemini turkish residence cgi zimbabwe nvidia ux api shipping ranger vic python gpt ui winds ml llama snowflakes national academy apis karim javascript html assemble r d llm sam altman gpu altman great gatsby css php django jono kv rag ocr anthropic deepmind alessio fine tuning surya faang xai typescript eric ries pbc philip morris apl starlet cuda visual studio code dpo kerem t5 reka kerim kaggle sqlite pytorch spv itay jupyter public benefit corporation 33m pef jeremy howard 70b repl neurips berts ai engineer huggingface vl m htmx ai winter zsh hpm simon willison alexio rnns streamlit iclr webgpu latent space unet gradio lstms polya web application development

The top AI news from the past week, every ThursdAI

Play Episode Listen Later Aug 1, 2024 112:36

Starting Monday, Apple released iOS 18.1 with Apple Intelligence, then Meta dropped SAM-2 (Segment Anything Model) and then Google first open sourced Gemma 2B and now (just literally 2 hours ago, during the live show) released Gemini 1.5 0801 experimental that takes #1 on LMsys arena across multiple categories, to top it all off we also got a new SOTA image diffusion model called FLUX.1 from ex-stability folks and their new Black Forest Lab.This week on the show, we had Joseph & Piotr Skalski from Roboflow, talk in depth about Segment Anything, and as the absolute experts on this topic (Skalski is our returning vision expert), it was an incredible deep dive into the importance dedicated vision models (not VLMs).We also had Lukas Atkins & Fernando Neto from Arcee AI talk to use about their new DistillKit and explain model Distillation in detail & finally we had Cristiano Giardina who is one of the lucky few that got access to OpenAI advanced voice mode + his new friend GPT-4o came on the show as well!Honestly, how can one keep up with all this? by reading ThursdAI of course, that's how but ⚠️ buckle up, this is going to be a BIG one (I think over 4.5K words, will mark this as the longest newsletter I penned, I'm sorry, maybe read this one on 2x?

Revolutionizing Game Development: W/ Will Eastcott of PlayCanvas

Building the Open Metaverse

Play Episode Listen Later Jun 11, 2024 38:48

In this episode of "Building the Open Metaverse," hosts Marc Petit and Patrick Cozzi welcome Will Eastcott, the co-founder and CEO of PlayCanvas, a pioneering open-source game engine. Eastcott's journey in the gaming industry began in the 1980s when he was captivated by the 3D space trading game "Elite," which fueled his passion for computers and video games. He pursued computing at Imperial College London and gained early industry experience at a VR company, working with advanced silicon graphics workstations. His career took off at Criterion Software, where he contributed to developing RenderWare and worked on notable games like Grand Theft Auto and Call of Duty.Eastcott's transition to web-based game development was sparked by the release of the WebGL specification in 2010, which he saw as a significant opportunity for interactive graphics. He founded PlayCanvas in 2011, focusing on creating a web-native game engine that leveraged WebGL. Despite initial challenges, such as limited WebGL support on iOS, PlayCanvas flourished, becoming open-source in 2014. This move fostered a global community of developers and solidified PlayCanvas's role in democratizing game development.The conversation delves into the strategic acquisition of PlayCanvas by Snap Inc. in 2017, which allowed Eastcott and his team to work on Snap Games, a platform serving millions of users. Eastcott shares insights into the unique aspects of PlayCanvas, including its lightweight runtime, collaborative browser-based platform, and commitment to open standards like glTF and WebXR. He emphasizes the importance of WebGPU in achieving significant performance improvements and explores the potential of AI and machine learning in revolutionizing content creation.Eastcott highlights the development of Super Splat, a tool for optimizing 3D Gaussian splat scenes, demonstrating how AI can streamline the creation of photorealistic content without extensive coding. He also discusses the future of web gaming, pointing out the need for improved payment systems, discovery mechanisms, and better support from browser vendors to enhance the web gaming experience.The episode concludes with Eastcott offering advice to aspiring game developers, encouraging them to leverage the vast audience and creative freedom provided by the web. He also gives a shout-out to Ken Russell and the National Center for Computing History in Cambridge, UK, acknowledging their contributions to the industry. Eastcott's journey and insights provide a compelling narrative on the evolution of web-based game development and the transformative potential of emerging technologies. Have any comments or questions? Email the show Feedback@Buildingtheopenmetaverse.org Want more information? Visit our website www.buildingtheopenmetaverse.org And make sure you follow us on Linkedin for all of our show updates https://www.linkedin.com/company/buildingtheopenmetaverse Building the Open Metaverse is a podcast hosted by Patrick Cozzi (Cesium) and Marc Petit that invites a broad range of technical experts to share their insights on how the community is building the metaverse together. #BuildingTheOpenMetaversePodcast #MetaversePodcast #Metaverse

GOOGLE I/O: ALL ABOUT AI

AI DAILY: Breaking News in AI

Play Episode Listen Later May 15, 2024 4:26

Plus AI Daily Turns 1 (subscribe in the links below) Get a free 20-page AI explainer: AI FROM ZERO plus these stories and more, delivered to your inbox, every weekday. Subscribe to our newsletter at https://aidaily.us Like this? Get AIDAILY, delivered to your inbox, every weekday. Subscribe to our newsletter at https://aidaily.us Google I/O Highlights AI Upgrades Across Gemini, Chrome, and More At Google I/O 2024, AI dominated the announcements. The Gemini model, now more conversational and versatile, underpins Project Astra for natural, real-time assistance. Google also unveiled Gemini Nano in Chrome for AI-powered features. Updates include enhanced AI in search, Android 15, and new tools like Veo for video creation and NotebookLM's virtual assistant. AI Daily Turns 1 A year ago, our founder Chris Kalaboukis had a vision to bring a quick summary (less than a 5 minutes read or listen) of the latest news in AI to readers and listeners. These news items would not be curated by AI, but by humans, and would try to feature a balanced view on AI developments, including both the positive and negative aspects of where we are taking AI.Have we succeeded? Let us know! Project Astra: Google's Future AI Assistant Google's Project Astra, unveiled at Google I/O, is a real-time, multimodal AI assistant capable of seeing, recognizing objects, and answering questions. Astra is part of the Gemini suite, which includes models like Veo for video generation. Google aims to create versatile AI assistants to enhance daily tasks and interactions. Google to Introduce Personalized AI Chatbots Google's new Gemini feature, “Gems,” allows users to create custom chatbots with distinct personalities and abilities. Users can configure Gemini to serve various roles, such as a gym buddy, sous-chef, or coding partner, by specifying tasks and response styles. Gems will soon be available to Gemini Advanced subscribers. Google Introduces AI Teammate at Google I/O 2024 At Google I/O 2024, Google unveiled the AI Teammate as part of its Gemini for Workspace platform. The AI Teammate integrates into group chats, emails, and documents, acting like a co-worker with a specific role. It can answer questions based on previous interactions and share information with the whole team. Google Integrates Gemini Nano AI Model into Chrome Desktop At Google I/O 2024, Google announced the integration of its Gemini Nano AI model into the Chrome desktop client, starting with Chrome 126. This allows developers to use the on-device model for AI features, such as Workspace Lab's "help me write" tool. Chrome's WebGPU and WASM support enable this capability, making the web AI-ready. Google Unveils AI Video Generator Veo at I/O 2024 Google introduced Veo, an AI model for generating 1080p videos from text prompts, at I/O 2024. Veo can create minute-long videos with advanced editing features and supports realistic visual effects. Initially available to select creators, Veo will be released gradually for broader use. OpenAI Co-Founder and Chief Scientist Ilya Sutskever Departs Ilya Sutskever, OpenAI's co-founder and chief scientist, announced his departure to pursue a "personally meaningful" project. Jakub Pachocki, the firm's director of research, will succeed him as chief scientist. Sutskever previously helped briefly oust CEO Sam Altman last year but expressed regret for his actions. --- Send in a voice message: https://podcasters.spotify.com/pod/show/aidaily/message

ai google android openai gemini users gems io chrome sam altman workspace google i o veo notebooklm wasm webgpu

Sick Jokes, WEBGPU, Fortra, Azorult, Fujitsu, Phishing, Josh Marpet, and More - SWN #370

Paul's Security Weekly

Play Episode Listen Later Mar 19, 2024 32:29

Sick Jokes, WEBGPU, Fortra, Azorult, Fujitsu, Conversation Overflow, Phishing, Josh Marpet, and more on this Edition of the Security Weekly News. Visit https://www.securityweekly.com/swn for all the latest episodes! Show Notes: https://securityweekly.com/swn-370

sick jokes phishing fujitsu webgpu fortra marpet

Sick Jokes, WEBGPU, Fortra, Azorult, Fujitsu, Phishing, Josh Marpet, and More - SWN #370

Paul's Security Weekly TV

Play Episode Listen Later Mar 19, 2024 32:39

Sick Jokes, WEBGPU, Fortra, Azorult, Fujitsu, Conversation Overflow, Phishing, Josh Marpet, and more on this Edition of the Security Weekly News. Show Notes: https://securityweekly.com/swn-370

sick jokes phishing fujitsu webgpu fortra marpet

Sick Jokes, WEBGPU, Fortra, Azorult, Fujitsu, Phishing, Josh Marpet, and More - SWN #370

Hack Naked News (Audio)

Play Episode Listen Later Mar 19, 2024 32:29

sick jokes phishing fujitsu webgpu fortra marpet

Sick Jokes, WEBGPU, Fortra, Azorult, Fujitsu, Phishing, Josh Marpet, and More - SWN #370

Hack Naked News (Video)

Play Episode Listen Later Mar 19, 2024 32:39

Sick Jokes, WEBGPU, Fortra, Azorult, Fujitsu, Conversation Overflow, Phishing, Josh Marpet, and more on this Edition of the Security Weekly News. Show Notes: https://securityweekly.com/swn-370

sick jokes phishing fujitsu webgpu fortra marpet

The top AI news from the past week, every ThursdAI

Play Episode Listen Later Mar 15, 2024 118:04

"...Happy birthday dear ThursdAIiiiiiiii, happy birthday to youuuuuu

god ceo amazon spotify canada ai europe english google earth vision hell real san francisco building phd toronto elon musk spanish microsoft dm open iphone smart hands vote chatgpt code bitcoin tesla human run camp tool pc discord cloud figure stanford mac incredible throw ios exciting alpha stock cards receiving integration context writers spacex beating models terminator developers pi intel reason cto encourage aka folks vc excel transformers fireworks portuguese chain openai commander gemini sf robotics app store optimize anton black mirror insanity responses hardware api gi luigi b2c common sense chrome coding open source bing python qu'en gpt ui gorilla announcement turbo ml lama github guillaume llama small talk apis lex float hermes dev vcs javascript 200k refocus appearing opus starship biases cognition tl copilot llm treehouse weights kudos macs cpu pharrell mamba gpu 3b agi canary sequoia google cloud hug modular ide grok js prs phi imo haiku rag ocr tbt dbt gpus dbs sonnets 7b anthropic devlin fj 8k deepmind ilya new ai rtx irobot wis optimus quint boston dynamics speculative fine tuning alpaca yam pi day json suno mistral hyena hf tropic xml ctos thermodynamics cli typescript slav cursor slava justin lin google x inti olympiads kilt funnily iterate hacker news sve jumbotron junaid ai news workbench pytorch cohere eac gpc farouk mhm reworked pinecone chatgpt plus yann lecun tpu autogpt greg brockman ioi jeremy howard ryan carson 70b jacquard svek assembler 128k tool use 34b rlhf entropic vl m cerebras cloud one gvt actuator chatgpt api technium webgpu vicuna cs3 metting

December 22nd, 2023 | How big is YouTube?

Hacker News Recap

Play Episode Listen Later Dec 23, 2023 18:41

This is a recap of the top 10 posts on Hacker News on December 22nd, 2023.This podcast was generated by wondercraft.ai(00:43): From Nand to Tetris (2017)Original post: https://news.ycombinator.com/item?id=38735066&utm_source=wondercraft_ai(02:23): How big is YouTube?Original post: https://news.ycombinator.com/item?id=38739563&utm_source=wondercraft_ai(04:10): Granting pardon for the offense of simple possession of or use of marijuanaOriginal post: https://news.ycombinator.com/item?id=38736919&utm_source=wondercraft_ai(06:05): Google is apparently struggling to contain an ongoing spam attackOriginal post: https://news.ycombinator.com/item?id=38738619&utm_source=wondercraft_ai(07:55): Posts, profiles, and user search are now available without loginOriginal post: https://news.ycombinator.com/item?id=38739130&utm_source=wondercraft_ai(09:37): LED Industrial PiercingOriginal post: https://news.ycombinator.com/item?id=38734164&utm_source=wondercraft_ai(11:19): WebGPU now available for testing in Safari Technology PreviewOriginal post: https://news.ycombinator.com/item?id=38737028&utm_source=wondercraft_ai(13:08): They want you to forget what a film looks likeOriginal post: https://news.ycombinator.com/item?id=38741536&utm_source=wondercraft_ai(14:50): SymbOS Z80 multitasking operating systemOriginal post: https://news.ycombinator.com/item?id=38736054&utm_source=wondercraft_ai(16:50): Schrödinger equation emerges mathematically from classical mechanics (2012)Original post: https://news.ycombinator.com/item?id=38735725&utm_source=wondercraft_aiThis is a third-party project, independent from HN and YC. Text and audio generated using AI, by wondercraft.ai. Create your own studio quality podcast with text as the only input in seconds at app.wondercraft.ai. Issues or feedback? We'd love to hear from you: team@wondercraft.ai

ai google original tetris schr yc granting hn hacker news webgpu

JetBrains turbina IDEs com IA; Google Duet no VS Code; Novo SvelteKit; Deno com novidades no KV e WebGPU; Jira ganha IA; Wi-Fi 7 5x mais rápido [Compilado #129]

Compilado do Código Fonte TV

Play Episode Listen Later Dec 16, 2023 57:17

Nesse episódio trouxemos as notícias e novidades do mundo da programação que nos chamaram atenção dos dias 09/12 a 15/12.

JetBrains turbina IDEs com IA; Google Duet no VS Code; Novo SvelteKit; Deno com novidades no KV e WebGPU; Jira ganha IA; Wi-Fi 7 5x mais rápido [Compilado #129]

Compilado do Código Fonte TV

Play Episode Listen Later Dec 16, 2023 57:17

Nesse episódio trouxemos as notícias e novidades do mundo da programação que nos chamaram atenção dos dias 09/12 a 15/12.

From WebGL to WebGPU

JS Party

Play Episode Listen Later Dec 7, 2023 58:53

Gregg Tavares (author of WebGL/WebGPU Fundamentals) joins Jerod & Amal to give us a tour of these low-level technologies that are pushing the web forward into the world of video games, machine learning & other exciting rich applications.

web programming animation robotics iot javascript html css node backend frontend amal jerod changelog webgl webgpu jerod santo

From WebGL to WebGPU (JS Party #304)

Changelog Master Feed

Play Episode Listen Later Dec 7, 2023 58:53

web programming animation robotics iot javascript html css node backend frontend amal jerod changelog webgl webgpu jerod santo

WebGPU and Browser Ideologies

Off The Main Thread

Play Episode Listen Later Nov 29, 2023 78:23

In this episode, Surma talks about the “GPU” in “WebGPU” and how this new web standard makes programming for the GPU more accessible. Jake talks about how different browsers approach standards and their perceived ideologies around what they prioritize. Resources: Surma's blog post on WebGPU A 13-part blog post series on the architecture of GPUs. The OpenGL internal state object explained Dawn, a C++ library that brings the WebGPU API to C++ wgpu, a Rust crate that brings the WebGPU API to Rust. The extensible web manifesto. Edge 'injecting' content into the Chrome download page. -webkit-box-reflect. Is Safari the new IE? Stadia controller flash.

rust chrome ideology stadia browsers gpu gpus surma webgpu

Web Standards for the Win W/ Ken Russell & Corentin Wallez

Building the Open Metaverse

Play Episode Listen Later Nov 7, 2023 36:52

In this episode of the Building the Open Metaverse podcast, Ken Russell and Corentin Wallez from the Google Chrome graphics team discuss using web browsers and technologies like WebGPU, WebGL, and WebAssembly to build an open and accessible metaverse. They explain how new browser capabilities like WebGPU's compute shaders and multi-threading support can enable complex 3D experiences on par with console and mobile games. Russell and Wallez examine performance considerations like streaming assets and reducing security overhead. An open question is supporting multi-user experiences across origins while maintaining security. The guests are optimistic that an open metaverse can be built using web principles like transparency and permissionless innovation. They see opportunities for blending languages like Rust, C++, JavaScript, and TypeScript. A key benefit of web tech is portability across devices. Russell and Wallez encourage industry collaboration on ethical guidelines and standards. ==== Have any comments or questions? Email the show Feedback@Buildingtheopenmetaverse.org Want more information? Visit our website www.buildingtheopenmetaverse.org And make sure you follow us on Linkedin for all of our show updates https://www.linkedin.com/company/buildingtheopenmetaverse/ Building the Open Metaverse is a podcast hosted by Patrick Cozzi (Cesium) and Marc Petit (Epic Games) that invites a broad range of technical experts to share their insights on how the community is building the metaverse together. #BuildingTheOpenMetaversePodcast #MetaversePodcast #Metaverse

building 3d rust javascript google chrome typescript ken russell corentin webassembly webgl web standards open metaverse webgpu

The top AI news from the past week, every ThursdAI

Play Episode Listen Later Oct 5, 2023 88:21

Boy am I glad that not all AI weeks are like last week, where we had so much news and so many things happening that I was barely able to take a breath for the week! I am very excited to bring you this newsletter from San Fancisco this week, the AI mecca, the arena, the place where there are so many AI events and hack-a-thons that I don't actually know how people get any work done!On that topic, I'm in SF to participate in the AI.engineer (by and ) next week, to host spaces and interviews with the top AI folks in here, and to discuss with the audience, what is an AI engineer, if you have any questions you'd like me to ask, please comment with them and I'll make sure I'll try to answer. ThursdAI - subscribe eh? ↴Here's a table of contents of everything we chatted about: [00:00:00] Intro and welcome[00:04:53] Alex in San Francisco - AI Engineer[00:07:32] Reka AI - Announcing a new multimodal Foundational model called Yasa-1 [00:12:42] Google adding Bard to Google Assistant[00:18:56] Where is Gemini? [00:23:06] Arc browser adding Arc Max with 5 new AI features[00:24:56] 5 seconds link AI generated previews[00:31:54] Ability to run LLMs on client side with WebGPU[00:39:28] Mistral is getting love from Open Source, [00:48:04] Mistral Open Orca 7B [00:58:28] Acknowledging the experts of ThursdAI[01:01:14] Voice based always on AI assistants[01:09:00] Airchat adds voice cloning based translation tech[01:14:23] Effects of AI voice cloning on society[01:21:32] SDXL IKEA LORA[01:23:17] Brief RecapShow notes: Big Co* Google - adding Bard to Google Assistant (Announcement)Come on google, just give us Gemini already!* Reka AI - Multimodal Yasa-1 from Yi Tay and team (Announcement)With Yi Tay from Flan/Bard fame as chief scientist! But I wasn't able to test myself!* Arc - first browser AI features (My thread, Brief video review, Arc Invite)I love Arc, I recommend it to everyone I meet, now with AI preview features it's even more a non brainer, strongly recommend if you like productivityOpen Source LLMs* Mistral vs LLaMa 2 boxing match (link)A fun little battle arena to select which responses you personally find better to see the difference between Mistral 7B and LLaMa 13B* Mistral-7B-OpenOrca (announcement)The folks from Alignment labs do it again! Great finetune that comes very close (98%) to LLaMa 70B on benchmarks! * SynthIA-7B-v1.3 - (Huggingface)An uncensored finetune on top of Mistral that Reddit claims is a great model, especially since a chain of thought is somehow built in apparentlyVISION* Radiologists thread about GPT-4 V taking over radiology (or maybe not?) (Thread)Voice* AirChat added voice clone + translation features (Room, Demo)I've been an avid AirChat user (It's Naval's social media platform that's voice based) for a while, and am very excited they are destroying language barriers with this feature! * Tab was revealed in a great demo by Avi Schiffman (Demo)Go Avi! Rooting for you brother, competition makes folk stronger!* Rewind announced Rewind Pendant (Announcement)I ordered one, but Rewind didn't announce a date of when this hits the market, going to be interesting to see how well they do!Ai Art and Diffusion - IKEA Lora generate IKEA style tutorials for everything with SDXL (Announcement, HuggingFace)* DALL-E3 seems to be available to all Plus members nowThis weeks pod was generated by talking to chatGPT, it's so fun, you gotta try it!No longer breakdown this week ,but we covered a bunch of it in the show, and I highly recommend listening to it!Don't forget to follow me on X to be aware of the spaces live from ai.engineer event in SF, the conference will be live-streamed as well on youtube! See you next week

89 - Android 14 Beta 5, Compose for Wear OS, WebGPU, and more!

Now in Android

Play Episode Listen Later Aug 23, 2023 3:43

Welcome to Now in Android, your ongoing guide to what's new and notable in the world of Android development. Today, we're covering updates on Android 14 Beta 5, Compose for Wear OS, the latest on WebGPU, articles, videos, and more! For links to these items, check out Now in Android #89 on Medium → https://goo.gle/3Z2xSzD Now in Android podcast → https://goo.gle/2BDIo9y Now in Android articles → https://goo.gle/2xtWmsu Now in Android playlist → https://goo.gle/now-in-android Subscribe to Android Developers → https://goo.gle/AndroidDevs

google android medium google play beta compose wear os android developers android 14 webgpu

Episode 200: WebGPU

Android Developers Backstage

Play Episode Listen Later Aug 15, 2023 50:26

In this episode, Chet and Romain speak with Ken Russell and Corentin Wallez from the WebGPU team. WebGPU is a new API that brings modern GPU rendering and compute functionality to web and other platforms (including Android!). We talk about the genesis and capabilities of WebGPU, WGSL (WebGPU's new shading language), the state of WebGL (the predecessor API for web GPU rendering), and lots of other fun related graphics topics. Ken, Romain, and Chet (not pictured: Corentin, who is on the monitor behind the photographer) Links: Samples (and its github repo) Google I/O Codelab Google I/O presentation Introducing WebGPU (and associated blog post) Series of articles teaching WebGPU and WGSL Series of articles of WebGPU Best Practices Draft specs for WebGPU and WGSL Dawn from Google/Chromium wgpu from Firefox Romain: @romainguy, romainguy@threads, romainguy@androiddev.social Tor: tor.norbye@threads and tornorbye@androiddev.social Chet: @chethaase, chet.haase@threads, and chethaase@androiddev.social Ken: @gfxprogrammerCorentin: @DaKangz and @DaKangz@mastodon.gamedev.place Catch more from ADB → https://goo.gle/adb-podcast Subscribe to Android Developers YouTube → https://goo.gle/AndroidDevs

google series android developers api tor romain gpu ken russell corentin adb webgl webgpu

LLMs Everywhere: Running 70B models in browsers and iPhones using MLC — with Tianqi Chen of CMU / OctoML

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Aug 10, 2023 52:10

We have just announced our first set of speakers at AI Engineer Summit! Sign up for the livestream or email sponsors@ai.engineer if you'd like to support.We are facing a massive GPU crunch. As both startups and VC's hoard Nvidia GPUs like countries count nuclear stockpiles, tweets about GPU shortages have become increasingly common. But what if we could run LLMs with AMD cards, or without a GPU at all? There's just one weird trick: compilation. And there's one person uniquely qualified to do it.We had the pleasure to sit down with Tianqi Chen, who's an Assistant Professor at CMU, where he both teaches the MLC course and runs the MLC group. You might also know him as the creator of XGBoost, Apache TVM, and MXNet, as well as the co-founder of OctoML. The MLC (short for Machine Learning Compilation) group has released a lot of interesting projects:* MLC Chat: an iPhone app that lets you run models like RedPajama-3B and Vicuna-7B on-device. It gets up to 30 tok/s!* Web LLM: Run models like LLaMA-70B in your browser (!!) to offer local inference in your product.* MLC LLM: a framework that allows any language models to be deployed natively on different hardware and software stacks.The MLC group has just announced new support for AMD cards; we previously talked about the shortcomings of ROCm, but using MLC you can get performance very close to the NVIDIA's counterparts. This is great news for founders and builders, as AMD cards are more readily available. Here are their latest results on AMD's 7900s vs some of top NVIDIA consumer cards.If you just can't get a GPU at all, MLC LLM also supports ARM and x86 CPU architectures as targets by leveraging LLVM. While speed performance isn't comparable, it allows for non-time-sensitive inference to be run on commodity hardware.We also enjoyed getting a peek into TQ's process, which involves a lot of sketching:With all the other work going on in this space with projects like ggml and Ollama, we're excited to see GPUs becoming less and less of an issue to get models in the hands of more people, and innovative software solutions to hardware problems!Show Notes* TQ's Projects:* XGBoost* Apache TVM* MXNet* MLC* OctoML* CMU Catalyst* ONNX* GGML* Mojo* WebLLM* RWKV* HiPPO* Tri Dao's Episode* George Hotz EpisodePeople:* Carlos Guestrin* Albert GuTimestamps* [00:00:00] Intros* [00:03:41] The creation of XGBoost and its surprising popularity* [00:06:01] Comparing tree-based models vs deep learning* [00:10:33] Overview of TVM and how it works with ONNX* [00:17:18] MLC deep dive* [00:28:10] Using int4 quantization for inference of language models* [00:30:32] Comparison of MLC to other model optimization projects* [00:35:02] Running large language models in the browser with WebLLM* [00:37:47] Integrating browser models into applications* [00:41:15] OctoAI and self-optimizing compute* [00:45:45] Lightning RoundTranscriptAlessio: Hey everyone, welcome to the Latent Space podcast. This is Alessio, Partner and CTO in Residence at Decibel Partners, and I'm joined by my co-host Swyx, writer and editor of Latent Space. [00:00:20]Swyx: Okay, and we are here with Tianqi Chen, or TQ as people call him, who is assistant professor in ML computer science at CMU, Carnegie Mellon University, also helping to run Catalyst Group, also chief technologist of OctoML. You wear many hats. Are those, you know, your primary identities these days? Of course, of course. [00:00:42]Tianqi: I'm also, you know, very enthusiastic open source. So I'm also a VP and PRC member of the Apache TVM project and so on. But yeah, these are the things I've been up to so far. [00:00:53]Swyx: Yeah. So you did Apache TVM, XGBoost, and MXNet, and we can cover any of those in any amount of detail. But maybe what's one thing about you that people might not learn from your official bio or LinkedIn, you know, on the personal side? [00:01:08]Tianqi: Let me say, yeah, so normally when I do, I really love coding, even though like I'm trying to run all those things. So one thing that I keep a habit on is I try to do sketchbooks. I have a book, like real sketchbooks to draw down the design diagrams and the sketchbooks I keep sketching over the years, and now I have like three or four of them. And it's kind of a usually a fun experience of thinking the design through and also seeing how open source project evolves and also looking back at the sketches that we had in the past to say, you know, all these ideas really turn into code nowadays. [00:01:43]Alessio: How many sketchbooks did you get through to build all this stuff? I mean, if one person alone built one of those projects, he'll be a very accomplished engineer. Like you built like three of these. What's that process like for you? Like it's the sketchbook, like the start, and then you think about the code or like. [00:01:59]Swyx: Yeah. [00:02:00]Tianqi: So, so usually I start sketching on high level architectures and also in a project that works for over years, we also start to think about, you know, new directions, like of course generative AI language model comes in, how it's going to evolve. So normally I would say it takes like one book a year, roughly at that rate. It's usually fun to, I find it's much easier to sketch things out and then gives a more like a high level architectural guide for some of the future items. Yeah. [00:02:28]Swyx: Have you ever published this sketchbooks? Cause I think people would be very interested on, at least on a historical basis. Like this is the time where XGBoost was born, you know? Yeah, not really. [00:02:37]Tianqi: I started sketching like after XGBoost. So that's a kind of missing piece, but a lot of design details in TVM are actually part of the books that I try to keep a record of. [00:02:48]Swyx: Yeah, we'll try to publish them and publish something in the journals. Maybe you can grab a little snapshot for visual aid. Sounds good. [00:02:57]Alessio: Yeah. And yeah, talking about XGBoost, so a lot of people in the audience might know it's a gradient boosting library, probably the most popular out there. And it became super popular because many people started using them in like a machine learning competitions. And I think there's like a whole Wikipedia page of like all state-of-the-art models. They use XGBoost and like, it's a really long list. When you were working on it, so we just had Tri Dao, who's the creator of FlashAttention on the podcast. And I asked him this question, it's like, when you were building FlashAttention, did you know that like almost any transform race model will use it? And so I asked the same question to you when you were coming up with XGBoost, like, could you predict it would be so popular or like, what was the creation process? And when you published it, what did you expect? We have no idea. [00:03:41]Tianqi: Like, actually, the original reason that we built that library is that at that time, deep learning just came out. Like that was the time where AlexNet just came out. And one of the ambitious mission that myself and my advisor, Carlos Guestrin, then is we want to think about, you know, try to test the hypothesis. Can we find alternatives to deep learning models? Because then, you know, there are other alternatives like, you know, support vector machines, linear models, and of course, tree-based models. And our question was, if you build those models and feed them with big enough data, because usually like one of the key characteristics of deep learning is that it's taking a lot [00:04:22]Swyx: of data, right? [00:04:23]Tianqi: So we will be able to get the same amount of performance. That's a hypothesis we're setting out to test. Of course, if you look at now, right, that's a wrong hypothesis, but as a byproduct, what we find out is that, you know, most of the gradient boosting library out there is not efficient enough for us to test that hypothesis. So I happen to have quite a bit of experience in the past of building gradient boosting trees and their variants. So Effective Action Boost was kind of like a byproduct of that hypothesis testing. At that time, I'm also competing a bit in data science challenges, like I worked on KDDCup and then Kaggle kind of become bigger, right? So I kind of think maybe it's becoming useful to others. One of my friends convinced me to try to do a Python binding of it. That tends to be like a very good decision, right, to be effective. Usually when I build it, we feel like maybe a command line interface is okay. And now we have a Python binding, we have R bindings. And then it realized, you know, it started getting interesting. People started contributing different perspectives, like visualization and so on. So we started to push a bit more on to building distributive support to make sure it works on any platform and so on. And even at that time point, when I talked to Carlos, my advisor, later, he said he never anticipated that we'll get to that level of success. And actually, why I pushed for gradient boosting trees, interestingly, at that time, he also disagreed. He thinks that maybe we should go for kernel machines then. And it turns out, you know, actually, we are both wrong in some sense, and Deep Neural Network was the king in the hill. But at least the gradient boosting direction got into something fruitful. [00:06:01]Swyx: Interesting. [00:06:02]Alessio: I'm always curious when it comes to these improvements, like, what's the design process in terms of like coming up with it? And how much of it is a collaborative with like other people that you're working with versus like trying to be, you know, obviously, in academia, it's like very paper-driven kind of research driven. [00:06:19]Tianqi: I would say the extra boost improvement at that time point was more on like, you know, I'm trying to figure out, right. But it's combining lessons. Before that, I did work on some of the other libraries on matrix factorization. That was like my first open source experience. Nobody knew about it, because you'll find, likely, if you go and try to search for the package SVD feature, you'll find some SVN repo somewhere. But it's actually being used for some of the recommender system packages. So I'm trying to apply some of the previous lessons there and trying to combine them. The later projects like MXNet and then TVM is much, much more collaborative in a sense that... But, of course, extra boost has become bigger, right? So when we started that project myself, and then we have, it's really amazing to see people come in. Michael, who was a lawyer, and now he works on the AI space as well, on contributing visualizations. Now we have people from our community contributing different things. So extra boost even today, right, it's a community of committers driving the project. So it's definitely something collaborative and moving forward on getting some of the things continuously improved for our community. [00:07:37]Alessio: Let's talk a bit about TVM too, because we got a lot of things to run through in this episode. [00:07:42]Swyx: I would say that at some point, I'd love to talk about this comparison between extra boost or tree-based type AI or machine learning compared to deep learning, because I think there is a lot of interest around, I guess, merging the two disciplines, right? And we can talk more about that. I don't know where to insert that, by the way, so we can come back to it later. Yeah. [00:08:04]Tianqi: Actually, what I said, when we test the hypothesis, the hypothesis is kind of, I would say it's partially wrong, because the hypothesis we want to test now is, can you run tree-based models on image classification tasks, where deep learning is certainly a no-brainer right [00:08:17]Swyx: now today, right? [00:08:18]Tianqi: But if you try to run it on tabular data, still, you'll find that most people opt for tree-based models. And there's a reason for that, in the sense that when you are looking at tree-based models, the decision boundaries are naturally rules that you're looking at, right? And they also have nice properties, like being able to be agnostic to scale of input and be able to automatically compose features together. And I know there are attempts on building neural network models that work for tabular data, and I also sometimes follow them. I do feel like it's good to have a bit of diversity in the modeling space. Actually, when we're building TVM, we build cost models for the programs, and actually we are using XGBoost for that as well. I still think tree-based models are going to be quite relevant, because first of all, it's really to get it to work out of the box. And also, you will be able to get a bit of interoperability and control monotonicity [00:09:18]Swyx: and so on. [00:09:19]Tianqi: So yes, it's still going to be relevant. I also sometimes keep coming back to think about, are there possible improvements that we can build on top of these models? And definitely, I feel like it's a space that can have some potential in the future. [00:09:34]Swyx: Are there any current projects that you would call out as promising in terms of merging the two directions? [00:09:41]Tianqi: I think there are projects that try to bring a transformer-type model for tabular data. I don't remember specifics of them, but I think even nowadays, if you look at what people are using, tree-based models are still one of their toolkits. So I think maybe eventually it's not even a replacement, it will be just an ensemble of models that you can call. Perfect. [00:10:07]Alessio: Next up, about three years after XGBoost, you built this thing called TVM, which is now a very popular compiler framework for models. Let's talk about, so this came out about at the same time as ONNX. So I think it would be great if you could maybe give a little bit of an overview of how the two things work together. Because it's kind of like the model, then goes to ONNX, then goes to the TVM. But I think a lot of people don't understand the nuances. I can get a bit of a backstory on that. [00:10:33]Tianqi: So actually, that's kind of an ancient history. Before XGBoost, I worked on deep learning for two years or three years. I got a master's before I started my PhD. And during my master's, my thesis focused on applying convolutional restricted Boltzmann machine for ImageNet classification. That is the thing I'm working on. And that was before AlexNet moment. So effectively, I had to handcraft NVIDIA CUDA kernels on, I think, a GTX 2070 card. I have a 22070 card. It took me about six months to get one model working. And eventually, that model is not so good, and we should have picked a better model. But that was like an ancient history that really got me into this deep learning field. And of course, eventually, we find it didn't work out. So in my master's, I ended up working on recommender system, which got me a paper, and I applied and got a PhD. But I always want to come back to work on the deep learning field. So after XGBoost, I think I started to work with some folks on this particular MXNet. At that time, it was like the frameworks of CAFE, Ciano, PyTorch haven't yet come out. And we're really working hard to optimize for performance on GPUs. At that time, I found it's really hard, even for NVIDIA GPU. It took me six months. And then it's amazing to see on different hardwares how hard it is to go and optimize code for the platforms that are interesting. So that gets me thinking, can we build something more generic and automatic? So that I don't need an entire team of so many people to go and build those frameworks. So that's the motivation of starting working on TVM. There is really too little about machine learning engineering needed to support deep learning models on the platforms that we're interested in. I think it started a bit earlier than ONNX, but once it got announced, I think it's in a similar time period at that time. So overall, how it works is that TVM, you will be able to take a subset of machine learning programs that are represented in what we call a computational graph. Nowadays, we can also represent a loop-level program ingest from your machine learning models. Usually, you have model formats ONNX, or in PyTorch, they have FX Tracer that allows you to trace the FX graph. And then it goes through TVM. We also realized that, well, yes, it needs to be more customizable, so it will be able to perform some of the compilation optimizations like fusion operator together, doing smart memory planning, and more importantly, generate low-level code. So that works for NVIDIA and also is portable to other GPU backends, even non-GPU backends [00:13:36]Swyx: out there. [00:13:37]Tianqi: So that's a project that actually has been my primary focus over the past few years. And it's great to see how it started from where I think we are the very early initiator of machine learning compilation. I remember there was a visit one day, one of the students asked me, are you still working on deep learning frameworks? I tell them that I'm working on ML compilation. And they said, okay, compilation, that sounds very ancient. It sounds like a very old field. And why are you working on this? And now it's starting to get more traction, like if you say Torch Compile and other things. I'm really glad to see this field starting to pick up. And also we have to continue innovating here. [00:14:17]Alessio: I think the other thing that I noticed is, it's kind of like a big jump in terms of area of focus to go from XGBoost to TVM, it's kind of like a different part of the stack. Why did you decide to do that? And I think the other thing about compiling to different GPUs and eventually CPUs too, did you already see some of the strain that models could have just being focused on one runtime, only being on CUDA and that, and how much of that went into it? [00:14:50]Tianqi: I think it's less about trying to get impact, more about wanting to have fun. I like to hack code, I had great fun hacking CUDA code. Of course, being able to generate CUDA code is cool, right? But now, after being able to generate CUDA code, okay, by the way, you can do it on other platforms, isn't that amazing? So it's more of that attitude to get me started on this. And also, I think when we look at different researchers, myself is more like a problem solver type. So I like to look at a problem and say, okay, what kind of tools we need to solve that problem? So regardless, it could be building better models. For example, while we build extra boots, we build certain regularizations into it so that it's more robust. It also means building system optimizations, writing low-level code, maybe trying to write assembly and build compilers and so on. So as long as they solve the problem, definitely go and try to do them together. And I also see it's a common trend right now. Like if you want to be able to solve machine learning problems, it's no longer at Aggressor layer, right? You kind of need to solve it from both Aggressor data and systems angle. And this entire field of machine learning system, I think it's kind of emerging. And there's now a conference around it. And it's really good to see a lot more people are starting to look into this. [00:16:10]Swyx: Yeah. Are you talking about ICML or something else? [00:16:13]Tianqi: So machine learning and systems, right? So not only machine learning, but machine learning and system. So there's a conference called MLsys. It's definitely a smaller community than ICML, but I think it's also an emerging and growing community where people are talking about what are the implications of building systems for machine learning, right? And how do you go and optimize things around that and co-design models and systems together? [00:16:37]Swyx: Yeah. And you were area chair for ICML and NeurIPS as well. So you've just had a lot of conference and community organization experience. Is that also an important part of your work? Well, it's kind of expected for academic. [00:16:48]Tianqi: If I hold an academic job, I need to do services for the community. Okay, great. [00:16:53]Swyx: Your most recent venture in MLsys is going to the phone with MLCLLM. You announced this in April. I have it on my phone. It's great. I'm running Lama 2, Vicuña. I don't know what other models that you offer. But maybe just kind of describe your journey into MLC. And I don't know how this coincides with your work at CMU. Is that some kind of outgrowth? [00:17:18]Tianqi: I think it's more like a focused effort that we want in the area of machine learning compilation. So it's kind of related to what we built in TVM. So when we built TVM was five years ago, right? And a lot of things happened. We built the end-to-end machine learning compiler that works, the first one that works. But then we captured a lot of lessons there. So then we are building a second iteration called TVM Unity. That allows us to be able to allow ML engineers to be able to quickly capture the new model and how we demand building optimizations for them. And MLCLLM is kind of like an MLC. It's more like a vertical driven organization that we go and build tutorials and go and build projects like LLM to solutions. So that to really show like, okay, you can take machine learning compilation technology and apply it and bring something fun forward. Yeah. So yes, it runs on phones, which is really cool. But the goal here is not only making it run on phones, right? The goal is making it deploy universally. So we do run on Apple M2 Macs, the 17 billion models. Actually, on a single batch inference, more recently on CUDA, we get, I think, the most best performance you can get out there already on the 4-bit inference. Actually, as I alluded earlier before the podcast, we just had a result on AMD. And on a single batch, actually, we can get the latest AMD GPU. This is a consumer card. It can get to about 80% of the 4019, so NVIDIA's best consumer card out there. So it's not yet on par, but thinking about how diversity and what you can enable and the previous things you can get on that card, it's really amazing that what you can do with this kind of technology. [00:19:10]Swyx: So one thing I'm a little bit confused by is that most of these models are in PyTorch, but you're running this inside a TVM. I don't know. Was there any fundamental change that you needed to do, or was this basically the fundamental design of TVM? [00:19:25]Tianqi: So the idea is that, of course, it comes back to program representation, right? So effectively, TVM has this program representation called TVM script that contains more like computational graph and operational representation. So yes, initially, we do need to take a bit of effort of bringing those models onto the program representation that TVM supports. Usually, there are a mix of ways, depending on the kind of model you're looking at. For example, for vision models and stable diffusion models, usually we can just do tracing that takes PyTorch model onto TVM. That part is still being robustified so that we can bring more models in. On language model tasks, actually what we do is we directly build some of the model constructors and try to directly map from Hugging Face models. The goal is if you have a Hugging Face configuration, we will be able to bring that in and apply optimization on them. So one fun thing about model compilation is that your optimization doesn't happen only as a soft language, right? For example, if you're writing PyTorch code, you just go and try to use a better fused operator at a source code level. Torch compile might help you do a bit of things in there. In most of the model compilations, it not only happens at the beginning stage, but we also apply generic transformations in between, also through a Python API. So you can tweak some of that. So that part of optimization helps a lot of uplifting in getting both performance and also portability on the environment. And another thing that we do have is what we call universal deployment. So if you get the ML program into this TVM script format, where there are functions that takes in tensor and output tensor, we will be able to have a way to compile it. So they will be able to load the function in any of the language runtime that TVM supports. So if you could load it in JavaScript, and that's a JavaScript function that you can take in tensors and output tensors. If you're loading Python, of course, and C++ and Java. So the goal there is really bring the ML model to the language that people care about and be able to run it on a platform they like. [00:21:37]Swyx: It strikes me that I've talked to a lot of compiler people, but you don't have a traditional compiler background. You're inventing your own discipline called machine learning compilation, or MLC. Do you think that this will be a bigger field going forward? [00:21:52]Tianqi: First of all, I do work with people working on compilation as well. So we're also taking inspirations from a lot of early innovations in the field. Like for example, TVM initially, we take a lot of inspirations from Halide, which is just an image processing compiler. And of course, since then, we have evolved quite a bit to focus on the machine learning related compilations. If you look at some of our conference publications, you'll find that machine learning compilation is already kind of a subfield. So if you look at papers in both machine learning venues, the MLC conferences, of course, and also system venues, every year there will be papers around machine learning compilation. And in the compiler conference called CGO, there's a C4ML workshop that also kind of trying to focus on this area. So definitely it's already starting to gain traction and becoming a field. I wouldn't claim that I invented this field, but definitely I helped to work with a lot of folks there. And I try to bring a perspective, of course, trying to learn a lot from the compiler optimizations as well as trying to bring in knowledges in machine learning and systems together. [00:23:07]Alessio: So we had George Hotz on the podcast a few episodes ago, and he had a lot to say about AMD and their software. So when you think about TVM, are you still restricted in a way by the performance of the underlying kernel, so to speak? So if your target is like a CUDA runtime, you still get better performance, no matter like TVM kind of helps you get there, but then that level you don't take care of, right? [00:23:34]Swyx: There are two parts in here, right? [00:23:35]Tianqi: So first of all, there is the lower level runtime, like CUDA runtime. And then actually for NVIDIA, a lot of the mood came from their libraries, like Cutlass, CUDN, right? Those library optimizations. And also for specialized workloads, actually you can specialize them. Because a lot of cases you'll find that if you go and do benchmarks, it's very interesting. Like two years ago, if you try to benchmark ResNet, for example, usually the NVIDIA library [00:24:04]Swyx: gives you the best performance. [00:24:06]Tianqi: It's really hard to beat them. But as soon as you start to change the model to something, maybe a bit of a variation of ResNet, not for the traditional ImageNet detections, but for latent detection and so on, there will be some room for optimization because people sometimes overfit to benchmarks. These are people who go and optimize things, right? So people overfit the benchmarks. So that's the largest barrier, like being able to get a low level kernel libraries, right? In that sense, the goal of TVM is actually we try to have a generic layer to both, of course, leverage libraries when available, but also be able to automatically generate [00:24:45]Swyx: libraries when possible. [00:24:46]Tianqi: So in that sense, we are not restricted by the libraries that they have to offer. That's why we will be able to run Apple M2 or WebGPU where there's no library available because we are kind of like automatically generating libraries. That makes it easier to support less well-supported hardware, right? For example, WebGPU is one example. From a runtime perspective, AMD, I think before their Vulkan driver was not very well supported. Recently, they are getting good. But even before that, we'll be able to support AMD through this GPU graphics backend called Vulkan, which is not as performant, but it gives you a decent portability across those [00:25:29]Swyx: hardware. [00:25:29]Alessio: And I know we got other MLC stuff to talk about, like WebLLM, but I want to wrap up on the optimization that you're doing. So there's kind of four core things, right? Kernel fusion, which we talked a bit about in the flash attention episode and the tiny grab one memory planning and loop optimization. I think those are like pretty, you know, self-explanatory. I think the one that people have the most questions, can you can you quickly explain [00:25:53]Swyx: those? [00:25:54]Tianqi: So there are kind of a different things, right? Kernel fusion means that, you know, if you have an operator like Convolutions or in the case of a transformer like MOP, you have other operators that follow that, right? You don't want to launch two GPU kernels. You want to be able to put them together in a smart way, right? And as a memory planning, it's more about, you know, hey, if you run like Python code, every time when you generate a new array, you are effectively allocating a new piece of memory, right? Of course, PyTorch and other frameworks try to optimize for you. So there is a smart memory allocator behind the scene. But actually, in a lot of cases, it's much better to statically allocate and plan everything ahead of time. And that's where like a compiler can come in. We need to, first of all, actually for language model, it's much harder because dynamic shape. So you need to be able to what we call symbolic shape tracing. So we have like a symbolic variable that tells you like the shape of the first tensor is n by 12. And the shape of the third tensor is also n by 12. Or maybe it's n times 2 by 12. Although you don't know what n is, right? But you will be able to know that relation and be able to use that to reason about like fusion and other decisions. So besides this, I think loop transformation is quite important. And it's actually non-traditional. Originally, if you simply write a code and you want to get a performance, it's very hard. For example, you know, if you write a matrix multiplier, the simplest thing you can do is you do for i, j, k, c, i, j, plus, equal, you know, a, i, k, times b, i, k. But that code is 100 times slower than the best available code that you can get. So we do a lot of transformation, like being able to take the original code, trying to put things into shared memory, and making use of tensor calls, making use of memory copies, and all this. Actually, all these things, we also realize that, you know, we cannot do all of them. So we also make the ML compilation framework as a Python package, so that people will be able to continuously improve that part of engineering in a more transparent way. So we find that's very useful, actually, for us to be able to get good performance very quickly on some of the new models. Like when Lamato came out, we'll be able to go and look at the whole, here's the bottleneck, and we can go and optimize those. [00:28:10]Alessio: And then the fourth one being weight quantization. So everybody wants to know about that. And just to give people an idea of the memory saving, if you're doing FB32, it's like four bytes per parameter. Int8 is like one byte per parameter. So you can really shrink down the memory footprint. What are some of the trade-offs there? How do you figure out what the right target is? And what are the precision trade-offs, too? [00:28:37]Tianqi: Right now, a lot of people also mostly use int4 now for language models. So that really shrinks things down a lot. And more recently, actually, we started to think that, at least in MOC, we don't want to have a strong opinion on what kind of quantization we want to bring, because there are so many researchers in the field. So what we can do is we can allow developers to customize the quantization they want, but we still bring the optimum code for them. So we are working on this item called bring your own quantization. In fact, hopefully MOC will be able to support more quantization formats. And definitely, I think there's an open field that's being explored. Can you bring more sparsities? Can you quantize activations as much as possible, and so on? And it's going to be something that's going to be relevant for quite a while. [00:29:27]Swyx: You mentioned something I wanted to double back on, which is most people use int4 for language models. This is actually not obvious to me. Are you talking about the GGML type people, or even the researchers who are training the models also using int4? [00:29:40]Tianqi: Sorry, so I'm mainly talking about inference, not training, right? So when you're doing training, of course, int4 is harder, right? Maybe you could do some form of mixed type precision for inference. I think int4 is kind of like, in a lot of cases, you will be able to get away with int4. And actually, that does bring a lot of savings in terms of the memory overhead, and so on. [00:30:09]Alessio: Yeah, that's great. Let's talk a bit about maybe the GGML, then there's Mojo. How should people think about MLC? How do all these things play together? I think GGML is focused on model level re-implementation and improvements. Mojo is a language, super sad. You're more at the compiler level. Do you all work together? Do people choose between them? [00:30:32]Tianqi: So I think in this case, I think it's great to say the ecosystem becomes so rich with so many different ways. So in our case, GGML is more like you're implementing something from scratch in C, right? So that gives you the ability to go and customize each of a particular hardware backend. But then you will need to write from CUDA kernels, and you write optimally from AMD, and so on. So the kind of engineering effort is a bit more broadened in that sense. Mojo, I have not looked at specific details yet. I think it's good to start to say, it's a language, right? I believe there will also be machine learning compilation technologies behind it. So it's good to say, interesting place in there. In the case of MLC, our case is that we do not want to have an opinion on how, where, which language people want to develop, deploy, and so on. And we also realize that actually there are two phases. We want to be able to develop and optimize your model. By optimization, I mean, really bring in the best CUDA kernels and do some of the machine learning engineering in there. And then there's a phase where you want to deploy it as a part of the app. So if you look at the space, you'll find that GGML is more like, I'm going to develop and optimize in the C language, right? And then most of the low-level languages they have. And Mojo is that you want to develop and optimize in Mojo, right? And you deploy in Mojo. In fact, that's the philosophy they want to push for. In the ML case, we find that actually if you want to develop models, the machine learning community likes Python. Python is a language that you should focus on. So in the case of MLC, we really want to be able to enable, not only be able to just define your model in Python, that's very common, right? But also do ML optimization, like engineering optimization, CUDA kernel optimization, memory planning, all those things in Python that makes you customizable and so on. But when you do deployment, we realize that people want a bit of a universal flavor. If you are a web developer, you want JavaScript, right? If you're maybe an embedded system person, maybe you would prefer C++ or C or Rust. And people sometimes do like Python in a lot of cases. So in the case of MLC, we really want to have this vision of, you optimize, build a generic optimization in Python, then you deploy that universally onto the environments that people like. [00:32:54]Swyx: That's a great perspective and comparison, I guess. One thing I wanted to make sure that we cover is that I think you are one of these emerging set of academics that also very much focus on your artifacts of delivery. Of course. Something we talked about for three years, that he was very focused on his GitHub. And obviously you treated XGBoost like a product, you know? And then now you're publishing an iPhone app. Okay. Yeah. Yeah. What is his thinking about academics getting involved in shipping products? [00:33:24]Tianqi: I think there are different ways of making impact, right? Definitely, you know, there are academics that are writing papers and building insights for people so that people can build product on top of them. In my case, I think the particular field I'm working on, machine learning systems, I feel like really we need to be able to get it to the hand of people so that really we see the problem, right? And we show that we can solve a problem. And it's a different way of making impact. And there are academics that are doing similar things. Like, you know, if you look at some of the people from Berkeley, right? A few years, they will come up with big open source projects. Certainly, I think it's just a healthy ecosystem to have different ways of making impacts. And I feel like really be able to do open source and work with open source community is really rewarding because we have a real problem to work on when we build our research. Actually, those research bring together and people will be able to make use of them. And we also start to see interesting research challenges that we wouldn't otherwise say, right, if you're just trying to do a prototype and so on. So I feel like it's something that is one interesting way of making impact, making contributions. [00:34:40]Swyx: Yeah, you definitely have a lot of impact there. And having experience publishing Mac stuff before, the Apple App Store is no joke. It is the hardest compilation, human compilation effort. So one thing that we definitely wanted to cover is running in the browser. You have a 70 billion parameter model running in the browser. That's right. Can you just talk about how? Yeah, of course. [00:35:02]Tianqi: So I think that there are a few elements that need to come in, right? First of all, you know, we do need a MacBook, the latest one, like M2 Max, because you need the memory to be big enough to cover that. So for a 70 million model, it takes you about, I think, 50 gigahertz of RAM. So the M2 Max, the upper version, will be able to run it, right? And it also leverages machine learning compilation. Again, what we are doing is the same, whether it's running on iPhone, on server cloud GPUs, on AMDs, or on MacBook, we all go through that same MOC pipeline. Of course, in certain cases, maybe we'll do a bit of customization iteration for either ones. And then it runs on the browser runtime, this package of WebLM. So that will effectively... So what we do is we will take that original model and compile to what we call WebGPU. And then the WebLM will be to pick it up. And the WebGPU is this latest GPU technology that major browsers are shipping right now. So you can get it in Chrome for them already. It allows you to be able to access your native GPUs from a browser. And then effectively, that language model is just invoking the WebGPU kernels through there. So actually, when the LATMAR2 came out, initially, we asked the question about, can you run 17 billion on a MacBook? That was the question we're asking. So first, we actually... Jin Lu, who is the engineer pushing this, he got 17 billion on a MacBook. We had a CLI version. So in MLC, you will be able to... That runs through a metal accelerator. So effectively, you use the metal programming language to get the GPU acceleration. So we find, okay, it works for the MacBook. Then we asked, we had a WebGPU backend. Why not try it there? So we just tried it out. And it's really amazing to see everything up and running. And actually, it runs smoothly in that case. So I do think there are some kind of interesting use cases already in this, because everybody has a browser. You don't need to install anything. I think it doesn't make sense yet to really run a 17 billion model on a browser, because you kind of need to be able to download the weight and so on. But I think we're getting there. Effectively, the most powerful models you will be able to run on a consumer device. It's kind of really amazing. And also, in a lot of cases, there might be use cases. For example, if I'm going to build a chatbot that I talk to it and answer questions, maybe some of the components, like the voice to text, could run on the client side. And so there are a lot of possibilities of being able to have something hybrid that contains the edge component or something that runs on a server. [00:37:47]Alessio: Do these browser models have a way for applications to hook into them? So if I'm using, say, you can use OpenAI or you can use the local model. Of course. [00:37:56]Tianqi: Right now, actually, we are building... So there's an NPM package called WebILM, right? So that you will be able to, if you want to embed it onto your web app, you will be able to directly depend on WebILM and you will be able to use it. We are also having a REST API that's OpenAI compatible. So that REST API, I think, right now, it's actually running on native backend. So that if a CUDA server is faster to run on native backend. But also we have a WebGPU version of it that you can go and run. So yeah, we do want to be able to have easier integrations with existing applications. And OpenAI API is certainly one way to do that. Yeah, this is great. [00:38:37]Swyx: I actually did not know there's an NPM package that makes it very, very easy to try out and use. I want to actually... One thing I'm unclear about is the chronology. Because as far as I know, Chrome shipped WebGPU the same time that you shipped WebILM. Okay, yeah. So did you have some kind of secret chat with Chrome? [00:38:57]Tianqi: The good news is that Chrome is doing a very good job of trying to have early release. So although the official shipment of the Chrome WebGPU is the same time as WebILM, actually, you will be able to try out WebGPU technology in Chrome. There is an unstable version called Canary. I think as early as two years ago, there was a WebGPU version. Of course, it's getting better. So we had a TVM-based WebGPU backhand two years ago. Of course, at that time, there were no language models. It was running on less interesting, well, still quite interesting models. And then this year, we really started to see it getting matured and performance keeping up. So we have a more serious push of bringing the language model compatible runtime onto the WebGPU. [00:39:45]Swyx: I think you agree that the hardest part is the model download. Has there been conversations about a one-time model download and sharing between all the apps that might use this API? That is a great point. [00:39:58]Tianqi: I think it's already supported in some sense. When we download the model, WebILM will cache it onto a special Chrome cache. So if a different web app uses the same WebILM JavaScript package, you don't need to redownload the model again. So there is already something there. But of course, you have to download the model once at least to be able to use it. [00:40:19]Swyx: Okay. One more thing just in general before we're about to zoom out to OctoAI. Just the last question is, you're not the only project working on, I guess, local models. That's right. Alternative models. There's gpt4all, there's olama that just recently came out, and there's a bunch of these. What would be your advice to them on what's a valuable problem to work on? And what is just thin wrappers around ggml? Like, what are the interesting problems in this space, basically? [00:40:45]Tianqi: I think making API better is certainly something useful, right? In general, one thing that we do try to push very hard on is this idea of easier universal deployment. So we are also looking forward to actually have more integration with MOC. That's why we're trying to build API like WebILM and other things. So we're also looking forward to collaborate with all those ecosystems and working support to bring in models more universally and be able to also keep up the best performance when possible in a more push-button way. [00:41:15]Alessio: So as we mentioned in the beginning, you're also the co-founder of Octomel. Recently, Octomel released OctoAI, which is a compute service, basically focuses on optimizing model runtimes and acceleration and compilation. What has been the evolution there? So Octo started as kind of like a traditional MLOps tool, where people were building their own models and you help them on that side. And then it seems like now most of the market is shifting to starting from pre-trained generative models. Yeah, what has been that experience for you and what you've seen the market evolve? And how did you decide to release OctoAI? [00:41:52]Tianqi: One thing that we found out is that on one hand, it's really easy to go and get something up and running, right? So if you start to consider there's so many possible availabilities and scalability issues and even integration issues since becoming kind of interesting and complicated. So we really want to make sure to help people to get that part easy, right? And now a lot of things, if we look at the customers we talk to and the market, certainly generative AI is something that is very interesting. So that is something that we really hope to help elevate. And also building on top of technology we build to enable things like portability across hardwares. And you will be able to not worry about the specific details, right? Just focus on getting the model out. We'll try to work on infrastructure and other things that helps on the other end. [00:42:45]Alessio: And when it comes to getting optimization on the runtime, I see when we run an early adopters community and most enterprises issue is how to actually run these models. Do you see that as one of the big bottlenecks now? I think a few years ago it was like, well, we don't have a lot of machine learning talent. We cannot develop our own models. Versus now it's like, there's these great models you can use, but I don't know how to run them efficiently. [00:43:12]Tianqi: That depends on how you define by running, right? On one hand, it's easy to download your MLC, like you download it, you run on a laptop, but then there's also different decisions, right? What if you are trying to serve a larger user request? What if that request changes? What if the availability of hardware changes? Right now it's really hard to get the latest hardware on media, unfortunately, because everybody's trying to work on the things using the hardware that's out there. So I think when the definition of run changes, there are a lot more questions around things. And also in a lot of cases, it's not only about running models, it's also about being able to solve problems around them. How do you manage your model locations and how do you make sure that you get your model close to your execution environment more efficiently? So definitely a lot of engineering challenges out there. That we hope to elevate, yeah. And also, if you think about our future, definitely I feel like right now the technology, given the technology and the kind of hardware availability we have today, we will need to make use of all the possible hardware available out there. That will include a mechanism for cutting down costs, bringing something to the edge and cloud in a more natural way. So I feel like still this is a very early stage of where we are, but it's already good to see a lot of interesting progress. [00:44:35]Alessio: Yeah, that's awesome. I would love, I don't know how much we're going to go in depth into it, but what does it take to actually abstract all of this from the end user? You know, like they don't need to know what GPUs you run, what cloud you're running them on. You take all of that away. What was that like as an engineering challenge? [00:44:51]Tianqi: So I think that there are engineering challenges on. In fact, first of all, you will need to be able to support all the kind of hardware backhand you have, right? On one hand, if you look at the media library, you'll find very surprisingly, not too surprisingly, most of the latest libraries works well on the latest GPU. But there are other GPUs out there in the cloud as well. So certainly being able to have know-hows and being able to do model optimization is one thing, right? Also infrastructures on being able to scale things up, locate models. And in a lot of cases, we do find that on typical models, it also requires kind of vertical iterations. So it's not about, you know, build a silver bullet and that silver bullet is going to solve all the problems. It's more about, you know, we're building a product, we'll work with the users and we find out there are interesting opportunities in a certain point. And when our engineer will go and solve that, and it will automatically reflect it in a service. [00:45:45]Swyx: Awesome. [00:45:46]Alessio: We can jump into the lightning round until, I don't know, Sean, if you have more questions or TQ, if you have more stuff you wanted to talk about that we didn't get a chance to [00:45:54]Swyx: touch on. [00:45:54]Alessio: Yeah, we have talked a lot. [00:45:55]Swyx: So, yeah. We always would like to ask, you know, do you have a commentary on other parts of AI and ML that is interesting to you? [00:46:03]Tianqi: So right now, I think one thing that we are really pushing hard for is this question about how far can we bring open source, right? I'm kind of like a hacker and I really like to put things together. So I think it's unclear in the future of what the future of AI looks like. On one hand, it could be possible that, you know, you just have a few big players, you just try to talk to those bigger language models and that can do everything, right? On the other hand, one of the things that Wailing Academic is really excited and pushing for, that's one reason why I'm pushing for MLC, is that can we build something where you have different models? You have personal models that know the best movie you like, but you also have bigger models that maybe know more, and you get those models to interact with each other, right? And be able to have a wide ecosystem of AI agents that helps each person while still being able to do things like personalization. Some of them can run locally, some of them, of course, running on a cloud, and how do they interact with each other? So I think that is a very exciting time where the future is yet undecided, but I feel like there is something we can do to shape that future as well. [00:47:18]Swyx: One more thing, which is something I'm also pursuing, which is, and this kind of goes back into predictions, but also back in your history, do you have any idea, or are you looking out for anything post-transformers as far as architecture is concerned? [00:47:32]Tianqi: I think, you know, in a lot of these cases, you can find there are already promising models for long contexts, right? There are space-based models, where like, you know, a lot of some of our colleagues from Albert, who he worked on this HIPPO models, right? And then there is an open source version called RWKV. It's like a recurrent models that allows you to summarize things. Actually, we are bringing RWKV to MOC as well, so maybe you will be able to see one of the models. [00:48:00]Swyx: We actually recorded an episode with one of the RWKV core members. It's unclear because there's no academic backing. It's just open source people. Oh, I see. So you like the merging of recurrent networks and transformers? [00:48:13]Tianqi: I do love to see this model space continue growing, right? And I feel like in a lot of cases, it's just that attention mechanism is getting changed in some sense. So I feel like definitely there are still a lot of things to be explored here. And that is also one reason why we want to keep pushing machine learning compilation, because one of the things we are trying to push in was productivity. So that for machine learning engineering, so that as soon as some of the models came out, we will be able to, you know, empower them onto those environments that's out there. [00:48:43]Swyx: Yeah, it's a really good mission. Okay. Very excited to see that RWKV and state space model stuff. I'm hearing increasing chatter about that stuff. Okay. Lightning round, as always fun. I'll take the first one. Acceleration. What has already happened in AI that you thought would take much longer? [00:48:59]Tianqi: Emergence of more like a conversation chatbot ability is something that kind of surprised me before it came out. This is like one piece that I feel originally I thought would take much longer, but yeah, [00:49:11]Swyx: it happens. And it's funny because like the original, like Eliza chatbot was something that goes all the way back in time. Right. And then we just suddenly came back again. Yeah. [00:49:21]Tianqi: It's always too interesting to think about, but with a kind of a different technology [00:49:25]Swyx: in some sense. [00:49:25]Alessio: What about the most interesting unsolved question in AI? [00:49:31]Swyx: That's a hard one, right? [00:49:32]Tianqi: So I can tell you like what kind of I'm excited about. So, so I think that I have always been excited about this idea of continuous learning and lifelong learning in some sense. So how AI continues to evolve with the knowledges that have been there. It seems that we're getting much closer with all those recent technologies. So being able to develop systems, support, and be able to think about how AI continues to evolve is something that I'm really excited about. [00:50:01]Swyx: So specifically, just to double click on this, are you talking about continuous training? That's like a training. [00:50:06]Tianqi: I feel like, you know, training adaptation and it's all similar things, right? You want to think about entire life cycle, right? The life cycle of collecting data, training, fine tuning, and maybe have your local context that getting continuously curated and feed onto models. So I think all these things are interesting and relevant in here. [00:50:29]Swyx: Yeah. I think this is something that people are really asking, you know, right now we have moved a lot into the sort of pre-training phase and off the shelf, you know, the model downloads and stuff like that, which seems very counterintuitive compared to the continuous training paradigm that people want. So I guess the last question would be for takeaways. What's basically one message that you want every listener, every person to remember today? [00:50:54]Tianqi: I think it's getting more obvious now, but I think one of the things that I always want to mention in my talks is that, you know, when you're thinking about AI applications, originally people think about algorithms a lot more, right? Our algorithm models, they are still very important. But usually when you build AI applications, it takes, you know, both algorithm side, the system optimizations, and the data curations, right? So it takes a connection of so many facades to be able to bring together an AI system and be able to look at it from that holistic perspective is really useful when we start to build modern applications. I think it's going to continue going to be more important in the future. [00:51:35]Swyx: Yeah. Thank you for showing the way on this. And honestly, just making things possible that I thought would take a lot longer. So thanks for everything you've done. [00:51:46]Tianqi: Thank you for having me. [00:51:47]Swyx: Yeah. [00:51:47]Alessio: Thanks for coming on TQ. [00:51:49]Swyx: Have a good one. [00:51:49] Get full access to Latent Space at www.latent.space/subscribe

LCC 298 - De l'IA à toutes les sauces

Les Cast Codeurs Podcast

Play Episode Listen Later Jul 24, 2023 103:52

Dans cet épisode estival Guillaume, Emmanuel et Arnaud parcourent les nouvelles du début d'été. Du Java, du Rust, du Go du coté des langages, du Micronaut, du Quarkus pour les frameworks, mais aussi du WebGPU, de l'agilité, du DDD, des sondages, de nombreux outils et surtout de l'intelligence artificielle à toutes les sauces (dans les bases de données, dans les voitures…). Enregistré le 21 juillet 2023 Téléchargement de l'épisode LesCastCodeurs-Episode-298.mp3 News Langages La release candidate de Go 1.21 supporte WASM et WASI nativement https://go.dev/blog/go1.21rc StringBuilder ou contatenation de String https://reneschwietzke.de/java/the-stringbuilder-advise-is-dead-or-isnt-it.html StringBuilder était la recommendation ca cela créait moins d'objects notamment. Mais la JVM a évolué et le compilateur ou JIT remplace cela par du code efficace Quelques petites exceptions le code froid (e.g. startup time) qui est encore interprété peut beneficier de StringBuilder autre cas, la concatenation dans des boucles où le JIT ne pourrait peut etre pas optimiser le StringBuilder “fluid” est plus efficace (inliné?) ces regles ne changement pas si des objects sont stringifié pour etre concaténés GPT 4 pas une revolution https://thealgorithmicbridge.substack.com/p/gpt-4s-secret-has-been-revealed rumeur ca beaucou de secret pas u modele a 1 trillion de parametres maus 8 a 220 Milliards combinés intelligeament les chercheurs attendaient un breakthrough amis c'est une envolution et pas particulierement nouveau methode deja implem,entee par des cherchers chez google (maintenant chez ooenai ils ont retarde la competition avec ces rumeurs de breakthrough amis 8 LLaMA peut peut etre rivaliser avec GPT4 Le blog Open Source de Google propose un article sur 5 mythes ou non sur l'apprentissage et l'utilisation de Rust https://opensource.googleblog.com/2023/06/rust-fact-vs-fiction-5-insights-from-googles-rust-journey-2022.html Il faut plus de 6 mois pour apprendre Rust : plutôt faux; quelques semaines à 3-4 mois max Le compilateur Rust est pas aussi rapide qu'on le souhaiterait — vrai ! Le code unsafe et l'interop sont les plus gros challanges — faux, c'est plutôt les macros, l'owernship/borrowing, et la programmation asynchrone Rust fournit des messages d'erreur de compilation géniaux — vrai Le code Rust est de haute qualité — vrai InfoQ sort un nouveau guide sur le Pattern Matching pour le switch de Java https://www.infoq.com/articles/pattern-matching-for-switch/ Le pattern matching supporte tous les types de référence L'article parle du cas de la valeur null L'utilisation des patterns “guarded” avec le mot clé when L'importance de l'ordre des cases Le pattern matching peut être utilisé aussi avec le default des switchs Le scope des variables du pattern Un seul pattern par case label Un seul case match-all dans un bloc switch L'exhaustivité de la couverture des types L'utilisation des generics La gestion d'erreur avec MatchException Librairies Sortie de Micronaut 4 https://micronaut.io/2023/07/14/micronaut-framework-4-0-0-released/ Langage minimal : Java 17, Groovy 4 et Kotlin 1.8 Support de la dernière version de GraalVM Utilisation des GraalVM Reachability Metadata Repository pour faciliter l'utilisation de Native Image Gradle 8 Nouveau Expression Language, à la compilation, pas possible au runtime (pour des raisons de sécurité et de support de pré-compilation) Support des Virtual Threads Nouvelle couche HTTP, éliminant les stack frames réactives quand on n'utilise pas le mode réactif Support expérimental de IO Uring et HTTP/3 Des filtres basés sur les annotations Le HTTP Client utilise maintenant le Java HTTP Client Génération de client et de serveur en Micronaut à partir de fichier OpenAPI L'utilisation YAML n'utilise plus la dépendance SnakeYAML (qui avait des problèmes de sécurité) Transition vers Jackarta terminé Et plein d'autres mises à jour de modules Couverture par InfoQ https://www.infoq.com/news/2023/07/micronaut-brings-virtual-thread/ Quarkus 3.2 et LTS https://quarkus.io/blog/quarkus-3-2-0-final-released/ https://quarkus.io/blog/quarkus-3-1-0-final-released/ https://quarkus.io/blog/lts-releases/ Infrastructure Red Hat partage les sources de sa distribution au travers de son Customer Portal, et impacte la communauté qui se base dessus https://almalinux.org/blog/impact-of-rhel-changes/ RedHat a annoncé un autre changement massif qui affecte tous les rebuilds et forks de Red Hat Enterprise Linux. À l'avenir, Red Hat publiera uniquement le code source pour les RHEL RPMs derrière leur portail client. Comme tous les clones de RHEL dépendent des sources publiées, cela perturbe encore une fois l'ensemble de l'écosystème Red Hat. Une analyse du choix de red hat sur la distribution du code source de rhel https://dissociatedpress.net/2023/06/24/red-hat-and-the-clone-wars/ Une reponse de red hat aux feux démarrés par l'annonce de la non distribution des sources de RHEL en public https://www.redhat.com/en/blog/red-hats-commitment-open-source-response-gitcentosorg-changes et un lien vers une de ces feux d'une personne proheminente dans la communauté Ansible https://www.jeffgeerling.com/blog/2023/im-done-red-hat-enterprise-linux Oracle demande a garder un Linux ouvert et gratuit https://www.oracle.com/news/announcement/blog/keep-linux-open-and-free-2023-07-10/ Suite à l'annonce d'IBM/RedHat, Oracle demande à garder Linux ouvert et gratuit IBM ne veut pas publier le code de RHEL car elle doit payer ses ingénieurs Alors que RedHat a pu maintenir son modèle économique durante des années L'article revient sur CentOS qu'IBM “a tué” en 2020 Oracle continue ses éfforts de rendre Linux ouvert et libre Oracle Linux continuera à être compatible avec RHEL jusqu'à la version 9.2, après ça sera compliqué de maintenir une comptabilité Oracle embauche des dev Linux Oracle demande à IBM de récupérer le downstream d'Oracle et de le distribuer SUSE forke RHEL https://www.suse.com/news/SUSE-Preserves-Choice-in-Enterprise-Linux/ SUSE est la société derrière Rancher, NeuVector, et SUSE Linux Enterprise (SLE) Annonce un fork de RHEL $10M d'investissement dans le projet sur les prochaines années Compatibilité assurée de RHEL et CentOS Web Google revent sont service de nom de domaine a Squarespace https://www.reddit.com/r/webdev/comments/14agag3/squarespace_acquires_google_domains/ et c'était pas gratuit donc on n'est pas censé etre le produit :wink: Squarespace est une entreprise américaine spécialisée dans la création de site internet Squarespace est un revendeur de Google Workspace depuis longtemps La vente devrait se finaliser en Q3 2023 Petite introduction à WebGPU en français https://blog.octo.com/connaissez-vous-webgpu/ Data Avec la mode des Large Language Models, on parle de plus en plus de bases de données vectorielles, pour stocker des “embeddings” (des vecteurs de nombre flottant représentant sémantiquement du texte, ou même des images). Un article explique que les Vecteurs sont le nouveau JSON dans les bases relationnelles comme PostgreSQL https://jkatz05.com/post/postgres/vectors-json-postgresql/ L'article parle en particulier de l'extension pgVector qui est une extension pour PostgreSQL pour rajouter le support des vectors comme type de colonne https://github.com/pgvector/pgvector Google Cloud annonce justement l'intégration de cette extension vectorielle à CloudSQL pour PostgreSQL et à AlloyDB pour PostgreSQL https://cloud.google.com/blog/products/databases/announcing-vector-support-in-postgresql-services-to-power-ai-enabled-applications Il y a également une vidéo, un notebook Colab, et une article plus détaillé techniquement utilisant LangChain https://cloud.google.com/blog/products/databases/using-pgvector-llms-and-langchain-with-google-cloud-databases Mais on voit aussi également Elastic améliorer Lucene pour utiliser le support des instructions SIMD pour accélérer les calculs vectoriels (produit scalaire, distance euclidienne, similarité cosinus) https://www.elastic.co/fr/blog/accelerating-vector-search-simd-instructions Outillage Le sondage de StackOverflow 2023 https://survey.stackoverflow.co/2023/ L'enquête a été réalisée auprès de 90 000 développeurs dans 185 pays. Les développeurs sont plus nombreux (+2%) que l'an dernier à travailler sur site (16% sur site, 41% remote, 42% hybrid) Les développeurs sont également de plus en plus nombreux à utiliser des outils d'intelligence artificielle, avec 70 % d'entre eux déclarant les utiliser (44%) ou prévoyant de les utiliser (25) dans leur travail. Les langages de programmation les plus populaires sont toujours JavaScript, Python et HTML/CSS. Les frameworks web les plus populaires sont Node, React, JQuery. Les bases de données les plus populaires sont PostgreSQL, MySQL, et SQLite. Les systèmes d'exploitation les plus populaires sont Windows puis macOS et Linux. Les IDE les plus populaires sont Visual Studio Code, Visual Studio et IDEA IntelliJ. Les différents types de déplacement dans Vim https://www.barbarianmeetscoding.com/boost-your-coding-fu-with-vscode-and-vim/moving-blazingly-fast-with-the-core-vim-motions/ JetBrains se mets aussi à la mode des assistants IA dans l'IDE https://blog.jetbrains.com/idea/2023/06/ai-assistant-in-jetbrains-ides/ une intégration avec OpenAI mais aussi de plus petits LLMs spécifiques à JetBrains un chat intégré pour discuter avec l'assistant, puis la possibilité d'intégrer les snippets de code là où se trouve le curseur possibilité de sélectionner du code et de demander à l'assistant d'expliquer ce que ce bout de code fait, mais aussi de suggérer un refactoring, ou de régler les problèmes potentiels on peut demander à générer la JavaDoc d'une méthode, d'une classe, etc, ou à suggérer un nom de méthode (en fonction de son contenu) génération de message de commit il faut avoir un compte JetBrains AI pour y avoir accès Des commandes macOS plus ou moins connues https://saurabhs.org/advanced-macos-commands caffeinate — pour garder le mac éveillé pbcopy / pbpaste — pour interagir avec le clipboard networkQuality — pour mesurer la rapidité de l'accès à internet sips — pour manipuler / redimensionner des images textutil — pour covertir des fichers word, texte, HTML screencapture — pour faire un screenshot say — pour donner une voix à vos commandes Le sondage de la communauté ArgoCD https://blog.argoproj.io/cncf-argo-cd-rollouts-2023-user-survey-results-514aa21c21df Un client d'API open-source et cross-platform pour GraphQL, REST, WebSockets, Server-sent events et gRPC https://github.com/Kong/insomnia Architecture Moderniser l'architecture avec la decouverte via le domain driven discovery https://www.infoq.com/articles/architecture-modernization-domain-driven-discovery/?utm_source=twitter&utm_medium=link&utm_campaign=calendar Un article très détaillé pour moderniser son architecture en utilisant une approche Domain-Driven Discovery qui se fait en 5 étapes: Encadrer le problème – Clarifier le problème que vous résolvez, les personnes touchées, les résultats souhaités et les contraintes de solution. Analyser l'état actuel – Explorer les processus opérationnels et l'architecture des systèmes existants afin d'établir une base de référence pour l'amélioration. Explorer l'état futur – Concevoir une architecture modernisée fondée sur des contextes délimités, établir des priorités stratégiques, évaluer les options et créer des solutions pour l'état futur. Créer une feuille de route – Créer un plan pour moderniser l'architecture au fil du temps en fonction des flux de travail ou des résultats souhaités. Récemment, Sfeir a lancé son blog de développement sur https://www.sfeir.dev/ plein d'articles techniques sur de nombreux thèmes : front, back, cloud, data, AI/ML, mobile aussi des tendances, des success stories par exemple dans les derniers articles : on parle d'Alan Turing, du Local Storage en Javascript, des la préparation de certifications React, l'impact de la cybersécurité sur le cloud Demis Hassabis annonce travailler sur une IA nommée Gemini qui dépassera ChatGPT https://www.wired.com/story/google-deepmind-demis-hassabis-chatgpt/ Demis Hassabis CEO de Google DeepMind créateur de AlphaGOet AlphaFold Travaille sur une IA nommé Gemini qui dépasserait ChatGPT de OpenAI Similair à GPT-4 mais avec des techniques issues de AlphaGO Encore en developpement, va prendre encore plusieurs mois Un remplaçant a Bard? Méthodologies Approcher l'agilité par les traumatismes (de developement) passés des individus https://www.infoq.com/articles/trauma-informed-agile/?utm_campaign=infoq_content&utm_source=twitter&utm_medium=feed&utm_term=culture-methods Nous subissons tous un traumatisme du développement qui rend difficile la collaboration avec d'autres - une partie cruciale du travail dans le développement de logiciels agiles. Diriger d'une manière tenant compte des traumatismes n'est pas pratiquer la psychothérapie non sollicitée, et ne justifie pas les comportements destructeurs sans les aborder. Être plus sensible aux traumatismes dans votre leadership peut aider tout le monde à agir de façon plus mature et plus disponible sur le plan cognitif, surtout dans des situations émotionnellement difficiles. Dans les milieux de travail tenant compte des traumatismes, les gens accordent plus d'attention à leur état physique et émotionnel. Ils s'appuient aussi davantage sur le pouvoir de l'intention, fixent des objectifs d'une manière moins manipulatrice et sont capables d'être empathiques sans s'approprier les problèmes des autres. Loi, société et organisation Mercedes va rajouter de l'intelligence artificielle dans ses voitures https://azure.microsoft.com/en-us/blog/mercedes-benz-enhances-drivers-experience-with-azure-openai-service/ Programme béta test de 3 mois pour le moment Assistance vocale “Hey Mercedes” Permet de discuter avec la voiture pour trouver son chemin, concocter une recette, ou avoir tout simplement des discussions Ils travaillent sur des plugin pour reserver un resto, acheter des tickets de cinéma Free software vs Open Source dans le contexte de l'intelligence artificielle par Sacha Labourey https://medium.com/@sachalabourey/ai-free-software-is-essential-to-save-humanity-86b08c3d4777 on parle beaucoup d'AI et d'open source mais il manque la dimension de controle des utilisateurs finaux Stallman a crée la FSF par peur de la notion d'humain augmenté par des logiciels qui sont controllés par d'autres (implants dans le cerveau etc) d'ou la GPL et sa viralité qui propage la capacité a voir et modifier le conde que l'on fait tourner dans le debat AI, ce n'est pas seulement open source (casser oligopolie) mais aissu le free software qui est en jeu La folie du Cyber Resilience Act (CRA) europeen https://news.apache.org/foundation/entry/save-open-source-the-impending-tragedy-of-the-cyber-resilience-act Au sein de l'UE, la loi sur la cyber-résilience (CRA) fait maintenant son chemin à travers les processus législatifs (et doit faire l'objet d'un vote clé le 19 juillet 2023). Cette loi s'appliquera à un large éventail de logiciels (et de matériel avec logiciel intégré) dans l'UE. L'intention de ce règlement est bonne (et sans doute attendue depuis longtemps) : rendre le logiciel beaucoup plus sûr. Le CRA a une approche binaire: oui/non et considère tout le monde de la même manière Le CRA réglementerait les projets à source ouverte à moins qu'ils n'aient « un modèle de développement entièrement décentralisé ». Mais les modèles OSS sont de complexes mélanges de pur OSS et éditeurs de logiciels les entreprises commerciales et les projets open source devront être beaucoup plus prudents quant à ce que les participants peuvent travailler sur le code, quel financement ils prennent, et quels correctifs ils peuvent accepter. Certaines des obligations sont pratiquement impossibles à respecter, par exemple l'obligation de « livrer un produit sans vulnérabilités exploitables connues ». Le CRA exige la divulgation de vulnérabilités graves non corrigées et exploitées à l'ENISA (une institution de l'UE) dans un délai mesuré en heures, avant qu'elles ne soient corrigées. (complètement opposé aux bonnes pratiques de sécu) Une fois de plus une bonne idée à l'origine mais très mal implémentée qui risque de faire beaucoup de dommages Octave Klaba, avec Miro, son frère, et la Caisse des Dépôts, finalisent la création de Synfonium qui va maintenant racheter 100% de Qwant et 100% fe Shadow. Synfonium est détenue à 75% par Jezby Venture & Deep Code et à 25% par la CDC. https://twitter.com/i/web/status/1673555414938427392 L'un de rôles de Synfonium est de créer la masse critique des utilisateurs et des clients B2C & B2B qui vont pouvoir utiliser tous ces services gratuits et payants Vous y retrouverez le moteur de recherche, les services gratuits, la suite collaborative, le social login, mais aussi les services de nos partenaires tech. Le but est de créer une plateforme dans le Cloud SaaS EU qui respectent nos valeurs et nos lois européennes Yann LeCun : «L'intelligence artificielle va amplifier l'intelligence humaine» https://www.europe1.fr/emissions/linterview-politique-dimitri-pavlenko/yann-lecun-li[…]gence-artificielle-va-amplifier-lintelligence-humaine-4189120 Conférences La liste des conférences provenant de Developers Conferences Agenda/List par Aurélie Vache et contributeurs : 2-3 septembre 2023 : SRE France SummerCamp - Chambéry (France) 6 septembre 2023 : Cloud Alpes - Lyon (France) 8 septembre 2023 : JUG Summer Camp - La Rochelle (France) 14 septembre 2023 : Cloud Sud - Remote / Toulouse (France) 18 septembre 2023 : Agile Tour Montpellier - Montpellier (France) 19-20 septembre 2023 : Agile en Seine - Paris (France) 19 septembre 2023 : Salon de la Data Nantes - Nantes (France) & Online 21-22 septembre 2023 : API Platform Conference - Lille (France) & Online 22 septembre 2023 : Agile Tour Sophia Antipolis - Valbonne (France) 25-26 septembre 2023 : BIG DATA & AI PARIS 2023 - Paris (France) 28-30 septembre 2023 : Paris Web - Paris (France) 2-6 octobre 2023 : Devoxx Belgium - Antwerp (Belgium) 6 octobre 2023 : DevFest Perros-Guirec - Perros-Guirec (France) 10 octobre 2023 : ParisTestConf - Paris (France) 11-13 octobre 2023 : Devoxx Morocco - Agadir (Morocco) 12 octobre 2023 : Cloud Nord - Lille (France) 12-13 octobre 2023 : Volcamp 2023 - Clermont-Ferrand (France) 12-13 octobre 2023 : Forum PHP 2023 - Marne-la-Vallée (France) 19-20 octobre 2023 : DevFest Nantes - Nantes (France) 19-20 octobre 2023 : Agile Tour Rennes - Rennes (France) 26 octobre 2023 : Codeurs en Seine - Rouen (France) 25-27 octobre 2023 : ScalaIO - Paris (France) 26-27 octobre 2023 : Agile Tour Bordeaux - Bordeaux (France) 26-29 octobre 2023 : SoCraTes-FR - Orange (France) 10 novembre 2023 : BDX I/O - Bordeaux (France) 15 novembre 2023 : DevFest Strasbourg - Strasbourg (France) 16 novembre 2023 : DevFest Toulouse - Toulouse (France) 23 novembre 2023 : DevOps D-Day #8 - Marseille (France) 30 novembre 2023 : PrestaShop Developer Conference - Paris (France) 30 novembre 2023 : WHO run the Tech - Rennes (France) 6-7 décembre 2023 : Open Source Experience - Paris (France) 7 décembre 2023 : Agile Tour Aix-Marseille - Gardanne (France) 8 décembre 2023 : DevFest Dijon - Dijon (France) 7-8 décembre 2023 : TechRocks Summit - Paris (France) Nous contacter Pour réagir à cet épisode, venez discuter sur le groupe Google https://groups.google.com/group/lescastcodeurs Contactez-nous via twitter https://twitter.com/lescastcodeurs Faire un crowdcast ou une crowdquestion Soutenez Les Cast Codeurs sur Patreon https://www.patreon.com/LesCastCodeurs Tous les épisodes et toutes les infos sur https://lescastcodeurs.com/

ai google france transition shadow chatgpt dans windows cdc ibm oracle ia react comme kong alors faire suite salon openai gemini explorers ils rust programme conf agile bard open source python gpt toutes linux server java quelques assistance guillaume llama string petite aur javascript macos html arnaud miro squarespace groovy google cloud certaines vall ranchers oss red hat node loi ai ml large language models sauces elastic cra enregistr stack overflow langage json milliards paris france mysql google workspace vim colab graphql caisse google deepmind ddd jquery analyser visual studio suse postgresql jit kotlin visual studio code gpl ansible jvm lts vache wasm jetbrains diriger sqlite couverture micronauts concevoir yaml fsf centos demis hassabis grpc wasi websockets stallman clarifier langchain rhel html css pattern matching qwant infoq red hat enterprise linux b2c b2b approcher lucene simd webgpu argocd oracle linux codeurs jackarta local storage javadoc

2.3 - Para hablar de Big tech con Adriana Carvajal (@adri.zip)

No me da la vida

Play Episode Listen Later Jun 30, 2023 101:51

En el episodio 2.3 os hablamos de las compañías Big Tech, de los 10 años de React, de los resultados de StackOverflow, de Zig, de Mojo (picón) sucesor de Python, de una reflexión sobre las Vision Pro de Apple, del StateofCSS, del nuevo JIRA Notion projects, del modo dev de Figma, del proyecto Baseline, Angular v16, WebGPU, de oLaunchers los móviles minimalistas, de Pixi.js y de ReactPy, entre otras muchas cosas

tiktok apple react big tech hablar mojo python adri imagen baseline figma angular stack overflow zig carvajal patrocinado headless cms pixi webgpu

LCC 296 - Interview Google IA IA I/O 2023

Les Cast Codeurs Podcast

Play Episode Listen Later May 25, 2023 104:45

Dans cet épisode, Antonio, Emmanuel et Guillaume reviennent sur les nouveautés et annonces faites à Google I/O 2023 : de nouveaux téléphones Pixel qui se plient ou pas, et surtout de l'intelligence artificielle du sol au plafond ! Que ce soit dans Android, dans Google Workspace, dans Google Cloud, une tonne de produits passe en mode survitaminé à l'IA. Guillaume, Antonio et Emmanuel discutent aussi de l'impact qu'ils voient sur l'AI, et de comment les Large Language Models sont raffinés et pourquoi on les fait halluciner, de subtilités du langage des signes. Enregistré le 23 mai 2023 Téléchargement de l'épisode LesCastCodeurs-Episode-296.mp3 Google I/O 2023 Site web : https://io.google/2023/ Keynote principale : https://io.google/2023/program/396cd2d5-9fe1-4725-a3dc-c01bb2e2f38a/ Keynote développeur : https://io.google/2023/program/9fe491dd-cadc-4e03-b084-f75e695993ea/ Vidéo résumée en 10 minutes de toutes les annonces : https://www.youtube.com/watch?v=QpBTM0GO6xI&list=TLGGCy91ScdjTPYxNjA1MjAyMw Vidéo de toutes les sessions techniques : https://io.google/2023/program/?q=technical-session Google I/O s'est tenu il y a 10 jours en Californie, dans l'amphithéâtre de Shoreline, près du campus de Google. Seulement 2000 personnes sur place, un chat et un jeu en ligne pour assister à distance. Jeu en ligne I/O Flip créé avec Flutter, Dart, Firebase, et Cloud Run, et tous les assets graphiques générés par Generative AI https://blog.google/technology/ai/google-card-game-io-flip-ai/ Des Pixels plein les yeux ! Des détails sur le design des nouveaux appareils : https://blog.google/products/pixel/google-pixel-fold-tablet-7a-design/ Pixel Fold Article : https://blog.google/products/pixel/google-pixel-fold/ Premier téléphone foldable de Google (après Samsung et Oppo) Un écran sur le dessus, et un grand écran pliable à l'intérieur Pratique pour la traduction où peut voir une discussion traduire en deux langues d'un côté sur un écran et dans l'autre langue sur l'autre Utilisation créative de la pliure : mode “laptop”, pour les selfies, pour poser l'appareil pour des photos de nuit Par contre… pas disponible en France, et tout de même presque 1900€ ! Pixel Tablet Article : https://blog.google/products/pixel/google-pixel-tablet/ Une belle tablette de 11 pouces, avec un dock de recharge avec enceinte intégrée Processeur Tensor G2, Chromecast intégré C'est un peu comme le Google Nest Hub Max mais avec un écran détachable Une coque pratique avec un trépied intégré et qui n'empêche pas de recharger la tablette sur le dock En mode dock, c'est comme l'écran du Google Home App, et dès qu'on la décroche, on est en mode multi-utilisateur, chacun avec son profil Pixel 7a Article : https://blog.google/products/pixel/pixel-7a-io-2023/ Écran de 6 pouces Triple appareil photo (grand angle, principal, et photo avant pour les selfies) 509 euros Magic Eraser pour effacer les trucs qu'on veut pas dans la photo, Magic Unblur pour rendre une photo floue plus nette, Real Tone pour rendre les peaux foncées plus naturelles Android Article quoi de neuf dans Android : https://blog.google/products/android/android-updates-io-2023/ Dans Messages, Magic Compose dans les conversations, l'IA nous aide à concevoir nos messages, dans différents styles (plus pro, plus fun, dans le style de Shakespeare) Android 14 devrait arriver un peu plus tard dans l'année, avec plus de possibilités de customisation (fond d'écran généré par Gen AI, fond d'écran Emojis, couleurs associées, fond d'écran 3D issus de ses photos) https://blog.google/products/android/new-android-features-generative-ai/ StudioBot : un chatbot intégré à Android Studio pour aider au développement d'applis Android https://io.google/2023/program/d94e89c5-1efa-4ab2-a13a-d61c5eb4e49c/ 800 millions d'utilisateurs sont passés à RCS pour le messaging Adaptation de 50 applications Android pour s'adapter aux foldables https://blog.google/products/android/android-app-redesign-tablet-foldable/ Wear OS 4 va rajouter le backup restore quand on change de montre et autres nouveautés https://blog.google/products/wear-os/wear-os-update-google-io-2023/ 800 chaînes TV gratuites dans Google TV sur Android et dans la voiture Android Auto va être disponible de 200 millions de voitures https://blog.google/products/android/android-auto-new-features-google-io-2023/ Waze disponible globalement sur le playstore dans toutes les voitures avec Android Auto Google Maps Article : https://blog.google/products/maps/google-maps-updates-io-2023/ Maps propose 20 milliards de km de direction tous les jours Immersive View for Routes 15 villes : Amsterdam, Berlin, Dublin, Florence, Las Vegas, London, Los Angeles, Miami, New York, Paris, San Francisco, San Jose, Seattle, Tokyo et Venice Possibilité pour les développeurs de s'intégrer et rajouter des augmentations 3D, des marqueurs Google Photos Article Magic Editor : https://blog.google/products/photos/google-photos-magic-editor-pixel-io-2023/ Magic Editor survitaminé à l'IA pour améliorer les photos, en déplaçant des gens, en rajoutant des parties coupées, ou bien rendre le ciel plus beau Possible que ce soit limité aux téléphones Pixel au début Projets expérimentaux Project Starline (écran avec caméra 3D qui donne un rendu 3D de son interlocuteur comme s'il était en face de soi) a été amélioré pour prendre moins de place https://blog.google/technology/research/project-starline-prototype/ Universal Translator : une nouvelle expérimentation pour faire du doublage et traduction automatique avec synchronisation des mouvements des lèvres Project Tailwind, une sorte de notebook dans lequel on peut rajouter tous ses documents à partir de drive, et poser des questions sur leur contenu, proposer des résumés, de faire du brainstorming sur ces thèmes https://thoughtful.sandbox.google.com/about MusicLM : un large language model pour générer de la musique à partir d'un texte de prompt (waitlist pour s'inscrire) https://blog.google/technology/ai/musiclm-google-ai-test-kitchen/ Project Gameface : utilisation des expressions du visage pour commander une souris et un ordinateur, pour les personnes qui ont perdu leur mobilité https://blog.google/technology/ai/google-project-gameface/ VisualBlocks : pour expérimenter dans une interface drag'n drop avec le développement de modèles pour Tensorflow lite et js https://visualblocks.withgoogle.com/ MakerStudio : pour les bidouilleurs et développeurs https://makersuite.google.com/ https://developers.googleblog.com/2023/05/palm-api-and-makersuite-moving-into-public-preview.html Search Labs Article : https://blog.google/products/search/generative-ai-search/ Expérimentations pour rajouter l'IA générative dans Google Search Faire des recherches avec des requêtes avec des phrases plus complexes, en intégrant des réponses comme Bard, avec des liens, des suggestions d'autres recherches associées Mais aussi proposer des publicités mieux ciblées On peut s'inscrire à Search Labs pour tester cette nouvelle expérience, mais au début juste en Anglais et juste pour les US Des intégrations avec Google Shopping pour proposer et filtrer des produits qui correspondent à la requête Recherche à l'aide d'image, avec Google Lens : 12 milliards de recherches visuelles par mois Palm et Bard Annonce du modèle LLM Palm 2 utilisé dans Bard et dans Google Cloud https://blog.google/technology/ai/google-palm-2-ai-large-language-model/ PaLM 2 est en cours d'intégration dans 25 produits de Google Supportera 100 langues différentes (pour l'instant seulement l'anglais, japonais et coréen), avec déjà les 40 langues les plus parlées d'ici la fin de l'année Maintenant disponible dans 180 pays… sauf l'Europe !!! Capacité de raisonnement accrue Peut coder dans une vingtaine de langages de programmation différents dont Groovy Différentes tailles de modèles : Gecko, Otter, Bison et Unicorn, mais le nombre de paramètres n'est pas communiquée, comme pour GPT-4 d'OpenAI Utilisable pour des requêtes et pour du chat Des modèles dérivées fine-tunés Med-PaLM 2 sur du savoir médical, sur l'analyse visuelle des radios et Sec-PaLM, entrainé sur des cas d'utilisation sur le thème de la cybersécurité, pour aider à déceler des scripts malicieux, des vecteurs d'attaque Sundar Pichai a aussi annoncé que Google travaillait déjà sur la prochaine évolution de ses LLM avec un modèle appelé Gemini. Peu de détails à part qu'il sera multimodal (en particulier recherche combinée image et texte par ex.) Partenariat et intégration de Adobe Firefly dans Bard pour générer des images https://blog.adobe.com/en/publish/2023/05/10/adobe-firefly-adobe-express-google-bard Duet AI pour Google Workspace Article : https://workspace.google.com/blog/product-announcements/duet-ai Dans Gmails et Docs, propose d'aider à la rédaction de vos emails et documents une extension de “smart compose” qui va permettre de générer des emails entiers, d'améliorer le style, de corriger la grammaire, éviter les répétitions de texte Dans Docs, des nouveaux “smart chips” pour rajouter des variables, des templates Dans Slides, rajouter des images générées par IA Des prompts dans Sheets pour générer un draft de table Dans Google Meet, possibilité de créer une image de fond customisée avec Generative AI Ces améliorations font parties de Workspace Labs auquel on peut s'inscrire dans la liste d'attente https://workspace.google.com/labs-sign-up/ Google Cloud Intégration de Generative AI partout https://cloud.google.com/blog/products/ai-machine-learning/google-cloud-launches-new-ai-models-opens-generative-ai-studio Nouvelles VM A3 avec les GPUs H100 de Nvidia, idéal pour l'entrainement de modèles de machine learning, avec 26 exaFlops de performance https://cloud.google.com/blog/products/compute/introducing-a3-supercomputers-with-nvidia-h100-gpus Trois nouveaux modèles LLM dans Vertex AI : Imagen (private preview) pour générer des images, Codey pour la génération de code, et Chirp pour la génération de la parole supportant 100 langues différentes avec 2 milliards de paramètres vocaux Model Garden : avec les modèles de machine learning y compris externes et open sources Ajout des embeddings pour le texte et l'image RLHF, Reinforcement Learning from Human Feedback bientôt intégrer pour étendre Vertex AI tuning et prompt design avec une boucle de feedback humaine Generative AI Studio pour tester ses prompts zero-shot, one-shot, multi-shots Duet AI pour Google Cloud https://cloud.google.com/blog/products/application-modernization/introducing-duet-ai-for-google-cloud Assistance de code dans VSCode et bientôt les IDEs JetBrains grâce au plugin Cloud Code, et dans Cloud Workstations. Intégration dans les IDEs d'un chat pour comme un compagnon pour discuter d'architecture, trouver les commandes à lancer pour son projet Le modèle de code de Codey fonctionne sur une vingtaine de languages de programmation, mais un modèle fine-tuné a été entrainé sur toute la doc de Google Cloud, donc pourra aider en particulier sur l'utilisation des APIs de Google Cloud, ou l'utilisation de la ligne de commande gcloud Duet AI est aussi dans App Sheet, la plateforme low/no-code, et permettra de chatter avec un chatbot pour générer une application App Sheet Quoi de neuf dans Firebase https://firebase.blog/posts/2023/05/whats-new-at-google-io Web Article : https://developers.googleblog.com/2023/05/io23-developer-keynote-recap.html Flutter 3 et Dart 3.10 https://io.google/2023/program/7a253260-3941-470b-8a4d-4253af000119/ WebAssembly https://io.google/2023/program/1d176349-7cf8-4b51-b816-a90fc9d7d479/ WebGPU https://io.google/2023/program/0da196f5-5169-43ff-91db-8762e2c424a2/ Baseline https://io.google/2023/program/528a223c-a3d6-46c5-84e4-88af2cf62670/ https://web.dev/baseline/ Nous contacter Pour réagir à cet épisode, venez discuter sur le groupe Google https://groups.google.com/group/lescastcodeurs Contactez-nous via twitter https://twitter.com/lescastcodeurs Faire un crowdcast ou une crowdquestion Soutenez Les Cast Codeurs sur Patreon https://www.patreon.com/LesCastCodeurs Tous les épisodes et toutes les infos sur https://lescastcodeurs.com/

tv new york ai google los angeles las vegas france san francisco miami seattle berlin 3d tokyo android amsterdam dans premier dublin unicorns triple samsung faire maps gemini san jose keynote nvidia adaptation peut palm bard gpt recherche pixel trois assistance guillaume docs sheets int apis dart vid bison maintenant emojis routes llm otter genai waze google cloud exp google i o peu ides baseline pratique projets californie anglais seulement large language models jeu chromecast flutter shoreline enregistr gecko rcs sundar pichai capacit google workspace utilisation vs code google tv chirp google shopping android auto tensorflow partenariat webassembly firebase reinforcement learning google lens wear os adobe firefly ajout magic eraser android studio codey project starline duet ai vertex ai rlhf real tone musiclm universal translator cloud run med palm magic editor webgpu human feedback immersive view cloud code

#1213: Primer on WebGPU & Bringing High-Performance 3D Graphics and Parallel Compute to the Web

Voices of VR Podcast – Designing for Virtual Reality

Play Episode Listen Later May 19, 2023 74:16

WebGPU shipped in Chrome 113, which brings high-performance 3D graphics & parallel compute capabilities to the web. I was able to chat with Google Chrome software Engineer Brandon Jones, who is a W3C specification editor for both WebGPU and WebXR Device API. We talk about the history of WebGPU, some his speculations as to how Apple may be actively working on support both WebGPU and WebXR (spec editor Ada Rose Cannon works at Apple), the future of WebXR, the new WebGPU Shading Language (WGSL), nascent ecosystem WebGPU support from Babylon.js and three.js and Play Canvas, and some of the AI and Machine Learning capabilities that will become available on the web, which Jones refers to as the “compatibility layer for the world's computing devices.” It'll be open standards like WebGPU, WebXR, glTF, and WebAssembly that start to define what an open and interoperable metaverse might look like, and WebGPU will start to close the gap on bringing the web closer to native performance. Though Jones believes the web will always lag behind trading off performance for more interoperability and cross-compatibility on a broader spectrum of devices. Here are some links with more information on WebGPU: Chrome Ships WebGPU Introducing WebGPU: Unlocking modern GPU access for JavaScript from Google I/O featuring Jones WebGPU Samples Andy McClure's Co-Host post on WebGPU history: “I want to talk about WebGPU” Compute.toys WebGPU and WGSL examples in the vein of Shadertoy

ai apple 3d primer babylon machine learning high performance chrome parallel javascript gpu google chrome google i o compute webassembly w3c webgpu 3d graphics

Game programming nell'era di JavaScript e di WebGPU

Continuous Delivery

Play Episode Listen Later May 10, 2023 64:56

In questo episodio esploriamo le differenze tra lo sviluppo di un videogioco e di una webapp. Ci concentriamo, ad esempio, su come vengono distribuiti i videogiochi attraverso binari nativi e come invece sono deployate le webapp, e come ciò influisce sulla loro architettura e sulle loro prestazioni. Discutiamo anche degli strumenti più utili per lo sviluppo di videogiochi, inclusi i tool di creazione e le soluzioni "nocode" e "lowcode". Inoltre, analizziamo gli aspetti tecnici cruciali per lo sviluppo di videogiochi, come i cicli di update, il frame rate e la gestione delle risorse. Non mancheremo di esplorare una punta di WebGPU e WebGL, e come queste tecnologie stanno cambiando, o potrebbero cambiare, lo sviluppo e la distribuzione di videogiochi.Con: Edoardo Dusi, Paolo Mainardi e Paolo Pustorino/* Newsletter & Telegram */https://landing.sparkfabrik.com/continuous-delivery-newsletterhttps://t.me/continuous_delivery/* Link */https://godotengine.org/https://unity.com/https://gamemaker.io/enhttps://illiteratecodegames.itch.io/powers-in-the-basement

telegram javascript inoltre discutiamo webgl webgpu game programming

Kurz informiert vom 03.05.2023 by heise online

Kurz informiert – die IT-News des Tages von heise online

Play Episode Listen Later May 3, 2023

Heute mit: Pressefreiheit, Monopoly Market, WebGPU, Nokia Drohne

kurz pressefreiheit webgpu heise online

Episode 449: Pigeons, Podcasts, Putting

Science Faction Podcast

Play Episode Listen Later Apr 19, 2023 79:16

This episode contains: Welcome to the Ben Lawless podcast. Lawless: The Ben Story. Y'all seen the movie Lawless? Ben hasn't yet. Was it good? Devon's not here tonight, because of... dinner? With his wife? We still would love a tour of the Mighty Coconut offices. They're doing great work. Steven pitches a series of Walkabout Minigolf courses to Mighty Coconut. Wouldn't you love to play minigolf through the backlots for scifi movies? We pitch minigolf courses based on alien invasions and Buck Rogers. Did you know that Mighty Coconut made Pigeon: Impossible? Which Spies in Disguise is based on? What if there was a minigolf course based on Pigeon: Impossible? Steven is pivoting from podcasts to putting, and pigeons. Ben needs a vacation from his vacation, amirite? Ben's son wanted to watch Speed Racer 2. Ben's heart broke to tell him the truth. Emile Hirsch wants to do another Speed Racer movie! Get the Wachowski's onboard. http://screenrant.com/emile-hirsch-speed-racer-return-hopes-response/ Ben is SO STOKED about the Apple TV+ Speed Racer show helmed by J.J. Abrams. Let's make the Speed Racer Extended Universe a thing. Today in the Weird Wide Web: Chrome will support WebGPU! What the hell does that mean? WebGL is cool and all, but WebGPU is going to revolutionize graphics on the web. Remember how Apple told Adobe "No Thank You" about putting Flash on iPhone? Ben does. Steven wonders why the name Mozilla sounds so familiar... Hint: he uses Firefox. Sometime in May, WebGPU is coming to Chrome! Ben is so excited. Ben is telling you: Arc is the best web browser ever. Steven and Ben argue whether Bing is a browser. Ben has worked in web development for the last 14 years. Feels like 40. Ben called it years ago: Microsoft finally stopped making their own browser engine. https://arstechnica.com/gadgets/2023/04/chrome-113-will-enable-webgpu-a-modern-low-overhead-graphics-api-for-the-web/ Symphony of Scent: Making sense of scents: Deciphering our sense of smell. Remember the five senses? They're cool, right? What is smell? When do things start and stop smelling? Steven wonders. Scientists have created the first 3D picture of how an odor molecule activates an odorant receptor. "We need to see it so we can science it!" - Steven. Smells are somehow like hitting keys on a piano to produce a chord. Devon could explain it. https://www.sciencedaily.com/releases/2023/03/230315132416.htm Science Fiction: The Big Door Prize on Apple TV+ is really good. Is it scifi? Ben thinks so. If you were in The Big Door Prize, would you want to hear your "life potential?" Getting big Tales from the Loop vibes from The Big Door Prize. Are there similarities between The Big Door Prize and Machine of Death? Steven really enjoyed the ending of Shrinking on Apple TV+. The Bad Batch's third season will be its final. We finally talk about Star Wars Celebration this episode. THREE NEW STAR WARS MOVIES?!?!?! WHAT?!?!?! The Filoni film is going to wrap up the Mandoverse? Hate that term btw. So... Grogu trained with Luke for 2 years?! We are digging the trailer for Star Wars Visions Season 2. The claymation looks incredible. Ben finally watched the first Hotel Transylvania. Cool flick! We spoil Star Trek: Picard's 3x09 Vōx because you have to spoil it to talk about it. Now we know why Ro Laren didn't use the transporters! Pre-pod Patreon-only: Ben monologued about sleep for 49 minutes.

S2E27: WebGPU: The Future of Web Graphics Coming Soon to a Chrome Browser Near You!

Minified: Web Dev News

Play Episode Listen Later Apr 17, 2023 10:16

In this episode, we talk Next.js, WebGPU, updates to Rome and Tailwind CSS Stay tuned!Links to the sources:Next.js 13.3: https://nextjs.org/blog/next-13-3Chrome and WebGPU: https://developer.chrome.com/blog/webgpu-release/Rome v12: https://rome.tools/blog/2023/03/28/rome12/Tailwind CSS v3.3: https://tailwindcss.com/blog/tailwindcss-v3-3Feel free to reach out to me on Twitter @Nuallian.Edited by Michal FeckoPowered by Sudolabs: https://sudolabs.com/

rome coming soon edited browsers graphics chrome browser tailwind css webgpu

Завтракаст №280 – Три токсичные бактерии

Zavtracast (Завтракаст)

Play Episode Listen Later Apr 15, 2023 315:56

Завтракаст возвращается с очередным “коротким” выпуском, приключение на 5 часов, вошли-вышли, это будет быстро! Обсуждаем много интересных новинок, трейлеров, новостей, слухов и вот этого всего. Зачем нужен WebGPU? Про что будут новые “Звездные Войны”? Каково это играть в Destiny 2 в 2023 году? На все эти вопросы отвечают ваши бессменные трое ведущих Завтракаста Дима, Тимур и Максим. Подписывайтесь и ставьте лайк, а также не забудьте нажать на колокольчик на нашем ютубе – https://youtube.com/zavtracast Если вы хотите нас поддержать из России, то подписывайтесь на нас на Boosty – https://boosty.to/zavtracast. Если вы находитесь за границей, то можно подписаться на нас еще и на сервисе Patreon – https://patreon.com/zavtracast. Подписывайтесь на каналы ведущих в телеграме: Радио Тимур – https://t.me/radiotimur Фотодушнила – https://t.me/dushovato Сказки Дядюшки Зомбака – https://t.me/zombaktales Шоуноты Новости Завтракаста: у нас вышел спешал с Дэвидом Перри, новый выпуск подкаста “Магнитное Поле”. Слухи недели: Sony выпустит новую портативку (но есть нюанс), Sega готовит римейки Jet Set Radio и Persona 3, а у Redfall проблемы. Ubisoft представила подписку Ubisoft+ для Xbox. Обсуждаем новые “Звездные Войны” и сопутствующие франшизе сериалы и фильмы. Игру Suicide Squad перенесли на год, а Илон Маск планирует монетизировать контент в Твиттере. Как в разных странах внедряют технологии на примере Индии, Турции […] Запись Завтракаст №280 – Три токсичные бактерии впервые появилась Zavtracast.

sony suicide squad ubisoft sega redfall webgpu zavtracast

Linux Action News 288

Linux Action News

Play Episode Listen Later Apr 13, 2023 14:24

A classic gadget gets a Linux-powered new lease on life, the next project getting Rusty, great news for Btrfs users, and more.

performance sustainability android metal driver rust rusty linux controller tablet firefox scrub vulkan touchscreens e-waste suse asahi action news chris fisher collabora webgl apple m2 daniel almeida linux desktop btrfs i2c webgpu linux servers intel atom novatek wes payne linux news podcast

Linux Action News 288

Linux Action News

Play Episode Listen Later Apr 13, 2023 14:24

A classic gadget gets a Linux-powered new lease on life, the next project getting Rusty, great news for Btrfs users, and more.

RIP 3rd Party Google Assistant Displays, Meet Hardware ChromeOS M103, Project Starline Update...

Google Workspace Recap

Play Episode Listen Later Apr 12, 2023 38:30

Only a handful of updates this week, and as usual a bunch of AI news updates. What do you think, should I start a show just for AI coverage and remove them from here, or do you like hearing about them the way they are now? Click here to learn all about the Google ChromeOS Administrator Certification and how to pass it: https://youtu.be/KHPy_n0qVk8 Silent Releases Google Meet Hardware Chrome OS M103 Expanding language support for grammar suggestions Published Releases Use speaker separation for a more dynamic meeting experience on Pixel 7 devices Other Topics Google ends updates to kill 3rd party Assistant Smart Displays Google Cloud IT Heroes Summit How Project Starline improves remote communication Google Details why exactly Project Starline is more natural Google Chrome ships WebGPU Google is rolling out WebGPU tech for next gen gaming in your browser Google Looming AI integration into Search should scare the hell out of Microsoft In AI Race, Microsoft and Google Choose Speed over Caution Anthropic's 5B, 4-year plan to take on OpenAI tabGeeks Resources

ai search microsoft heroes hire slack hardware pixel 5b google chrome displays google assistant 3rdparty chrome os project starline webgpu m103

@583 - Hacker “Robin Hood” é preso / Chrome lança WebGPU / Samsung e ChatGPT

Filipe Deschamps News

Play Episode Listen Later Apr 6, 2023 3:25

Notícias que chamaram a nossa atenção nesta quinta-feira, dia 06 de abril de 2023! Reprodução em áudio do e-mail recebido diariamente pela, Newsletters (newsletter@filipedeschamps.com) Newsletter gratuita sobre Tecnologia e Programação: https://filipedeschamps.com.br/newsletter #news #noticias #fdnews #robsonamendonca

chatgpt newsletter hackers robin hood samsung tecnologia chrome newsletters preso reprodu webgpu

Kodsnack 512 - Enrich the graphics, with Denis Radin

Kodsnack

Play Episode Listen Later Feb 14, 2023 40:19

Recorded at the Øredev 2022 developer conference, Fredrik chats with Denis Radin about React, Webgpu, standards development, coding standards, and a lot more. We start way back, with early React development - while React was still in beta, on amazingly bad hardware. A project where focus was actually on optimization and education instead of throwing hardware at solving the performance problem. We discuss AI art generation a bit, and how it affects our world. Denis then gets into how Webgpu is different from Webgl, mostly a lot better for a lot more use cases. What's holding back really cool graphical things in the browser now? Getting paid! Denis tells us about the development of the Webgpu standard, a unique standard which filled a gap major players all wanted filling. What if we applied NASA coding guidelines to Javascript? Denis did it to show that Javascript can be taken as seriously as C or other low-level languages, if we just want to. Do we web developers have more to internalize when it comes to pride in craftmanship? But examples are out there if we just know to look for them. What does Denis think of React's evolution? Finally, fullstack frameworks are coming and exciting. They are a revolution for Denis' side projects already! Thank you Cloudnet for sponsoring our VPS! Comments, questions or tips? We are @kodsnack, @tobiashieta, @oferlundand @bjoreman on Twitter, have a page on Facebook and can be emailed at info@kodsnack.se if you want to write longer. We read everything we receive. If you enjoy Kodsnack we would love a review in iTunes! You can also support the podcast by buying us a coffee (or two!) through Ko-fi. Links Øredev Denis Denis helps organize React conferences in Amsterdam Denis' presentation at Øredev 2022 Denis' blog post on WebGPU Thick clients Webgpu Webgl Canvas Opengl Metal Directx Vulkan NASA coding standards (for C) Denis' talk about applying the NASA coding standards High-performance Javascript Angular Solid.js Alpine.js Svelte React native React-three-fiber - React renderer for three.js Next.js Blitz.js Ruby on rails Titles Amazingly shitty hardware The performance and scalability wasn't there Let's use this pipeline Enrich the graphics How do you monetize? A standard that fills a gap Javascript developer: no Change the perception This is engineering Innovate by simplicity A fullstack developer with a couple of commands

ai change nasa react ko denis blitz innovate alpine javascript graphics fredrik vps enrich radin webgl webgpu kodsnack cloudnet

Kodsnack 512 - Enrich the graphics, with Denis Radin

Kodsnack in English

Play Episode Listen Later Feb 14, 2023 40:18

Recorded at the Øredev 2022 developer conference, Fredrik chats with Denis Radin about React, Webgpu, standards development, coding standards, and a lot more. We start way back, with early React development - while React was still in beta, on amazingly bad hardware. A project where focus was actually on optimization and education instead of throwing hardware at solving the performance problem. We discuss AI art generation a bit, and how it affects our world. Denis then gets into how Webgpu is different from Webgl, mostly a lot better for a lot more use cases. What’s holding back really cool graphical things in the browser now? Getting paid! Denis tells us about the development of the Webgpu standard, a unique standard which filled a gap major players all wanted filling. What if we applied NASA coding guidelines to Javascript? Denis did it to show that Javascript can be taken as seriously as C or other low-level languages, if we just want to. Do we web developers have more to internalize when it comes to pride in craftmanship? But examples are out there if we just know to look for them. What does Denis think of React’s evolution? Finally, fullstack frameworks are coming and exciting. They are a revolution for Denis' side projects already! Thank you Cloudnet for sponsoring our VPS! Comments, questions or tips? We are @kodsnack, @tobiashieta, @oferlund and @bjoreman on Twitter, have a page on Facebook and can be emailed at info@kodsnack.se if you want to write longer. We read everything we receive. If you enjoy Kodsnack we would love a review in iTunes! You can also support the podcast by buying us a coffee (or two!) through Ko-fi. Links Øredev Denis Denis helps organize React conferences in Amsterdam Denis' presentation at Øredev 2022 Denis' blog post on WebGPU Thick clients Webgpu Webgl Canvas Opengl Metal Directx Vulkan NASA coding standards (for C) Denis' talk about applying the NASA coding standards High-performance Javascript Angular Solid.js Alpine.js Svelte React native React-three-fiber - React renderer for three.js Next.js Blitz.js Ruby on rails Titles Amazingly shitty hardware The performance and scalability wasn’t there Let’s use this pipeline Enrich the graphics How do you monetize? A standard that fills a gap Javascript developer: no Change the perception This is engineering Innovate by simplicity A fullstack developer with a couple of commands

Purdy with Marty Jones

Rustacean Station

Play Episode Listen Later Apr 8, 2022 47:10

Allen Wyma talks with Marty Jones, creator of Purdy. Purdy is an experimental PDF renderer built on top of WebGPU. Contributing to Rustacean Station Rustacean Station is a community project; get in touch with us if you'd like to suggest an idea for an episode or offer your services as a host or audio editor! Twitter: @rustaceanfm Discord: Rustacean Station Github: @rustacean-station Email: hello@rustacean-station.org Timestamps [@0:55] - Marty's Background [@4:06] - What sparked Marty's interest in PDFs [@6:21] - What kind of primitives are built into PDF? [@8:56] - How to solve edge cases in PDFs? [@11:54] - Property-based testing [@16:54] - The deciding factor that got Marty into creating his library. [@19:59] - What is Web GPU [@22:13] - Marty's goal with PDF JS [@24:08] - Why use PDF JS? [@29:02] - Why Marty used Rust instead of JavaScript [@30:15] - What's next with PDF JS? [@36:51] - Legalities of PDFs [@41:42] - How to reach Marty Other Resources Marty's Github What is unique about PDF rendering? Credits Intro Theme: Aerocity Audio Editing: Plangora Hosting Infrastructure: Jon Gjengset Show Notes: Plangora Hosts: Allen Wyma

property rust javascript contributing pdfs purdy legalities webgpu allen wyma

WebGPU and Graphics on the Web

Building the Open Metaverse

Play Episode Listen Later Mar 22, 2022 51:37

Senior Software Engineers at Google, Brandon Jones and Kai Ninomiya, join Patrick Cozzi (Cesium) to discuss the origin of WebGPU and the state of its ecosystem.

google graphics brandon jones webgpu

SDCast #138: в гостях Денис Радин, специалист по WebGL

SDCast

Play Episode Listen Later Nov 16, 2021 81:16

Что вы знаете про 3D графику в вебе? А про стандарты W3C по работе с 3D графикой? У меня в гостях Денис Радин, так же известный в интернетах как PixelCommander, один из первых react разработчиков в Нидерландах и специалист по WebGL. В этом выпуске мы говорим про 3D графику в вебе. Денис рассказал про свой опыт участия в разработке операционной системы для ТВ-приставок, пользовательский интерфейс которой был построен на веб стэке с использованием 3D. Поговорили про фреймворки и библиотеки, предоставляющие высокоуровневый API для работы с графикой: их плюсы и минусы, когда стоит использовать библиотеки, а когда лучше руками писать низкоуровневый код. Мы обсудили для каких задач применяется WebGL, где и как стоит его использовать, обсудили как работает весь стек от вызова API со стороны JavaScript в браузере до выполнения в GPU. Поговорили про API WebGL: как он развивался, устаревал, какие в нём накопились вопросы и проблемы и как на свет появился новый стандарт WebGPU. Денис является одним из организаторов крупной европейской конференции React Summit. Мы поговорили про конференции, выступления, онлайн и оффлайн. Так же Денис рассказал про своё участие в W3C WebGPU Working group. Ссылки на ресурсы по темам выпуска: * Слайды доклада «Что WebGPU значит для Web платформы?» (https://docs.google.com/presentation/d/1ydAPZbMvp6iJHs4k--AIop6wT2HOQWOieBIffKrLKpI/edit#slide=id.gf18191bc23_0_265) * WebGPU W3C Working Draft (https://www.w3.org/TR/webgpu/) * Доклад Дениса «Пиксельные шейдеры для Web-разработчиков» (https://www.youtube.com/watch?v=SszWYsEio7E) с Highload++ 2017 * Доклад Дениса «Интерактивные проекции и 3D-маппинг с помощью web-технологий» (https://www.youtube.com/watch?v=EKHP2y2BGf0) с РИТ++ 2019 Понравился выпуск? — Поддержи подкаст на patreon.com/KSDaemon (https://www.patreon.com/KSDaemon), звёздочками в iTunes (https://podcasts.apple.com/ru/podcast/software-development-podcast/id890468606?l=en), а так же ретвитом или постом! Заходи в телеграм-чат SDCast (https://t.me/SDCast), где можно обсудить выпуски, предложить гостей и высказать свои замечания и пожелания!

3d web api gpu webgl webgpu

News 10/21: Deno 1.8 // Flutter Engage // OpenHaystack // Blender 2.92

programmier.bar – der Podcast für App- und Webentwicklung

Play Episode Listen Later Mar 10, 2021 25:23

Es gibt Neuigkeiten im Bereich der Web- und App-Entwicklung! Deshalb sprechen wir über ein Update von Deno, das vor allem für Machine-Learning-Enthusiasten interessant sein könnte. Mit der Unterstützung für WebGPU ist zukünftig Python vielleicht nicht mehr die erste Anlaufstelle für die Programmierung im Machine-Learning-Umfeld.Letzte Woche fand die Flutter Engage statt – ein Online-Event von Google zur Vorstellung der neuen Version 2.0 von Flutter. Neben dem nun offiziellen Stable-Release der Web-Kompatibilität standen auch weitere Plattformen im Fokus, die jetzt unterstützt werden.Bei OpenHaystack handelt es sich um ein Reverse-Engineering-Projekt der TU Darmstadt, mit dessen Hilfe sich jede:r selbst Bluetooth-Hardware bauen kann, die in Apples “Wo ist”-Programm geortet werden können, auch wenn man selbst nicht in der Nähe ist.Außerdem datet Sebi uns in Sachen 3D ab: Das kostenlose Modellierungsprogramm Blender unterstützt in der neusten Version 2.92 nun Geometry Nodes.Schreibt uns!Schickt uns eure Themenwünsche und euer Feedback.podcast@programmier.barFolgt uns!Bleibt auf dem Laufenden über zukünftige Folgen und beteiligt euch an Community-Diskussionen.TwitterInstagramFacebookMeetup

Podcasts about webgpu

Best podcasts about webgpu

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Hacker News Recap

Building the Open Metaverse

Les Cast Codeurs Podcast

The top AI news from the past week, every ThursdAI

Latest news about webgpu

Latest podcast episodes about webgpu

E357 Cetáceos CEOívoros

075 - Epic React Native Packages, Background Images, Skia and WebGPU Updates, Expo Router & The MCP Hype

913: NEWS: Remix drops React, Safari 26 CSS + mega fast Vite and TypeSCript

#069 - Expo Router v5, Skia WebGPU, App Updates & Galaxies Lifetime

#1524: HTC Viverse Leverages Open Web Tech in Aim to Become ‘YouTube of 3D Content’

News 06/25: Apples neue App // JavaScript Temporal // Web AI Acceleration Fund // Angular Dokumentation // Ross Ulbricht // Bitcoins in El Salvador

#156 Web AI - DeepSeek Хайпует | AI уничтожит фронтенд? | WebGPU портит волосы | Гугл несет AI всем

Bolt.new, Flow Engineering for Code Agents, and >$8m ARR in 2 months as a Claude Wrapper

PART 1: Matthew Hartman | How Factorial Invests in the Future

How to Get Started With 3D in React | React Universe On Air: Coffee Talk #21

September 5th, 2024 | UE5 Nanite in WebGPU

AI Magic: Shipping 1000s of successful products with no managers and a team of 12 — Jeremy Howard of Answer.ai

Revolutionizing Game Development: W/ Will Eastcott of PlayCanvas

GOOGLE I/O: ALL ABOUT AI

Sick Jokes, WEBGPU, Fortra, Azorult, Fujitsu, Phishing, Josh Marpet, and More - SWN #370

Sick Jokes, WEBGPU, Fortra, Azorult, Fujitsu, Phishing, Josh Marpet, and More - SWN #370

Sick Jokes, WEBGPU, Fortra, Azorult, Fujitsu, Phishing, Josh Marpet, and More - SWN #370

Sick Jokes, WEBGPU, Fortra, Azorult, Fujitsu, Phishing, Josh Marpet, and More - SWN #370

December 22nd, 2023 | How big is YouTube?

JetBrains turbina IDEs com IA; Google Duet no VS Code; Novo SvelteKit; Deno com novidades no KV e WebGPU; Jira ganha IA; Wi-Fi 7 5x mais rápido [Compilado #129]

JetBrains turbina IDEs com IA; Google Duet no VS Code; Novo SvelteKit; Deno com novidades no KV e WebGPU; Jira ganha IA; Wi-Fi 7 5x mais rápido [Compilado #129]

From WebGL to WebGPU

From WebGL to WebGPU (JS Party #304)

WebGPU and Browser Ideologies

Web Standards for the Win W/ Ken Russell & Corentin Wallez

89 - Android 14 Beta 5, Compose for Wear OS, WebGPU, and more!

Episode 200: WebGPU

LLMs Everywhere: Running 70B models in browsers and iPhones using MLC — with Tianqi Chen of CMU / OctoML

LCC 298 - De l'IA à toutes les sauces

2.3 - Para hablar de Big tech con Adriana Carvajal (@adri.zip)

LCC 296 - Interview Google IA IA I/O 2023

#1213: Primer on WebGPU & Bringing High-Performance 3D Graphics and Parallel Compute to the Web

Game programming nell'era di JavaScript e di WebGPU

Kurz informiert vom 03.05.2023 by heise online

Episode 449: Pigeons, Podcasts, Putting

S2E27: WebGPU: The Future of Web Graphics Coming Soon to a Chrome Browser Near You!

Завтракаст №280 – Три токсичные бактерии

Linux Action News 288

Linux Action News 288

RIP 3rd Party Google Assistant Displays, Meet Hardware ChromeOS M103, Project Starline Update...

@583 - Hacker “Robin Hood” é preso / Chrome lança WebGPU / Samsung e ChatGPT

Kodsnack 512 - Enrich the graphics, with Denis Radin

Kodsnack 512 - Enrich the graphics, with Denis Radin

Purdy with Marty Jones

WebGPU and Graphics on the Web

SDCast #138: в гостях Денис Радин, специалист по WebGL

News 10/21: Deno 1.8 // Flutter Engage // OpenHaystack // Blender 2.92