Podcasts about LLVM

Play Episode Listen Later Dec 27, 2024 42:12

Allen Wyma talks with Lars Bergstrom, Director of Engineering at Google, about Google's use of Rust within Android. Android is Google's main mobile operating system deployed to over 3 billion devices around the world. Contributing to Rustacean Station Rustacean Station is a community project; get in touch with us if you'd like to suggest an idea for an episode or offer your services as a host or audio editor! Twitter: @rustaceanfm Discord: Rustacean Station Github: @rustacean-station Email: hello@rustacean-station.org Timestamps [@00:00] - Meet Lars Bergstrom [@03:06] - Updates on Android devices [@06:49] - Rust usage at Google and in Android development [@10:26] - Zig as a security-focused alternative [@22:52] - Native code development on Android [@24:56] - Comparing Rust and Go [@27:26] - Rust as an app development language [@32:12] - LLVM vs GCC [@40:15] - Concluding discussion Other links RUSTAsia Conf 2025 Credits Intro Theme: Aerocity Audio Editing: Plangora Hosting Infrastructure: Jon Gjengset Show Notes: Plangora Hosts: Allen Wyma

director google android engineering native rust contributing gcc zig bergstrom llvm allen wyma

What's New in Rust 1.76, 1.77, and 1.78

ai openai nobel prize llm tech news deepmind ray ban webassembly tech talks llvm

Play Episode Listen Later Oct 26, 2024 105:34

Jon and Ben discuss the highlights of the 1.76, 1.77, and 1.78 releases of Rust. This episode was recorded as part of a YouTube live stream on 2024-05-18, which you can still watch. Contributing to Rustacean Station Rustacean Station is a community project; get in touch with us if you'd like to suggest an idea for an episode or offer your services as a host or audio editor! Twitter: @rustaceanfm Discord: Rustacean Station Github: @rustacean-station Email: hello@rustacean-station.org Timestamps & referenced resources [@00:34] - Rust 1.76 [@01:18] - ABI compatibility updates The updated ABI section An interesting article on ABIs in Swift vs Rust [@08:53] - Type names from references type_name type_name_of_val [@10:35] - Stabilized APIs [@10:56] - Result::inspect [@13:53] - Arc::unwrap_or_clone [@15:25] - std::hash::DefaultHasher [@18:01] - ptr::addr_eq [@21:30] - Changelog deep-dive [@21:33] - Resize/hide rustdoc bars [@22:40] - Rust 1.77 [@22:51] - C-string literals std::ffi::CStr [@28:20] - Support for recursion in async fn [@31:43] - offset_of! [@36:32] - Enable strip in release profiles by default [@39:35] - Stabilized APIs [@39:36] - core::net [@40:59] - f64::round_ties_even [@42:05] - Mutex::clear_poison [@43:43] - File::create_new OpenOptions [@46:15] - Changelog deep-dive [@46:46] - Lint on references to static mut SyncUnsafeCell [@50:05] - Undeprecate unstable_features lint [@51:37] - Deny braced macro invocation in let-else Details from dtolnay comment [@55:45] - cargo:: in build scripts [@56:20] - Standardized package ID spec in Cargo [@57:36] - slice::first_chunk [@59:55] - Rust 1.77.1 Stripping debug info in release builds broke Windows. [@1:00:58] - Rust 1.77.2 Fixes CVE-2024-24576. Detailed advisory, fix, and current logic. [@1:04:54] - Rust 1.78 [@1:07:55] - Diagnostic attributes #[diagnostic] documentation [@1:13:13] - Asserting unsafe preconditions Implementation PR [@1:19:56] - Deterministic realignment [@1:23:24] - Stabilized APIs [@1:23:33] - impl Read for &Stdin [@1:24:03] - Relax bounds on Error trait implementations [@1:25:40] - Compatibility notes [@1:25:40] - Windows requirement bump Replace pthread RwLock Slim reader/writer locks [@1:29:25] - LLVM 18 brings *128 ABI change [@1:32:04] - Changelog deep-dive [@1:32:04] - Make non-PartialEq-typed consts as patterns a hard error [@1:34:59] - Suggest moving definition if non-found macro_rules! is defined later [@1:36:08] - Stabilize v4 of Cargo lockfile [@1:37:36] - cargo update highlights stale dependencies [@1:38:23] - Deprecate non-extension .cargo/config files [@1:39:19] - Clippy lint assigning_clones [@1:40:49] - Clippy lint incompatible_msrv [@1:42:22] - cargo new stopped commenting in Cargo.toml Credits Intro Theme: Aerocity Audio Editing: Aerocity Hosting Infrastructure: Jon Gjengset Show Notes: Jon Gjengset Hosts: Jon Gjengset and Ben Striegel

News: Про новинки світу ШІ, як гонка за увагою псує усе, пробіли в знаннях Євгена

Pi Tech

Play Episode Listen Later Oct 17, 2024 53:02

У цьому випуску наші ведучі Павло, Євген, та Михайло обговорюють найсвіжіші новини зі світу технологій:

Episode 184: Safety in Swift 6, Protocols & More with Doug Gregor

apple safety swift protocols gregor audio library rensselaer polytechnic institute distinguished engineer clang llvm

Play Episode Play 39 sec Highlight Listen Later May 31, 2024 40:26

In this episode, Conor and Bryce chat with Doug Gregor from Apple about the Swift programming language!Link to Episode 184 on WebsiteDiscuss this episode, leave a comment, or ask a question (on GitHub)TwitterADSP: The PodcastConor HoekstraBryce Adelstein LelbachAbout the Guest:Douglas Gregor is is a Distinguished Engineer at Apple working on the Swift programming language, compiler, and related libraries and tools. He is code owner emeritus of the Clang compiler (part of the LLVM project), a former member of the ISO C++ committee, and a co-author on the second edition of C++ Templates: The Complete Guide. He holds a Ph.D. in computer science from Rensselaer Polytechnic Institute.Show NotesDate Recorded: 2024-04-29Date Released: 2024-05-31Swift Programming LanguageSwift ActorsD Programming LanguageRust Programming LanguageFearless Concurrency? Understanding Concurrent Programming Safety in Real-World Rust SoftwareSwift Protocols2022 LLVM Dev Mtg: Implementing Language Support for ABI-Stable Software Evolution in Swift and LLVMOxide Episode - Discovering the XZ Backdoor with Andres FreundSwift Algorithms LibraryIntro Song InfoMiss You by Sarah Jansen https://soundcloud.com/sarahjansenmusicCreative Commons — Attribution 3.0 Unported — CC BY 3.0Free Download / Stream: http://bit.ly/l-miss-youMusic promoted by Audio Library https://youtu.be/iYYxnasvfx8

Episode 183: Swift with Doug Gregor

apple swift gregor audio library rensselaer polytechnic institute distinguished engineer clang llvm

Play Episode Play 46 sec Highlight Listen Later May 24, 2024 28:06

In this episode, Conor and Bryce chat with Doug Gregor from Apple about the Swift programming language!Link to Episode 183 on WebsiteDiscuss this episode, leave a comment, or ask a question (on GitHub)TwitterADSP: The PodcastConor HoekstraBryce Adelstein LelbachAbout the Guest:Douglas Gregor is is a Distinguished Engineer at Apple working on the Swift programming language, compiler, and related libraries and tools. He is code owner emeritus of the Clang compiler (part of the LLVM project), a former member of the ISO C++ committee, and a co-author on the second edition of C++ Templates: The Complete Guide. He holds a Ph.D. in computer science from Rensselaer Polytechnic Institute.Show NotesDate Recorded: 2024-04-29Date Released: 2024-05-24Swift Programming LanguageWWDC 2014 Swift AnnouncementSwift on LanguishIntro Song InfoMiss You by Sarah Jansen https://soundcloud.com/sarahjansenmusicCreative Commons — Attribution 3.0 Unported — CC BY 3.0Free Download / Stream: http://bit.ly/l-miss-youMusic promoted by Audio Library https://youtu.be/iYYxnasvfx8

Episode 182: C++ Variadic Templates, Swift and More with Doug Gregor

apple swift gregor templates audio library rensselaer polytechnic institute distinguished engineer clang llvm

Play Episode Play 60 sec Highlight Listen Later May 17, 2024 38:10

In this episode, Conor and Bryce chat with Doug Gregor from Apple about C++11 Variadic Templates, C++11 std::tuple, C++17 std::variant, Swift and more!Link to Episode 182 on WebsiteDiscuss this episode, leave a comment, or ask a question (on GitHub)TwitterADSP: The PodcastConor HoekstraBryce Adelstein LelbachAbout the Guest:Douglas Gregor is is a Distinguished Engineer at Apple working on the Swift programming language, compiler, and related libraries and tools. He is code owner emeritus of the Clang compiler (part of the LLVM project), a former member of the ISO C++ committee, and a co-author on the second edition of C++ Templates: The Complete Guide. He holds a Ph.D. in computer science from Rensselaer Polytechnic Institute.Show NotesDate Recorded: 2024-04-29Date Released: 2024-05-17C++11 Variadic Templates / Parameter Packs / ExpansionC++26 Pack IndexingC++11 std::tupleC++17 std::variantC++11 Digit SeparatorsSwift Programming LanguageHPX (High Performance ParalleX)Intro Song InfoMiss You by Sarah Jansen https://soundcloud.com/sarahjansenmusicCreative Commons — Attribution 3.0 Unported — CC BY 3.0Free Download / Stream: http://bit.ly/l-miss-youMusic promoted by Audio Library https://youtu.be/iYYxnasvfx8

LLama2.java: LLM integration with A 100% Pure Java file

airhacks.fm podcast with adam bien

Play Episode Listen Later May 12, 2024 61:28

An airhacks.fm conversation with Alfonso Peterssen (@TheMukel) about: discussion about Alfonso's early programming experience and participation in the IOI competition, studying computer science and functional programming with Martin Odersky, internships at Google and Oracle Labs working on compilers and the Espresso project implementing a JVM in Java, espresso mentioned in "#208 GraalVM: Meta Circularity on Different Levels", "#194 GraalVM, Apple Silicon (M1) and Clouds", "#167 GraalVM and Java 17, Truffle, Espresso and Native Image" and "#157 The Ingredients of GraalVM", porting LLVM to pure Java in one class, integrating Large Language Models (LLMs) in Java by porting the LLAMA model from C to Java, GPU acceleration with tornadovm, TornadoVM appeared at "#282 TornadoVM, Paravox.ai: Java, AI, LLMs and Hardware Acceleration", performance of the Java port being within 10% of the C versions, potential huge opportunities for integrating AI and LLMs with enterprise Java systems for use cases like fraud detection, the Java port being a 1,000 line self-contained implementation with no external dependencies, the need for more resources and support to further develop the Java LLM integration, the llama2.java project Alfonso Peterssen on twitter: @TheMukel

ai google pure integration ingredients clouds file java llama espresso gpu large language models truffles different levels jvm ioi llvm llama2 hardware acceleration

Episode 181: The C++0x Concepts Story with Doug Gregor (Part 2)

apple swift concepts optimization gregor audio library rensselaer polytechnic institute distinguished engineer clang llvm 10c

Play Episode Play 37 sec Highlight Listen Later May 10, 2024 33:29

In this episode, Conor and Bryce chat with Doug Gregor from Apple about the history of C++0x Concepts (part 2).Link to Episode 181 on WebsiteDiscuss this episode, leave a comment, or ask a question (on GitHub)TwitterADSP: The PodcastConor HoekstraBryce Adelstein LelbachAbout the Guest:Douglas Gregor is is a Distinguished Engineer at Apple working on the Swift programming language, compiler, and related libraries and tools. He is code owner emeritus of the Clang compiler (part of the LLVM project), a former member of the ISO C++ committee, and a co-author on the second edition of C++ Templates: The Complete Guide. He holds a Ph.D. in computer science from Rensselaer Polytechnic Institute.Show NotesDate Recorded: 2024-04-29Date Released: 2024-05-10C++20 ConceptsSwift Programming LanguageElements of ProgrammingTecton: A Language for Manipulating Generic ObjectsGeneric Programming by David Musser and Alexander StepanovOriginal paper on concepts for C++0x (Stroustrup and Dos Reis)C++ Concepts vs Rust Traits vs Haskell Typeclasses vs Swift Protocols - Conor Hoekstra - ACCU 2021Paper on the implementation of concepts in ConceptGCC (Gregor, Siek)C++0x Concepts proposal that explains the model (Gregor, Stroustrup)Language wording for concepts that went into C++0xDoug's last-ditch effort to bring back a simpler C++0x Concepts model using archetypes for type checkingJeremy Siek's extensive C++0x Concepts writeupType-Soundness and Optimization in the Concepts ProposalIntro Song InfoMiss You by Sarah Jansen https://soundcloud.com/sarahjansenmusicCreative Commons — Attribution 3.0 Unported — CC BY 3.0Free Download / Stream: http://bit.ly/l-miss-youMusic promoted by Audio Library https://youtu.be/iYYxnasvfx8

Episode 180: The C++0x Concepts Story with Doug Gregor (Part 1)

apple swift concepts gregor audio library rensselaer polytechnic institute distinguished engineer clang llvm

Play Episode Play 36 sec Highlight Listen Later May 3, 2024 48:58

In this episode, Conor and Bryce chat with Doug Gregor from Apple about the history of C++0x Concepts.Link to Episode 180 on WebsiteDiscuss this episode, leave a comment, or ask a question (on GitHub)TwitterADSP: The PodcastConor HoekstraBryce Adelstein LelbachAbout the Guest:Douglas Gregor is is a Distinguished Engineer at Apple working on the Swift programming language, compiler, and related libraries and tools. He is code owner emeritus of the Clang compiler (part of the LLVM project), a former member of the ISO C++ committee, and a co-author on the second edition of C++ Templates: The Complete Guide. He holds a Ph.D. in computer science from Rensselaer Polytechnic Institute.Show NotesDate Recorded: 2024-04-29Date Released: 2024-05-03C++20 ConceptsSwift Programming LanguageElements of ProgrammingTecton: A Language for Manipulating Generic ObjectsGeneric Programming by David Musser and Alexander StepanovOriginal paper on concepts for C++0x (Stroustrup and Dos Reis)C++ Concepts vs Rust Traits vs Haskell Typeclasses vs Swift Protocols - Conor Hoekstra - ACCU 2021Paper on the implementation of concepts in ConceptGCC (Gregor, Siek)C++0x Concepts proposal that explains the model (Gregor, Stroustrup)Language wording for concepts that went into C++0xDoug's last-ditch effort to bring back a simpler C++0x Concepts model using archetypes for type checkingJeremy Siek's extensive C++0x Concepts writeupIntro Song InfoMiss You by Sarah Jansen https://soundcloud.com/sarahjansenmusicCreative Commons — Attribution 3.0 Unported — CC BY 3.0Free Download / Stream: http://bit.ly/l-miss-youMusic promoted by Audio Library https://youtu.be/iYYxnasvfx8

Mojo Lang - Tomorrow's High Performance Python? (with Chris Lattner)

Developer Voices

Play Episode Listen Later May 1, 2024 84:38

Mojo is the latest language from the creator of Swift and LLVM. It's an attempt to take some of the best techniques from CPU/GPU-level programming and package them up in a Python-compatible syntax.In this episode we explore why Mojo was created, and what it offers to Python programmers and non-Python programmers alike. How is it built for performance, and which performance features matter? What's its take on functional programming and type systems? And can it marry the high-level programming of Python with the low-level programming of LLVM/MLIR?If you're a Python programmer who needs better performance, a C programmer who expects more from a ‘scripting language', or just someone who'd be happier if Python had a first-class type system, Mojo might well be for you…–Mojo: https://www.modular.com/max/mojoMojo's Roadmap: https://docs.modular.com/mojo/roadmap.htmlThe Mojo Discord: https://discord.com/invite/modularMLIR: https://mlir.llvm.org/Chris's Talks: https://nondot.org/sabre/Resume.html#talksChris on Twitter: https://twitter.com/clattner_llvmKris on Mastodon: http://mastodon.social/@krisajenkinsKris on LinkedIn: https://www.linkedin.com/in/krisjenkins/Kris on Twitter: https://twitter.com/krisajenkins–#software #podcast #mojolang #ml #pythonml

roadmap swift talks lang resume high performance mojo python mastodon llvm chris lattner cpu gpu

#377 A Dramatic Episode

Python Bytes

Play Episode Listen Later Apr 2, 2024 32:55

Topics covered in this episode: justpath xz back door LPython dramatic Extras Joke Watch on YouTube About the show Sponsored by ScoutAPM: pythonbytes.fm/scout Connect with the hosts Michael: @mkennedy@fosstodon.org Brian: @brianokken@fosstodon.org Show: @pythonbytes@fosstodon.org Join us on YouTube at pythonbytes.fm/live to be part of the audience. Usually Tuesdays at 11am PT. Older video versions available there too. Finally, if you want an artisanal, hand-crafted digest of every week of the show notes in email form? Add your name and email to our friends of the show list, we'll never share it. Michael #1: justpath Inspect and refine PATH environment variable on both Windows and Linux. Raw, count, duplicates, invalids, corrections, excellent stuff. Check out the video Brian #2: xz back door In case you kinda heard about this, but not really. Very short version: A Microsoft engineer noticed a performance problem with ssh and tracked it to a particular version update of xz. Further investigations found a multi-year installation of a fairly complex back door into the xz by a new-ish contributor. But still contributing over several years. First commit in early 2022. The problem is caught. But if it had succeeded, it would have been bad. Part of the issue of how this happened is due to having one primary maintainer on a very widely used tool included in tons-o-Linux distributions. Some useful articles Everything I Know About the XZ Backdoor - Evan Boehs - recommended read Don't think your affected? Think again if you use homebrew, for example: Update and upgrade Homebrew and xz versions Notes Open source maintenance burnout is real Lots of open source projects are maintained by unpaid individuals for long periods of time. Multi-year sneakiness and social bullying is pretty hard to defend against. Handing off projects to another primary maintainer has to be doable. But now I think we need better tools to vet contributors. Maybe? Or would that just suppress contributions? One option to help with burnout: JGMM, Just Give Maintainers Money: Software Needs To Be More Expensive - Glyph Michael #3: LPython LPython aggressively optimizes type-annotated Python code. It has several backends, including LLVM, C, C++, and WASM. LPython's primary tenet is speed. Play with the wasm version here: dev.lpython.org Still in alpha, so keep that in mind. Brian #4: dramatic Trey Hunner More drama in the software world. This time in the Python. Actually, this is just a fun utility to make your Python output more dramatic. More fun output with terminaltexteffects suggested by Allan Extras Brian: Textual how has a new inline feature in the new release. Michael: My keynote talk is out: The State of Python in 2024 Have you browsed your github feed lately? 3.10, 3.9, 3.8 security updates Joke: Definition of terms

Deep Dive w/Scott: CircuitPython Bugs & Builds

Adafruit Industries

Play Episode Listen Later Mar 9, 2024 120:13

Join Scott as he discusses the last few CircuitPython 9.0.0 bug fixes he did, experiments with a new build system and answers questions. Visit the Adafruit shop online - http://www.adafruit.com Thanks to dcd for the time codes: 0:00 getting started 1:04 hello 10:22 bugs and builds 10:40 issues closed in CP on github 12:14 issue 8994 web workflow 13:33 tlsf Two-Level Segregated Fit memory allocator / split heaps 19:33 adafruit learn guide issue 2746 dvi 23:07 how to choose a microcontroller learn guide(s) 24:00 espressif tlsf pull request 25:15 find top and bottom bits in 32 bit word 29:52 tlsf mapping_search() 31:10 debugging the tlsf allocator 34:55 fragmentation issues 39:44 another bsd tlsf implementation on github 42:18 circuitpythgon supervisor shared memory allocation in CP 44:30 CP allocation has other constraints 45:50 issue 9008 improve RGBMatrix reliabilty 48:05 cache disabled race condition - mp_hal_delay moved to IRAM 49:10 tweak watchdog #9012 50:30 esp-idf releases CP using v5.1.3 52:00 esp C6 feather 54:00 licensing GPL / MIT / BSD etc 55:00 build systems github aapleby / hancho written in python 56:20 picolibc on github 59:40 moving in the direction of sharing and not recompiling common code 1:01:25 writing python code to drive cmake ! 1:01:50 back to Hancho - and cmake gripes :-) 1:11:00 continuing the hancho tutorial 1:14:35 picolib meson build / turing complete build systems 1:16:10 hancho and asyncio! 1:18:30 board.hancho experiment 1:19:25 rp2040.hancho 1:22:10 sharing artifacts 1:22:45 build systems and upstream changes ( micro python ) 1:24:15 bringing in the 3 libraries libc, libm, .... ( shared/libc vs. picolibc ) 1:25:30 libgcc ( libm ) / llvm compiler runtime 1:26:00 LLVM-embedded-toolchain-for-Arm 1:28:11 Q-string generation in hancho? 1:30:38 hacker news hancho article 1:34:50 "what is a q-string" 1:49:47 debugging hancho syntax errors 1:57:45 wrap up - Tim deep diving next week ----------------------------------------- LIVE CHAT IS HERE! http://adafru.it/discord Subscribe to Adafruit on YouTube: http://adafru.it/subscribe New tutorials on the Adafruit Learning System: http://learn.adafruit.com/ -----------------------------------------

deep dive builds bugs cp c6 llvm adafruit circuitpython adafruit learning system

535: Untitled Episode

BSD Now

Play Episode Listen Later Nov 30, 2023 56:38

FreeBSD 14 has been released, Reading your RSS feed on FreeBSD, Manipulate PDF files easily with pdftk, clang(1)/llvm updated to version 16 in OpenBSD, NetBSD Security Advisory: multiple vulnerabilities in ftpd(8), and more NOTES This episode of BSDNow is brought to you by Tarsnap (https://www.tarsnap.com/bsdnow) and the BSDNow Patreon (https://www.patreon.com/bsdnow) Headlines FreeBSD 14 (https://www.freebsd.org/releases/14.0R/relnotes/) • [Quick update](https://www.daemonology.net/blog/2023-11-21-late-breaking-FreeBSD-14-breakage.html) • [Vermaden's FreeBSD 14 valuable news] (https://vermaden.wordpress.com/2023/11/17/valuable-freebsd-14-0-release-updates) News Roundup Reading your RSS feed on FreeBSD (https://www.ncartron.org/reading-your-rss-feed-on-freebsd.html) Manipulate PDF files easily with pdftk (https://dataswamp.org/~solene/2023-08-19-pdftk-guide.html) clang(1)/llvm updated to version 16 (https://www.undeadly.org/cgi?action=article;sid=20231113160314&utm_source=bsdweekly) NetBSD Security Advisory 2023-007: multiple vulnerabilities in ftpd(8) (https://bsdsec.net/articles/netbsd-security-advisory-2023-007-multiple-vulnerabilities-in-ftpd-8) Tarsnap This weeks episode of BSDNow was sponsored by our friends at Tarsnap, the only secure online backup you can trust your data to. Even paranoids need backups. Feedback/Questions Brad - zpool disk allocation questions (https://github.com/BSDNow/bsdnow.tv/blob/master/episodes/535/feedback/Brad%20-%20zpool%20disk%20allocation%20questions.md) Kevin - shell question (https://github.com/BSDNow/bsdnow.tv/blob/master/episodes/535/feedback/Kevin%20-%20shell%20question.md) Send questions, comments, show ideas/topics, or stories you want mentioned on the show to feedback@bsdnow.tv (mailto:feedback@bsdnow.tv) Join us and other BSD Fans in our BSD Now Telegram channel (https://t.me/bsdnow)

interview guide reading development tools code os software jail berkeley programming distribution storage how to open source tutorials utility packages ports operating systems foss unix cli bsd dataset freebsd clang filesystem llvm zfs openbsd netbsd feedreader trueos tarsnap dragonflybsd

#027 - 2023 vs 2001 Tech Recessions and Distributed Systems with Russ Ross

Backend Banter

Play Episode Listen Later Nov 13, 2023 75:56

Lane chats with his distributed systems professor from when he was a computer science undergraduate, Dr. Russ Ross. They talk about the state of the hiring market in 2023, LLVM, and of course, distributed systems!Learn back-end development - https://boot.devListen on your favorite podcast player: https://www.backendbanter.comRuss Ross's Twitter: https://twitter.com/_russross?lang=enLike & subscribe for the algo if you enjoyed the video!

tech cloud russ recession python databases devops software development software engineering backend web development golang distributed systems llvm

The End of Finetuning — with Jeremy Howard of Fast.ai

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Oct 19, 2023 69:15

Thanks to the over 17,000 people who have joined the first AI Engineer Summit! A full recap is coming. Last call to fill out the State of AI Engineering survey! See our Community page for upcoming meetups in SF, Paris and NYC.This episode had good interest on Twitter.Fast.ai's “Practical Deep Learning” courses been watched by over >6,000,000 people, and the fastai library has over 25,000 stars on Github. Jeremy Howard, one of the creators of Fast, is now one of the most prominent and respected voices in the machine learning industry; but that wasn't always the case. Being non-consensus and right In 2018, Jeremy and Sebastian Ruder published a paper on ULMFiT (Universal Language Model Fine-tuning), a 3-step transfer learning technique for NLP tasks: The paper demonstrated that pre-trained language models could be fine-tuned on a specific task with a relatively small amount of data to achieve state-of-the-art results. They trained a 24M parameters model on WikiText-103 which was beat most benchmarks.While the paper had great results, the methods behind weren't taken seriously by the community: “Everybody hated fine tuning. Everybody hated transfer learning. I literally did tours trying to get people to start doing transfer learning and nobody was interested, particularly after GPT showed such good results with zero shot and few shot learning […] which I was convinced was not the right direction, but who's going to listen to me, cause as you said, I don't have a PhD, not at a university… I don't have a big set of computers to fine tune huge transformer models.”Five years later, fine-tuning is at the center of most major discussion topics in AI (we covered some like fine tuning vs RAG and small models fine tuning), and we might have gotten here earlier if Jeremy had OpenAI-level access to compute and distribution. At heart, Jeremy has always been “GPU poor”:“I've always been somebody who does not want to build stuff on lots of big computers because most people don't have lots of big computers and I hate creating stuff that most people can't use.”This story is a good reminder of how some of the best ideas are hiding in plain sight; we recently covered RWKV and will continue to highlight the most interesting research that isn't being done in the large labs. Replacing fine-tuning with continued pre-trainingEven though fine-tuning is now mainstream, we still have a lot to learn. The issue of “catastrophic forgetting” and potential solutions have been brought up in many papers: at the fine-tuning stage, the model can forget tasks it previously knew how to solve in favor of new ones. The other issue is apparent memorization of the dataset even after a single epoch, which Jeremy covered Can LLMs learn from a single example? but we still don't have the answer to. Despite being the creator of ULMFiT, Jeremy still professes that there are a lot of open questions on finetuning:“So I still don't know how to fine tune language models properly and I haven't found anybody who feels like they do.”He now advocates for "continued pre-training" - maintaining a diversity of data throughout the training process rather than separate pre-training and fine-tuning stages. Mixing instructional data, exercises, code, and other modalities while gradually curating higher quality data can avoid catastrophic forgetting and lead to more robust capabilities (something we covered in Datasets 101).“Even though I originally created three-step approach that everybody now does, my view is it's actually wrong and we shouldn't use it… the right way to do this is to fine-tune language models, is to actually throw away the idea of fine-tuning. There's no such thing. There's only continued pre-training. And pre-training is something where from the very start, you try to include all the kinds of data that you care about, all the kinds of problems that you care about, instructions, exercises, code, general purpose document completion, whatever. And then as you train, you gradually curate that, you know, you gradually make that higher and higher quality and more and more specific to the kinds of tasks you want it to do. But you never throw away any data….So yeah, that's now my view, is I think ULMFiT is the wrong approach. And that's why we're seeing a lot of these so-called alignment tax… I think it's actually because people are training them wrong.An example of this phenomena is CodeLlama, a LLaMA2 model finetuned on 500B tokens of code: while the model is much better at code, it's worse on generic tasks that LLaMA2 knew how to solve well before the fine-tuning. In the episode we also dive into all the places where open source model development and research is happening (academia vs Discords - tracked on our Communities list and on our survey), and how Jeremy recommends getting the most out of these diffuse, pseudonymous communities (similar to the Eleuther AI Mafia).Show Notes* Jeremy's Background* FastMail* Optimal Decisions* Kaggle* Enlitic* fast.ai* Rachel Thomas* Practical Deep Learning* fastai for PyTorch* nbdev* fastec2 (the underrated library we describe)* Can LLMs learn from a single example?* the Kaggle LLM Science Exam competition, which “challenges participants to answer difficult science-based questions written by a Large Language Model”.* Sebastian Ruder* Alec Radford* Sylvain Gugger* Stephen Merity* Chris Lattner* Modular.ai / Mojo* Jono Whittaker* Zeiler and Fergus paper* ULM Fit* DAWNBench* Phi-1* Code Llama* AlexNetTimestamps* [00:00:00] Intros and Jeremy's background* [00:05:28] Creating ULM Fit - a breakthrough in NLP using transfer learning* [00:06:32] The rise of GPT and the appeal of few-shot learning over fine-tuning* [00:10:00] Starting Fast.ai to distribute AI capabilities beyond elite academics* [00:14:30] How modern LMs like ChatGPT still follow the ULM Fit 3-step approach* [00:17:23] Meeting with Chris Lattner on Swift for TensorFlow at Google* [00:20:00] Continued pre-training as a fine-tuning alternative* [00:22:16] Fast.ai and looking for impact vs profit maximization* [00:26:39] Using Fast.ai to create an "army" of AI experts to improve their domains* [00:29:32] Fast.ai's 3 focus areas - research, software, and courses* [00:38:42] Fine-tuning memorization and training curve "clunks" before each epoch* [00:46:47] Poor training and fine-tuning practices may be causing alignment failures* [00:48:38] Academia vs Discords* [00:53:41] Jeremy's high hopes for Chris Lattner's Mojo and its potential* [01:05:00] Adding capabilities like SQL generation through quick fine-tuning* [01:10:12] Rethinking Fast.ai courses for the AI-assisted coding era* [01:14:53] Rapid model development has created major technical debt* [01:17:08] Lightning RoundAI Summary (beta)This is the first episode we're trying this. Here's an overview of the main topics before you dive in the transcript. * Jeremy's background and philosophies on AI* Studied philosophy and cognitive science in college* Focused on ethics and thinking about AI even 30 years ago* Believes AI should be accessible to more people, not just elite academics/programmers* Created fast.ai to make deep learning more accessible* Development of transfer learning and ULMFit* Idea of transfer learning critical for making deep learning accessible* ULMFit pioneered transfer learning for NLP* Proposed training general language models on large corpora then fine-tuning - this became standard practice* Faced skepticism that this approach would work from NLP community* Showed state-of-the-art results on text classification soon after trying it* Current open questions around fine-tuning LLMs* Models appear to memorize training data extremely quickly (after 1 epoch)* This may hurt training dynamics and cause catastrophic forgetting* Unclear how best to fine-tune models to incorporate new information/capabilities* Need more research on model training dynamics and ideal data mixing* Exciting new developments* Mojo and new programming languages like Swift could enable faster model innovation* Still lots of room for improvements in computer vision-like innovations in transformers* Small models with fine-tuning may be surprisingly capable for many real-world tasks* Prompting strategies enable models like GPT-3 to achieve new skills like playing chess at superhuman levels* LLMs are like computer vision in 2013 - on the cusp of huge new breakthroughs in capabilities* Access to AI research* Many key convos happen in private Discord channels and forums* Becoming part of these communities can provide great learning opportunities* Being willing to do real work, not just talk about ideas, is key to gaining access* The future of practical AI* Coding becoming more accessible to non-programmers through AI assistance* Pre-requisite programming experience for learning AI may no longer be needed* Huge open questions remain about how to best train, fine-tune, and prompt LLMsTranscriptAlessio: Hey everyone, welcome to the Latent Space Podcast. This is Alessio, partner and CTO at Residence at Decibel Partners, and I'm joined by my co-host Swyx, founder of Smol AI. [00:00:21]Swyx: Hey, and today we have in the remote studio, Jeremy Howard all the way from Australia. Good morning. [00:00:27]Jeremy: The remote studio, also known as my house. Good morning. Nice to see you. [00:00:32]Swyx: Nice to see you too. I'm actually very used to seeing you in your mask as a message to people, but today we're mostly audio. But thank you for doing the very important public service of COVID awareness. It was a pleasure. [00:00:46]Jeremy: It was all very annoying and frustrating and tedious, but somebody had to do it. [00:00:52]Swyx: Somebody had to do it, especially somebody with your profile. I think it really drives home the message. So we tend to introduce people for them and then ask people to fill in the blanks on the personal side. Something I did not know about you was that you graduated with a BA in philosophy from the University of Melbourne. I assumed you had a PhD. [00:01:14]Jeremy: No, I mean, I barely got through my BA because I was working 80 to 100 hour weeks at McKinsey and Company from 19 years old onwards. So I actually didn't attend any lectures in second and third year university. [00:01:35]Swyx: Well, I guess you didn't need it or you're very sort of self-driven and self-motivated. [00:01:39]Jeremy: I took two weeks off before each exam period when I was working at McKinsey. And then, I mean, I can't believe I got away with this in hindsight, I would go to all my professors and say, oh, I was meant to be in your class this semester and I didn't quite turn up. Were there any assignments I was meant to have done, whatever. I can't believe all of them let me basically have it. They basically always would say like, okay, well, if you can have this written by tomorrow, I'll accept it. So yeah, stressful way to get through university, but. [00:02:12]Swyx: Well, it shows that, I guess, you min-maxed the opportunities. That definitely was a precursor. [00:02:18]Jeremy: I mean, funnily, like in as much as I, you know, in philosophy, the things I found interesting and focused on in the little bit of time I did spend on it was ethics and cognitive science. And it's kind of really amazing that it's now come back around and those are actually genuinely useful things to know about, which I never thought would happen. [00:02:38]Swyx: A lot of, yeah, a lot of relevant conversations there. So you were a consultant for a while and then in the magical month of June 1989, you founded both Optimal Decisions and Fastmeal, which I also briefly used. So thank you for that. [00:02:53]Jeremy: Oh, good for you. Yeah. Cause I had read the statistics, which is that like 90% or something of small businesses fail. So I thought if I start two businesses, I have a higher chance. In hindsight, I was thinking of it as some kind of stochastic thing I didn't have control over, but it's a bit odd, but anyway. [00:03:10]Swyx: And then you were president and chief scientist at Kaggle, which obviously is the sort of composition platform of machine learning. And then Enlitic, where you were working on using deep learning to improve medical diagnostics and clinical decisions. Yeah. [00:03:28]Jeremy: I was actually the first company to use deep learning in medicine, so I kind of founded the field. [00:03:33]Swyx: And even now that's still like a pretty early phase. And I actually heard you on your new podcast with Tanish, where you went very, very deep into the stuff, the kind of work that he's doing, such a young prodigy at his age. [00:03:47]Jeremy: Maybe he's too old to be called a prodigy now, ex-prodigy. No, no. [00:03:51]Swyx: I think he still counts. And anyway, just to round out the bio, you have a lot more other credentials, obviously, but most recently you started Fast.ai, which is still, I guess, your primary identity with Rachel Thomas. So welcome. [00:04:05]Jeremy: Yep. [00:04:06]Swyx: Thanks to my wife. Thank you. Yeah. Doing a lot of public service there with getting people involved in AI, and I can't imagine a better way to describe it than fast, fast.ai. You teach people from nothing to stable diffusion in seven weeks or something, and that's amazing. Yeah, yeah. [00:04:22]Jeremy: I mean, it's funny, you know, when we started that, what was that, like 2016 or something, the idea that deep learning was something that you could make more accessible was generally considered stupid. Everybody knew that deep learning was a thing that you got a math or a computer science PhD, you know, there was one of five labs that could give you the appropriate skills and that you would join, yeah, basically from one of those labs, you might be able to write some papers. So yeah, the idea that normal people could use that technology to do good work was considered kind of ridiculous when we started it. And we weren't sure if it was possible either, but we kind of felt like we had to give it a go because the alternative was we were pretty sure that deep learning was on its way to becoming, you know, the most or one of the most, you know, important technologies in human history. And if the only people that could use it were a handful of computer science PhDs, that seemed like A, a big waste and B, kind of dangerous. [00:05:28]Swyx: Yeah. [00:05:29]Alessio: And, you know, well, I just wanted to know one thing on your bio that at Kaggle, you were also the top rank participant in both 2010 and 2011. So sometimes you see a lot of founders running companies that are not really in touch with the problem, but you were clearly building something that you knew a lot about, which is awesome. Talking about deep learning, you created, published a paper on ULM fit, which was kind of the predecessor to multitask learning and a lot of the groundwork that then went to into Transformers. I've read back on the paper and you turned this model, AWD LSTM, which I did the math and it was like 24 to 33 million parameters, depending on what training data set you use today. That's kind of like not even small, it's like super small. What were some of the kind of like contrarian takes that you had at the time and maybe set the stage a little bit for the rest of the audience on what was kind of like the state of the art, so to speak, at the time and what people were working towards? [00:06:32]Jeremy: Yeah, the whole thing was a contrarian take, you know. So okay, so we started Fast.ai, my wife and I, and we thought, yeah, so we're trying to think, okay, how do we make it more accessible? So when we started thinking about it, it was probably 2015 and then 2016, we started doing something about it. Why is it inaccessible? Okay, well, A, no one knows how to do it other than a few number of people. And then when we asked those few number of people, well, how do you actually get good results? They would say like, oh, it's like, you know, a box of tricks that aren't published. So you have to join one of the labs and learn the tricks. So a bunch of unpublished tricks, not much software around, but thankfully there was Theano and rappers and particularly Lasagna, the rapper, but yeah, not much software around, not much in the way of data sets, you know, very hard to get started in terms of the compute. Like how do you get that set up? So yeah, no, everything was kind of inaccessible. And you know, as we started looking into it, we had a key insight, which was like, you know what, most of the compute and data for image recognition, for example, we don't need to do it. You know, there's this thing which nobody knows about, nobody talks about called transfer learning, where you take somebody else's model, where they already figured out like how to detect edges and gradients and corners and text and whatever else, and then you can fine tune it to do the thing you want to do. And we thought that's the key. That's the key to becoming more accessible in terms of compute and data requirements. So when we started Fast.ai, we focused from day one on transfer learning. Lesson one, in fact, was transfer learning, literally lesson one, something not normally even mentioned in, I mean, there wasn't much in the way of courses, you know, the courses out there were PhD programs that had happened to have recorded their lessons and they would rarely mention it at all. We wanted to show how to do four things that seemed really useful. You know, work with vision, work with tables of data, work with kind of recommendation systems and collaborative filtering and work with text, because we felt like those four kind of modalities covered a lot of the stuff that, you know, are useful in real life. And no one was doing anything much useful with text. Everybody was talking about word2vec, you know, like king plus queen minus woman and blah, blah, blah. It was like cool experiments, but nobody's doing anything like useful with it. NLP was all like lemmatization and stop words and topic models and bigrams and SPMs. And it was really academic and not practical. But I mean, to be honest, I've been thinking about this crazy idea for nearly 30 years since I had done cognitive science at university, where we talked a lot about the CELS Chinese room experiment. This idea of like, what if there was somebody that could kind of like, knew all of the symbolic manipulations required to answer questions in Chinese, but they didn't speak Chinese and they were kind of inside a room with no other way to talk to the outside world other than taking in slips of paper with Chinese written on them and then they do all their rules and then they pass back a piece of paper with Chinese back. And this room with a person in is actually fantastically good at answering any question you give them written in Chinese. You know, do they understand Chinese? And is this, you know, something that's intelligently working with Chinese? Ever since that time, I'd say the most thought, to me, the most thoughtful and compelling philosophical response is yes. You know, intuitively it feels like no, because that's just because we can't imagine such a large kind of system. But you know, if it looks like a duck and acts like a duck, it's a duck, you know, or to all intents and purposes. And so I always kind of thought, you know, so this is basically a kind of analysis of the limits of text. And I kind of felt like, yeah, if something could ingest enough text and could use the patterns it saw to then generate text in response to text, it could appear to be intelligent, you know. And whether that means it is intelligent or not is a different discussion and not one I find very interesting. Yeah. And then when I came across neural nets when I was about 20, you know, what I learned about the universal approximation theorem and stuff, and I started thinking like, oh, I wonder if like a neural net could ever get big enough and take in enough data to be a Chinese room experiment. You know, with that background and this kind of like interest in transfer learning, you know, I'd been thinking about this thing for kind of 30 years and I thought like, oh, I wonder if we're there yet, you know, because we have a lot of text. Like I can literally download Wikipedia, which is a lot of text. And I thought, you know, how would something learn to kind of answer questions or, you know, respond to text? And I thought, well, what if we used a language model? So language models are already a thing, you know, they were not a popular or well-known thing, but they were a thing. But language models exist to this idea that you could train a model to fill in the gaps. Or actually in those days it wasn't fill in the gaps, it was finish a string. And in fact, Andrej Karpathy did his fantastic RNN demonstration from this at a similar time where he showed like you can have it ingest Shakespeare and it will generate something that looks a bit like Shakespeare. I thought, okay, so if I do this at a much bigger scale, using all of Wikipedia, what would it need to be able to do to finish a sentence in Wikipedia effectively, to do it quite accurately quite often? I thought, geez, it would actually have to know a lot about the world, you know, it'd have to know that there is a world and that there are objects and that objects relate to each other through time and cause each other to react in ways and that causes proceed effects and that, you know, when there are animals and there are people and that people can be in certain positions during certain timeframes and then you could, you know, all that together, you can then finish a sentence like this was signed into law in 2016 by US President X and it would fill in the gap, you know. So that's why I tried to create what in those days was considered a big language model trained on the entirety on Wikipedia, which is that was, you know, a bit unheard of. And my interest was not in, you know, just having a language model. My interest was in like, what latent capabilities would such a system have that would allow it to finish those kind of sentences? Because I was pretty sure, based on our work with transfer learning and vision, that I could then suck out those latent capabilities by transfer learning, you know, by fine-tuning it on a task data set or whatever. So we generated this three-step system. So step one was train a language model on a big corpus. Step two was fine-tune a language model on a more curated corpus. And step three was further fine-tune that model on a task. And of course, that's what everybody still does today, right? That's what ChatGPT is. And so the first time I tried it within hours, I had a new state-of-the-art academic result on IMDB. And I was like, holy s**t, it does work. And so you asked, to what degree was this kind of like pushing against the established wisdom? You know, every way. Like the reason it took me so long to try it was because I asked all my friends in NLP if this could work. And everybody said, no, it definitely won't work. It wasn't like, oh, maybe. Everybody was like, it definitely won't work. NLP is much more complicated than vision. Language is a much more vastly complicated domain. You know, and you've got problems like the grounding problem. We know from like philosophy and theory of mind that it's actually impossible for it to work. So yeah, so don't waste your time. [00:15:10]Alessio: Jeremy, had people not tried because it was like too complicated to actually get the data and like set up the training? Or like, were people just lazy and kind of like, hey, this is just not going to work? [00:15:20]Jeremy: No, everybody wasn't lazy. So like, so the person I thought at that time who, you know, there were two people I thought at that time, actually, who were the strongest at language models were Stephen Merity and Alec Radford. And at the time I didn't know Alec, but I, after we had both, after I'd released ULM Fit and he had released GPT, I organized a chat for both of us with Kate Metz in the New York Times. And Kate Metz answered, sorry, and Alec answered this question for Kate. And Kate was like, so how did, you know, GPT come about? And he said, well, I was pretty sure that pre-training on a general large corpus wouldn't work. So I hadn't tried it. And then I read ULM Fit and turns out it did work. And so I did it, you know, bigger and it worked even better. And similar with, with Stephen, you know, I asked Stephen Merity, like, why don't we just find, you know, take your AWD-ASTLM and like train it on all of Wikipedia and fine tune it? And he's kind of like, well, I don't think that's going to really lie. Like two years before I did a very popular talk at KDD, the conference where everybody in NLP was in the audience. I recognized half the faces, you know, and I told them all this, I'm sure transfer learning is the key. I'm sure ImageNet, you know, is going to be an NLP thing as well. And, you know, everybody was interested and people asked me questions afterwards and, but not just, yeah, nobody followed up because everybody knew that it didn't work. I mean, even like, so we were scooped a little bit by Dai and Lee, Kwok Lee at Google. They had, they had, I already, I didn't even realize this, which is a bit embarrassing. They had already done a large language model and fine tuned it. But again, they didn't create a general purpose, large language model on a general purpose corpus. They only ever tested a domain specific corpus. And I haven't spoken to Kwok actually about that, but I assume that the reason was the same. It probably just didn't occur to them that the general approach could work. So maybe it was that kind of 30 years of mulling over the, the cell Chinese room experiment that had convinced me that it probably would work. I don't know. Yeah. [00:17:48]Alessio: Interesting. I just dug up Alec announcement tweet from 2018. He said, inspired by Cobe, Elmo, and Yola, I'm fit. We should have a single transformer language model can be fine tuned to a wide variety. It's interesting because, you know, today people think of AI as the leader, kind of kind of like the research lab pushing forward the field. What was that at the time? You know, like kind of like going back five years, people think of it as an overnight success, but obviously it took a while. [00:18:16]Swyx: Yeah. Yeah. [00:18:17]Jeremy: No, I mean, absolutely. And I'll say like, you know, it's interesting that it mentioned Elmo because in some ways that was kind of diametrically opposed to, to ULM fit. You know, there was these kind of like, so there was a lot of, there was a lot of activity at the same time as ULM fits released. So there was, um, so before it, as Brian McCann, I think at Salesforce had come out with this neat model that did a kind of multitask learning, but again, they didn't create a general fine tune language model first. There was Elmo, um, which I think was a lip, you know, actually quite a few months after the first ULM fit example, I think. Um, but yeah, there was a bit of this stuff going on. And the problem was everybody was doing, and particularly after GPT came out, then everybody wanted to focus on zero shot and few shot learning. You know, everybody hated fine tuning. Everybody hated transfer learning. And like, I literally did tours trying to get people to start doing transfer learning and people, you know, nobody was interested, particularly after GPT showed such good results with zero shot and few shot learning. And so I actually feel like we kind of went backwards for years and, and not to be honest, I mean, I'm a bit sad about this now, but I kind of got so disappointed and dissuaded by like, it felt like these bigger lab, much bigger labs, you know, like fast AI had only ever been just me and Rachel were getting all of this attention for an approach I thought was the wrong way to do it. You know, I was convinced was the wrong way to do it. And so, yeah, for years people were really focused on getting better at zero shot and few shots and it wasn't until, you know, this key idea of like, well, let's take the ULM fit approach, but for step two, rather than fine tuning on a kind of a domain corpus, let's fine tune on an instruction corpus. And then in step three, rather than fine tuning on a reasonably specific task classification, let's fine tune on a, on a RLHF task classification. And so that was really, that was really key, you know, so I was kind of like out of the NLP field for a few years there because yeah, it just felt like, I don't know, pushing uphill against this vast tide, which I was convinced was not the right direction, but who's going to listen to me, you know, cause I, as you said, I don't have a PhD, not at a university, or at least I wasn't then. I don't have a big set of computers to fine tune huge transformer models. So yeah, it was definitely difficult. It's always been hard. You know, it's always been hard. Like I've always been somebody who does not want to build stuff on lots of big computers because most people don't have lots of big computers and I hate creating stuff that most people can't use, you know, and also stuff that's created on lots of big computers has always been like much more media friendly. So like, it might seem like a recent thing, but actually throughout my 30 years in data science, the attention's always been on, you know, the big iron results. So when I first started, everybody was talking about data warehouses and it was all about Teradata and it'd be like, oh, this big bank has this huge room full of computers and they have like terabytes of data available, you know, at the press of a button. And yeah, that's always what people want to talk about, what people want to write about. And then of course, students coming out of their PhDs and stuff, that's where they want to go work because that's where they read about. And to me, it's a huge distraction, you know, because like I say, most people don't have unlimited compute and I want to help most people, not the small subset of the most well-off people. [00:22:16]Alessio: That's awesome. And it's great to hear, you do such a great job educating that a lot of times you're not telling your own story, you know? So I love this conversation. And the other thing before we jump into Fast.AI, actually, a lot of people that I know, they run across a new architecture and whatnot, they're like, I got to start a company and raise a bunch of money and do all of this stuff. And say, you were like, I want everybody to have access to this. Why was that the case for you? Was it because you already had a successful venture in like FastMail and you were more interested in that? What was the reasoning? [00:22:52]Jeremy: It's a really good question. So I guess the answer is yes, that's the reason why. So when I was a teenager, I thought it would be really cool to like have my own company. You know, I didn't know the word startup. I didn't know the word entrepreneur. I didn't know the word VC. And I didn't really know what any of those things were really until after we started Kaggle, to be honest. Even the way it started to what we now call startups. I just thought they were just small businesses. You know, they were just companies. So yeah, so those two companies were FastMail and Optimal Decisions. FastMail was the first kind of synchronized email provider for non-businesses. So something you can get your same email at home, on your laptop, at work, on your phone, whatever. And then Optimal Decisions invented a new approach to insurance pricing. Something called profit-optimized insurance pricing. So I saw both of those companies, you know, after 10 years. And at that point, I had achieved the thing that as a teenager I had wanted to do. You know, it took a lot longer than it should have because I spent way longer in management consulting than I should have because I got caught up in that stupid rat race. But, you know, eventually I got there and I remember my mom saying to me, you must be so proud. You know, because she remembered my dream. She's like, you've done it. And I kind of reflected and I was like, I'm not proud at all. You know, like people quite liked FastMail. You know, it's quite nice to have synchronized email. It probably would have happened anyway. Yeah, I'm certainly not proud that I've helped some insurance companies suck more money out of their customers. Yeah, no, I'm not proud. You know, it's actually, I haven't really helped the world very much. You know, maybe in the insurance case I've made it a little bit worse. I don't know. So, yeah, I was determined to not waste more years of my life doing things, working hard to do things which I could not be reasonably sure would have a lot of value. So, you know, I took some time off. I wasn't sure if I'd ever work again, actually. I didn't particularly want to, because it felt like, yeah, it felt like such a disappointment. And, but, you know, and I didn't need to. I had enough money. Like, I wasn't super rich, but I had enough money. I didn't need to work. And I certainly recognized that amongst the other people I knew who had enough money that they didn't need to work, they all worked ridiculously hard, you know, and constantly put themselves in extremely stressful situations. And I thought, I don't want to be one of those idiots who's tied to, you know, buying a bigger plane than the next guy or whatever. You know, Kaggle came along and I mainly kind of did that just because it was fun and interesting to hang out with interesting people. But, you know, with Fast.ai in particular, you know, Rachel and I had a very explicit, you know, long series of conversations over a long period of time about like, well, how can we be the most helpful to society as a whole, and particularly to those people who maybe need more help, you know? And so we definitely saw the world going in a potentially pretty dystopian direction if the world's most powerful technology was controlled by a small group of elites. So we thought, yeah, we should focus on trying to help that not happen. You know, sadly, it looks like it still is likely to happen. But I mean, I feel like we've helped make it a little bit less likely. So we've done our bit. [00:26:39]Swyx: You've shown that it's possible. And I think your constant advocacy, your courses, your research that you publish, you know, just the other day you published a finding on, you know, learning that I think is still something that people are still talking about quite a lot. I think that that is the origin story of a lot of people who are going to be, you know, little Jeremy Howards, furthering your mission with, you know, you don't have to do everything by yourself is what I'm saying. No, definitely. Definitely. [00:27:10]Jeremy: You know, that was a big takeaway from like, analytic was analytic. It definitely felt like we had to do everything ourselves. And I kind of, I wanted to solve medicine. I'll say, yeah, okay, solving medicine is actually quite difficult. And I can't do it on my own. And there's a lot of other things I'd like to solve, and I can't do those either. So that was definitely the other piece was like, yeah, you know, can we create an army of passionate domain experts who can change their little part of the world? And that's definitely happened. Like I find nowadays, at least half the time, probably quite a bit more that I get in contact with somebody who's done really interesting work in some domain. Most of the time I'd say, they say, yeah, I got my start with fast.ai. So it's definitely, I can see that. And I also know from talking to folks at places like Amazon and Adobe and stuff, which, you know, there's lots of alumni there. And they say, oh my God, I got here. And like half of the people are fast.ai alumni. So it's fantastic. [00:28:13]Swyx: Yeah. [00:28:14]Jeremy: Actually, Andre Kapathy grabbed me when I saw him at NeurIPS a few years ago. And he was like, I have to tell you, thanks for the fast.ai courses. When people come to Tesla and they need to know more about deep learning, we always send them to your course. And the OpenAI Scholars Program was doing the same thing. So it's kind of like, yeah, it's had a surprising impact, you know, that's just one of like three things we do is the course, you know. [00:28:40]Swyx: Yes. [00:28:40]Jeremy: And it's only ever been at most two people, either me and Rachel or me and Sylvia nowadays, it's just me. So yeah, I think it shows you don't necessarily need a huge amount of money and a huge team of people to make an impact. [00:28:56]Swyx: Yeah. So just to reintroduce fast.ai for people who may not have dived into it much, there is the courses that you do. There is the library that is very well loved. And I kind of think of it as a nicer layer on top of PyTorch that people should start with by default and use it as the basis for a lot of your courses. And then you have like NBDev, which I don't know, is that the third one? [00:29:27]Jeremy: Oh, so the three areas were research, software, and courses. [00:29:32]Swyx: Oh, sorry. [00:29:32]Jeremy: So then in software, you know, fast.ai is the main thing, but NBDev is not far behind. But then there's also things like FastCore, GHAPI, I mean, dozens of open source projects that I've created and some of them have been pretty popular and some of them are still a little bit hidden, actually. Some of them I should try to do a better job of telling people about. [00:30:01]Swyx: What are you thinking about? Yeah, what's on the course of my way? Oh, I don't know, just like little things. [00:30:04]Jeremy: Like, for example, for working with EC2 and AWS, I created a FastEC2 library, which I think is like way more convenient and nice to use than anything else out there. And it's literally got a whole autocomplete, dynamic autocomplete that works both on the command line and in notebooks that'll like auto-complete your instance names and everything like that. You know, just little things like that. I try to make like, when I work with some domain, I try to make it like, I want to make it as enjoyable as possible for me to do that. So I always try to kind of like, like with GHAPI, for example, I think that GitHub API is incredibly powerful, but I didn't find it good to work with because I didn't particularly like the libraries that are out there. So like GHAPI, like FastEC2, it like autocompletes both at the command line or in a notebook or whatever, like literally the entire GitHub API. The entire thing is like, I think it's like less than 100K of code because it actually, as far as I know, the only one that grabs it directly from the official open API spec that GitHub produces. And like if you're in GitHub and you just type an API, you know, autocomplete API method and hit enter, it prints out the docs with brief docs and then gives you a link to the actual documentation page. You know, GitHub Actions, I can write now in Python, which is just so much easier than writing them in TypeScript and stuff. So, you know, just little things like that. [00:31:40]Swyx: I think that's an approach which more developers took to publish some of their work along the way. You described the third arm of FastAI as research. It's not something I see often. Obviously, you do do some research. And how do you run your research? What are your research interests? [00:31:59]Jeremy: Yeah, so research is what I spend the vast majority of my time on. And the artifacts that come out of that are largely software and courses. You know, so to me, the main artifact shouldn't be papers because papers are things read by a small exclusive group of people. You know, to me, the main artifacts should be like something teaching people, here's how to use this insight and here's software you can use that builds it in. So I think I've only ever done three first-person papers in my life, you know, and none of those are ones I wanted to do. You know, they were all ones that, like, so one was ULM Fit, where Sebastian Ruder reached out to me after seeing the course and said, like, you have to publish this as a paper, you know. And he said, I'll write it. He said, I want to write it because if I do, I can put it on my PhD and that would be great. And it's like, okay, well, I want to help you with your PhD. And that sounds great. So like, you know, one was the masks paper, which just had to exist and nobody else was writing it. And then the third was the Fast.ai library paper, which again, somebody reached out and said, please, please write this. We will waive the fee for the journal and everything and actually help you get it through publishing and stuff. So yeah, so I don't, other than that, I've never written a first author paper. So the research is like, well, so for example, you know, Dawn Bench was a competition, which Stanford ran a few years ago. It was kind of the first big competition of like, who can train neural nets the fastest rather than the most accurate. And specifically it was who can train ImageNet the fastest. And again, this was like one of these things where it was created by necessity. So Google had just released their TPUs. And so I heard from my friends at Google that they had put together this big team to smash Dawn Bench so that they could prove to people that they had to use Google Cloud and use their TPUs and show how good their TPUs were. And we kind of thought, oh s**t, this would be a disaster if they do that, because then everybody's going to be like, oh, deep learning is not accessible. [00:34:20]Swyx: You know, to actually be good at it, [00:34:21]Jeremy: you have to be Google and you have to use special silicon. And so, you know, we only found out about this 10 days before the competition finished. But, you know, we basically got together an emergency bunch of our students and Rachel and I and sat for the next 10 days and just tried to crunch through and try to use all of our best ideas that had come from our research. And so particularly progressive resizing, just basically train mainly on small things, train on non-square things, you know, stuff like that. And so, yeah, we ended up winning, thank God. And so, you know, we turned it around from being like, like, oh s**t, you know, this is going to show that you have to be Google and have TPUs to being like, oh my God, even the little guy can do deep learning. So that's an example of the kind of like research artifacts we do. And yeah, so all of my research is always, how do we do more with less, you know? So how do we get better results with less data, with less compute, with less complexity, with less education, you know, stuff like that. So ULM fits obviously a good example of that. [00:35:37]Swyx: And most recently you published, can LLMs learn from a single example? Maybe could you tell the story a little bit behind that? And maybe that goes a little bit too far into the learning of very low resource, the literature. [00:35:52]Jeremy: Yeah, yeah. So me and my friend, Jono Whittaker, basically had been playing around with this fun Kaggle competition, which is actually still running as we speak, which is, can you create a model which can answer multiple choice questions about anything that's in Wikipedia? And the thing that makes it interesting is that your model has to run on Kaggle within nine hours. And Kaggle's very, very limited. So you've only got 14 gig RAM, only two CPUs, and a small, very old GPU. So this is cool, you know, if you can do well at this, then this is a good example of like, oh, you can do more with less. So yeah, Jono and I were playing around with fine tuning, of course, transfer learning, pre-trained language models. And we saw this, like, so we always, you know, plot our losses as we go. So here's another thing we created. Actually, Sylvain Guuger, when he worked with us, created called fast progress, which is kind of like TQEDM, but we think a lot better. So we look at our fast progress curves, and they kind of go down, down, down, down, down, down, down, a little bit, little bit, little bit. And then suddenly go clunk, and they drop. And then down, down, down, down, down a little bit, and then suddenly clunk, they drop. We're like, what the hell? These clunks are occurring at the end of each epoch. So normally in deep learning, this would be, this is, you know, I've seen this before. It's always been a bug. It's always turned out that like, oh, we accidentally forgot to turn on eval mode during the validation set. So I was actually learning then, or, oh, we accidentally were calculating moving average statistics throughout the epoch. So, you know, so it's recently moving average or whatever. And so we were using Hugging Face Trainer. So, you know, I did not give my friends at Hugging Face the benefit of the doubt. I thought, oh, they've fucked up Hugging Face Trainer, you know, idiots. Well, you'll use the Fast AI Trainer instead. So we switched over to Learner. We still saw the clunks and, you know, that's, yeah, it shouldn't really happen because semantically speaking in the epoch, isn't like, it's not a thing, you know, like nothing happens. Well, nothing's meant to happen when you go from ending one epoch to starting the next one. So there shouldn't be a clunk, you know. So I kind of asked around on the open source discords. That's like, what's going on here? And everybody was just like, oh, that's just what, that's just what these training curves look like. Those all look like that. Don't worry about it. And I was like, oh, are you all using Trainer? Yes. Oh, well, there must be some bug with Trainer. And I was like, well, we also saw it in Learner [00:38:42]Swyx: and somebody else is like, [00:38:42]Jeremy: no, we've got our own Trainer. We get it as well. They're just like, don't worry about it. It's just something we see. It's just normal. [00:38:48]Swyx: I can't do that. [00:38:49]Jeremy: I can't just be like, here's something that's like in the previous 30 years of neural networks, nobody ever saw it. And now suddenly we see it. [00:38:57]Swyx: So don't worry about it. [00:38:59]Jeremy: I just, I have to know why. [00:39:01]Swyx: Can I clarify? This is, was everyone that you're talking to, were they all seeing it for the same dataset or in different datasets? [00:39:08]Jeremy: Different datasets, different Trainers. They're just like, no, this is just, this is just what it looks like when you fine tune language models. Don't worry about it. You know, I hadn't seen it before, but I'd been kind of like, as I say, I, you know, I kept working on them for a couple of years after ULM fit. And then I kind of moved on to other things, partly out of frustration. So I hadn't been fine tuning, you know, I mean, Lama's only been out for a few months, right? But I wasn't one of those people who jumped straight into it, you know? So I was relatively new to the kind of Lama fine tuning world, where else these guys had been, you know, doing it since day one. [00:39:49]Swyx: It was only a few months ago, [00:39:51]Jeremy: but it's still quite a bit of time. So, so yeah, they're just like, no, this is all what we see. [00:39:56]Swyx: Don't worry about it. [00:39:56]Jeremy: So yeah, I, I've got a very kind of like, I don't know, I've just got this brain where I have to know why things are. And so I kind of, I ask people like, well, why, why do you think it's happening? And they'd be like, oh, it would pretty obviously, cause it's like memorize the data set. It's just like, that can't be right. It's only seen it once. Like, look at this, the loss has dropped by 0.3, 0.3, which is like, basically it knows the answer. And like, no, no, it's just, it is, it's just memorize the data set. So yeah. So look, Jono and I did not discover this and Jono and I did not come up with a hypothesis. You know, I guess we were just the ones, I guess, who had been around for long enough to recognize that like, this, this isn't how it's meant to work. And so we, we, you know, and so we went back and like, okay, let's just run some experiments, you know, cause nobody seems to have actually published anything about this. [00:40:51]Well, not quite true.Some people had published things, but nobody ever actually stepped back and said like, what the hell, you know, how can this be possible? Is it possible? Is this what's happening? And so, yeah, we created a bunch of experiments where we basically predicted ahead of time. It's like, okay, if this hypothesis is correct, that it's memorized in the training set, then we ought to see blah, under conditions, blah, but not under these conditions. And so we ran a bunch of experiments and all of them supported the hypothesis that it was memorizing the data set in a single thing at once. And it's a pretty big data set, you know, which in hindsight, it's not totally surprising because the theory, remember, of the ULMFiT theory was like, well, it's kind of creating all these latent capabilities to make it easier for it to predict the next token. So if it's got all this kind of latent capability, it ought to also be really good at compressing new tokens because it can immediately recognize it as like, oh, that's just a version of this. So it's not so crazy, you know, but it is, it requires us to rethink everything because like, and nobody knows like, okay, so how do we fine tune these things? Because like, it doesn't even matter. Like maybe it's fine. Like maybe it's fine that it's memorized the data set after one go and you do a second go and okay, the validation loss is terrible because it's now really overconfident. [00:42:20]Swyx: That's fine. [00:42:22]Jeremy: Don't, you know, don't, I keep telling people, don't track validation loss, track validation accuracy because at least that will still be useful. Just another thing that's got lost since ULMFiT, nobody tracks accuracy of language models anymore. But you know, it'll still keep learning and it does, it does keep improving. But is it worse? You know, like, is it like, now that it's kind of memorized it, it's probably getting a less strong signal, you know, I don't know. So I still don't know how to fine tune language models properly and I haven't found anybody who feels like they do, like nobody really knows whether this memorization thing is, it's probably a feature in some ways. It's probably some things that you can do usefully with it. It's probably, yeah, I have a feeling it's messing up training dynamics as well. [00:43:13]Swyx: And does it come at the cost of catastrophic forgetting as well, right? Like, which is the other side of the coin. [00:43:18]Jeremy: It does to some extent, like we know it does, like look at Code Llama, for example. So Code Llama was a, I think it was like a 500 billion token fine tuning of Llama 2 using code. And also pros about code that Meta did. And honestly, they kind of blew it because Code Llama is good at coding, but it's bad at everything else, you know, and it used to be good. Yeah, I was pretty sure it was like, before they released it, me and lots of people in the open source discords were like, oh my God, you know, we know this is coming, Jan Lukinsk saying it's coming. I hope they kept at least like 50% non-code data because otherwise it's going to forget everything else. And they didn't, only like 0.3% of their epochs were non-code data. So it did, it forgot everything else. So now it's good at code and it's bad at everything else. So we definitely have catastrophic forgetting. It's fixable, just somebody has to do, you know, somebody has to spend their time training a model on a good mix of data. Like, so, okay, so here's the thing. Even though I originally created three-step approach that everybody now does, my view is it's actually wrong and we shouldn't use it. [00:44:36]Jeremy: And that's because people are using it in a way different to why I created it. You know, I created it thinking the task-specific models would be more specific. You know, it's like, oh, this is like a sentiment classifier as an example of a task, you know, but the tasks now are like a, you know, RLHF, which is basically like answer questions that make people feel happy about your answer. So that's a much more general task and it's a really cool approach. And so we see, for example, RLHF also breaks models like, you know, like GPT-4, RLHDEFT, we know from kind of the work that Microsoft did, you know, the pre, the earlier, less aligned version was better. And these are all kind of examples of catastrophic forgetting. And so to me, the right way to do this is to fine-tune language models, is to actually throw away the idea of fine-tuning. There's no such thing. There's only continued pre-training. And pre-training is something where from the very start, you try to include all the kinds of data that you care about, all the kinds of problems that you care about, instructions, exercises, code, general purpose document completion, whatever. And then as you train, you gradually curate that, you know, you gradually make that higher and higher quality and more and more specific to the kinds of tasks you want it to do. But you never throw away any data. You always keep all of the data types there in reasonably high quantities. You know, maybe the quality filter, you stop training on low quality data, because that's probably fine to forget how to write badly, maybe. So yeah, that's now my view, is I think ULM fit is the wrong approach. And that's why we're seeing a lot of these, you know, so-called alignment tacks and this view of like, oh, a model can't both code and do other things. And, you know, I think it's actually because people are training them wrong. [00:46:47]Swyx: Yeah, well, I think you have a clear [00:46:51]Alessio: anti-laziness approach. I think other people are not as good hearted, you know, they're like, [00:46:57]Swyx: hey, they told me this thing works. [00:46:59]Alessio: And if I release a model this way, people will appreciate it, I'll get promoted and I'll kind of make more money. [00:47:06]Jeremy: Yeah, and it's not just money. It's like, this is how citations work most badly, you know, so if you want to get cited, you need to write a paper that people in your field recognize as an advancement on things that we know are good. And so we've seen this happen again and again. So like I say, like zero shot and few shot learning, everybody was writing about that. Or, you know, with image generation, everybody just was writing about GANs, you know, and I was trying to say like, no, GANs are not the right approach. You know, and I showed again through research that we demonstrated in our videos that you can do better than GANs, much faster and with much less data. And nobody cared because again, like if you want to get published, you write a GAN paper that slightly improves this part of GANs and this tiny field, you'll get published, you know. So it's, yeah, it's not set up for real innovation. It's, you know, again, it's really helpful for me, you know, I have my own research lab with nobody telling me what to do and I don't even publish. So it doesn't matter if I get citations. And so I just write what I think actually matters. I wish there was, and, you know, and actually places like OpenAI, you know, the researchers there can do that as well. It's a shame, you know, I wish there was more academic, open venues in which people can focus on like genuine innovation. [00:48:38]Swyx: Twitter, which is unironically has become a little bit of that forum. I wanted to follow up on one thing that you mentioned, which is that you checked around the open source discords. I don't know if it's too, I don't know if it's a pusher to ask like what discords are lively or useful right now. I think that something I definitely felt like I missed out on was the early days of Luther AI, which is a very hard bit. And, you know, like what is the new Luther? And you actually shouted out the alignment lab AI discord in your blog post. And that was the first time I even knew, like I saw them on Twitter, never knew they had a discord, never knew that there was actually substantive discussions going on in there and that you were an active member of it. Okay, yeah. [00:49:23]Jeremy: And then even then, if you do know about that and you go there, it'll look like it's totally dead. And that's because unfortunately, nearly all the discords, nearly all of the conversation happens in private channels. You know, and that's, I guess. [00:49:35]Swyx: How does someone get into that world? Because it's obviously very, very instructive, right? [00:49:42]Jeremy: You could just come to the first AI discord, which I'll be honest with you, it's less bustling than some of the others, but it's not terrible. And so like, at least, to be fair, one of Emma's bustling channels is private. [00:49:57]Swyx: I guess. [00:49:59]Jeremy: So I'm just thinking. [00:50:01]Swyx: It's just the nature of quality discussion, right? Yeah, I guess when I think about it, [00:50:05]Jeremy: I didn't have any private discussions on our discord for years, but there was a lot of people who came in with like, oh, I just had this amazing idea for AGI. If you just thought about like, if you imagine that AI is a brain, then we, you know, this just, I don't want to talk about it. You know, I don't want to like, you don't want to be dismissive or whatever. And it's like, oh, well, that's an interesting comment, but maybe you should like, try training some models first to see if that aligns with your intuition. Like, oh, but how could I possibly learn? It's like, well, we have a course, just actually spend time learning. Like, you know, anyway. And there's like, okay, I know the people who always have good answers there. And so I created a private channel and put them all in it. And I got to admit, that's where I post more often because there's much less, you know, flight of fancy views about how we could solve AGI, blah, blah, blah. So there is a bit of that. But having said that, like, I think the bar is pretty low. Like if you join a Discord and you can hit the like participants or community or whatever button, you can see who's in it. And then you'll see at the top, who the admins or moderators or people in the dev role are. And just DM one of them and say like, oh, here's my GitHub. Well, here's some blog posts I wrote. You know, I'm interested in talking about this, you know, can I join the private channels? And I've never heard of anybody saying no. I will say, you know, Alutha's all pretty open. So you can do the Alutha Discord still. You know, one problem with the Alutha Discord is it's been going on for so long that it's like, it's very inside baseball. It's quite hard to get started. Yeah. Carpa AI looks, I think it's all open. That's just less stability. That's more accessible. [00:52:03]Swyx: Yeah. [00:52:04]Jeremy: There's also just recently, now it's research that does like the Hermes models and data set just opened. They've got some private channels, but it's pretty open, I think. You mentioned Alignment Lab, that one it's all the interesting stuff is on private channels. So just ask. If you know me, ask me, cause I've got admin on that one. There's also, yeah, OS Skunkworks, OS Skunkworks AI is a good Discord, which I think it's open. So yeah, they're all pretty good. [00:52:40]Swyx: I don't want you to leak any, you know, Discords that don't want any publicity, but this is all helpful. [00:52:46]Jeremy: We all want people, like we all want people. [00:52:49]Swyx: We just want people who like, [00:52:51]Jeremy: want to build stuff, rather than people who, and like, it's fine to not know anything as well, but if you don't know anything, but you want to tell everybody else what to do and how to do it, that's annoying. If you don't know anything and want to be told like, here's a really small kind of task that as somebody who doesn't know anything is going to take you a really long time to do, but it would still be helpful. Then, and then you go and do it. That would be great. The truth is, yeah, [00:53:19]Swyx: like, I don't know, [00:53:20]Jeremy: maybe 5% of people who come in with great enthusiasm and saying that they want to learn and they'll do anything. [00:53:25]Swyx: And then somebody says like, [00:53:25]Jeremy: okay, here's some work you can do. Almost nobody does that work. So if you're somebody who actually does the work and follows up, you will massively stand out. That's an extreme rarity. And everybody will then want to help you do more work. [00:53:41]Swyx: So yeah. [00:53:41]Jeremy: So just, yeah, just do work and people will want to support you. [00:53:47]Alessio: Our Discord used to be referral only for a long time. We didn't have a public invite and then we opened it and they're kind of like channel gating. Yeah. A lot of people just want to do, I remember it used to be like, you know, a forum moderator. [00:54:00]Swyx: It's like people just want to do [00:54:01]Alessio: like drive-by posting, [00:54:03]Swyx: you know, and like, [00:54:03]Alessio: they don't want to help the community. They just want to get their question answered. [00:54:07]Jeremy: I mean, the funny thing is our forum community does not have any of that garbage. You know, there's something specific about the low latency thing where people like expect an instant answer. And yeah, we're all somehow in a forum thread where they know it's like there forever. People are a bit more thoughtful, but then the forums are less active than they used to be because Discord has got more popular, you know? So it's all a bit of a compromise, you know, running a healthy community is, yeah, it's always a bit of a challenge. All right, we got so many more things [00:54:47]Alessio: we want to dive in, but I don't want to keep you here for hours. [00:54:50]Swyx: This is not the Lex Friedman podcast [00:54:52]Alessio: we always like to say. One topic I would love to maybe chat a bit about is Mojo, modular, you know, CrystalLiner, not many of you on the podcast. So we want to spend a little time there. You recently did a hacker's guide to language models and you ran through everything from quantized model to like smaller models, larger models, and all of that. But obviously modular is taking its own approach. Yeah, what got you excited? I know you and Chris have been talking about this for like years and a lot of the ideas you had, so. [00:55:23]Jeremy: Yeah, yeah, yeah, yeah, no, absolutely. So I met Chris, I think it was at the first TensorFlow Dev Summit. And I don't think he had even like, I'm not sure if he'd even officially started his employment with Google at that point. So I don't know, you know, certainly nothing had been mentioned. So I, you know, I admired him from afar with LLVM and Swift and whatever. And so I saw him walk into the courtyard at Google. It's just like, oh s**t, man, that's Chris Latner. I wonder if he would lower his standards enough to talk to me. Well, worth a try. So I caught up my courage because like nobody was talking to him. He looked a bit lost and I wandered over and it's like, oh, you're Chris Latner, right? It's like, what are you doing here? What are you doing here? And I was like, yeah, yeah, yeah. It's like, oh, I'm Jeremy Howard. It's like, oh, do you do some of this AI stuff? And I was like, yeah, yeah, I like this AI stuff. Are you doing AI stuff? It's like, well, I'm thinking about starting to do some AI stuff. Yeah, I think it's going to be cool. And it's like, wow. So like, I spent the next half hour just basically brain dumping all the ways in which AI was stupid to him. And he listened patiently. And I thought he probably wasn't even remember or care or whatever. But yeah, then I kind of like, I guess I re-caught up with him a few months later. And it's like, I've been thinking about everything you said in that conversation. And he like narrated back his response to every part of it, projects he was planning to do. And it's just like, oh, this dude follows up. Holy s**t. And I was like, wow, okay. And he was like, yeah, so we're going to create this new thing called Swift for TensorFlow. And it's going to be like, it's going to be a compiler with auto differentiation built in. And blah, blah, blah. And I was like, why would that help? [00:57:10]Swyx: You know, why would you? [00:57:10]Jeremy: And he was like, okay, with a compiler during the forward pass, you don't have to worry about saving context, you know, because a lot will be optimized in the backward. But I was like, oh my God. Because I didn't really know much about compilers. You know, I spent enough to kind of like, understand the ideas, but it hadn't occurred to me that a compiler basically solves a lot of the problems we have as end users. I was like, wow, that's amazing. Okay, you do know, right, that nobody's going to use this unless it's like usable. It's like, yeah, I know, right. So I was thinking you should create like a fast AI for this. So, okay, but I don't even know Swift. And he was like, well, why don't you start learning it? And if you have any questions, ask me. It's just like, holy s**t. Like, not only has Chris Latner lowered his standards enough to talk to me, but he's offering me personal tutoring on the programming language that he made. So I was just like, I'm not g

covid-19 christmas god university amazon community new york city ai australia google state new york times phd chinese development microsoft dm holy current language 3d poor chatgpt tesla created lesson melbourne discord trainers stanford shakespeare communities honestly exciting wikipedia focused tom cruise cto ram vc swift nlp react academia transformers rapid openai salesforce sf residence imdb adobe faced api mixing mckinsey replacing luther mojo python gpt aws lama github llama hermes elmo 2b copilot llm dai learner gpu showed 3b agi elo google cloud sql ragnar jono lasagna internally codex rag gpus unclear large language models 7b ulm fergus anthropic gan alessio lms fine tuning triton gans prompting lex fridman cpus rac typescript yola tensorflow rfc cuda kwok datasets sram github actions kaggle pytorch 24m 500b ec2 teradata cobe jeremy howard brian mccann imagenet discords llvm zeiler andrej karpathy chris lattner neurips rachel thomas so google fastmail rlhf spms rnn phy kdd stockfish code llama theano llama2 jeremy it jeremy you enlitic mlir jeremy so practical deep learning tensorflow dev summit

Doing it the Hard Way: Making the AI engine and language

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Sep 14, 2023 89:22

Want to help define the AI Engineer stack? Have opinions on the top tools, communities and builders? We're collaborating with friends at Amplify to launch the first State of AI Engineering survey! Please fill it out (and tell your friends)!If AI is so important, why is its software so bad?This was the motivating question for Chris Lattner as he reconnected with his product counterpart on Tensorflow, Tim Davis, and started working on a modular solution to the problem of sprawling, monolithic, fragmented platforms in AI development. They announced a $30m seed in 2022 and, following their successful double launch of Modular/Mojo

ceo president ai google starting apple future state design phd goals performance dna microsoft iphone language cnn tesla fall in love attention tree humans matrix discord android origins nerds switzerland mac lego ios ipads windows intel senior director cto actors swift siri openai load residence nvidia rust hardware api engine generally learnings cs ads prom mojo python gpt ui ml linux enabling llama amplify autopilot automatic macbook amd guido macos underneath llm hard way macs cpu gpu flute google cloud modular docker walrus gpus tf simplest alessio speculative satya rl playgrounds cpus gcp dsl proms cpp clippy hpc tensorflow cuda keras xcode chris do caffe compiler risc v smol pytorch distinguished engineer risc tim davis objective c google brain clang intel cpus product engineering tpu jupyter notebooks cutlass jeremy howard swift playgrounds a100 imagenet llvm pypi graviton halide andrej karpathy amx chris lattner tvm alerted numpy ai engineer george hotz scott forstall christ here chris yeah chris ray chris you tabular sagemaker cisc chris no chris oh chris so resnet chris well chris one xla pypy chris right mnist chris they wkb mlir cython

Write Solidity on Solana with Solang (feat. Sean Young, Solana Labs)

Solfate Podcast - Interviews with blockchain founders/builders on Solana

Play Episode Listen Later Aug 23, 2023 55:00

Follow the @SolfatePod show on Twitter for updates. Thanks for listening frens :)Notes from the showThe creator and lead developer of Solang, Sean Young, a compiler that allow developers to write Solana programs (aka smart contracts) in the Solidity programming language. This has been a multi year effort to allow existing Solidity developers, like all those existing in the Ethereum ecosystem, to use their existing language knowledge to write Solidity smart contracts on the Solana blockchain.Sean describes how he started his developer journey in the blockchain space, starting as writing his own compiler for the Solidity programming language for a EVM compatible blockchain for the purpose of processing traditional documents.Sean began hitting roadblocks when he was trying to add new features into the Solidity language, which is effectively only used for Ethereum and EVM compatible blockchains and maintained by the Ethereum community.As a general overview, Sean describes how a compiler actually works. Including how compilers like Solang and even native Solana uses LLVM toolkit (Low Level Virtual Machine) to maximize compatibility for multiple programming languages.Words and acronyms used throughout the episodesolidity - A statically-typed curly-braces programming language designed for developing smart contracts that run on Ethereum and most EVM compatible blockchains.EVM - the Ethereum Virtual Machine - essentially the portion of any Ethereum based blockchain that actually runs/executes smart contracts written in the Solidity programming languageEIP - Ethereum Improvement Proposals - standards specifying potential new features or processes for EthereumWASM - Web Assembly - is a binary instruction format for a stack-based virtual machineLLVM - Low Level Virtual Machine - a set of compiler and toolchain technologies that can be used to develop a frontend for any programming language and a backend for any instruction set architecture.Solana specific terms (or at least common in the Solana ecosystem): BPF - Berkeley Packet Filter - a technology used in certain computer operating systems for programs that need to, among other things, analyze network traffic.SBF (aka SBPF) - Solana Berkeley Packet Filter - this is a custom implementation of BPF with tweaks for the Solana runtime and SVMSVM - Solana Virtual Machine - the portion of the Solana runtime that actually runs/executes code on the Solana blockchainIDL - Interface Definition Language - generic term for a language that lets a program or object written in one language communicate with another program written in an unknown languageFind Sean and Solang onlineFollow Sean on twitterSolang's documentationSolang getting started guideFollow us aroundNicktwitter: @nickfrostygithub: github.com/nickfrostywebsite: https://nick.afJamestwitter: @jamesrp13github: github.com/jamesrp13Solfate Podcasttwitter: @SolfatePodmore podcast episodes: solfate.com/podcast

write ethereum solana sbf sean young evm solidity showthe llvm bpf solana labs solang

LLMs Everywhere: Running 70B models in browsers and iPhones using MLC — with Tianqi Chen of CMU / OctoML

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Aug 10, 2023 52:10

We have just announced our first set of speakers at AI Engineer Summit! Sign up for the livestream or email sponsors@ai.engineer if you'd like to support.We are facing a massive GPU crunch. As both startups and VC's hoard Nvidia GPUs like countries count nuclear stockpiles, tweets about GPU shortages have become increasingly common. But what if we could run LLMs with AMD cards, or without a GPU at all? There's just one weird trick: compilation. And there's one person uniquely qualified to do it.We had the pleasure to sit down with Tianqi Chen, who's an Assistant Professor at CMU, where he both teaches the MLC course and runs the MLC group. You might also know him as the creator of XGBoost, Apache TVM, and MXNet, as well as the co-founder of OctoML. The MLC (short for Machine Learning Compilation) group has released a lot of interesting projects:* MLC Chat: an iPhone app that lets you run models like RedPajama-3B and Vicuna-7B on-device. It gets up to 30 tok/s!* Web LLM: Run models like LLaMA-70B in your browser (!!) to offer local inference in your product.* MLC LLM: a framework that allows any language models to be deployed natively on different hardware and software stacks.The MLC group has just announced new support for AMD cards; we previously talked about the shortcomings of ROCm, but using MLC you can get performance very close to the NVIDIA's counterparts. This is great news for founders and builders, as AMD cards are more readily available. Here are their latest results on AMD's 7900s vs some of top NVIDIA consumer cards.If you just can't get a GPU at all, MLC LLM also supports ARM and x86 CPU architectures as targets by leveraging LLVM. While speed performance isn't comparable, it allows for non-time-sensitive inference to be run on commodity hardware.We also enjoyed getting a peek into TQ's process, which involves a lot of sketching:With all the other work going on in this space with projects like ggml and Ollama, we're excited to see GPUs becoming less and less of an issue to get models in the hands of more people, and innovative software solutions to hardware problems!Show Notes* TQ's Projects:* XGBoost* Apache TVM* MXNet* MLC* OctoML* CMU Catalyst* ONNX* GGML* Mojo* WebLLM* RWKV* HiPPO* Tri Dao's Episode* George Hotz EpisodePeople:* Carlos Guestrin* Albert GuTimestamps* [00:00:00] Intros* [00:03:41] The creation of XGBoost and its surprising popularity* [00:06:01] Comparing tree-based models vs deep learning* [00:10:33] Overview of TVM and how it works with ONNX* [00:17:18] MLC deep dive* [00:28:10] Using int4 quantization for inference of language models* [00:30:32] Comparison of MLC to other model optimization projects* [00:35:02] Running large language models in the browser with WebLLM* [00:37:47] Integrating browser models into applications* [00:41:15] OctoAI and self-optimizing compute* [00:45:45] Lightning RoundTranscriptAlessio: Hey everyone, welcome to the Latent Space podcast. This is Alessio, Partner and CTO in Residence at Decibel Partners, and I'm joined by my co-host Swyx, writer and editor of Latent Space. [00:00:20]Swyx: Okay, and we are here with Tianqi Chen, or TQ as people call him, who is assistant professor in ML computer science at CMU, Carnegie Mellon University, also helping to run Catalyst Group, also chief technologist of OctoML. You wear many hats. Are those, you know, your primary identities these days? Of course, of course. [00:00:42]Tianqi: I'm also, you know, very enthusiastic open source. So I'm also a VP and PRC member of the Apache TVM project and so on. But yeah, these are the things I've been up to so far. [00:00:53]Swyx: Yeah. So you did Apache TVM, XGBoost, and MXNet, and we can cover any of those in any amount of detail. But maybe what's one thing about you that people might not learn from your official bio or LinkedIn, you know, on the personal side? [00:01:08]Tianqi: Let me say, yeah, so normally when I do, I really love coding, even though like I'm trying to run all those things. So one thing that I keep a habit on is I try to do sketchbooks. I have a book, like real sketchbooks to draw down the design diagrams and the sketchbooks I keep sketching over the years, and now I have like three or four of them. And it's kind of a usually a fun experience of thinking the design through and also seeing how open source project evolves and also looking back at the sketches that we had in the past to say, you know, all these ideas really turn into code nowadays. [00:01:43]Alessio: How many sketchbooks did you get through to build all this stuff? I mean, if one person alone built one of those projects, he'll be a very accomplished engineer. Like you built like three of these. What's that process like for you? Like it's the sketchbook, like the start, and then you think about the code or like. [00:01:59]Swyx: Yeah. [00:02:00]Tianqi: So, so usually I start sketching on high level architectures and also in a project that works for over years, we also start to think about, you know, new directions, like of course generative AI language model comes in, how it's going to evolve. So normally I would say it takes like one book a year, roughly at that rate. It's usually fun to, I find it's much easier to sketch things out and then gives a more like a high level architectural guide for some of the future items. Yeah. [00:02:28]Swyx: Have you ever published this sketchbooks? Cause I think people would be very interested on, at least on a historical basis. Like this is the time where XGBoost was born, you know? Yeah, not really. [00:02:37]Tianqi: I started sketching like after XGBoost. So that's a kind of missing piece, but a lot of design details in TVM are actually part of the books that I try to keep a record of. [00:02:48]Swyx: Yeah, we'll try to publish them and publish something in the journals. Maybe you can grab a little snapshot for visual aid. Sounds good. [00:02:57]Alessio: Yeah. And yeah, talking about XGBoost, so a lot of people in the audience might know it's a gradient boosting library, probably the most popular out there. And it became super popular because many people started using them in like a machine learning competitions. And I think there's like a whole Wikipedia page of like all state-of-the-art models. They use XGBoost and like, it's a really long list. When you were working on it, so we just had Tri Dao, who's the creator of FlashAttention on the podcast. And I asked him this question, it's like, when you were building FlashAttention, did you know that like almost any transform race model will use it? And so I asked the same question to you when you were coming up with XGBoost, like, could you predict it would be so popular or like, what was the creation process? And when you published it, what did you expect? We have no idea. [00:03:41]Tianqi: Like, actually, the original reason that we built that library is that at that time, deep learning just came out. Like that was the time where AlexNet just came out. And one of the ambitious mission that myself and my advisor, Carlos Guestrin, then is we want to think about, you know, try to test the hypothesis. Can we find alternatives to deep learning models? Because then, you know, there are other alternatives like, you know, support vector machines, linear models, and of course, tree-based models. And our question was, if you build those models and feed them with big enough data, because usually like one of the key characteristics of deep learning is that it's taking a lot [00:04:22]Swyx: of data, right? [00:04:23]Tianqi: So we will be able to get the same amount of performance. That's a hypothesis we're setting out to test. Of course, if you look at now, right, that's a wrong hypothesis, but as a byproduct, what we find out is that, you know, most of the gradient boosting library out there is not efficient enough for us to test that hypothesis. So I happen to have quite a bit of experience in the past of building gradient boosting trees and their variants. So Effective Action Boost was kind of like a byproduct of that hypothesis testing. At that time, I'm also competing a bit in data science challenges, like I worked on KDDCup and then Kaggle kind of become bigger, right? So I kind of think maybe it's becoming useful to others. One of my friends convinced me to try to do a Python binding of it. That tends to be like a very good decision, right, to be effective. Usually when I build it, we feel like maybe a command line interface is okay. And now we have a Python binding, we have R bindings. And then it realized, you know, it started getting interesting. People started contributing different perspectives, like visualization and so on. So we started to push a bit more on to building distributive support to make sure it works on any platform and so on. And even at that time point, when I talked to Carlos, my advisor, later, he said he never anticipated that we'll get to that level of success. And actually, why I pushed for gradient boosting trees, interestingly, at that time, he also disagreed. He thinks that maybe we should go for kernel machines then. And it turns out, you know, actually, we are both wrong in some sense, and Deep Neural Network was the king in the hill. But at least the gradient boosting direction got into something fruitful. [00:06:01]Swyx: Interesting. [00:06:02]Alessio: I'm always curious when it comes to these improvements, like, what's the design process in terms of like coming up with it? And how much of it is a collaborative with like other people that you're working with versus like trying to be, you know, obviously, in academia, it's like very paper-driven kind of research driven. [00:06:19]Tianqi: I would say the extra boost improvement at that time point was more on like, you know, I'm trying to figure out, right. But it's combining lessons. Before that, I did work on some of the other libraries on matrix factorization. That was like my first open source experience. Nobody knew about it, because you'll find, likely, if you go and try to search for the package SVD feature, you'll find some SVN repo somewhere. But it's actually being used for some of the recommender system packages. So I'm trying to apply some of the previous lessons there and trying to combine them. The later projects like MXNet and then TVM is much, much more collaborative in a sense that... But, of course, extra boost has become bigger, right? So when we started that project myself, and then we have, it's really amazing to see people come in. Michael, who was a lawyer, and now he works on the AI space as well, on contributing visualizations. Now we have people from our community contributing different things. So extra boost even today, right, it's a community of committers driving the project. So it's definitely something collaborative and moving forward on getting some of the things continuously improved for our community. [00:07:37]Alessio: Let's talk a bit about TVM too, because we got a lot of things to run through in this episode. [00:07:42]Swyx: I would say that at some point, I'd love to talk about this comparison between extra boost or tree-based type AI or machine learning compared to deep learning, because I think there is a lot of interest around, I guess, merging the two disciplines, right? And we can talk more about that. I don't know where to insert that, by the way, so we can come back to it later. Yeah. [00:08:04]Tianqi: Actually, what I said, when we test the hypothesis, the hypothesis is kind of, I would say it's partially wrong, because the hypothesis we want to test now is, can you run tree-based models on image classification tasks, where deep learning is certainly a no-brainer right [00:08:17]Swyx: now today, right? [00:08:18]Tianqi: But if you try to run it on tabular data, still, you'll find that most people opt for tree-based models. And there's a reason for that, in the sense that when you are looking at tree-based models, the decision boundaries are naturally rules that you're looking at, right? And they also have nice properties, like being able to be agnostic to scale of input and be able to automatically compose features together. And I know there are attempts on building neural network models that work for tabular data, and I also sometimes follow them. I do feel like it's good to have a bit of diversity in the modeling space. Actually, when we're building TVM, we build cost models for the programs, and actually we are using XGBoost for that as well. I still think tree-based models are going to be quite relevant, because first of all, it's really to get it to work out of the box. And also, you will be able to get a bit of interoperability and control monotonicity [00:09:18]Swyx: and so on. [00:09:19]Tianqi: So yes, it's still going to be relevant. I also sometimes keep coming back to think about, are there possible improvements that we can build on top of these models? And definitely, I feel like it's a space that can have some potential in the future. [00:09:34]Swyx: Are there any current projects that you would call out as promising in terms of merging the two directions? [00:09:41]Tianqi: I think there are projects that try to bring a transformer-type model for tabular data. I don't remember specifics of them, but I think even nowadays, if you look at what people are using, tree-based models are still one of their toolkits. So I think maybe eventually it's not even a replacement, it will be just an ensemble of models that you can call. Perfect. [00:10:07]Alessio: Next up, about three years after XGBoost, you built this thing called TVM, which is now a very popular compiler framework for models. Let's talk about, so this came out about at the same time as ONNX. So I think it would be great if you could maybe give a little bit of an overview of how the two things work together. Because it's kind of like the model, then goes to ONNX, then goes to the TVM. But I think a lot of people don't understand the nuances. I can get a bit of a backstory on that. [00:10:33]Tianqi: So actually, that's kind of an ancient history. Before XGBoost, I worked on deep learning for two years or three years. I got a master's before I started my PhD. And during my master's, my thesis focused on applying convolutional restricted Boltzmann machine for ImageNet classification. That is the thing I'm working on. And that was before AlexNet moment. So effectively, I had to handcraft NVIDIA CUDA kernels on, I think, a GTX 2070 card. I have a 22070 card. It took me about six months to get one model working. And eventually, that model is not so good, and we should have picked a better model. But that was like an ancient history that really got me into this deep learning field. And of course, eventually, we find it didn't work out. So in my master's, I ended up working on recommender system, which got me a paper, and I applied and got a PhD. But I always want to come back to work on the deep learning field. So after XGBoost, I think I started to work with some folks on this particular MXNet. At that time, it was like the frameworks of CAFE, Ciano, PyTorch haven't yet come out. And we're really working hard to optimize for performance on GPUs. At that time, I found it's really hard, even for NVIDIA GPU. It took me six months. And then it's amazing to see on different hardwares how hard it is to go and optimize code for the platforms that are interesting. So that gets me thinking, can we build something more generic and automatic? So that I don't need an entire team of so many people to go and build those frameworks. So that's the motivation of starting working on TVM. There is really too little about machine learning engineering needed to support deep learning models on the platforms that we're interested in. I think it started a bit earlier than ONNX, but once it got announced, I think it's in a similar time period at that time. So overall, how it works is that TVM, you will be able to take a subset of machine learning programs that are represented in what we call a computational graph. Nowadays, we can also represent a loop-level program ingest from your machine learning models. Usually, you have model formats ONNX, or in PyTorch, they have FX Tracer that allows you to trace the FX graph. And then it goes through TVM. We also realized that, well, yes, it needs to be more customizable, so it will be able to perform some of the compilation optimizations like fusion operator together, doing smart memory planning, and more importantly, generate low-level code. So that works for NVIDIA and also is portable to other GPU backends, even non-GPU backends [00:13:36]Swyx: out there. [00:13:37]Tianqi: So that's a project that actually has been my primary focus over the past few years. And it's great to see how it started from where I think we are the very early initiator of machine learning compilation. I remember there was a visit one day, one of the students asked me, are you still working on deep learning frameworks? I tell them that I'm working on ML compilation. And they said, okay, compilation, that sounds very ancient. It sounds like a very old field. And why are you working on this? And now it's starting to get more traction, like if you say Torch Compile and other things. I'm really glad to see this field starting to pick up. And also we have to continue innovating here. [00:14:17]Alessio: I think the other thing that I noticed is, it's kind of like a big jump in terms of area of focus to go from XGBoost to TVM, it's kind of like a different part of the stack. Why did you decide to do that? And I think the other thing about compiling to different GPUs and eventually CPUs too, did you already see some of the strain that models could have just being focused on one runtime, only being on CUDA and that, and how much of that went into it? [00:14:50]Tianqi: I think it's less about trying to get impact, more about wanting to have fun. I like to hack code, I had great fun hacking CUDA code. Of course, being able to generate CUDA code is cool, right? But now, after being able to generate CUDA code, okay, by the way, you can do it on other platforms, isn't that amazing? So it's more of that attitude to get me started on this. And also, I think when we look at different researchers, myself is more like a problem solver type. So I like to look at a problem and say, okay, what kind of tools we need to solve that problem? So regardless, it could be building better models. For example, while we build extra boots, we build certain regularizations into it so that it's more robust. It also means building system optimizations, writing low-level code, maybe trying to write assembly and build compilers and so on. So as long as they solve the problem, definitely go and try to do them together. And I also see it's a common trend right now. Like if you want to be able to solve machine learning problems, it's no longer at Aggressor layer, right? You kind of need to solve it from both Aggressor data and systems angle. And this entire field of machine learning system, I think it's kind of emerging. And there's now a conference around it. And it's really good to see a lot more people are starting to look into this. [00:16:10]Swyx: Yeah. Are you talking about ICML or something else? [00:16:13]Tianqi: So machine learning and systems, right? So not only machine learning, but machine learning and system. So there's a conference called MLsys. It's definitely a smaller community than ICML, but I think it's also an emerging and growing community where people are talking about what are the implications of building systems for machine learning, right? And how do you go and optimize things around that and co-design models and systems together? [00:16:37]Swyx: Yeah. And you were area chair for ICML and NeurIPS as well. So you've just had a lot of conference and community organization experience. Is that also an important part of your work? Well, it's kind of expected for academic. [00:16:48]Tianqi: If I hold an academic job, I need to do services for the community. Okay, great. [00:16:53]Swyx: Your most recent venture in MLsys is going to the phone with MLCLLM. You announced this in April. I have it on my phone. It's great. I'm running Lama 2, Vicuña. I don't know what other models that you offer. But maybe just kind of describe your journey into MLC. And I don't know how this coincides with your work at CMU. Is that some kind of outgrowth? [00:17:18]Tianqi: I think it's more like a focused effort that we want in the area of machine learning compilation. So it's kind of related to what we built in TVM. So when we built TVM was five years ago, right? And a lot of things happened. We built the end-to-end machine learning compiler that works, the first one that works. But then we captured a lot of lessons there. So then we are building a second iteration called TVM Unity. That allows us to be able to allow ML engineers to be able to quickly capture the new model and how we demand building optimizations for them. And MLCLLM is kind of like an MLC. It's more like a vertical driven organization that we go and build tutorials and go and build projects like LLM to solutions. So that to really show like, okay, you can take machine learning compilation technology and apply it and bring something fun forward. Yeah. So yes, it runs on phones, which is really cool. But the goal here is not only making it run on phones, right? The goal is making it deploy universally. So we do run on Apple M2 Macs, the 17 billion models. Actually, on a single batch inference, more recently on CUDA, we get, I think, the most best performance you can get out there already on the 4-bit inference. Actually, as I alluded earlier before the podcast, we just had a result on AMD. And on a single batch, actually, we can get the latest AMD GPU. This is a consumer card. It can get to about 80% of the 4019, so NVIDIA's best consumer card out there. So it's not yet on par, but thinking about how diversity and what you can enable and the previous things you can get on that card, it's really amazing that what you can do with this kind of technology. [00:19:10]Swyx: So one thing I'm a little bit confused by is that most of these models are in PyTorch, but you're running this inside a TVM. I don't know. Was there any fundamental change that you needed to do, or was this basically the fundamental design of TVM? [00:19:25]Tianqi: So the idea is that, of course, it comes back to program representation, right? So effectively, TVM has this program representation called TVM script that contains more like computational graph and operational representation. So yes, initially, we do need to take a bit of effort of bringing those models onto the program representation that TVM supports. Usually, there are a mix of ways, depending on the kind of model you're looking at. For example, for vision models and stable diffusion models, usually we can just do tracing that takes PyTorch model onto TVM. That part is still being robustified so that we can bring more models in. On language model tasks, actually what we do is we directly build some of the model constructors and try to directly map from Hugging Face models. The goal is if you have a Hugging Face configuration, we will be able to bring that in and apply optimization on them. So one fun thing about model compilation is that your optimization doesn't happen only as a soft language, right? For example, if you're writing PyTorch code, you just go and try to use a better fused operator at a source code level. Torch compile might help you do a bit of things in there. In most of the model compilations, it not only happens at the beginning stage, but we also apply generic transformations in between, also through a Python API. So you can tweak some of that. So that part of optimization helps a lot of uplifting in getting both performance and also portability on the environment. And another thing that we do have is what we call universal deployment. So if you get the ML program into this TVM script format, where there are functions that takes in tensor and output tensor, we will be able to have a way to compile it. So they will be able to load the function in any of the language runtime that TVM supports. So if you could load it in JavaScript, and that's a JavaScript function that you can take in tensors and output tensors. If you're loading Python, of course, and C++ and Java. So the goal there is really bring the ML model to the language that people care about and be able to run it on a platform they like. [00:21:37]Swyx: It strikes me that I've talked to a lot of compiler people, but you don't have a traditional compiler background. You're inventing your own discipline called machine learning compilation, or MLC. Do you think that this will be a bigger field going forward? [00:21:52]Tianqi: First of all, I do work with people working on compilation as well. So we're also taking inspirations from a lot of early innovations in the field. Like for example, TVM initially, we take a lot of inspirations from Halide, which is just an image processing compiler. And of course, since then, we have evolved quite a bit to focus on the machine learning related compilations. If you look at some of our conference publications, you'll find that machine learning compilation is already kind of a subfield. So if you look at papers in both machine learning venues, the MLC conferences, of course, and also system venues, every year there will be papers around machine learning compilation. And in the compiler conference called CGO, there's a C4ML workshop that also kind of trying to focus on this area. So definitely it's already starting to gain traction and becoming a field. I wouldn't claim that I invented this field, but definitely I helped to work with a lot of folks there. And I try to bring a perspective, of course, trying to learn a lot from the compiler optimizations as well as trying to bring in knowledges in machine learning and systems together. [00:23:07]Alessio: So we had George Hotz on the podcast a few episodes ago, and he had a lot to say about AMD and their software. So when you think about TVM, are you still restricted in a way by the performance of the underlying kernel, so to speak? So if your target is like a CUDA runtime, you still get better performance, no matter like TVM kind of helps you get there, but then that level you don't take care of, right? [00:23:34]Swyx: There are two parts in here, right? [00:23:35]Tianqi: So first of all, there is the lower level runtime, like CUDA runtime. And then actually for NVIDIA, a lot of the mood came from their libraries, like Cutlass, CUDN, right? Those library optimizations. And also for specialized workloads, actually you can specialize them. Because a lot of cases you'll find that if you go and do benchmarks, it's very interesting. Like two years ago, if you try to benchmark ResNet, for example, usually the NVIDIA library [00:24:04]Swyx: gives you the best performance. [00:24:06]Tianqi: It's really hard to beat them. But as soon as you start to change the model to something, maybe a bit of a variation of ResNet, not for the traditional ImageNet detections, but for latent detection and so on, there will be some room for optimization because people sometimes overfit to benchmarks. These are people who go and optimize things, right? So people overfit the benchmarks. So that's the largest barrier, like being able to get a low level kernel libraries, right? In that sense, the goal of TVM is actually we try to have a generic layer to both, of course, leverage libraries when available, but also be able to automatically generate [00:24:45]Swyx: libraries when possible. [00:24:46]Tianqi: So in that sense, we are not restricted by the libraries that they have to offer. That's why we will be able to run Apple M2 or WebGPU where there's no library available because we are kind of like automatically generating libraries. That makes it easier to support less well-supported hardware, right? For example, WebGPU is one example. From a runtime perspective, AMD, I think before their Vulkan driver was not very well supported. Recently, they are getting good. But even before that, we'll be able to support AMD through this GPU graphics backend called Vulkan, which is not as performant, but it gives you a decent portability across those [00:25:29]Swyx: hardware. [00:25:29]Alessio: And I know we got other MLC stuff to talk about, like WebLLM, but I want to wrap up on the optimization that you're doing. So there's kind of four core things, right? Kernel fusion, which we talked a bit about in the flash attention episode and the tiny grab one memory planning and loop optimization. I think those are like pretty, you know, self-explanatory. I think the one that people have the most questions, can you can you quickly explain [00:25:53]Swyx: those? [00:25:54]Tianqi: So there are kind of a different things, right? Kernel fusion means that, you know, if you have an operator like Convolutions or in the case of a transformer like MOP, you have other operators that follow that, right? You don't want to launch two GPU kernels. You want to be able to put them together in a smart way, right? And as a memory planning, it's more about, you know, hey, if you run like Python code, every time when you generate a new array, you are effectively allocating a new piece of memory, right? Of course, PyTorch and other frameworks try to optimize for you. So there is a smart memory allocator behind the scene. But actually, in a lot of cases, it's much better to statically allocate and plan everything ahead of time. And that's where like a compiler can come in. We need to, first of all, actually for language model, it's much harder because dynamic shape. So you need to be able to what we call symbolic shape tracing. So we have like a symbolic variable that tells you like the shape of the first tensor is n by 12. And the shape of the third tensor is also n by 12. Or maybe it's n times 2 by 12. Although you don't know what n is, right? But you will be able to know that relation and be able to use that to reason about like fusion and other decisions. So besides this, I think loop transformation is quite important. And it's actually non-traditional. Originally, if you simply write a code and you want to get a performance, it's very hard. For example, you know, if you write a matrix multiplier, the simplest thing you can do is you do for i, j, k, c, i, j, plus, equal, you know, a, i, k, times b, i, k. But that code is 100 times slower than the best available code that you can get. So we do a lot of transformation, like being able to take the original code, trying to put things into shared memory, and making use of tensor calls, making use of memory copies, and all this. Actually, all these things, we also realize that, you know, we cannot do all of them. So we also make the ML compilation framework as a Python package, so that people will be able to continuously improve that part of engineering in a more transparent way. So we find that's very useful, actually, for us to be able to get good performance very quickly on some of the new models. Like when Lamato came out, we'll be able to go and look at the whole, here's the bottleneck, and we can go and optimize those. [00:28:10]Alessio: And then the fourth one being weight quantization. So everybody wants to know about that. And just to give people an idea of the memory saving, if you're doing FB32, it's like four bytes per parameter. Int8 is like one byte per parameter. So you can really shrink down the memory footprint. What are some of the trade-offs there? How do you figure out what the right target is? And what are the precision trade-offs, too? [00:28:37]Tianqi: Right now, a lot of people also mostly use int4 now for language models. So that really shrinks things down a lot. And more recently, actually, we started to think that, at least in MOC, we don't want to have a strong opinion on what kind of quantization we want to bring, because there are so many researchers in the field. So what we can do is we can allow developers to customize the quantization they want, but we still bring the optimum code for them. So we are working on this item called bring your own quantization. In fact, hopefully MOC will be able to support more quantization formats. And definitely, I think there's an open field that's being explored. Can you bring more sparsities? Can you quantize activations as much as possible, and so on? And it's going to be something that's going to be relevant for quite a while. [00:29:27]Swyx: You mentioned something I wanted to double back on, which is most people use int4 for language models. This is actually not obvious to me. Are you talking about the GGML type people, or even the researchers who are training the models also using int4? [00:29:40]Tianqi: Sorry, so I'm mainly talking about inference, not training, right? So when you're doing training, of course, int4 is harder, right? Maybe you could do some form of mixed type precision for inference. I think int4 is kind of like, in a lot of cases, you will be able to get away with int4. And actually, that does bring a lot of savings in terms of the memory overhead, and so on. [00:30:09]Alessio: Yeah, that's great. Let's talk a bit about maybe the GGML, then there's Mojo. How should people think about MLC? How do all these things play together? I think GGML is focused on model level re-implementation and improvements. Mojo is a language, super sad. You're more at the compiler level. Do you all work together? Do people choose between them? [00:30:32]Tianqi: So I think in this case, I think it's great to say the ecosystem becomes so rich with so many different ways. So in our case, GGML is more like you're implementing something from scratch in C, right? So that gives you the ability to go and customize each of a particular hardware backend. But then you will need to write from CUDA kernels, and you write optimally from AMD, and so on. So the kind of engineering effort is a bit more broadened in that sense. Mojo, I have not looked at specific details yet. I think it's good to start to say, it's a language, right? I believe there will also be machine learning compilation technologies behind it. So it's good to say, interesting place in there. In the case of MLC, our case is that we do not want to have an opinion on how, where, which language people want to develop, deploy, and so on. And we also realize that actually there are two phases. We want to be able to develop and optimize your model. By optimization, I mean, really bring in the best CUDA kernels and do some of the machine learning engineering in there. And then there's a phase where you want to deploy it as a part of the app. So if you look at the space, you'll find that GGML is more like, I'm going to develop and optimize in the C language, right? And then most of the low-level languages they have. And Mojo is that you want to develop and optimize in Mojo, right? And you deploy in Mojo. In fact, that's the philosophy they want to push for. In the ML case, we find that actually if you want to develop models, the machine learning community likes Python. Python is a language that you should focus on. So in the case of MLC, we really want to be able to enable, not only be able to just define your model in Python, that's very common, right? But also do ML optimization, like engineering optimization, CUDA kernel optimization, memory planning, all those things in Python that makes you customizable and so on. But when you do deployment, we realize that people want a bit of a universal flavor. If you are a web developer, you want JavaScript, right? If you're maybe an embedded system person, maybe you would prefer C++ or C or Rust. And people sometimes do like Python in a lot of cases. So in the case of MLC, we really want to have this vision of, you optimize, build a generic optimization in Python, then you deploy that universally onto the environments that people like. [00:32:54]Swyx: That's a great perspective and comparison, I guess. One thing I wanted to make sure that we cover is that I think you are one of these emerging set of academics that also very much focus on your artifacts of delivery. Of course. Something we talked about for three years, that he was very focused on his GitHub. And obviously you treated XGBoost like a product, you know? And then now you're publishing an iPhone app. Okay. Yeah. Yeah. What is his thinking about academics getting involved in shipping products? [00:33:24]Tianqi: I think there are different ways of making impact, right? Definitely, you know, there are academics that are writing papers and building insights for people so that people can build product on top of them. In my case, I think the particular field I'm working on, machine learning systems, I feel like really we need to be able to get it to the hand of people so that really we see the problem, right? And we show that we can solve a problem. And it's a different way of making impact. And there are academics that are doing similar things. Like, you know, if you look at some of the people from Berkeley, right? A few years, they will come up with big open source projects. Certainly, I think it's just a healthy ecosystem to have different ways of making impacts. And I feel like really be able to do open source and work with open source community is really rewarding because we have a real problem to work on when we build our research. Actually, those research bring together and people will be able to make use of them. And we also start to see interesting research challenges that we wouldn't otherwise say, right, if you're just trying to do a prototype and so on. So I feel like it's something that is one interesting way of making impact, making contributions. [00:34:40]Swyx: Yeah, you definitely have a lot of impact there. And having experience publishing Mac stuff before, the Apple App Store is no joke. It is the hardest compilation, human compilation effort. So one thing that we definitely wanted to cover is running in the browser. You have a 70 billion parameter model running in the browser. That's right. Can you just talk about how? Yeah, of course. [00:35:02]Tianqi: So I think that there are a few elements that need to come in, right? First of all, you know, we do need a MacBook, the latest one, like M2 Max, because you need the memory to be big enough to cover that. So for a 70 million model, it takes you about, I think, 50 gigahertz of RAM. So the M2 Max, the upper version, will be able to run it, right? And it also leverages machine learning compilation. Again, what we are doing is the same, whether it's running on iPhone, on server cloud GPUs, on AMDs, or on MacBook, we all go through that same MOC pipeline. Of course, in certain cases, maybe we'll do a bit of customization iteration for either ones. And then it runs on the browser runtime, this package of WebLM. So that will effectively... So what we do is we will take that original model and compile to what we call WebGPU. And then the WebLM will be to pick it up. And the WebGPU is this latest GPU technology that major browsers are shipping right now. So you can get it in Chrome for them already. It allows you to be able to access your native GPUs from a browser. And then effectively, that language model is just invoking the WebGPU kernels through there. So actually, when the LATMAR2 came out, initially, we asked the question about, can you run 17 billion on a MacBook? That was the question we're asking. So first, we actually... Jin Lu, who is the engineer pushing this, he got 17 billion on a MacBook. We had a CLI version. So in MLC, you will be able to... That runs through a metal accelerator. So effectively, you use the metal programming language to get the GPU acceleration. So we find, okay, it works for the MacBook. Then we asked, we had a WebGPU backend. Why not try it there? So we just tried it out. And it's really amazing to see everything up and running. And actually, it runs smoothly in that case. So I do think there are some kind of interesting use cases already in this, because everybody has a browser. You don't need to install anything. I think it doesn't make sense yet to really run a 17 billion model on a browser, because you kind of need to be able to download the weight and so on. But I think we're getting there. Effectively, the most powerful models you will be able to run on a consumer device. It's kind of really amazing. And also, in a lot of cases, there might be use cases. For example, if I'm going to build a chatbot that I talk to it and answer questions, maybe some of the components, like the voice to text, could run on the client side. And so there are a lot of possibilities of being able to have something hybrid that contains the edge component or something that runs on a server. [00:37:47]Alessio: Do these browser models have a way for applications to hook into them? So if I'm using, say, you can use OpenAI or you can use the local model. Of course. [00:37:56]Tianqi: Right now, actually, we are building... So there's an NPM package called WebILM, right? So that you will be able to, if you want to embed it onto your web app, you will be able to directly depend on WebILM and you will be able to use it. We are also having a REST API that's OpenAI compatible. So that REST API, I think, right now, it's actually running on native backend. So that if a CUDA server is faster to run on native backend. But also we have a WebGPU version of it that you can go and run. So yeah, we do want to be able to have easier integrations with existing applications. And OpenAI API is certainly one way to do that. Yeah, this is great. [00:38:37]Swyx: I actually did not know there's an NPM package that makes it very, very easy to try out and use. I want to actually... One thing I'm unclear about is the chronology. Because as far as I know, Chrome shipped WebGPU the same time that you shipped WebILM. Okay, yeah. So did you have some kind of secret chat with Chrome? [00:38:57]Tianqi: The good news is that Chrome is doing a very good job of trying to have early release. So although the official shipment of the Chrome WebGPU is the same time as WebILM, actually, you will be able to try out WebGPU technology in Chrome. There is an unstable version called Canary. I think as early as two years ago, there was a WebGPU version. Of course, it's getting better. So we had a TVM-based WebGPU backhand two years ago. Of course, at that time, there were no language models. It was running on less interesting, well, still quite interesting models. And then this year, we really started to see it getting matured and performance keeping up. So we have a more serious push of bringing the language model compatible runtime onto the WebGPU. [00:39:45]Swyx: I think you agree that the hardest part is the model download. Has there been conversations about a one-time model download and sharing between all the apps that might use this API? That is a great point. [00:39:58]Tianqi: I think it's already supported in some sense. When we download the model, WebILM will cache it onto a special Chrome cache. So if a different web app uses the same WebILM JavaScript package, you don't need to redownload the model again. So there is already something there. But of course, you have to download the model once at least to be able to use it. [00:40:19]Swyx: Okay. One more thing just in general before we're about to zoom out to OctoAI. Just the last question is, you're not the only project working on, I guess, local models. That's right. Alternative models. There's gpt4all, there's olama that just recently came out, and there's a bunch of these. What would be your advice to them on what's a valuable problem to work on? And what is just thin wrappers around ggml? Like, what are the interesting problems in this space, basically? [00:40:45]Tianqi: I think making API better is certainly something useful, right? In general, one thing that we do try to push very hard on is this idea of easier universal deployment. So we are also looking forward to actually have more integration with MOC. That's why we're trying to build API like WebILM and other things. So we're also looking forward to collaborate with all those ecosystems and working support to bring in models more universally and be able to also keep up the best performance when possible in a more push-button way. [00:41:15]Alessio: So as we mentioned in the beginning, you're also the co-founder of Octomel. Recently, Octomel released OctoAI, which is a compute service, basically focuses on optimizing model runtimes and acceleration and compilation. What has been the evolution there? So Octo started as kind of like a traditional MLOps tool, where people were building their own models and you help them on that side. And then it seems like now most of the market is shifting to starting from pre-trained generative models. Yeah, what has been that experience for you and what you've seen the market evolve? And how did you decide to release OctoAI? [00:41:52]Tianqi: One thing that we found out is that on one hand, it's really easy to go and get something up and running, right? So if you start to consider there's so many possible availabilities and scalability issues and even integration issues since becoming kind of interesting and complicated. So we really want to make sure to help people to get that part easy, right? And now a lot of things, if we look at the customers we talk to and the market, certainly generative AI is something that is very interesting. So that is something that we really hope to help elevate. And also building on top of technology we build to enable things like portability across hardwares. And you will be able to not worry about the specific details, right? Just focus on getting the model out. We'll try to work on infrastructure and other things that helps on the other end. [00:42:45]Alessio: And when it comes to getting optimization on the runtime, I see when we run an early adopters community and most enterprises issue is how to actually run these models. Do you see that as one of the big bottlenecks now? I think a few years ago it was like, well, we don't have a lot of machine learning talent. We cannot develop our own models. Versus now it's like, there's these great models you can use, but I don't know how to run them efficiently. [00:43:12]Tianqi: That depends on how you define by running, right? On one hand, it's easy to download your MLC, like you download it, you run on a laptop, but then there's also different decisions, right? What if you are trying to serve a larger user request? What if that request changes? What if the availability of hardware changes? Right now it's really hard to get the latest hardware on media, unfortunately, because everybody's trying to work on the things using the hardware that's out there. So I think when the definition of run changes, there are a lot more questions around things. And also in a lot of cases, it's not only about running models, it's also about being able to solve problems around them. How do you manage your model locations and how do you make sure that you get your model close to your execution environment more efficiently? So definitely a lot of engineering challenges out there. That we hope to elevate, yeah. And also, if you think about our future, definitely I feel like right now the technology, given the technology and the kind of hardware availability we have today, we will need to make use of all the possible hardware available out there. That will include a mechanism for cutting down costs, bringing something to the edge and cloud in a more natural way. So I feel like still this is a very early stage of where we are, but it's already good to see a lot of interesting progress. [00:44:35]Alessio: Yeah, that's awesome. I would love, I don't know how much we're going to go in depth into it, but what does it take to actually abstract all of this from the end user? You know, like they don't need to know what GPUs you run, what cloud you're running them on. You take all of that away. What was that like as an engineering challenge? [00:44:51]Tianqi: So I think that there are engineering challenges on. In fact, first of all, you will need to be able to support all the kind of hardware backhand you have, right? On one hand, if you look at the media library, you'll find very surprisingly, not too surprisingly, most of the latest libraries works well on the latest GPU. But there are other GPUs out there in the cloud as well. So certainly being able to have know-hows and being able to do model optimization is one thing, right? Also infrastructures on being able to scale things up, locate models. And in a lot of cases, we do find that on typical models, it also requires kind of vertical iterations. So it's not about, you know, build a silver bullet and that silver bullet is going to solve all the problems. It's more about, you know, we're building a product, we'll work with the users and we find out there are interesting opportunities in a certain point. And when our engineer will go and solve that, and it will automatically reflect it in a service. [00:45:45]Swyx: Awesome. [00:45:46]Alessio: We can jump into the lightning round until, I don't know, Sean, if you have more questions or TQ, if you have more stuff you wanted to talk about that we didn't get a chance to [00:45:54]Swyx: touch on. [00:45:54]Alessio: Yeah, we have talked a lot. [00:45:55]Swyx: So, yeah. We always would like to ask, you know, do you have a commentary on other parts of AI and ML that is interesting to you? [00:46:03]Tianqi: So right now, I think one thing that we are really pushing hard for is this question about how far can we bring open source, right? I'm kind of like a hacker and I really like to put things together. So I think it's unclear in the future of what the future of AI looks like. On one hand, it could be possible that, you know, you just have a few big players, you just try to talk to those bigger language models and that can do everything, right? On the other hand, one of the things that Wailing Academic is really excited and pushing for, that's one reason why I'm pushing for MLC, is that can we build something where you have different models? You have personal models that know the best movie you like, but you also have bigger models that maybe know more, and you get those models to interact with each other, right? And be able to have a wide ecosystem of AI agents that helps each person while still being able to do things like personalization. Some of them can run locally, some of them, of course, running on a cloud, and how do they interact with each other? So I think that is a very exciting time where the future is yet undecided, but I feel like there is something we can do to shape that future as well. [00:47:18]Swyx: One more thing, which is something I'm also pursuing, which is, and this kind of goes back into predictions, but also back in your history, do you have any idea, or are you looking out for anything post-transformers as far as architecture is concerned? [00:47:32]Tianqi: I think, you know, in a lot of these cases, you can find there are already promising models for long contexts, right? There are space-based models, where like, you know, a lot of some of our colleagues from Albert, who he worked on this HIPPO models, right? And then there is an open source version called RWKV. It's like a recurrent models that allows you to summarize things. Actually, we are bringing RWKV to MOC as well, so maybe you will be able to see one of the models. [00:48:00]Swyx: We actually recorded an episode with one of the RWKV core members. It's unclear because there's no academic backing. It's just open source people. Oh, I see. So you like the merging of recurrent networks and transformers? [00:48:13]Tianqi: I do love to see this model space continue growing, right? And I feel like in a lot of cases, it's just that attention mechanism is getting changed in some sense. So I feel like definitely there are still a lot of things to be explored here. And that is also one reason why we want to keep pushing machine learning compilation, because one of the things we are trying to push in was productivity. So that for machine learning engineering, so that as soon as some of the models came out, we will be able to, you know, empower them onto those environments that's out there. [00:48:43]Swyx: Yeah, it's a really good mission. Okay. Very excited to see that RWKV and state space model stuff. I'm hearing increasing chatter about that stuff. Okay. Lightning round, as always fun. I'll take the first one. Acceleration. What has already happened in AI that you thought would take much longer? [00:48:59]Tianqi: Emergence of more like a conversation chatbot ability is something that kind of surprised me before it came out. This is like one piece that I feel originally I thought would take much longer, but yeah, [00:49:11]Swyx: it happens. And it's funny because like the original, like Eliza chatbot was something that goes all the way back in time. Right. And then we just suddenly came back again. Yeah. [00:49:21]Tianqi: It's always too interesting to think about, but with a kind of a different technology [00:49:25]Swyx: in some sense. [00:49:25]Alessio: What about the most interesting unsolved question in AI? [00:49:31]Swyx: That's a hard one, right? [00:49:32]Tianqi: So I can tell you like what kind of I'm excited about. So, so I think that I have always been excited about this idea of continuous learning and lifelong learning in some sense. So how AI continues to evolve with the knowledges that have been there. It seems that we're getting much closer with all those recent technologies. So being able to develop systems, support, and be able to think about how AI continues to evolve is something that I'm really excited about. [00:50:01]Swyx: So specifically, just to double click on this, are you talking about continuous training? That's like a training. [00:50:06]Tianqi: I feel like, you know, training adaptation and it's all similar things, right? You want to think about entire life cycle, right? The life cycle of collecting data, training, fine tuning, and maybe have your local context that getting continuously curated and feed onto models. So I think all these things are interesting and relevant in here. [00:50:29]Swyx: Yeah. I think this is something that people are really asking, you know, right now we have moved a lot into the sort of pre-training phase and off the shelf, you know, the model downloads and stuff like that, which seems very counterintuitive compared to the continuous training paradigm that people want. So I guess the last question would be for takeaways. What's basically one message that you want every listener, every person to remember today? [00:50:54]Tianqi: I think it's getting more obvious now, but I think one of the things that I always want to mention in my talks is that, you know, when you're thinking about AI applications, originally people think about algorithms a lot more, right? Our algorithm models, they are still very important. But usually when you build AI applications, it takes, you know, both algorithm side, the system optimizations, and the data curations, right? So it takes a connection of so many facades to be able to bring together an AI system and be able to look at it from that holistic perspective is really useful when we start to build modern applications. I think it's going to continue going to be more important in the future. [00:51:35]Swyx: Yeah. Thank you for showing the way on this. And honestly, just making things possible that I thought would take a lot longer. So thanks for everything you've done. [00:51:46]Tianqi: Thank you for having me. [00:51:47]Swyx: Yeah. [00:51:47]Alessio: Thanks for coming on TQ. [00:51:49]Swyx: Have a good one. [00:51:49] Get full access to Latent Space at www.latent.space/subscribe

Commoditizing the Petaflop — with George Hotz of the tiny corp

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Jun 20, 2023 72:41

We are now launching our dedicated new YouTube and Twitter! Any help in amplifying our podcast would be greatly appreciated, and of course, tell your friends! Notable followon discussions collected on Twitter, Reddit, Reddit, Reddit, HN, and HN. Please don't obsess too much over the GPT4 discussion as it is mostly rumor; we spent much more time on tinybox/tinygrad on which George is the foremost authority!We are excited to share the world's first interview with George Hotz on the tiny corp!If you don't know George, he was the first person to unlock the iPhone, jailbreak the PS3, went on to start Comma.ai, and briefly “interned” at the Elon Musk-run Twitter. Tinycorp is the company behind the deep learning framework tinygrad, as well as the recently announced tinybox, a new $15,000 “luxury AI computer” aimed at local model training and inference, aka your “personal compute cluster”:* 738 FP16 TFLOPS* 144 GB GPU RAM* 5.76 TB/s RAM bandwidth* 30 GB/s model load bandwidth (big llama loads in around 4 seconds)* AMD EPYC CPU* 1600W (one 120V outlet)* Runs 65B FP16 LLaMA out of the box (using tinygrad, subject to software development risks)(In the episode, we also talked about the future of the tinybox as the intelligence center of every home that will help run models, at-home robots, and more. Make sure to check the timestamps

god amazon ai google starting work giving design building phd chinese meditation partner elon musk mit hero iphone san diego mars twitch congress uber tesla humans sony manhattan speed pc hiring discord reddit caribbean shakespeare rumors titanic tampa orange wikipedia pirates avatar karma thousands operation shoutouts rice girlfriends pac intel falcon cto ram whispers closed ir transformers optimizing ea openai arm nvidia james cameron api watts flop miami heat generate chrome remote work input gb mojo python gpt notable ml lama github llama technically rewriting db international space station sas tb corp neuralink merging torch wireless flops amd laziness playstation 3 10x batch everything else llm sam altman nickel lidar qualcomm thesis nb cpu gpu m1 a1 agi graphs raspberry pi turing docker dynamo kv alibi ocr gpus rpm kanban dsp alessio segmentation squeezing triton bayesian 100x computer vision kama comma brushes iac debugging replicating piston tensorflow cuda nhtsa hn alu john carmack amd ryzen nvidia gpus risc v pytorch risc venkatesh mdl pcie tpu jake sully n4 a100 llvm winograd chris lattner tpms lisa su alus numpy devkit george hotz 65b rnn cisc amd rx vgr ptx kolmogorov amd gpus turing complete quantize xla geohot intel gpus tflops nissan 350z 120v nvlink george it george they jython

189. If You Don't Know Who Your Wario Is, You're The Wario

Topic Lords

Play Episode Listen Later Jun 5, 2023 55:44

Lords: * Alexander * Yaros Topics: * Somewhat Dim Mirror * Unique and weird self-bootstrapping computer language - Forth * I've been getting emails from an Online Casino Guide offering analysis of the relative popularity of characters from the Mario Bros. movie. How did they get my email, and how did they know that this is the kind of thing I want to gamble on? * The Kraken, by Alfred Tennyson * https://en.m.wikipedia.org/wiki/TheKraken(poem) * NES dev scene and new games still being released * https://www.youtube.com/watch?v=h0Yg0GAX5vw * You have to heat a black hole to cool it down * Winston is suddenly really into Power Rangers which I'm not super thrilled about, but it does make me happy that the appeal of cheesy MIDI rock won't be lost on future generations * Esper says: "The tradition of taking Japanese action stuff and reworking it into an entirely different show is pretty wild, and pretty common. The original idea behind the western release of Sailor Moon was actually going to be a live action cast of young girls who transform into "cartoon scouts" or something, and the legendary anime Macross (known for animating lots of missles with cool smoke trails) was brought over here and entirely rewritten to be Robotech, an already existing western property. Power Rangers specifically comes from the Super Sentai tokusatsu series, of which there's actually two or three dozen seasons, each with more or less individual continuity. They're fun and goofy to watch if you get a chance to see the originals; I was mostly surprised by how self-aware they are." Microtopics: * Just playing games you already know whenever you find the time for games. * Dystopian fiction about all the little annoying things. * Dystopian fiction about all the terrible TV shows that are on now. * A guy who thought his idea would work but it didn't. * A black mirror but a little less black. * How really shiny black things work. * Logging in to watch people make themselves miserable. * Reverse polish notation. * Giving up on operating systems and deciding to live inside a Forth interpreter. * Going back to the Cambrian period and being like "what is this shell thing and what is it trying to accomplish?" * How Forth is like Eurovision. * Borrowing someone's RPN calculator and being very confused for a moment. * Your Dymaxion map of the globe. * The next emulations of Hewlett-Packard reverse polish notation calculators. * Online Casino Guides and the kinds of email they send. * A gaming and entertainment experience. * Naming your movie @ and getting incredible engagement on Twitter. * How recently Nethack has been patched. * Carpetology and the study of rugs and carpets even though they're not in the same phylum. * The Dungeons and Dragons Chick Tract. * A kid named Wario. * The Abysmal Sea. * Unnumbered and enormous polypi. * Interpreting a poem as a political statement when it's clearly about how giant squids are super cool. * Lauding this poet's skill with language even though he didn't know the difference between abyssal and abysmal. * Calling a poem a sonnet when it doesn't meet the criteria of a sonnet just because Tennyson wrote it. * Wanting to be huge and eat sponges, like the kraken. * Dendy. * Buying NES games made this year. * Sokoban with a Twist. * MOON 8. * Releasing chiptunes on vinyl shaped like a square. * Russian Roulette for the NES making good use of the Zapper. * Two people who are really bad at archery. * Pointing your Rambo exploding arrow at the exploding barrel sitting right next to you. * A turn-based thing where you can kill zombies. * Forklift simulators in VR. * NES Maker and GB Studio. * LLVM's NES back-end. * Making a NES game in C and never using local variables. * Finding the free time to do all your hobbies. * The bigger I am the colder I am, and if you heat me up I get bigger and colder. What am I? * A tear in geometry that just leaks shit. * Care and feeding of your pet black hole. * Pascal's Breakfast. * Whether it's in your best interest to believe in waffles. * The International Cult Registry. * Trying to make a portmanteau of waffle and apocalypse. * Violence against putty monsters. * The Horsemen of the Apocalypse Power Rangers spinoff. * A Power Rangers spinoff made in the last three years that has the exact same production values of the original. * Writing a new TV show around the action scenes from a different TV show. * Taking the most expensive special effects shots from every movie and putting them all in one uber-movie. * Tricking Harrison Ford into being in your movie because he's so old now.

#381 – Chris Lattner: Future of Programming and AI

Lex Fridman Podcast

Play Episode Listen Later Jun 2, 2023 218:29

Chris Lattner is a legendary software and hardware engineer, leading projects at Apple, Tesla, Google, SiFive, and Modular AI, including the development of Swift, LLVM, Clang, MLIR, CIRCT, TPUs, and Mojo. Please support this podcast by checking out our sponsors: - iHerb: https://lexfridman.com/iherb and use code LEX to get 22% off your order - Numerai: https://numer.ai/lex - InsideTracker: https://insidetracker.com/lex to get 20% off EPISODE LINKS: Chris's Twitter: https://twitter.com/clattner_llvm Chris's Website: http://nondot.org/sabre/ Mojo programming language: https://www.modular.com/mojo Modular AI: https://modular.com/ PODCAST INFO: Podcast website: https://lexfridman.com/podcast Apple Podcasts: https://apple.co/2lwqZIr Spotify: https://spoti.fi/2nEwCF8 RSS: https://lexfridman.com/feed/podcast/ YouTube Full Episodes: https://youtube.com/lexfridman YouTube Clips: https://youtube.com/lexclips SUPPORT & CONNECT: - Check out the sponsors above, it's the best way to support this podcast - Support on Patreon: https://www.patreon.com/lexfridman - Twitter: https://twitter.com/lexfridman - Instagram: https://www.instagram.com/lexfridman - LinkedIn: https://www.linkedin.com/in/lexfridman - Facebook: https://www.facebook.com/lexfridman - Medium: https://medium.com/@lexfridman OUTLINE: Here's the timestamps for the episode. On some podcast players you should be able to click the timestamp to jump to that time. (00:00) - Introduction (06:38) - Mojo programming language (16:55) - Code indentation (25:22) - The power of autotuning (35:12) - Typed programming languages (51:56) - Immutability (1:04:14) - Distributed deployment (1:38:41) - Mojo vs CPython (1:54:30) - Guido van Rossum (2:01:31) - Mojo vs PyTorch vs TensorFlow (2:04:55) - Swift programming language (2:10:27) - Julia programming language (2:15:32) - Switching programming languages (2:24:58) - Mojo playground (2:29:48) - Jeremy Howard (2:40:34) - Function overloading (2:48:59) - Error vs Exception (2:56:39) - Mojo roadmap (3:09:41) - Building a company (3:21:27) - ChatGPT (3:27:50) - Danger of AI (3:31:44) - Future of programming (3:35:01) - Advice for young people

Kodsnack 527 - Optimera registerhanteringen

Kodsnack

Play Episode Listen Later May 30, 2023 84:17

Fredrik, Tobias, och Kristoffer samlas i samma avsnitt! Tobias berättar om nyligen avslutade Eurollvm 2023-konferensen och allt han såg där. Till att börja med höll Tobias själv inget mindre än öppningskeynoten. Han berättar om sin presentation, sina förberedelser, och hur han diskuterade och tänkte kring att förankra det hela på jobbet. Sedan går vi igenom övriga presentationer Tobias såg på konferensen, med gott om sidospår om optimeraranekdoter, hur kompilatorer och processorer arbetar, och mycket annat. Som avslutning lite funderingar kring företaget Modular och deras språk Mojo, och varför det marknadsförs som just bra för AI. Ett stort tack till Cloudnet som sponsrar vår VPS! Har du kommentarer, frågor eller tips? Vi är @kodsnack, @tobiashieta, @oferlund, och @bjoreman på Twitter, har en sida på Facebook och epostas på info@kodsnack.se om du vill skriva längre. Vi läser allt som skickas. Gillar du Kodsnack får du hemskt gärna recensera oss i iTunes! Du kan också stödja podden genom att ge oss en kaffe (eller två!) på Ko-fi, eller handla något i vår butik. Länkar Eurollvm 2023 Hela konferensprogrammet LLVM Reveal.js Hugo Miro Order out of chaos - the LLVM release process - Tobias keynote LLVM:s Youtubekanal A whirlwind tour of the LLVM optimizer Nikita Popov från Red hat LLVM IR Memristor Practical Global Merge Function with ThinLTO LTO - link-time optimization Kyungwoo Lee från Meta Fast and Vectorized Pivot Function for MLIR Presburger Library , av Qi Zhou - att göra flyttalsoperationer snabbare än heltalsoperationer Using the Clang data-flow framework for null-pointer analysis - Viktor Cseh pratade eliminering av nollpekare med dataanalys Register Cost Modelling for Register Allocation and Beyond - Aiden Grossman optimerade register Mojo Modular Anders Waldenborg Keynote dag två - “-fbounds-safety”: Enforcing bounds safety for production C code - Yeoul Na, Apple Bounds checking ABI - application binary interface MachineScheduler - fine grain resource allocation using resource intervals - Francesco Petrogalli, från Apple What would it take to remove debug intrinsics? Jeremy Morse, från Sony GlobalISel by example, av Alex Bradbury Selectiondag CISC RISC Duke Nukem forever llvm-debuginfo-analyzer-presentationen, med Carlos Alberto Encisofrån Sony Trainspotting DWARF och ELF How do you do fellow kids? Pytorch Tensorflow Global interpreter lock Titlar Klämdag Göra ett så tråkigt ämne intressant För att få en regnig semester Sedan fick jag keynoten Använda registren så mycket som möjligt Den fina tanken utan den fula verkligheten Optimera för storlek Ett hopp till en annan funktion Försöka förstå register Optimera registerhanteringen Alla världens program på alla världens processorer En naiv allokerare i huvudet Ljuset är för långsamt Samtidigt i en cykel Välja instruktioner Titta på hela programmet samtidigt Debugga debuginformationen Killarna på Sony och jag Instruktionerna levereras med brevduva

ai apple sony ko ett mojo abi sedan hela fredrik modular samtidigt enforcing vps kristoffer gillar anv titta ljuset clang killarna llvm optimera kodsnack cloudnet

The Week in Green Software: AWS & Scope 3 Emissions Data

Environment Variables

Play Episode Listen Later May 17, 2023 39:45

Host Chris Adams is joined by the GSF's Asim Hussain on this episode of The Week in Green Software. They discuss some interesting news about Amazon, AWS and their scope 3 GHG protocol emission data. We also find out how Python has got its Mojo back and we have a very exciting tool from Catchpoint WebpageTest for measuring site's carbon footprint. Finally, some great green software events that you can be part of!

amazon ai google germany green data microsoft berlin chatgpt sustainability software rust mojo python scope aws coinbase javascript azure emissions firefox mozilla carbon footprint gpus ghg terraform tensorflow datadog llvm numpy lvm gsf catchpoint scope3 adrian cockcroft charles roberts webpagetest green software asim hussain mlir

#335 Should you get your mojo on?

Python Bytes

Play Episode Listen Later May 11, 2023 25:37

Watch on YouTube About the show Sponsored by InfluxDB from Influxdata. Connect with the hosts Michael: @mkennedy@fosstodon.org Brian: @brianokken@fosstodon.org Show: @pythonbytes@fosstodon.org Join us on YouTube at pythonbytes.fm/live to be part of the audience. Usually Tuesdays at 11am PT. Older video versions available there too. Michael #1: Introducing 'Trusted Publishers' PyPI package maintainers can adopt a new, more secure publishing method that does not require long-lived passwords or API tokens to be shared with external systems. Our term for using the OpenID Connect (OIDC) standard to exchange short-lived identity tokens between a trusted third-party service and PyPI. Instead, PyPI maintainers can configure PyPI to trust an identity provided by a given OpenID Connect Identity Provider (IdP). These API tokens never need to be stored or shared rotate automatically by expiring quickly provide a verifiable link between a published package and its source Additional security hardening is available Brian #2: Mojo : a new programming language for all AI developers. Mojo may be the biggest programming language advance in decades - fast.ai blog Suggested by many listeners “Mojo combines the usability of Python with the performance of C, unlocking unparalleled programmability of AI hardware and extensibility of AI models.” A programming language compatible with Python, with performance similar to C++/Rust. “Mojo is designed to become a superset of Python over time by preserving Python's dynamic features while adding new primitives for systems programming.” - emphasis from Brian It's not there yet, but still super cool Built on a MLIR, not LLVM “How compatible is Mojo with Python really? Mojo already supports many core features of Python including async/await, error handling, variadics, etc, but… it is still very early and missing many features - so today it isn't very compatible. Mojo doesn't even support classes yet!” Michael #3: django-prose Wonderful rich-text editing for your Django project. Rendering rich-text in templates Small rich-text content (as model fields) Django Prose is using Bleach to only allow certain tags and attributes See the website for a screenshot of it in action Brian #4: pylyzer is a static code analyzer / language server for Python, written in Rust. Shunsuke Shibayama Suggested by Owen Features fast detailed analysis type checking plus things like out-of-bounds accesses to lists, and non-existent key references to dicts more readable reports and a VS Code extension pylyzer vs ruff “Ruff, like pylyzer, is a static code analysis tool for Python written in Rust, but Ruff is a linter and pylyzer is a type checker & language server. pylyzer does not perform linting, and Ruff does not perform type checking.” Some limitations and incomplete “todo list”. See README for more details. Joke: Escape Room

Cpp2, with Herb Sutter

CppCast

Play Episode Listen Later Mar 31, 2023 70:25

Herb Sutter joins Phil and Timur. We catch up on the news about LLVM 16 being released, a new book on initialisation in C++ and a couple of new user groups. Then we talk to Herb about his new language/ alternate syntax, Cpp2, which compiles down to C++ in much the same way that C with Classes compiled down to C. Show Notes News LLVM 16.0.0 released "C++ initialisation story" - a new book by Bartlomiej Filipek New user group forming in Prague - Miloš Anđelković New user group forming in Helsinki - Timur Doumler Links CppFront - the compiler for Cpp2 "Can C++ be 10x Simpler & Safer?" - Herb's CppCon keynote introducing Cpp2 and CppFront

classes herb sutter timur llvm

Cpp2, with Herb Sutter

CppCast

Play Episode Listen Later Mar 31, 2023 70:25

Herb Sutter joins Phil and Timur. We catch up on the news about LLVM 16 being released, a new book on initialisation in C++ and a couple of new user groups. Then we talk to Herb about his new language/ alternate syntax, Cpp2, which compiles down to C++ in much the same way that C with Classes compiled down to C. News LLVM 16.0.0 released "C++ initialisation story" - a new book by Bartlomiej Filipek New user group forming in Prague - Miloš Anđelković New user group forming in Helsinki - Timur Doumler Links CppFront - the compiler for Cpp2 "Can C++ be 10x Simpler & Safer?" - Herb's CppCon keynote introducing Cpp2 and CppFront

classes safer herb simpler sutter timur llvm phil nash

Linux Action News 279

airhacks.fm podcast with adam bien

Play Episode Listen Later Feb 9, 2023 16:53

We round up some news from FOSDEM 2023, update a 21-year-old project, and the Fedora fix that's been a few releases in the making.

matrix arm element mesa linux voip interoperability dvr peer to peer synapse apple silicon action news chris fisher fosdem llvm flatpak asahi linux podcasting 2.0 wes payne gnome software aarch64 linux news podcast

Supercharging the GraalVM

Play Episode Listen Later Jan 22, 2023 47:04

An airhacks.fm conversation with Аlina Yurenko (@alina_yurenko) about: 2012 MacBook Air, enjoying a Symbian mobile phone, GCP meetups, from firebase to C++, starting as Developer Advocate for GraalVM, GraalVM JIT, GraalVM native, GraalVM Polyglot, doom on GraalVM, JavaScript and python are interpreted at GraalVM, the closed world assumption - the dependencies have to be known at compile time, GraalVM tracing agent provides dependency configuration, GraalVM Reachability Metadata Repository, GraalVM Visual Studio Code extensions, GraalVM and LLVM runtime, GraalVM isolate, the GraalVM native image performance, Github Actions for GraalVM, Alibaba uses Native Image in production, Disney Streaming uses GraalVM to reduce cold starts, article: Disney Streaming using GraalVM on AWS Lambda, Adyen uses GraalVM as safe execution environment for native code, article: GraalVM: running C/C++ application safely in the Java world, Supercharge your Native Image applications in 5 steps Аlina Yurenko on twitter: @alina_yurenko

cc alibaba java supercharge javascript macbook air supercharging gcp developer advocate adyen aws lambda github actions disney streaming llvm symbian

500: Internal Server Error

Coder Radio

Play Episode Listen Later Jan 11, 2023 43:47

After sacrificing our pound of flesh for episode 500, we get into some spicy Big Tech dynamics and the performance mess of WebAssembly runtimes.

#317 Most loved and most dreaded dev tools of 2022

Python Bytes

Play Episode Listen Later Jan 3, 2023 48:31

Watch on YouTube About the show Sponsored by Microsoft for Startups Founders Hub. Connect with the hosts Michael: @mkennedy@fosstodon.org Brian: @brianokken@fosstodon.org Show: @pythonbytes@fosstodon.org Michael #1: StackOverflow 2022 Developer Survey Last year we saw Git as a fundamental tool to being a developer. This year it appears that Docker is becoming a similar fundamental tool for Professional Developers, increasing from 55% to 69%. Language: Rust is […] the most loved language with 87% of developers saying they want to continue using it. JS Frameworks: Angular.js is in its third year as the most dreaded. Let me Google that for you: 62% of all respondents spend more than 30 minutes a day searching for answers or solutions to problems. 25% spending more than an hour each day. The demise of the full-stack developer is overrated. I do wish there were more women in the field. Databases: Postgres is #1 and MongoDB is still going strong. The “which web framework do you use?” question is a full on train wreck. Why is this so hard for people to write the question? Node.js or Express (built on Node) vs. FastAPI or Flask (but no Python?) Most wanted / loved language is Rust (wanted) and Python/Rust tied for most wanted. Worked with vs. want to work with has some interesting graphics. Brian #2: PePy.tech - PyPI download stats with package version breakdown Petru Rares Sincraian We've discussed pypistats.org before, which highlights daily downloads downloads per major/minor Python version downloads per OS PyPy is a bit more useful for me default shows last few versions and total for this major version “select versions” box is editable. clicking in it shows dropdown with downloads per version already there you can add * for graph of total or other major versions if you want to compare daily/weekly/monthly is nice, to round out some noise and see larger trends Oddity I noticed - daily graph isn't the same dates as the table. off by a day on both sides not a big deal, but I notice these kinds of things. Michael #3: Codon Python Compiler via Jeff Hutchins and Abdulaziz Alqasem A high-performance, zero-overhead, extensible Python compiler using LLVM You can scale performance and produce executables, even when using third party libraries such as matplotlib. It also supports writing and executing GPU kernels, which is an interesting feature. See how it works at exaloop.io BTW, really terrible licensing. Free for non-commercial (great) “Contact us” for commercial use (it's fine to charge, but give us a price) Brian #4: 8 Levels of Using Type Hints in Python Yang Zhou (yahng cho) A progression of using type hints that seems to track how I've picked them up Type Hints for Basic Data Types. x: int Define a Constant Using Final Type DB: Final = '``PostgreSQL' (ok. I haven't used this one at all yet) Adding multipe type hints to one variable. int | None Using general type hints. def func(nums: Iterable) Also using Optional Type hints for functions def func(name: str) → str: (I probably would put this at #2) Alias of type hints (not used this yet, but looks cool) PostsType = dict[int, str] new_posts: PostsType = {1: 'Python Type Hints', 2: 'Python Tricks'} Type hints for a class itself, i.e. Self type from typing import Self class ListNode: def __init__(self, prev_node: Self) -> None: pass Provide literals for a variable. (not used this yet, but looks cool) from typing import Literal weekend_day: Literal['Saturday', 'Sunday'] weekend_day = 'Saturday' weekend_day = 'Monday' # will by a type error Extras Brian: I hear a heartbeat for Test & Code, so it must not be dead yet. Michael: New article: Welcome Back RSS From this I learned about Readwise, Kustosz, and Python's reader. Year progress == 100% PyTorch discloses malicious dependency chain compromise over holidays (of course found over RSS and reeder — see article above) Joke: vim switch

Linux Action News 271

All Ruby Podcasts by Devchat.tv

Play Episode Listen Later Dec 15, 2022 19:07

Why the next kernel will be "the merge window from hell," a holiday gift for Wayland users, and how the open source community could do more to take on YouTube.

Building Desktop and Mobile Video Games with DragonRuby with Amir Rajan - RUBY 572

Play Episode Listen Later Dec 7, 2022 69:45

Game Developer and CEO of DragonRuby, Amir Rajan returns to the show. He joins the rogues to talk about DragonRuby. DragonRuby is a zero dependency, cross-platform, Ruby runtime built on top of mRuby, libSDL, and LLVM. Additionally, Amir talks about how it allows you to use the Ruby language to build video games. He also shares his experiences when it comes to working with mruby.About this Episode All about DragonRuby Building VR games using Ruby Runtime and how it works Sponsors AppSignal Developer Book Club starting with Clean Architecture by Robert C. Martin Become a Top 1% Dev with a Top End Devs Membership Links 272 RR Game Development and RubyMotion with Amir Rajan RR 333: RubyMotion and the Aesthetic of Ruby with Amir Rajan RUBYMOTION DragonRuby Flappy Dragon by DragonRuby mruby Simple DirectMedia Layer Ryan C. Gordon's Homepage fiddle.dragonruby.org Chipmunk2D Physics Toby Fox GitHub: DragonRuby/dragonruby-game-toolkit-contrib Intro to DragonRuby Game Toolkit Pico-8 Fancine Duelists amirrajan.net Twitter: @amirrajan Picks Amir - Project Hail Mary Amir - We Are Legion (We Are Bob) Amir - The Broken Earth Trilogy Charles - King of Tokyo Charles - Command your coding career Charles - Rails Remote Conference 2023 Luke - UTM Luke - Modules! Magnets! MiRage Mk3: The Mechanical Keyboard You're Meant to Modify! Luke - Real Hardware Hacking (with a hacksaw): My New Wearable Computer Valentino - Apple Watch Ultra

ceo video games meant aesthetics dev desktops magnets game developers modify llvm mobile video clean architecture rubymotion amir rajan

Building Desktop and Mobile Video Games with DragonRuby with Amir Rajan - RUBY 572

Ruby Rogues

Play Episode Listen Later Dec 7, 2022 69:45

ceo video games meant aesthetics dev desktops magnets game developers modify llvm mobile video clean architecture rubymotion amir rajan

Presser with Gray Olson

airhacks.fm podcast with adam bien

Play Episode Listen Later Dec 2, 2022 71:15

Allen Wyma talks with Gray Olson, developer of Presser, a library that aims to make it easier to safely work with byte buffers. Contributing to Rustacean Station Rustacean Station is a community project; get in touch with us if you'd like to suggest an idea for an episode or offer your services as a host or audio editor! Twitter: @rustaceanfm Discord: Rustacean Station Github: @rustacean-station Email: hello@rustacean-station.org Timestamps [@00:00] - Gray's background and introduction [@04:18] - Gray's art and graphic designing work for Embark Studio [@08:40] - Ray tracing and fractals [@13:44] - The most expensive process in a video game [@16:48] - Vector graphics are so hard on the GPU [@18:57] - What makes triangles very useful in drawing and designing [@22:41] - Matrix math as a fundamental building block of computer graphics [@28:13] - Understanding the concept of uninitialized memory and why Presser is necessary [@36:31] - LLVM's “No Uninitialized Memory” attribute. [@39:06] - Rust's virtual machine [@40:52] - Allocating memory for data [@49:34] - Safety invariants and validity invariants in the Rust ecosystem [@53:19] - How to use unsafe code in a way that does not violate the validity invariant of Rust [@1:04:01] - Embark Studio's mission to enable those who play games to also modify the game worlds they play in [@1:07:27] - Embark Studio's Rust game projects [@1:09:08] - Parting thoughts Credits Intro Theme: Aerocity Audio Editing: Plangora Hosting Infrastructure: Jon Gjengset Show Notes: Plangora Hosts: Allen Wyma

safety matrix rust olson contributing presser gpu vector allocating llvm allen wyma

Low Code, No Code, WYSIWYG …and some CRaC

Play Episode Listen Later Nov 13, 2022 61:08

An airhacks.fm conversation with John Ceccarelli (@jceccarelli1) about: Macintosh 512K, writing short stories and playing Dark Castle, studying European politics, enjoying Brno and Prague, learning Czech from a communist book, technical writing for Sun Microsystems, working on NetBeans Matisse, WYSIWYG precision is challenging, NetBeans Visual Web Pack was extremely popular, Sun's JSF woodstock, separation of generated and implemented code is challenging, explaining AWS Lambdas with EJBs, visual representation of complex code is challenging, NetBeans vs. IntelliJ strategies, Installing Java Support in Visual Studio Code, working on JVM internals at Azul Systems, Azul JVMs Zulu vs. Prime, the Falcon JIT, optimising JVM for Apache Cassandra, the Renaissance Suite, memento and openJDK CRaC, Azul's CRAC optimization, crowdourcing the optimizations, quarkus on Azul's CRaC, Azul Prime is based on LLVM, Foojay and azul John Ceccarelli on twitter: @jceccarelli1

Linux Action News 251

Play Episode Listen Later Jul 29, 2022 18:33

Red Hat hints at its future direction, why realtime might finally come to Linux after all these years, and our reaction to Google's ambitious new programing language.

LW - Examples of AI Increasing AI Progress by ThomasWoodside

The Nonlinear Library

Play Episode Listen Later Jul 18, 2022 2:58

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Examples of AI Increasing AI Progress, published by ThomasWoodside on July 17, 2022 on LessWrong. Recursive self-improvement is already here. This point is far from original. It's been described before, for instance here, in Drexler's Reframing Superintelligence, and (as I was working on this post) in Jack Clark's newsletter and even by Yann LeCun. But sometimes I still hear people talk about preparing for “when recursive self-improvement kicks in,” implying that it hasn't already. The kinds of recursive self-improvement mentioned here aren't exactly the frequently-envisioned scenario of a single AI system improving itself unencumbered. They instead rely on humans to make them work, and humans are inevitably slow and thus currently inhibit a discontinuous foom scenario. It may be tempting to dismiss the kind of recursive self-improvement happening today as not real recursive self-improvement. To think about it as some future event that will start to happen that we need to prepare for. Yes, we need to prepare for increasing amounts of it, but it's not in the future, it's in the present. Here are some currently existing examples (years given for the particular example linked): (2016) Models play against themselves in order to iteratively improve their performance in games, most notably in AlphaGo and its variants. (2016) Some neural architecture search techniques use one neural network to optimize the architectures of different neural networks. (2016) AI is being used to optimize data center cooling, helping reduce the cost of further scaling. (2021) Code generation tools like GitHub Copilot can be helpful to software engineers, including presumably some AI research engineers (anecdotally, I've found it helpful when doing engineering). Engineers may thus be faster at designing AI systems, including Copilot-like systems. (2021) Google uses deep reinforcement learning to optimize their AI accelerators. (2022) Neural networks, running on NVIDIA GPUs, have been used to design more efficient GPUs which can in turn run more neural networks. (2022) Neural networks are being used for compiler optimization in the popular LLVM compiler language, which Pytorch's just-in-time compiler is based on. Inspired by Victoria Krakovna's specification gaming spreadsheet, I've made a spreadsheet here with these examples. Feel free to submit more here. I think the number of examples will continue to grow, making it useful to keep track of them. If this feels underwhelming compared with the kinds of recursive self-improvement often written about, you're right. But consider that the start of an exponential often feels underwhelming. As time goes on, I expect that humans will become less and less involved in the development of AI, with AI automating more and more of the process. This could very well feel sudden, but it won't be unprecedented: it's already begun. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.

463: The 1.0 Legend

BSD Now

Play Episode Listen Later Jul 14, 2022 55:11

Differences between base and ports LLVM in OpenBSD, Netgraph for FreeBSD's bhyve Networking, Audio on FreeBSD – Quick Guide, FreeBSD's Legend starts at 1.0, Hacker News running by FreeBSD, TrueNAS 13, and more NOTES This episode of BSDNow is brought to you by Tarsnap (https://www.tarsnap.com/bsdnow) and the BSDNow Patreon (https://www.patreon.com/bsdnow) Headlines Differences between base and ports LLVM in OpenBSD (https://www.cambus.net/differences-between-base-and-ports-llvm-in-openbsd/) Using Netgraph for FreeBSD's bhyve Networking (https://klarasystems.com/articles/using-netgraph-for-freebsds-bhyve-networking/?utm_source=bsdweekly) News Roundup Audio on FreeBSD – Quick Guide (https://freebsdfoundation.org/freebsd-project/resources/audio-on-freebsd/) [Legends start at 1.0! – FreeBSD in 1993] Part 1 (https://eerielinux.wordpress.com/2022/06/18/legends-start-at-1-0-freebsd-in-1993-pt-1/) Part 2 (https://eerielinux.wordpress.com/2022/06/19/legends-start-at-1-0-freebsd-in-1993-pt-2/) *** ### Hacker News running by FreeBSD. Take that, Linux! (https://news.ycombinator.com/item?id=16076041) *** ### TrueNAS 13 (https://www.theregister.com/2022/05/11/truenas_13_released/) *** Beastie Bits Notable OpenBSD news you may have missed, 2022-06-28 edition (http://undeadly.org/cgi?action=article;sid=20220628135253) rEFInd design for all the BSDs (https://github.com/indgy/refind-bsd-black) OpenBGPD 7.4 released (https://undeadly.org/cgi?action=article;sid=20220619185920) Hotfix GhostBSD 22.06.18 ISO is now available (http://ghostbsd.org/22.06.18_iso_is_now_available) *** ###Tarsnap This weeks episode of BSDNow was sponsored by our friends at Tarsnap, the only secure online backup you can trust your data to. Even paranoids need backups. Feedback/Questions Brad - Jails Question (https://github.com/BSDNow/bsdnow.tv/blob/master/episodes/463/feedback/Brad%20-%20Jails%20Question.md) Freezr - A few questions (https://github.com/BSDNow/bsdnow.tv/blob/master/episodes/463/feedback/Freezr%20-%20A%20few%20questions.md) A different Brad - Drive question (https://github.com/BSDNow/bsdnow.tv/blob/master/episodes/463/feedback/A%20different%20Brad%20-%20Drive%20question.md) Send questions, comments, show ideas/topics, or stories you want mentioned on the show to feedback@bsdnow.tv (mailto:feedback@bsdnow.tv) ***

interview guide networking os software legends differences jail berkeley distribution how to open source tutorials linux iso packages ports tom jones operating systems trident unix 1993 hacker news bsd compiler 20a dataset freebsd filesystem llvm zfs openbsd refind netbsd bsds trueos tarsnap openbgpd dragonflybsd

Zig with Andrew Kelley

Play Episode Listen Later Jun 24, 2022 56:25

Allen Wyma talks with Andrew Kelley, creator of Zig. Zig is a general-purpose programming language and toolchain for maintaining robust, optimal, and reusable software. Contributing to Rustacean Station Rustacean Station is a community project; get in touch with us if you'd like to suggest an idea for an episode or offer your services as a host or audio editor! Twitter: @rustaceanfm Discord: Rustacean Station Github: @rustacean-station Email: hello@rustacean-station.org Timestamps [@0:51] - Andrew's introduction [@2:55] - Rust vs Zig [@5:27] - What is undefined behavior (UB) and what causes it? [@11:37] - How does Zig deal with undefined behavior? [@16:09] - How well does Zig work in production? [@22:46] - Deeper dive into Andrew's programming background [@33:35] - Zig's mission statement and what they're doing as a non-profit [@37:38] - Zig's update release management [@40:06] - Andrew's OkCupid project [@42:20] - Andrew's preparations and motivations for making a language [@46:11] - Zig using LLVM [@49:12] - What's next for Zig? [@54:20] - Parting thoughts Other Resources Zig's Github Andrew's Github Credits Intro Theme: Aerocity Audio Editing: Plangora Hosting Infrastructure: Jon Gjengset Show Notes: Plangora Hosts: Allen Wyma

deeper rust contributing okcupid ub zig llvm andrew kelley allen wyma

It Depends

Trail of Bits

Play Episode Listen Later Jun 20, 2022 21:05

FEATURED VOICES IN THIS EPISODEClint BruceClint Bruce is a former Navy Special Warfare Officer, a graduate of the US Naval Academy, decorated athlete, and seasoned entrepreneur. A 4-year letter winner at Navy playing middle linebacker, captain and MVP of the '96 Aloha Bowl Championship team, he was named to multiple all-star teams his senior year. He enjoyed opportunities with both the Baltimore Ravens and New Orleans Saints and was inducted into the Navy/Marine Corps Stadium Hall of Fame in 2009. Clint's desire to serve was deep and firmly rooted. He left the NFL to pursue becoming a Navy SEAL and successfully completed BUDS (Basic Underwater Demolition SEAL Training) in 1998 with Class 217. Joining SEAL Team FIVE, Clint completed multiple deployments pre and post-911 directly involved in counter-terrorism and national security missions globally. He is a co-founder of Carry the Load, which was founded to restore true meaning to Memorial Day and celebrate the service and sacrifice of Police, Fire, and Rescue personnel and their families during the month of May. Clint lives in Dallas with his college sweetheart and three daughters who are not impressed that he played football or was a Navy SEAL.Patrick GrayPatrick Gray is the producer and presenter of the Risky Business weekly information security podcast, a weekly podcast that launched in 2007. He formerly was a journalist for publications including Wired.com, ZDNet Australia, The Sydney Morning Herald, The Age, The Bulletin (magazine) and Men's Style Australia.Eric OlsonEric Olson is the Director of Threat Intelligence for Jet Blue Airways. A threat intelligence professional for more than 20 years, Eric has had executive roles including Senior Vice President of Product Management and Vice President, Intellugence Operations, at LookingGlass Cyber Solutions, and was VP of Product Strategy at Cyveillance.Allan FriedmanAllan Friedman is Senior Advisor and Strategist at the United States Cybersecurity and Infrastructure Security Agency, and one of the nation's leading experts on Software Bill of Materials. Allan leads CISA's efforts to coordinate SBOM initiatives inside and outside the US government, and around the world. He is known for applying technical and policy expertise to help audiences understand the pathways to change in an engaging fashion, and is frequently invited to speak or keynote to industry, academic, and public audiences. Wearing the hats of both a technologist and a policy maker, Allan has over 15 years of experience in international cybersecurity and technology policy. His experience and research focuses on economic and market analyses of information security. On the practical side, he has designed, convened, and facilitated national and international multistakeholder processes that have produced real results, helping diverse organizations finding common ground on contentious, cutting edge issues.Evan Sultanik, PhDEvan Sultanik is a Principal Computer Security Researcher at Trail of Bits. A computer scientist with extensive experience both in industry (as a software engineer) and academia, Evan is an active contributor to open source software. He is author of more than two dozen peer-reviewed academic papers, and is particularly interested in intelligent, distributed/peer-to-peer systems. Evan is editor of and frequent contributor to the International Journal of PoC||GTFO. William WoodruffWilliam Woodruff is a senior security engineer at Trail of Bits, contributing to the engineering and research practices in work for corporate and governmental clients. He has developed several of our open-source projects (e.g., twa, winchecksec, KRF, and mishegos). His work focuses on fuzzing, program analysis, and automated vulnerability reasoning. Outside of Trail of Bits, William helps to maintain the Homebrew project, the dominant macOS package manager. Before joining Trail of Bits, he was a software engineering intern at Cipher Tech Solutions, a small defense subcontractor. He has participated in the Google Summer of Code for four years (two as a student, two as a mentor) and taught a class in ethical hacking as a college senior. William holds a BA in philosophy from the University of Maryland (2018).HOST: Nick SelbyAn accomplished information and physical security professional, Nick leads the Software Assurance Practice at Trail of Bits, giving customers at some of the world's most targeted companies a comprehensive understanding of their security landscape. He is the creator of the Trail of Bits podcast, and does everything from writing scripts to conducting interviews to audio engineering to Foley (e.g. biting into pickles). Prior to Trail of Bits, Nick was Director of Cyber Intelligence and Investigations at the NYPD; the CSO of a blockchain startup; and VP of Operations at an industry analysis firm.PRODUCTION STAFFStory Editor: Chris JulinAssociate Editor: Emily HaavikExecutive Producer: Nick SelbyExecutive Producer: Dan GuidoRECORDINGRecorded at Rocky Hill Studios, Ghent, NY - Nick Selby, Engineer;22Springroad Tonstudio, Übersee, Germany - Volker Lesch, EngineerRemote recordings were conducted at Whistler, BC, Canada (Nick Selby); Clint Bruce was recorded in a Google Meet session; Patrick Gray provided recordings of himself from Australia, courtesy of the Risky Business podcast. Eric Olson recorded himself on an iPhone. Washington, DC (tape sync of Allan Friedman by George Mocharko). Trail of Bits supports and adheres to the Tape Syncers United Fair Rates Card.Edited by Emily Haavik and Chris JulinMastered by Chris JulinMUSICDispatches From Technology's Future, the Trail of Bits theme, Chris JulinEVERYBODY GET UP - No Vocals & FX - Ian PostJD SCAVENGER by Randy SharpRIPPLES by Tamuz DekelFUTURE PERFECT, Evgeny BardyuzhaTHE SWINDLER, The Original Orchestra]BLUE - ALTERNATIVE - INSTRUMENTAL VERSION by Faith RichardsOU ALLONS NOUS D'ICI - INSTRUMENTAL, Dan ZeituneLITTLE EDGY, Chris JulinSCAPES: Gray NorthReproductionWith the exception of any Copyrighted music herein, Trail of Bits Season 1 Episode 3; It Depends © 2022 by Trail of Bits is licensed under Attribution-NonCommercial-NoDerivatives 4.0 International. This license allows reuse: reusers may copy and distribute the material in any medium or format in unadapted form and for noncommercial purposes only (noncommercial means not primarily intended for or directed towards commercial advantage or monetary compensation), provided that reusers give credit to Trail of Bits as the creator. No derivatives or adaptations of this work are permitted. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.Referenced in this Episode:The original blog post announcing the availability of It Depends describes the history you just heard with more technical specificity, and also of course links to the GitHub repository where you can download It Depends and try it for yourself. That blog post also links to the repository where you can download pip-audit, and give that a whirl.In the 2021 Executive Order on Improving the Nation's Cybersecurity, the Biden Administration announced that it would require SBOMs for all software vendors selling to the federal government.Dependabot is a tool available to GitHub users. If you're interested in the catalog of open source projects Trail of Bits participates in and contributes to, please read the blog post Celebrating our 2021 Open Source Contributions. There, you can read about our work contributing for example to LLVM - the compiler and toolchain technologies we discuss in the Podcast episode Future - to Pwndbg, a GDB plug-in that makes debugging with GDB “suck less.” The post includes links to contributions our engineer consultants have made to a huge range of open source projects from assert-rs to ZenGo-X.Meet the Team:CHRIS JULINChris Julin has spent years telling audio stories and helping other people tell theirs. These days he works as a story editor and producer for news outlets like APM Reports, West Virginia Public Broadcasting, and Marketplace. He has also taught and mentored hundreds of young journalists as a professor. For the Trail of Bits podcast, he serves as story and music editor, sound designer, and mixing and mastering engineer.EMILY HAAVIKFor the past 10 years Emily Haavik has worked as a broadcast journalist in radio, television, and digital media. She's spent time writing, reporting, covering courts, producing investigative podcasts, and serving as an editorial manager. She now works as an audio producer for several production shops including Us & Them from West Virginia Public Broadcasting and PRX, and APM Reports. For the Trail of Bits podcast, she helps with scripting, interviews, story concepts, and audio production.

director university australia washington nfl men future fire international vice president dc iphone class police maryland code mvp navy rescue memorial day improving operations engineers bc trail cybersecurity senior vice president wearing marketplace wired navy seals load edited clint bits materials biden administration executive orders strategist baltimore ravens nypd investigations new orleans saints senior advisor github foley product management bom macos international journal cso bulletin risky business whistler homebrew sydney morning herald product strategy google meet cisa ghent us naval academy prx threat intelligence infrastructure security agency krf sbom eric olson cyber intelligence sboms llvm attribution noncommercial noderivatives gdb patrick gray google summer apm reports clint bruce us them west virginia public broadcasting nation's cybersecurity poc gtfo

Future

Trail of Bits

Play Episode Listen Later Jun 20, 2022 21:37

FEATURED VOICES IN THIS EPISODEDan GuidoDan Guido is the CEO of Trail of Bits, a cybersecurity firm he founded in 2012 to address software security challenges with cutting-edge research. In his tenure leading Trail of Bits, Dan has grown the team to 80 engineers, led the team to compete in the DARPA Cyber Grand Challenge, built an industry-leading blockchain security practice, and refined open-source tools for the endpoint security market. In addition to his work at Trail of Bits, he's active on the boards of four early-stage technology companies. Dan contributes to cybersecurity policy papers from RAND, CNAS, and Harvard. He runs Empire Hacking, a 1,500-member meetup group focused on NYC-area cybersecurity professionals. His latest hobby coding project -- AlgoVPN -- is the Internet's most recommended self-hosted VPN. In prior roles, Dan taught a capstone course on software exploitation at NYU as a faculty member and the Hacker in Residence, consulted at iSEC Partners (now NCC Group), and worked as an incident responder for the Federal Reserve System.Nat ChinNat Chin is a security engineer 2 at Trail of Bits, where she performs security reviews of blockchain projects, and develops tools that are useful when working with Ethereum. She is the author of solc-select, a tool to help switch Solidity versions. She worked as a smart contract developer and taught as a Blockchain Professor at George Brown College, before transitioning to blockchain security when she joined Trail of Bits.Opal WrightOpal Wright is a cryptography analyst at Trail of Bits. Two of the following three statements about her are true: (a) she's a long-distance unicyclist; (b) she invented a public-key cryptosystem; (c) she designed and built an award-winning sex toy.Jim MillerJim Miller is the cryptography team lead at Trail of Bits. Before joining Trail of Bits, Jim attended graduate programs at both Cambridge and Yale, where he studied and researched both Number Theory and Cryptography, focusing on topics such as lattice-based cryptography and zero-knowledge proofs. During his time at Trail of Bits, Jim has led several security reviews across a wide variety of cryptographic applications and has helped lead the development of multiple projects, such as ZKDocs and PrivacyRaven.Josselin FeistJosselin Feist is a principal security engineer at Trail of Bits where he participates in assessments of blockchain software and designs automated bug-finding tools for smart contracts. He holds a Ph.D. in static analysis and symbolic execution and regularly speaks at both academic and industrial conferences. He is the author of various security tools, including Slither - a static analyzer framework for Ethereum smart contracts and Tealer - a static analyzer for Algorand contracts.Peter GoodmanPeter Goodman is a Staff Engineer in the Research and Engineering practice at Trail of Bits, where he leads all de/compilation efforts. He is the creator of various static and dynamic program analysis tools, ranging from the Remill library for lifting machine code into LLVM bitcode, to the GRR snapshot/record/replay-based fuzzer. When Peter isn't writing code, he's mentoring a fleet of interns to push the envelope. Peter holds a Master's in Computer Science from the University of Toronto.Host: Nick SelbyAn accomplished information and physical security professional, Nick leads the Software Assurance practice at Trail of Bits, giving customers at some of the world's most targeted companies a comprehensive understanding of their security landscape. He is the creator of the Trail of Bits podcast, and does everything from writing scripts to conducting interviews to audio engineering to Foley (e.g. biting into pickles). Prior to Trail of Bits, Nick was Director of Cyber Intelligence and Investigations at the NYPD; the CSO of a blockchain startup; and VP of Operations at an industry analysis firm.Production StaffStory Editor: Chris JulinAssociate Editor: Emily HaavikExecutive Producer: Nick SelbyExecutive Producer: Dan GuidoRecordingRocky Hill Studios, Ghent, New York. Nick Selby, EngineerPreuss-Projekt Tonstudio, Salzburg, Austria. Christian Höll, EngineerRemote recordings:Whistler, BC, Canada; (Nick Selby) Queens, NY; Brooklyn, NY; Rochester, NY (Emily Haavik);Toronto, ON, Canada. TAPES//TYPES, Russell W. Gragg, EngineerTrail of Bits supports and adheres to the Tape Syncers United Fair Rates CardEdited by Emily Haavik and Chris JulinMastered by Chris JulinMusicDISPATCHES FROM TECHNOLOGY'S FUTURE, THE TRAIL OF BITS THEME, Chris JulinOPEN WINGS, Liron MeyuhasNEW WORLD, Ian PostFUNKYMANIA, Omri Smadar, The Original OrchestraGOOD AS GONE, INSTRUMENTAL VERSION, Bunker Buster ALL IN YOUR STRIDE, AbeBREATHE EASY, Omri SmadarTREEHOUSE, LingerwellLIKE THAT, Tobias BergsonSCAPES, Gray NorthReproductionWith the exception of any Copyrighted music herein, Trail of Bits Season 1 Episode 0; Immutable © 2022 by Trail of Bits is licensed under Attribution-NonCommercial-NoDerivatives 4.0 International. This license allows reuse: reusers may copy and distribute the material in any medium or format in unadapted form and for noncommercial purposes only (noncommercial means not primarily intended for or directed towards commercial advantage or monetary compensation), provided that reusers give credit to Trail of Bits as the creator. No derivatives or adaptations of this work are permitted. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.Meet the Team:CHRIS JULINChris Julin has spent years telling audio stories and helping other people tell theirs. These days he works as a story editor and producer for news outlets like APM Reports, West Virginia Public Broadcasting, and Marketplace. He has also taught and mentored hundreds of young journalists as a professor. For the Trail of Bits podcast, he serves as story and music editor, sound designer, and mixing and mastering engineer.EMILY HAAVIKFor the past 10 years Emily Haavik has worked as a broadcast journalist in radio, television, and digital media. She's spent time writing, reporting, covering courts, producing investigative podcasts, and serving as an editorial manager. She now works as an audio producer for several production shops including Us & Them from West Virginia Public Broadcasting and PRX, and APM Reports. For the Trail of Bits podcast, she helps with scripting, interviews, story concepts, and audio production.

How to Create an Open Web with Evan Chang

Untold Stories

Play Episode Listen Later Apr 8, 2022 44:31

My guest today is Evan Cheng, co-founder & CEO of Mysten Labs. Mysten Labs is a team of leading distributed systems, programming languages, and cryptography experts whose founders led Meta's Novi Research and helped develop the Diem blockchain and Move programming language. The mission of Mysten Labs is to create foundational infrastructure for Web3. They partner with crucial ecosystem builders to make step-change improvements to their networks. Before co-founding Mysten Labs, Evan helped lead Novi Financial and was the research and development director. Evan focused on systems & security, programming language & formal verification, and new product experimentation as the Director of R&D at Novi Financial. Novi Financial is Meta's crypto R&D division. Before leading Novi Financial, Evan worked at Facebook as the Director of Programming Languages and Runtime. Evan focused on supporting Facebook's world-class team in programming languages, runtimes, and compilers: Hack, HHVM, Skip, Flow, Python, Reactive programming, C++, LLVM, Android + iOS mobile runtimes and optimization pipelines, mobile JavaScript platform, and runtimes, Glow, and more. Evan worked at Apple for over a decade, where he was a third-level manager overseeing seven engineering teams. He focused on leading efforts in static and runtime compilation for both CPUs and GPUs, Swift performance, static and dynamic linker, shared cache optimizations, debugger, and bitcode tooling. He collaborated on HW transitioning plans, a simulation tool for early GPU architecture exploration, built & integrated tools, App Store tools, and security hardening. Evan Bootstrapped various new "engineering effectiveness" initiatives. In our conversation, we discussed multiple topics, including; Mysten Labs, the future of Web3, the infrastructure of Web3, decentralization, and much more. We began our conversation by discussing what attracted Evan to crypto and what compelled him to make the jump to start Mysten Labs with co-founders. Evan explains why we are at the precipice of the next innovation wave in crypto. Our next conversation topic centered around the infrastructure layer of crypto. Evan discusses the various limitations current blockchains have to meet the rigorous demands of scaling and composability. Evan addresses how Mysten Labs is building tools and infrastructure to solve these bottlenecks. Our conversation pivots to discuss the future of Web3 and NFTs. Evan outlines why we are still in the early days of NFT and Web3. He discusses how the NFT space may evolve as the tools and infrastructure built around NFTs and Web3 becomes. One significant evolution we discussed is how NFTs will be able to send and receive data unlocking a tremendous amount of potential for collaboration and value for both builders and owners of NFTs. We finish our conversation by discussing how we imagine NFTs and Web3 changing how society interacts and communicates. Please enjoy my conversation with Evan Cheng. -- This podcast is powered by Blockworks. For exclusive content and events that provide insights into the crypto and blockchain space, visit them at https://blockworks.co

Linux Action News 234

Play Episode Listen Later Mar 31, 2022 15:18

A new rolling remix of Ubuntu is grabbing attention, AMD has big Linux plans, and why Linux 5.18 looks like another barn burner release.

google zoom security intel arm bluetooth alibaba linux amd ubuntu idle entropy apt gcc canonical hpc rng debian async battery life linus torvalds action news nvme high performance computing chris fisher cxl llvm deprecation arm64 power management random number generator linux gaming pipewire hfi pulseaudio filesystems design review lwn xfs phoronix linux audio wes payne aarch64 linux news podcast

Linux Action News 234