Podcasts about ResNet

  • 60PODCASTS
  • 260EPISODES
  • 33mAVG DURATION
  • 1MONTHLY NEW EPISODE
  • May 12, 2025LATEST

POPULARITY

20172018201920202021202220232024


Best podcasts about ResNet

Latest podcast episodes about ResNet

RESTalk
EP140 The Future of QA at RESNET®: Automation, Oversight, and ENERGY STAR Updates with Scott Doyle

RESTalk

Play Episode Listen Later May 12, 2025 26:09


"You can't manage what you can't measure." — Peter Drucker   In this insightful episode of RESTalk, host Bill Spohn welcomes Scott Doyle, RESNET's Managing Director of Quality Assurance (QA), for a comprehensive update on what's ahead in the world of RESNET QA®. Scott unpacks the most significant changes hitting the registry, Chapter 9 standards, and the QA app—all designed to modernize, streamline, and strengthen the QA process. The conversation delves into how these updates will affect HERS® raters and providers, with a major focus on the ENERGY STAR QA/QC program. Scott outlines the move toward increased documentation, real-time oversight, and the eventual integration of automation and AI into RESNET's workflow. With the industry's expectations for speed and accuracy climbing, these changes aim to ensure trust, defensibility, and better service. He also gives a call to action: start adapting now by reviewing ENERGY STAR Field Checklist Revision 14. Whether you're a provider, QAD, or rater, this episode equips you with both the “why” and the “what's next” behind the QA evolution—and how to stay ahead of the curve. Note: This episode was recorded on May 1, prior to the recent speculation that the Trump administration is planning to eliminate the EPA and ENERGY STAR. RESNET has been active in advocating for the preservation of the 45L tax credit and ENERGY STAR Homes and will continue to provide updates as more information becomes available.  Scott's LinkedIn: https://www.linkedin.com/in/scott-doyle-84750823/ Link to RESTalk Episode 134: Boosting Efficiency: How RESNET's QA App Is Transforming the Industry https://directory.libsyn.com/episode/index/id/33586437 Energy Star National Field Rater Checklist, Rev 14: https://www.energystar.gov/sites/default/files/2025-01/National%20Rater%20Field%20Checklist_Rev%2014.pdf RESNET QA Compliance Specialist Job Postings: https://www.resnet.us/articles/job-posting-resnet-qa-compliance-specialists-regional-positions/ To the RESNET® community, we hear you and want to engage. Learn more at www.RESNET.us. For more info on this topic, contact RESNET at INFO@RESNET.US  

RESTalk
EP139 Concrete Innovations: The Story of Arizona's First 3D-Printed Home with Jason Barlow

RESTalk

Play Episode Listen Later Apr 14, 2025 30:18


"Innovation is taking two things that exist and putting them together in a new way." — Tom Freston   In this episode of the RESTalk podcast, Bill Spohn sits down with Jason Barlow, former President & CEO of Habitat for Humanity Central Arizona, to discuss a groundbreaking achievement in homebuilding—the first 3D-printed concrete home in Arizona. Jason shares how this innovative project, born from a collaboration with ASU graduates and a German scaffolding company, PERI, pushed the boundaries of affordable housing construction. From overcoming extreme heat challenges to designing a home with unique structural and artistic elements, Jason details the incredible journey of bringing this home to life. The conversation dives into the long-term implications of 3D printing in construction, its potential for large-scale housing developments, and the lessons learned from Habitat's pioneering efforts. Jason also touches on Habitat's broader mission, its work in providing energy-efficient and affordable housing, and how listeners can engage with their local Habitat affiliates. This episode is a fascinating look at the intersection of technology, sustainability, and community impact.   Here is the LinkedIn profile for our podcast guest Jason Barlow: https://www.linkedin.com/in/jasbarlow/   Additional information as well on the 3D home.   https://habitatcaz.org/habitats-first-3d-printed-home-in-the-u-s/   Habitat Resnet Presentation 2025.pdf Habitat for Humanity's first 3D-printed home in Arizona Why not us? Habitat's 3D-Printed home in Tempe, Arizona (3 min)   To the RESNET® community, we hear you and want to engage. Learn more at www.RESNET.us. For more info on this topic, contact RESNET at INFO@RESNET.US  

BUILDTank / buildCAST
#8-2025 Steve Baden – A retrospective with RESNET'S founding Executive Director

BUILDTank / buildCAST

Play Episode Listen Later Apr 13, 2025 65:52


Thanks for listening to the buildCAST. In this episode we hear from Steve Baden, the founding Executive director of the Residential Energy Services Network or RESNET. RESNET is the governing body of the HERS home energy rating industry which was established in 1995 by the National Association of State Energy Officials, Energy Rated Homes of America, and the National Mortgage Industry Association to develop a national market for home energy rating systems and energy-efficient mortgages.  I was in the first energy Rater training and became the 32nd certified rater in Colorado in 1995.  Soon after I met Steve at the first RESNET conference held at the Florida Solar Energy Center when I was on the board of directors of Energy Rated Homes of Colorado. Even before Jimmy Carter's infamous, wear a sweater speech from the oval office Steve's path to energy efficiency came though politics. He found himself in Alaska at the state energy office leading an initiative called “Warming Homes for Alaskans” which received the 1993 national award for the most outstanding state housing program, and which set the stage for a national home energy rating program that RESNET grew out of.Steve has worked in the residential energy efficiency field for 30 years, including eighteen years with home energy ratings and energy mortgages on both the state and national levels, and ten years administering the Alaska State Energy Office. Steve is the founding executive director of RESNET and has just announced that he will be retiring on December 31st, 2025.  I have known Steve from the beginning of the founding of RESNET and my time as a board member of Energy Rated Homes of Colorado and was so happy to have the opportunity to recap his career on the buildCAST before his last day.  Thanks for all you have done for the Industry Steve.Steve Baden on LinkedInRESNET - Residential Energy Services Network

RESTalk
EP138 From Learning to Leading: How the ELC Fellows Are Transforming the Industry with Jennifer Goldberg, Dorian Gothard and Alex Haworth

RESTalk

Play Episode Listen Later Mar 10, 2025 31:55


"Leadership and learning are indispensable to each other."  – John F. Kennedy   In this episode of RESTalk, host Bill Spohn welcomes the 2025 Emerging Leadership Council (ELC) fellows—Jennifer Goldberg, Dorian Gothard, and Alex Haworth—for an insightful discussion on their career journeys, experiences at the RESNET conference, and their visions for the future of energy efficiency and building science. Each fellow shares how they entered the industry, what drew them to the ELC program, and how they apply their knowledge professionally and personally. The conversation covers key takeaways from the RESNET conference, including the industry's ambitious goal of 1 million HERS ratings by 2028, the importance of HVAC grading, and the role of building science in improving homes and communities. Throughout the episode, the fellows highlight the value of collaboration, education, and continuous learning, inspiring those new to the industry and seasoned professionals. Whether you're a rater, builder, or someone passionate about energy efficiency, this episode provides valuable insights into the next generation of industry leaders shaping the future.   Here are the LinkedIn profiles for our podcast guests: Alex Haworth: https://www.linkedin.com/in/alexander-haworth-aa1b90227/ Jennifer Goldberg: https://www.linkedin.com/in/jennifer-goldberg-4688b6107/ Dorian Gothard: https://www.linkedin.com/in/dorian-gothard/ To the RESNET® community, we hear you and want to engage. Learn more at www.RESNET.us. For more info on this topic, contact RESNET at INFO@RESNET.US

The BERcast
The BERcast | Season 2 Episode 4 | Recap of the 2025 RESNET Conference

The BERcast

Play Episode Listen Later Feb 14, 2025 58:22


Join BER's Chris McTaggart and Gabriel Pasillas, along with Tony Lisanti of Integral Building + Design, Inc., for expert insights on the future of solid-state heat pumps and electrification, a recap of the 2025 RESNET Conference, and the latest advancements in the Phius program!

RESTalk
Inside RESNET®: Quality Assurance, Collaboration , and Impact with Ryan Moore, Jordi Kimbrough, and Michael Matthews

RESTalk

Play Episode Listen Later Feb 10, 2025 28:40


“Excellence is never an accident. It results from high intention, sincere effort, intelligent direction, skillful execution, and the vision to see obstacles as opportunities."   – Aristotle (adapted interpretation) In this insightful episode of RESTalk, host Bill Spohn interviews three key members of RESNET's team.: Ryan Moore, Jordi Kimbrough in Quality Assurance, and Michael Matthews in Programs. Each brings a unique background and perspective to their roles, strengthening RESNET's mission to ensure high-quality standards in energy-efficient home ratings. Ryan Moore, the Quality Assurance Investigations Program Manager, shares his journey from a varied background in environmental studies, construction, and tourism to his pivotal role at RESNET. Ryan emphasizes the importance of a level playing field in the rating industry, using investigations to ensure compliance and education among stakeholders. Jordi Kimbrough, Quality Assurance Project Manager, reflects on her serendipitous entry into the rating world and her focus on making providers' challenging roles more manageable. She highlights her goals of streamlining processes, providing training, and gathering feedback to reduce ambiguity and enhance the industry's efficiency. Michael Matthews, Programs Engagement Specialist, rounds out the conversation with his passion for connecting people and advancing energy efficiency. From his early days in weatherization to his current role, Michael facilitates stakeholder conversations, promotes better building practices, and supports t's RESNET's ambitious goals. The episode concludes with discussing the value of teamwork, service, and a shared commitment to sustainability, underscoring the human connection behind RESNET's impactful work. Here is the contact info for our guests: Jordi Kimbrough: Jordi@resnet.us Michael Matthews Michael@resnet.us, https://www.linkedin.com/in/michael-matthews-a8931928/ Ryan Moore: Ryanmoore@resnet.us, https://www.linkedin.com/in/rymoore/ Here is a link to the complaint resolution process: https://www.hersindex.com/about-resnet/complaint-resolution-process/ and the Complaint form https://www.hersindex.com/file-complaint-resolution/   Job Posting for Regional Positions for QA specialists: (Posted 12-4-24) https://www.resnet.us/articles/job-posting-resnet-qa-compliance-specialists-regional-positions/ To the RESNET® community, we hear you and want to engage. Learn more at www.RESNET.us. For more info on this topic, contact RESNET at INFO@RESNET.US

RESTalk
EP136 Your Guide to RESNET® 2025: Tracks, Events, and Exciting Updates with Clara Hedrick

RESTalk

Play Episode Listen Later Jan 13, 2025 19:12


Networking is not about just connecting people.  It's about connecting people with ideas and opportunities -Michele Jennae   In this episode of RESTalk, Bill Spohn chats with Clara Hedrick, Lead Events Coordinator for RESNET®, to discuss the upcoming 2025 RESNET® Conference. Clara shares the event details, which is set to take place from January 26-29, 2025, in Tempe, Arizona. Attendees can look forward to engaging sessions, dynamic networking opportunities, and exciting receptions hosted by sponsors like NAIMA and Knauf. Clara highlights the innovative offerings this year, including a live demonstration of the new RESNET® insulation grading standard and Knauf's VR training system. Clara also delves into the conference's tracks, which cover a broad range of topics such as building science, workforce development, and energy codes. Special events like the offsite Fulton Homes tour and training opportunities add further value to the conference. Looking ahead, Clara announces future RESNET® conference locations: San Antonio, TX in 2026 and Savannah, Georgia, in 2027, both offering unique settings for continued learning and connection. Bill wraps up by encouraging listeners to explore the conference page at resnet.us/2025 for more details and download the WHOVA app to easily access schedules, session info, and networking opportunities. Here's the link to the conference website: https://whova.com/web/1qIMgnn8Jh5Y4xsupaZaVuFO57dR7GOfzNfNibqilSE%3D/ To reach the conference team, you can email them at: conference@resnet.us To the RESNET® community, we hear you and want to engage. Learn more at www.RESNET.us. Or for more info on this topic, contact RESNET at INFO@RESNET.US  

Digital Pathology Podcast
119: DigitPath Digest #19 | Cytology's Digital Revolution, Prostate cancer tsunami + Live AI Demo

Digital Pathology Podcast

Play Episode Listen Later Jan 11, 2025 31:53 Transcription Available


Send us a textIn this episode of the Digital Pathology Podcast, you will learn about cytology's entrance into the digital pathology space, including successful AI and scanner implementations. We cover AI's role in rapid on-site evaluation for lung cancer and share insights on a looming prostate cancer surge and how digital pathology and AI can help. IYou will also listen to a live demo of me using an AI assistant to decode a scientific paper in real-time. Tune in to stay on top of the digital pathology research in 2025!00:00 Welcome to DigiPath Digest00:53 Introduction and New Year Greetings01:41 Diving into DigiPath Digest01:44 AI in Respiratory Cytology06:11 The Role of AI in Pathology09:49 Multi-Omics and AI11:28 Radiomics and Pathomics14:44 Live Q&A and Future Plans20:09 Prostate Cancer Tsunami22:34 Thyroid Cytology and Live AI-Assistant demo31:07 Conclusion and the option to send texts :)Links and Resources:Subscribe to Digital Pathology Podcast on YouTubeFree E-book "Pathology 101"YouTube (unedited) version of this episodeTry Perplexity with my referral linkMy new page built with PerplexityPublications Discussed Today:

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Happy holidays! We'll be sharing snippets from Latent Space LIVE! through the break bringing you the best of 2024! We want to express our deepest appreciation to event sponsors AWS, Daylight Computer, Thoth.ai, StrongCompute, Notable Capital, and most of all all our LS supporters who helped fund the gorgeous venue and A/V production!For NeurIPS last year we did our standard conference podcast coverage interviewing selected papers (that we have now also done for ICLR and ICML), however we felt that we could be doing more to help AI Engineers 1) get more industry-relevant content, and 2) recap 2024 year in review from experts. As a result, we organized the first Latent Space LIVE!, our first in person miniconference, at NeurIPS 2024 in Vancouver.The single most requested domain was computer vision, and we could think of no one better to help us recap 2024 than our friends at Roboflow, who was one of our earliest guests in 2023 and had one of this year's top episodes in 2024 again. Roboflow has since raised a $40m Series B!LinksTheir slides are here:All the trends and papers they picked:* Isaac Robinson* Sora (see our Video Diffusion pod) - extending diffusion from images to video* SAM 2: Segment Anything in Images and Videos (see our SAM2 pod) - extending prompted masks to full video object segmentation* DETR Dominancy: DETRs show Pareto improvement over YOLOs* RT-DETR: DETRs Beat YOLOs on Real-time Object Detection* LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection* D-FINE: Redefine Regression Task in DETRs as Fine-grained Distribution Refinement* Peter Robicheaux* MMVP (Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs)* * Florence 2 (Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks) * PalíGemma / PaliGemma 2* PaliGemma: A versatile 3B VLM for transfer* PaliGemma 2: A Family of Versatile VLMs for Transfer* AlMv2 (Multimodal Autoregressive Pre-training of Large Vision Encoders) * Vik Korrapati - MoondreamFull Talk on YouTubeWant more content like this? Like and subscribe to stay updated on our latest talks, interviews, and podcasts.Transcript/Timestamps[00:00:00] Intro[00:00:05] AI Charlie: welcome to Latent Space Live, our first mini conference held at NeurIPS 2024 in Vancouver. This is Charlie, your AI co host. When we were thinking of ways to add value to our academic conference coverage, we realized that there was a lack of good talks, just recapping the best of 2024, going domain by domain.[00:00:36] AI Charlie: We sent out a survey to the over 900 of you. who told us what you wanted, and then invited the best speakers in the Latent Space Network to cover each field. 200 of you joined us in person throughout the day, with over 2, 200 watching live online. Our second featured keynote is The Best of Vision 2024, with Peter Robichaud and Isaac [00:01:00] Robinson of Roboflow, with a special appearance from Vic Corrapati of Moondream.[00:01:05] AI Charlie: When we did a poll of our attendees, the highest interest domain of the year was vision. And so our first port of call was our friends at Roboflow. Joseph Nelson helped us kickstart our vision coverage in episode 7 last year, and this year came back as a guest host with Nikki Ravey of Meta to cover segment Anything 2.[00:01:25] AI Charlie: Roboflow have consistently been the leaders in open source vision models and tooling. With their SuperVision library recently eclipsing PyTorch's Vision library. And Roboflow Universe hosting hundreds of thousands of open source vision datasets and models. They have since announced a 40 million Series B led by Google Ventures.[00:01:46] AI Charlie: Woohoo.[00:01:48] Isaac's picks[00:01:48] Isaac Robinson: Hi, we're Isaac and Peter from Roboflow, and we're going to talk about the best papers of 2024 in computer vision. So, for us, we defined best as what made [00:02:00] the biggest shifts in the space. And to determine that, we looked at what are some major trends that happened and what papers most contributed to those trends.[00:02:09] Isaac Robinson: So I'm going to talk about a couple trends, Peter's going to talk about a trend, And then we're going to hand it off to Moondream. So, the trends that I'm interested in talking about are These are a major transition from models that run on per image basis to models that run using the same basic ideas on video.[00:02:28] Isaac Robinson: And then also how debtors are starting to take over the real time object detection scene from the YOLOs, which have been dominant for years.[00:02:37] Sora, OpenSora and Video Vision vs Generation[00:02:37] Isaac Robinson: So as a highlight we're going to talk about Sora, which from my perspective is the biggest paper of 2024, even though it came out in February. Is the what?[00:02:48] Isaac Robinson: Yeah. Yeah. So just it's a, SORA is just a a post. So I'm going to fill it in with details from replication efforts, including open SORA and related work, such as a stable [00:03:00] diffusion video. And then we're also going to talk about SAM2, which applies the SAM strategy to video. And then how debtors, These are the improvements in 2024 to debtors that are making them a Pareto improvement to YOLO based models.[00:03:15] Isaac Robinson: So to start this off, we're going to talk about the state of the art of video generation at the end of 2023, MagVIT MagVIT is a discrete token, video tokenizer akin to VQ, GAN, but applied to video sequences. And it actually outperforms state of the art handcrafted video compression frameworks.[00:03:38] Isaac Robinson: In terms of the bit rate versus human preference for quality and videos generated by autoregressing on these discrete tokens generate some pretty nice stuff, but up to like five seconds length and, you know, not super detailed. And then suddenly a few months later we have this, which when I saw it, it was totally mind blowing to me.[00:03:59] Isaac Robinson: 1080p, [00:04:00] a whole minute long. We've got light reflecting in puddles. That's reflective. Reminds me of those RTX demonstrations for next generation video games, such as Cyberpunk, but with better graphics. You can see some issues in the background if you look closely, but they're kind of, as with a lot of these models, the issues tend to be things that people aren't going to pay attention to unless they're looking for.[00:04:24] Isaac Robinson: In the same way that like six fingers on a hand. You're not going to notice is a giveaway unless you're looking for it. So yeah, as we said, SORA does not have a paper. So we're going to be filling it in with context from the rest of the computer vision scene attempting to replicate these efforts. So the first step, you have an LLM caption, a huge amount of videos.[00:04:48] Isaac Robinson: This, this is a trick that they introduced in Dolly 3, where they train a image captioning model to just generate very high quality captions for a huge corpus and then train a diffusion model [00:05:00] on that. Their Sora and their application efforts also show a bunch of other steps that are necessary for good video generation.[00:05:09] Isaac Robinson: Including filtering by aesthetic score and filtering by making sure the videos have enough motion. So they're not just like kind of the generators not learning to just generate static frames. So. Then we encode our video into a series of space time latents. Once again, SORA, very sparse in details.[00:05:29] Isaac Robinson: So the replication related works, OpenSORA actually uses a MAG VIT V2 itself to do this, but swapping out the discretization step with a classic VAE autoencoder framework. They show that there's a lot of benefit from getting the temporal compression, which makes a lot of sense as the Each sequential frames and videos have mostly redundant information.[00:05:53] Isaac Robinson: So by compressing against, compressing in the temporal space, you allow the latent to hold [00:06:00] a lot more semantic information while avoiding that duplicate. So, we've got our spacetime latents. Possibly via, there's some 3D VAE, presumably a MAG VATV2 and then you throw it into a diffusion transformer.[00:06:19] Isaac Robinson: So I think it's personally interesting to note that OpenSORA is using a MAG VATV2, which originally used an autoregressive transformer decoder to model the latent space, but is now using a diffusion diffusion transformer. So it's still a transformer happening. Just the question is like, is it?[00:06:37] Isaac Robinson: Parameterizing the stochastic differential equation is, or parameterizing a conditional distribution via autoregression. It's also it's also worth noting that most diffusion models today, the, the very high performance ones are switching away from the classic, like DDPM denoising diffusion probability modeling framework to rectified flows.[00:06:57] Isaac Robinson: Rectified flows have a very interesting property that as [00:07:00] they converge, they actually get closer to being able to be sampled with a single step. Which means that in practice, you can actually generate high quality samples much faster. Major problem of DDPM and related models for the past four years is just that they require many, many steps to generate high quality samples.[00:07:22] Isaac Robinson: So, and naturally, the third step is throwing lots of compute at the problem. So I didn't, I never figured out how to manage to get this video to loop, but we see very little compute, medium compute, lots of compute. This is so interesting because the the original diffusion transformer paper from Facebook actually showed that, in fact, the specific hyperparameters of the transformer didn't really matter that much.[00:07:48] Isaac Robinson: What mattered was that you were just increasing the amount of compute that the model had. So, I love how in the, once again, little blog posts, they don't even talk about [00:08:00] like the specific hyperparameters. They say, we're using a diffusion transformer, and we're just throwing more compute at it, and this is what happens.[00:08:08] Isaac Robinson: OpenSora shows similar results. The primary issue I think here is that no one else has 32x compute budget. So we end up with these we end up in the middle of the domain and most of the related work, which is still super, super cool. It's just a little disappointing considering the context. So I think this is a beautiful extension of the framework that was introduced in 22 and 23 for these very high quality per image generation and then extending that to videos.[00:08:39] Isaac Robinson: It's awesome. And it's GA as of Monday, except no one can seem to get access to it because they keep shutting down the login.[00:08:46] SAM and SAM2[00:08:46] Isaac Robinson: The next, so next paper I wanted to talk about is SAM. So we at Roboflow allow users to label data and train models on that data. Sam, for us, has saved our users 75 years of [00:09:00] labeling time.[00:09:00] Isaac Robinson: We are the, to the best of my knowledge, the largest SAM API that exists. We also, SAM also allows us to have our users train just pure bounding box regression models and use those to generate high quality masks which has the great side effect of requiring less training data to have a meaningful convergence.[00:09:20] Isaac Robinson: So most people are data limited in the real world. So anything that requires less data to get to a useful thing is that super useful. Most of our users actually run their object per frame object detectors on every frame in a video, or maybe not most, but many, many. And so Sam follows into this category of taking, Sam 2 falls into this category of taking something that really really works and applying it to a video which has the wonderful benefit of being plug and play with most of our Many of our users use cases.[00:09:53] Isaac Robinson: We're, we're still building out a sufficiently mature pipeline to take advantage of that, but it's, it's in the works. [00:10:00] So here we've got a great example. We can click on cells and then follow them. You even notice the cell goes away and comes back and we can still keep track of it which is very challenging for existing object trackers.[00:10:14] Isaac Robinson: High level overview of how SAM2 works. We there's a simple pipeline here where we can give, provide some type of prompt and it fills out the rest of the likely masks for that object throughout the rest of the video. So here we're giving a bounding box in the first frame, a set of positive negative points, or even just a simple mask.[00:10:36] Isaac Robinson: I'm going to assume people are somewhat familiar with SAM. So I'm going to just give a high level overview of how SAM works. You have an image encoder that runs on every frame. SAM two can be used on a single image, in which case the only difference between SAM two and SAM is that image encoder, which Sam used a standard VIT [00:11:00] Sam two replaced that with a hara hierarchical encoder, which gets approximately the same results, but leads to a six times faster inference, which is.[00:11:11] Isaac Robinson: Excellent, especially considering how in a trend of 23 was replacing the VAT with more efficient backbones. In the case where you're doing video segmentation, the difference is that you actually create a memory bank and you cross attend the features from the image encoder based on the memory bank.[00:11:31] Isaac Robinson: So the feature set that is created is essentially well, I'll go more into it in a couple of slides, but we take the features from the past couple frames, plus a set of object pointers and the set of prompts and use that to generate our new masks. Then we then fuse the new masks for this frame with the.[00:11:57] Isaac Robinson: Image features and add that to the memory bank. [00:12:00] It's, well, I'll say more in a minute. The just like SAM, the SAM2 actually uses a data engine to create its data set in that people are, they assembled a huge amount of reference data, used people to label some of it and train the model used the model to label more of it and asked people to refine the predictions of the model.[00:12:20] Isaac Robinson: And then ultimately the data set is just created from the engine Final output of the model on the reference data. It's very interesting. This paradigm is so interesting to me because it unifies a model in a dataset in a way that is very unique. It seems unlikely that another model could come in and have such a tight.[00:12:37] Isaac Robinson: So brief overview of how the memory bank works, the paper did not have a great visual, so I'm just, I'm going to fill in a bit more. So we take the last couple of frames from our video. And we take the last couple of frames from our video attend that, along with the set of prompts that we provided, they could come from the future, [00:13:00] they could come from anywhere in the video, as well as reference object pointers, saying, by the way, here's what we've found so far attending to the last few frames has the interesting benefit of allowing it to model complex object motion without actually[00:13:18] Isaac Robinson: By limiting the amount of frames that you attend to, you manage to keep the model running in real time. This is such an interesting topic for me because one would assume that attending to all of the frames is super essential, or having some type of summarization of all the frames is super essential for high performance.[00:13:35] Isaac Robinson: But we see in their later ablation that that actually is not the case. So here, just to make sure that there is some benchmarking happening, we just compared to some of the stuff that's came out prior, and indeed the SAM2 strategy does improve on the state of the art. This ablation deep in their dependencies was super interesting to me.[00:13:59] Isaac Robinson: [00:14:00] We see in section C, the number of memories. One would assume that increasing the count of memories would meaningfully increase performance. And we see that it has some impact, but not the type that you'd expect. And that it meaningfully decreases speed, which justifies, in my mind, just having this FIFO queue of memories.[00:14:20] Isaac Robinson: Although in the future, I'm super interested to see A more dedicated summarization of all of the last video, not just a stacking of the last frames. So that another extension of beautiful per frame work into the video domain.[00:14:42] Realtime detection: DETRs > YOLO[00:14:42] Isaac Robinson: The next trend I'm interested in talking about is this interesting at RoboFlow, we're super interested in training real time object detectors.[00:14:50] Isaac Robinson: Those are bread and butter. And so we're doing a lot to keep track of what is actually happening in that space. We are finally starting to see something change. So, [00:15:00] for years, YOLOs have been the dominant way of doing real time object detection, and we can see here that they've essentially stagnated.[00:15:08] Isaac Robinson: The performance between 10 and 11 is not meaningfully different, at least, you know, in this type of high level chart. And even from the last couple series, there's not. A major change so YOLOs have hit a plateau, debtors have not. So we can look here and see the YOLO series has this plateau. And then these RT debtor, LW debtor, and Define have meaningfully changed that plateau so that in fact, the best Define models are plus 4.[00:15:43] Isaac Robinson: 6 AP on Cocoa at the same latency. So three major steps to accomplish this. The first RT deditor, which is technically a 2023 paper preprint, but published officially in 24, so I'm going to include that. I hope that's okay. [00:16:00] That is showed that RT deditor showed that we could actually match or out speed YOLOs.[00:16:04] Isaac Robinson: And then LWdebtor showed that pre training is hugely effective on debtors and much less so on YOLOs. And then DeFine added the types of bells and whistles that we expect from these types, this, this arena. So the major improvements that RTdebtor shows was Taking the multi scale features that debtors typically pass into their encoder and decoupling them into a much more efficient transformer encoder.[00:16:30] Isaac Robinson: The transformer is of course, quadratic complexity. So decreasing the amount of stuff that you pass in at once is super helpful for increasing your runtime or increasing your throughput. So that change basically brought us up to yellow speed and then they do a hardcore analysis on. Benchmarking YOLOs, including the NMS step.[00:16:54] Isaac Robinson: Once you once you include the NMS in the latency calculation, you see that in fact, these debtors [00:17:00] are outperforming, at least this time, the the, the YOLOs that existed. Then LW debtor goes in and suggests that in fact, the frame, the huge boost here is from pre training. So, this is the define line, and this is the define line without pre training.[00:17:19] Isaac Robinson: It's within range, it's still an improvement over the YOLOs, but Really huge boost comes from the benefit of pre training. When YOLOx came out in 2021, they showed that they got much better results by having a much, much longer training time, but they found that when they did that, they actually did not benefit from pre training.[00:17:40] Isaac Robinson: So, you see in this graph from LWdebtor, in fact, YOLOs do have a real benefit from pre training, but it goes away as we increase the training time. Then, the debtors converge much faster. LWdebtor trains for only 50 epochs, RTdebtor is 60 epochs. So, one could assume that, in fact, [00:18:00] the entire extra gain from pre training is that you're not destroying your original weights.[00:18:06] Isaac Robinson: By relying on this long training cycle. And then LWdebtor also shows superior performance to our favorite data set, Roboflow 100 which means that they do better on the real world, not just on Cocoa. Then Define throws all the bells and whistles at it. Yellow models tend to have a lot of very specific complicated loss functions.[00:18:26] Isaac Robinson: This Define brings that into the debtor world and shows consistent improvement on a variety of debtor based frameworks. So bring these all together and we see that suddenly we have almost 60 AP on Cocoa while running in like 10 milliseconds. Huge, huge stuff. So we're spending a lot of time trying to build models that work better with less data and debtors are clearly becoming a promising step in that direction.[00:18:56] Isaac Robinson: The, what we're interested in seeing [00:19:00] from the debtors in this, this trend to next is. Codetter and the models that are currently sitting on the top of the leaderboard for large scale inference scale really well as you switch out the backbone. We're very interested in seeing and having people publish a paper, potentially us, on what happens if you take these real time ones and then throw a Swingy at it.[00:19:23] Isaac Robinson: Like, do we have a Pareto curve that extends from the real time domain all the way up to the super, super slow but high performance domain? We also want to see people benchmarking in RF100 more, because that type of data is what's relevant for most users. And we want to see more pre training, because pre training works now.[00:19:43] Isaac Robinson: It's super cool.[00:19:48] Peter's Picks[00:19:48] Peter Robicheaux: Alright, so, yeah, so in that theme one of the big things that we're focusing on is how do we get more out of our pre trained models. And one of the lenses to look at this is through sort of [00:20:00] this, this new requirement for like, how Fine grained visual details and your representations that are extracted from your foundation model.[00:20:08] Peter Robicheaux: So it's sort of a hook for this Oh, yeah, this is just a list of all the the papers that I'm going to mention I just want to make sure I set an actual paper so you can find it later[00:20:18] MMVP (Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs)[00:20:18] Peter Robicheaux: Yeah, so sort of the big hook here is that I make the claim that LLMs can't see if you go to if you go to Claude or ChatGPT you ask it to see this Watch and tell me what time it is, it fails, right?[00:20:34] Peter Robicheaux: And so you could say, like, maybe, maybe the Like, this is, like, a very classic test of an LLM, but you could say, Okay, maybe this, this image is, like, too zoomed out, And it just, like, it'll do better if we increase the resolution, And it has easier time finding these fine grained features, Like, where the watch hands are pointing.[00:20:53] Peter Robicheaux: Nodice. And you can say, okay, well, maybe the model just doesn't know how to tell time from knowing the position of the hands. But if you actually prompt [00:21:00] it textually, it's very easy for it to tell the time. So this to me is proof that these LLMs literally cannot see the position of the watch hands and it can't see those details.[00:21:08] Peter Robicheaux: So the question is sort of why? And for you anthropic heads out there, cloud fails too. So the, the, my first pick for best paper of 2024 Envision is this MMVP paper, which tries to investigate the Why do LLMs not have the ability to see fine grained details? And so, for instance, it comes up with a lot of images like this, where you ask it a question that seems very visually apparent to us, like, which way is the school bus facing?[00:21:32] Peter Robicheaux: And it gets it wrong, and then, of course, it makes up details to support its wrong claim. And so, the process by which it finds these images is sort of contained in its hypothesis for why it can't. See these details. So it hypothesizes that models that have been initialized with, with Clip as their vision encoder, they don't have fine grained details and the, the features extracted using Clip because Clip sort of doesn't need to find these fine grained [00:22:00] details to do its job correctly, which is just to match captions and images, right?[00:22:04] Peter Robicheaux: And sort of at a high level, even if ChatGPT wasn't initialized with Clip and wasn't trained contrastively at all. The vision encoder wasn't trained contrastively at all. Still, in order to do its job of capturing the image it could do a pretty good job without actually finding the exact position of all the objects and visual features in the image, right?[00:22:21] Peter Robicheaux: So This paper finds a set of difficult images for these types of models. And the way it does it is it looks for embeddings that are similar in clip space, but far in DynaV2 space. So DynaV2 is a foundation model that was trained self supervised purely on image data. And it kind of uses like some complex student teacher framework, but essentially, and like, it patches out like certain areas of the image or like crops with certain areas of the image and tries to make sure that those have consistent representations, which is a way for it to learn very fine grained visual features.[00:22:54] Peter Robicheaux: And so if you take things that are very close in clip space and very far in DynaV2 space, you get a set of images [00:23:00] that Basically, pairs of images that are hard for a chat GPT and other big language models to distinguish. So, if you then ask it questions about this image, well, as you can see from this chart, it's going to answer the same way for both images, right?[00:23:14] Peter Robicheaux: Because to, to, from the perspective of the vision encoder, they're the same image. And so if you ask a question like, how many eyes does this animal have? It answers the same for both. And like all these other models, including Lava do the same thing, right? And so this is the benchmark that they create, which is like finding clip, like clip line pairs, which is pairs of images that are similar in clip space and creating a data set of multiple choice questions based off of those.[00:23:39] Peter Robicheaux: And so how do these models do? Well, really bad. Lava, I think, So, so, chat2BT and Jim and I do a little bit better than random guessing, but, like, half of the performance of humans who find these problems to be very easy. Lava is, interestingly, extremely negatively correlated with this dataset. It does much, much, much, much worse [00:24:00] than random guessing, which means that this process has done a very good job of identifying hard images for, for Lava, specifically.[00:24:07] Peter Robicheaux: And that's because Lava is basically not trained for very long and is initialized from Clip, and so You would expect it to do poorly on this dataset. So, one of the proposed solutions that this paper attempts is by basically saying, Okay, well if clip features aren't enough, What if we train the visual encoder of the language model also on dyno features?[00:24:27] Peter Robicheaux: And so it, it proposes two different ways of doing this. One, additively which is basically interpolating between the two features, and then one is interleaving, which is just kind of like training one on the combination of both features. So there's this really interesting trend when you do the additive mixture of features.[00:24:45] Peter Robicheaux: So zero is all clip features and one is all DynaV2 features. So. It, as you, so I think it's helpful to look at the right most chart first, which is as you increase the number of DynaV2 features, your model does worse and worse and [00:25:00] worse on the actual language modeling task. And that's because DynaV2 features were trained completely from a self supervised manner and completely in image space.[00:25:08] Peter Robicheaux: It knows nothing about text. These features aren't really compatible with these text models. And so you can train an adapter all you want, but it seems that it's in such an alien language that it's like a very hard optimization for this. These models to solve. And so that kind of supports what's happening on the left, which is that, yeah, it gets better at answering these questions if as you include more dyna V two features up to a point, but then you, when you oversaturate, it completely loses its ability to like.[00:25:36] Peter Robicheaux: Answer language and do language tasks. So you can also see with the interleaving, like they essentially double the number of tokens that are going into these models and just train on both, and it still doesn't really solve the MMVP task. It gets Lava 1. 5 above random guessing by a little bit, but it's still not close to ChachiPT or, you know, Any like human performance, obviously.[00:25:59] Peter Robicheaux: [00:26:00] So clearly this proposed solution of just using DynaV2 features directly, isn't going to work. And basically what that means is that as a as a vision foundation model, DynaV2 is going to be insufficient for language tasks, right?[00:26:14] Florence 2 (Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks)[00:26:14] Peter Robicheaux: So my next pick for best paper of 2024 would be Florence 2, which tries to solve this problem by incorporating not only This dimension of spatial hierarchy, which is to say pixel level understanding, but also in making sure to include what they call semantic granularity, which ends up, the goal is basically to have features that are sufficient for finding objects in the image, so they're, they're, they have enough pixel information, but also can be talked about and can be reasoned about.[00:26:44] Peter Robicheaux: And that's on the semantic granularity axis. So here's an example of basically three different paradigms of labeling that they do. So they, they create a big dataset. One is text, which is just captioning. And you would expect a model that's trained [00:27:00] only on captioning to have similar performance like chat2BT and like not have spatial hierarchy, not have features that are meaningful at the pixel level.[00:27:08] Peter Robicheaux: And so they add another type, which is region text pairs, which is essentially either classifying a region or You're doing object detection or doing instance segmentation on that region or captioning that region. And then they have text phrased region annotations, which is essentially a triple. And basically, not only do you have a region that you've described, you also find it's like, It's placed in a descriptive paragraph about the image, which is basically trying to introduce even more like semantic understanding of these regions.[00:27:39] Peter Robicheaux: And so like, for instance, if you're saying a woman riding on the road, right, you have to know what a woman is and what the road is and that she's on top of it. And that's, that's basically composing a bunch of objects in this visual space, but also thinking about it semantically, right? And so the way that they do this is they take basically they just dump Features from a vision encoder [00:28:00] straight into a encoder decoder transformer.[00:28:03] Peter Robicheaux: And then they train a bunch of different tasks like object detection and so on as a language task. And I think that's one of the big things that we saw in 2024 is these, these vision language models operating in, on pixel space linguistically. So they introduced a bunch of new tokens to point to locations and[00:28:22] Peter Robicheaux: So how does it work? How does it actually do? We can see if you look at the graph on the right, which is using the, the Dino, the the Dino framework your, your pre trained Florence 2 models transfer very, very well. They get 60%, 60 percent map on Cocoa, which is like approaching state of the art and they train[00:28:42] Vik Korrapati: with, and they[00:28:43] Peter Robicheaux: train with a much more more efficiently.[00:28:47] Peter Robicheaux: So they, they converge a lot faster, which both of these things are pointing to the fact that they're actually leveraging their pre trained weights effectively. So where is it falling short? So these models, I forgot to mention, Florence is a 0. 2 [00:29:00] billion and a 0. 7 billion parameter count. So they're very, very small in terms of being a language model.[00:29:05] Peter Robicheaux: And I think that. This framework, you can see saturation. So, what this graph is showing is that if you train a Florence 2 model purely on the image level and region level annotations and not including the pixel level annotations, like this, segmentation, it actually performs better as an object detector.[00:29:25] Peter Robicheaux: And what that means is that it's not able to actually learn all the visual tasks that it's trying to learn because it doesn't have enough capacity.[00:29:32] PalíGemma / PaliGemma 2[00:29:32] Peter Robicheaux: So I'd like to see this paper explore larger model sizes, which brings us to our next big paper of 2024 or two papers. So PolyGemma came out earlier this year.[00:29:42] Peter Robicheaux: PolyGemma 2 was released, I think like a week or two ago. Oh, I forgot to mention, you can actually train You can, like, label text datasets on RoboFlow and you can train a Florence 2 model and you can actually train a PolyGemma 2 model on RoboFlow, which we got into the platform within, like, 14 hours of release, which I was really excited about.[00:29:59] Peter Robicheaux: So, anyway, so [00:30:00] PolyGemma 2, so PolyGemma is essentially doing the same thing, but instead of doing an encoder decoder, it just dumps everything into a decoder only transformer model. But it also introduced the concept of location tokens to point to objects in pixel space. PolyGemma 2, so PolyGemma uses Gemma as the language encoder, and it uses Gemma2B.[00:30:17] Peter Robicheaux: PolyGemma 2 introduces using multiple different sizes of language encoders. So, the way that they sort of get around having to do encoder decoder is they use the concept of prefix loss. Which basically means that when it's generating, tokens autoregressively, it's all those tokens in the prefix, which is like the image that it's looking at and like a description of the task that it's trying to do.[00:30:41] Peter Robicheaux: They're attending to each other fully, full attention. Which means that, you know, it can sort of. Find high level it's easier for the, the prefix to color, to color the output of the suffix and also to just find like features easily. So this is sort of [00:31:00] an example of like one of the tasks that was trained on, which is like, you describe the task in English and then you give it all these, like, You're asking for it to segment these two classes of objects, and then it finds, like, their locations using these tokens, and it finds their masks using some encoding of the masks into tokens.[00:31:24] Peter Robicheaux: And, yeah, so, one of my critiques, I guess, of PolyGemma 1, at least, is that You find that performance saturates as a pre trained model after only 300 million examples seen. So, what this graph is representing is each blue dot is a performance on some downstream task. And you can see that after seeing 300 million examples, It sort of does equally well on all of the downtrend tasks that they tried it on, which was a lot as 1 billion examples, which to me also kind of suggests a lack of capacity for this model.[00:31:58] Peter Robicheaux: PolyGemma2, [00:32:00] you can see the results on object detection. So these were transferred to to Coco. And you can see that this sort of also points to an increase in capacity being helpful to the model. You can see as. Both the resolution increases, and the parameter count of the language model increases, performance increases.[00:32:16] Peter Robicheaux: So resolution makes sense, obviously, it helps to find small images, or small objects in the image. But it also makes sense for another reason, which is that it kind of gives the model a thinking register, and it gives it more tokens to, like, process when making its predictions. But yeah, you could, you could say, oh, 43.[00:32:30] Peter Robicheaux: 6, that's not that great, like Florence 2 got 60. But this is not Training a dino or a debtor on top of this language or this image encoder. It's doing the raw language modeling task on Cocoa. So it doesn't have any of the bells and whistles. It doesn't have any of the fancy losses. It doesn't even have bipartite graph matching or anything like that.[00:32:52] Peter Robicheaux: Okay, the big result and one of the reasons that I was really excited about this paper is that they blow everything else away [00:33:00] on MMVP. I mean, 47. 3, sure, that's nowhere near human accuracy, which, again, is 94%, but for a, you know, a 2 billion language, 2 billion parameter language model to be chat2BT, that's quite the achievement.[00:33:12] Peter Robicheaux: And that sort of brings us to our final pick for paper of the year, which is AIMV2. So, AIMV2 sort of says, okay, Maybe this language model, like, maybe coming up with all these specific annotations to find features and with high fidelity and pixel space isn't actually necessary. And we can come up with an even simpler, more beautiful idea for combining you know, image tokens and pixel tokens in a way that's interfaceable for language tasks.[00:33:44] Peter Robicheaux: And this is nice because it can scale, you can come up with lots more data if you don't have to come up with all these annotations, right? So the way that it works. is it does something very, very similar to PolyGemo, where you have a vision encoder that dumps image tokens into a decoder only transformer.[00:33:59] Peter Robicheaux: But [00:34:00] the interesting thing is that it also autoregressively tries to learn the mean squared error of the image tokens. So instead of having to come up with fancy object detection or semantic, or segment, or segmentation labels, you can just try to reconstruct the image and have it learn fine grained features that way.[00:34:16] Peter Robicheaux: And it does this in kind of, I think, a beautiful way that's kind of compatible with the PolyGemma line of thinking, which is randomly sampling a prefix line of thinking Prefix length and using only this number of image tokens as the prefix. And so doing a similar thing with the causal. So the causal with prefix is the, the attention mask on the right.[00:34:35] Peter Robicheaux: So it's doing full block attention with some randomly sampled number of image tokens to then reconstruct the rest of the image and the downstream caption for that image. And so, This is the dataset that they train on. It's image or internet scale data, very high quality data created by the data filtering networks paper, essentially which is maybe The best clip data that exists.[00:34:59] Peter Robicheaux: [00:35:00] And we can see that this is finally a model that doesn't saturate. It's even at the highest parameter count, it's, it appears to be, oh, at the highest parameter account, it appears to be improving in performance with more and more samples seen. And so you can sort of think that. You know, if we just keep bumping the parameter count and increasing the example scene, which is the, the, the line of thinking for language models, then it'll keep getting better.[00:35:27] Peter Robicheaux: So how does it actually do at finding, oh, it also improves with resolution, which you would expect for a model that This is the ImageNet classification accuracy, but yeah, it does better if you increase the resolution, which means that it's actually leveraging and finding fine grained visual features.[00:35:44] Peter Robicheaux: And so how does that actually do compared to CLIP on Cocoa? Well, you can see that if you slap a transformer detection head on it, Entry now in Cocoa, it's just 60. 2, which is also within spitting distance of Soda, which means that it does a very good job of [00:36:00] finding visual features, but you could say, okay, well, wait a second.[00:36:03] Peter Robicheaux: Clip got to 59. 1, so. Like, how does this prove your claim at all? Because doesn't that mean like clip, which is known to be clip blind and do badly on MMVP, it's able to achieve a very high performance on fine, on this fine grained visual features task of object detection, well, they train on like, Tons of data.[00:36:24] Peter Robicheaux: They train on like objects, 365, Cocoa, Flickr and everything else. And so I think that this benchmark doesn't do a great job of selling how good of a pre trained model MV2 is. And we would like to see the performance on fewer data as examples and not trained to convergence on object detection. So seeing it in the real world on like a dataset, like RoboFlow 100, I think would be quite interesting.[00:36:48] Peter Robicheaux: And our, our, I guess our final, final pick for paper of 2024 would be Moondream. So introducing Vic to talk about that.[00:36:54] swyx: But overall, that was exactly what I was looking for. Like best of 2024, an amazing job. Yeah, you can, [00:37:00] if there's any other questions while Vic gets set up, like vision stuff,[00:37:07] swyx: yeah,[00:37:11] swyx: Vic, go ahead. Hi,[00:37:13] Vik Korrapati / Moondream[00:37:13] question: well, while we're getting set up, hi, over here, thanks for the really awesome talk. One of the things that's been weird and surprising is that the foundation model companies Even these MLMs, they're just like worse than RT Tether at detection still. Like, if you wanted to pay a bunch of money to auto label your detection dataset, If you gave it to OpenAI or Cloud, that would be like a big waste.[00:37:37] question: So I'm curious, just like, even Pali Gemma 2, like is worse. So, so I'm curious to hear your thoughts on like, how come, Nobody's cracked the code on like a generalist that really you know, beats a specialist model in computer vision like they have in in LLM land.[00:38:00][00:38:01] Isaac Robinson: Okay. It's a very, very interesting question. I think it depends on the specific domain. For image classification, it's basically there. In the, in AIMv2 showed, a simple attentional probe on the pre trained features gets like 90%, which is as well as anyone does. The, the, the, the bigger question, like, why isn't it transferring to object detection, especially like real time object detection.[00:38:25] Isaac Robinson: I think, in my mind, there are two answers. One is, object detection is really, really, really the architectures are super domain specific. You know, we see these, all these super, super complicated things, and it's not super easy to, to, to build something that just transfers naturally like that, whereas image classification, you know, clip pre training transfers super, super quickly.[00:38:48] Isaac Robinson: And the other thing is, until recently, the real time object detectors didn't even really benefit from pre training. Like, you see the YOLOs that are like, essentially saturated, showing very little [00:39:00] difference with pre training improvements, with using pre trained model at all. It's not surprising, necessarily, that People aren't looking at the effects of better and better pre training on real time detection.[00:39:12] Isaac Robinson: Maybe that'll change in the next year. Does that answer your question?[00:39:17] Peter Robicheaux: Can you guys hear me? Yeah, one thing I want to add is just like, or just to summarize, basically, is that like, Until 2024, you know, we haven't really seen a combination of transformer based object detectors and fancy losses, and PolyGemma suffers from the same problem, which is basically to say that these ResNet, or like the convolutional models, they have all these, like, extreme optimizations for doing object detection, but essentially, I think it's kind of been shown now that convolution models like just don't benefit from pre training and just don't like have the level of intelligence of transformer models.[00:39:56] swyx: Awesome. Hi,[00:39:59] Vik Korrapati: can [00:40:00] you hear me?[00:40:01] swyx: Cool. I hear you. See you. Are you sharing your screen?[00:40:04] Vik Korrapati: Hi. Might have forgotten to do that. Let me do[00:40:07] swyx: that. Sorry, should have done[00:40:08] Vik Korrapati: that.[00:40:17] swyx: Here's your screen. Oh, classic. You might have to quit zoom and restart. What? It's fine. We have a capture of your screen.[00:40:34] swyx: So let's get to it.[00:40:35] Vik Korrapati: Okay, easy enough.[00:40:49] Vik Korrapati: All right. Hi, everyone. My name is Vic. I've been working on Moondream for almost a year now. Like Shawn mentioned, I just went and looked and it turns out the first version I released December [00:41:00] 29, 2023. It's been a fascinating journey. So Moonbeam started off as a tiny vision language model. Since then, we've expanded scope a little bit to also try and build some tooling, client libraries, et cetera, to help people really deploy it.[00:41:13] Vik Korrapati: Unlike traditional large models that are focused at assistant type use cases, we're laser focused on building capabilities that developers can, sorry, it's yeah, we're basically focused on building capabilities that developers can use to build vision applications that can run anywhere. So, in a lot of cases for vision more so than for text, you really care about being able to run on the edge, run in real time, etc.[00:41:40] Vik Korrapati: So That's really important. We have we have different output modalities that we support. There's query where you can ask general English questions about an image and get back human like answers. There's captioning, which a lot of our users use for generating synthetic datasets to then train diffusion models and whatnot.[00:41:57] Vik Korrapati: We've done a lot of work to minimize those sessions there. [00:42:00] So that's. Use lot. We have open vocabulary object detection built in similar to a couple of more recent models like Palagem, et cetera, where rather than having to train a dedicated model, you can just say show me soccer balls in this image or show me if there are any deer in this image, it'll detect it.[00:42:14] Vik Korrapati: More recently, earlier this month, we released pointing capability where if all you're interested in is the center of an object you can just ask it to point out where that is. This is very useful when you're doing, you know, I automation type stuff. Let's see, LA we, we have two models out right now.[00:42:33] Vik Korrapati: There's a general purpose to be para model, which runs fair. Like it's, it's it's fine if you're running on server. It's good for our local Amma desktop friends and it can run on flagship, flagship mobile phones, but it never. so much for joining us today, and we'll see you in the [00:43:00] next one. Less memory even with our not yet fully optimized inference client.[00:43:06] Vik Korrapati: So the way we built our 0. 5b model was to start with the 2 billion parameter model and prune it while doing continual training to retain performance. We, our objective during the pruning was to preserve accuracy across a broad set of benchmarks. So the way we went about it was to estimate the importance of different components of the model, like attention heads, channels MLP rows and whatnot using basically a technique based on the gradient.[00:43:37] Vik Korrapati: I'm not sure how much people want to know details. We'll be writing a paper about this, but feel free to grab me if you have more questions. Then we iteratively prune a small chunk that will minimize loss and performance retrain the model to recover performance and bring it back. The 0. 5b we released is more of a proof of concept that this is possible.[00:43:54] Vik Korrapati: I think the thing that's really exciting about this is it makes it possible for for developers to build using the 2B param [00:44:00] model and just explore, build their application, and then once they're ready to deploy figure out what exactly they need out of the model and prune those capabilities into a smaller form factor that makes sense for their deployment target.[00:44:12] Vik Korrapati: So yeah, very excited about that. Let me talk to you folks a little bit about another problem I've been working on recently, which is similar to the clocks example we've been talking about. We had a customer reach out who was talking about, like, who had a bunch of gauges out in the field. This is very common in manufacturing and oil and gas, where you have a bunch of analog devices that you need to monitor.[00:44:34] Vik Korrapati: It's expensive to. And I was like, okay, let's have humans look at that and monitor stuff and make sure that the system gets shut down when the temperature goes over 80 or something. So I was like, yeah, this seems easy enough. Happy to, happy to help you distill that. Let's, let's get it going. Turns out our model couldn't do it at all.[00:44:51] Vik Korrapati: I went and looked at other open source models to see if I could just generate a bunch of data and learn from that. Did not work either. So I was like, let's look at what the folks with [00:45:00] hundreds of billions of dollars in market cap have to offer. And yeah, that doesn't work either. My hypothesis is that like the, the way these models are trained are using a large amount of image text data scraped from the internet.[00:45:15] Vik Korrapati: And that can be biased. In the case of gauges, most gauge images aren't gauges in the wild, they're product images. Detail images like these, where it's always set to zero. It's paired with an alt text that says something like GIVTO, pressure sensor, PSI, zero to 30 or something. And so the models are fairly good at picking up those details.[00:45:35] Vik Korrapati: It'll tell you that it's a pressure gauge. It'll tell you what the brand is, but it doesn't really learn to pay attention to the needle over there. And so, yeah, that's a gap we need to address. So naturally my mind goes to like, let's use synthetic data to, Solve this problem. That works, but it's problematic because it turned out we needed millions of synthetic gauge images to get to reasonable performance.[00:45:57] Vik Korrapati: And thinking about it, reading a gauge is like [00:46:00] not a one, like it's not a zero short process in our minds, right? Like if you had to tell me the reading in Celsius for this, Real world gauge. There's two dials on there. So first you have to figure out which one you have to be paying attention to, like the inner one or the outer one.[00:46:14] Vik Korrapati: You look at the tip of the needle, you look at what labels it's between, and you count how many and do some math to figure out what that probably is. So what happens if we just add that as a Chain of thought to give the model better understanding of the different sub, to allow the model to better learn the subtasks it needs to perform to accomplish this goal.[00:46:37] Vik Korrapati: So you can see in this example, this was actually generated by the latest version of our model. It's like, okay, Celsius is the inner scale. It's between 50 and 60. There's 10 ticks. So the second tick, it's a little debatable here, like there's a weird shadow situation going on, the dial is off, so I don't know what the ground truth is, but it works okay.[00:46:57] Vik Korrapati: There's points on there that are, the points [00:47:00] over there are actually grounded. I don't know if this is easy to see, but when I click on those, there's a little red dot that moves around on the image. The model actually has to predict where this points are, I was already trying to do this with bounding boxes, but then Malmo came out with pointing capabilities.[00:47:15] Vik Korrapati: And it's like pointing is a much better paradigm to to represent this. We see pretty good results. This one's actually for clock reading. I couldn't find our chart for gauge reading at the last minute. So the light. Blue chart is with our rounded chain of thought. This measures, we have, we built a clock reading benchmark about 500 images.[00:47:37] Vik Korrapati: This measures accuracy on that. You can see it's a lot more sample efficient when you're using the chain of thought to model. Another big benefit from this approach is like, you can kind of understand how the model is. it and how it's failing. So in this example, the actual correct reading is 54 Celsius, the model output [00:48:00] 56, not too bad but you can actually go and see where it messed up. Like it got a lot of these right, except instead of saying it was on the 7th tick, it actually predicted that it was the 8th tick and that's why it went with 56.[00:48:14] Vik Korrapati: So now that you know that this. Failing in this way, you can adjust how you're doing the chain of thought to maybe say like, actually count out each tick from 40, instead of just trying to say it's the eighth tick. Or you might say like, okay, I see that there's that middle thing, I'll count from there instead of all the way from 40.[00:48:31] Vik Korrapati: So helps a ton. The other thing I'm excited about is a few short prompting or test time training with this. Like if a customer has a specific gauge that like we're seeing minor errors on, they can give us a couple of examples where like, if it's miss detecting the. Needle, they can go in and correct that in the chain of thought.[00:48:49] Vik Korrapati: And hopefully that works the next time. Now, exciting approach, we only apply it to clocks and gauges. The real question is, is it going to generalize? Probably, like, there's some science [00:49:00] from text models that when you train on a broad number of tasks, it does generalize. And I'm seeing some science with our model as well.[00:49:05] Vik Korrapati: So, in addition to the image based chain of thought stuff, I also added some spelling based chain of thought to help it understand better understand OCR, I guess. I don't understand why everyone doesn't do this, by the way. Like, it's trivial benchmark question. It's Very, very easy to nail. But I also wanted to support it for stuff like license plate, partial matching, like, hey, does any license plate in this image start with WHA or whatever?[00:49:29] Vik Korrapati: So yeah, that sort of worked. All right, that, that ends my story about the gauges. If you think about what's going on over here it's interesting that like LLMs are showing enormous. Progress in reasoning, especially with the latest set of models that we've seen, but we're not really seeing, I have a feeling that VLMs are lagging behind, as we can see with these tasks that should be very simple for a human to do [00:50:00] that are very easy to find VLMs failing at.[00:50:04] Vik Korrapati: My hypothesis on why this is the case is because On the internet, there's a ton of data that talks about how to reason. There's books about how to solve problems. There's books critiquing the books about how to solve problems. But humans are just so good at perception that we never really talk about it.[00:50:20] Vik Korrapati: Like, maybe in art books where it's like, hey, to show that that mountain is further away, you need to desaturate it a bit or whatever. But the actual data on how to, like, look at images is, isn't really present. Also, the Data we have is kind of sketched. The best source of data we have is like image all text pairs on the internet and that's pretty low quality.[00:50:40] Vik Korrapati: So yeah, I, I think our solution here is really just we need to teach them how to operate on individual tasks and figure out how to scale that out. All right. Yep. So conclusion. At Moondream we're trying to build amazing PLMs that run everywhere. Very hard problem. Much work ahead, but we're making a ton of progress and I'm really excited [00:51:00] about If anyone wants to chat about more technical details about how we're doing this or interest in collaborating, please, please hit me up.[00:51:08] Isaac Robinson: Yeah,[00:51:09] swyx: like, I always, when people say, when people say multi modality, like, you know, I always think about vision as the first among equals in all the modalities. So, I really appreciate having the experts in the room. Get full access to Latent Space at www.latent.space/subscribe

The BERcast
The BERcast | Season 2 Episode 2 | Honoring David Goldstein & New Year's Resolutions

The BERcast

Play Episode Listen Later Dec 20, 2024 83:59


Join BER's Chris McTaggart and Sandy Gallo as they talk New Year's resolutions, climate change, and honor home energy rating and energy policy advocate David Goldstein with RESNET's Steve Baden and FSEC's Philip Fairey!

RESTalk
EP135 RESTalk Rewind 2024: Milestones, Innovations, and the Road Ahead with Bill Spohn

RESTalk

Play Episode Listen Later Dec 9, 2024 38:01


"The future belongs to those who believe in the beauty of their dreams."  – Eleanor Roosevelt   Welcome to a special episode of RESTalk: "Best of RESTalk 2024"! This year has been transformative for RESNET® and the residential energy efficiency industry. From setting bold goals to launching innovative tools and forming impactful collaborations, 2024 has redefined what's possible in energy efficiency and sustainability. In this episode, we revisit the year's most insightful and impactful moments, featuring highlights from the conversations that shaped the industry. Join us as we dive into the ambitious goal of achieving 1 million RESNET® HERS ratings annually, explore the evolution of new QA tools, and unpack the growing Build-to-Rent housing trend. We'll also reflect on RESNET®'s strides in water and carbon efficiency programs and the game-changing advocacy efforts that have amplified energy efficiency policies nationwide. Below is a table of all the topics in this episode with links to the full podcasts. Reflection on 2024: As we recap the highlights, we celebrate the progress and resilience of the RESNET® community in driving energy efficiency and sustainability. This episode highlights the industry's achievements and innovation, from RESNET®'s ambitious goals and policy wins to groundbreaking tools like the QA app and evolving market trends like Build-to-Rent housing. Looking Ahead: 2025 promises to bring even more exciting initiatives, broader adoption of RESNET® programs, and a collective journey toward achieving 1 million HERS ratings annually. Don't miss the opportunity to engage with RESNET® through upcoming events like the RESNET® 2025 Conference and continued advocacy efforts. Call to Action: If this episode inspired you, revisit the full conversations for deeper insights and ideas. We thank our listeners, guests, and stakeholders for their contributions to an impactful year. Here's to even greater achievements in 2025! Topic & Episode Topic Full Episode link Part A GOALS & MACRO TRENDS     124 Ambitious goals 2028 www.bit.ly/RT-124 124 Stretch goals www.bit.ly/RT-124 128 Defining BTR & HERS Ratings www.bit.ly/RT-128 Part B SOFTWARE & DATA     125 Development of the QA app www.bit.ly/RT-125 134 Purpose and goals for QA App www.bit.ly/RT-134 131 Highlights from the Trends report www.bit.ly/RT-131 132 QA is the backbone www.bit.ly/RT-132 Part C PEOPLE     126 Perspectives on Women in the HERS industry www.bit.ly/RT-126 127 Robert's advocacy experience www.bit.ly/RT-127 127 Carl on how advocacy days are structured www.bit.ly/RT-127 129 Paulette explains her role in outreach www.bit.ly/RT-129 133 Evolution of IECC/HERS Compliance Specialist www.bit.ly/RT-133 130 Preview of the 2025 RESNET Conference www.bit.ly/RT-130   To the RESNET® community, we hear you and want to engage. Learn more at www.RESNET.us. Or for more info on this topic, contact RESNET at INFO@RESNET.US  

RESTalk
EP134 Boosting Efficiency: How RESNET®'s QA App Is Transforming the Industry with Kiro Bondarev, Kelvin Abong and Billy Giblin

RESTalk

Play Episode Listen Later Nov 11, 2024 30:31


Without continual growth and progress, such words as improvement, achievement, and success have no meaning. – Benjamin Franklin In this episode of RESTalk, Bill Spohn speaks with three guests to discuss the RESNET® QA app, an innovative tool designed to streamline the quality assurance process for home energy ratings. Bill is joined by Billy Giblin, RESNET®'s quality assurance field specialist, Kelvin Abong and Kiro Bondarev from The Fourth Dimension (4D), RESNET®'s technology partner. Together, they explore the app's journey from ideation to execution, its role in harmonizing quality assurance efforts, and the benefits it brings to users.   Billy provides insight into the app's development, explaining how it shifts the industry from inconsistent spreadsheet-based reviews to a more transparent, data-driven approach. Kelvin discusses 4D's collaborative efforts to design a user-friendly app and highlights future improvements based on user feedback. Kiro offers technical details describing the app's ability to integrate with external QA tools via an API, ensuring flexibility for providers.   The conversation touches on the importance of industry adoption, the app's intuitive design, and its positive impact. The episode includes exciting news about upcoming features, including cloud syncing and expanded integration with Energy Star. Here is a link to RESTalk EP125, where we first introduced the QA App on this podcast https://restalk.libsyn.com/ep125-qa-in-your-pocket-how-a-mobile-app-is-empowering-resnet-hers-ratings-with-cassandra-wright-leo-jansen-and-billy-giblin   A press release on the RESNET® QA App, including links to download the app: https://www.resnet.us/articles/resnet-launches-new-qa-app-for-resnet-rating-providers-and-qads/   More details on the RESNET® QA App: https://www.resnet.us/about/qa/resnet-qa-app/   To the RESNET® community, we hear you and want to engage. Learn more at www.RESNET.us. Or for more info on this topic, contact RESNET at INFO@RESNET.US  

Oracle University Podcast
Oracle AI Vector Search: Part 1

Oracle University Podcast

Play Episode Listen Later Oct 22, 2024 13:14


In this episode, Senior Principal APEX and Apps Dev Instructor Brent Dayley joins hosts Lois Houston and Nikita Abraham to discuss Oracle AI Vector Search. Brent provides an in-depth overview, shedding light on the brand-new vector data type, vector embeddings, and the vector workflow.   Oracle Database 23ai: Oracle AI Vector Search Fundamentals: https://mylearn.oracle.com/ou/course/oracle-database-23ai-oracle-ai-vector-search-fundamentals/140188/   Oracle Database 23ai: SQL Workshop: https://mylearn.oracle.com/ou/course/oracle-database-23ai-sql-workshop/137830/   Oracle University Learning Community: https://education.oracle.com/ou-community   LinkedIn: https://www.linkedin.com/showcase/oracle-university/   Twitter: https://twitter.com/Oracle_Edu   Special thanks to Arijit Ghosh, David Wright, Radhika Banka, and the OU Studio Team for helping us create this episode.   ---------------------------------------------------------   Episode Transcript:   00:00 Welcome to the Oracle University Podcast, the first stop on your cloud journey. During this series of informative podcasts, we'll bring you foundational training on the most popular Oracle technologies. Let's get started!   00:26 Lois: Hello and welcome to the Oracle University Podcast! I'm Lois Houston, Director of Innovation Programs here at Oracle University. Joining me as always is our Team Lead of our Editorial Services, Nikita Abraham. Nikita: Hi everyone! Thanks for tuning in over the last few months as we've been discussing all the Oracle Database 23ai new features. We're coming to the end of the season, and to close things off, in this episode and the next one, we're going to be talking about the fundamentals of Oracle AI Vector Search. In today's episode, we'll try to get an overview of what vector search is, why Oracle Vector Search stands out, and dive into the new vector data type. We'll also get insights into vector embedding models and the vector workflow. 01:11 Lois: To take us through all of this, we're joined by Brent Dayley, who is a Senior Principal APEX and Apps Development Instructor with Oracle University. Hi Brent! Thanks for joining us today. Can you tell us about the new vector data type? Brent: So this data type was introduced in Oracle Database 23ai. And it allows you to store vector embeddings alongside other business data. Now, the vector data type allows a foundation to store vector embeddings. 01:42 Lois: And what are vector embeddings, Brent? Brent: Vector embeddings are mathematical representations of data points. They assign mathematical representations based on meaning and context of your unstructured data. You have to generate vector embeddings from your unstructured data either outside or within the Oracle Database. In order to get vector embeddings, you can either use ONNX embedding machine learning models or access third-party REST APIs. Embeddings can be used to represent almost any type of data, including text, audio, or visual, such as pictures. And they are used in proximity searches. 02:28 Nikita: Hmmm, proximity search. And similarity search, right? Can you break down what similarity search is and how it functions? Brent: So vector data tends to be unevenly distributed and clustered into groups that are semantically related. Doing a similarity search based on a given query vector is equivalent to retrieving the k nearest vectors to your query vector in your vector space. What this means is that basically you need to find an ordered list of vectors by ranking them, where the first row is the closest or most similar vector to the query vector. The second row in the list would be the second closest vector to the query vector, and so on, depending on your data set. What we need to do is to find the relative order of distances. And that's really what matters rather than the actual distance. Now, similarity searches tend to get data from one or more clusters, depending on the value of the query vector and the fetch size. Approximate searches using vector indexes can limit the searches to specific clusters. Exact searches visit vectors across all clusters. 03:44 Lois: Ok. I want to move on to vector embedding models. What are they and why are they valuable? Brent: Vector embedding models allow you to assign meaning to what a word, or a sentence, or the pixels in an image, or perhaps audio. It allows you to quantify features or dimensions. Most modern vector embeddings use a transformer model. Bear in mind that convolutional neural networks can also be used. Depending on the type of your data, you can use different pretrained open source models to create vector embeddings. As an example, for textual data, sentence transformers can transform words, sentences, or paragraphs into vector embeddings. 04:33 Nikita: And what about visual data? Brent: For visual data, you can use residual network also known as ResNet to generate vector embeddings. You can also use visual spectrogram representation for audio data. And that allows us to use the audio data to fall back into the visual data case. Now, these can also be based on your own data set. Each model also determines the number of dimensions for your vectors. 05:02 Lois: Can you give us some examples of this, Brent? Brent: Cohere's embedding model, embed English version 3.0, has 1,024 dimensions. Open AI's embedding model, text-embedding-3-large, has 3,072 dimensions. 05:24 Want to get the inside scoop on Oracle University? Head over to the Oracle University Learning Community. Attend exclusive events. Read up on the latest news. Get first-hand access to new products. Read the OU Learning Blog. Participate in Challenges. And stay up-to-date with upcoming certification opportunities. Visit mylearn.oracle.com to get started.  05:50 Nikita: Welcome back! Let's now get into the practical side of things. Brent, how do you import embedding models? Brent: Although you can generate vector embeddings outside the Oracle Database using pre-trained open source embeddings or your own embedding models, you also have the option of doing those within the Oracle Database. In order to use those within the Oracle Database, you need to use models that are compatible with the Open Neural Network Exchange Standard, or ONNX, also known as Onyx. Oracle Database implements an Onyx runtime directly within the database, and this is going to allow you to generate vector embeddings directly inside the Oracle Database using SQL. 06:35 Lois: Brent, why should people choose to use Oracle AI Vector Search? Brent: Now one of the biggest benefits of Oracle AI Vector Search is that semantic search on unstructured data can be combined with relational search on business data, all in one single system. This is very powerful, and also a lot more effective because you don't need to add a specialized vector database. And this eliminates the pain of data fragmentation between multiple systems. It also supports Retrieval Augmented Generation, also known as RAG. Now this is a breakthrough generative AI technique that combines large language models and private business data. And this allows you to deliver responses to natural language questions. RAG provides higher accuracy and avoids having to expose private data by including it in the large language model training data. 07:43 Nikita: In the last part of our conversation today, I want to ask you about the Oracle AI Vector Search workflow, starting with generating vector embeddings. Brent: Generate vector embeddings from your data, either outside the database or within the database. Now, embeddings are a mathematical representation of what your data meaning is. So what does this long sentence mean, for instance? What are the main keywords out of it? You can also generate embeddings not only on your typical string type of data, but you can also generate embeddings on other types of data, such as pictures or perhaps maybe audio wavelengths. 08:28 Lois: Could you give us some examples? Brent: Maybe we want to convert text strings to embeddings or convert files into text. And then from text, maybe we can chunk that up into smaller chunks and then generate embeddings on those chunks. Maybe we want to convert files to embeddings, or maybe we want to use embeddings for end-to-end search. Now you have to generate vector embeddings from your unstructured data, either outside or within the Oracle Database. You can either use the ONNX embedding machine learning models or you can access third-party REST APIs. You can import pre-trained models in ONNX format for vector generation within the database. You can download pre-trained embedding machine learning models, convert them into the ONNX format if they are not already in that format. Then you can import those models into the Oracle Database and generate vector embeddings from your data within the database. Oracle also allows you to convert pre-trained models to the ONNX format using Oracle machine learning for Python. This enables the use of text transformers from different companies. 09:51 Nikita: Ok, so that was about generating vector embeddings. What about the next step in the workflow—storing vector embeddings? Brent: So you can create one or more columns of the vector data type in your standard relational data tables. You can also store those in secondary tables that are related to the primary tables using primary key foreign key relationships. You can store vector embeddings on structured data and relational business data in the Oracle Database. You do store the resulting vector embeddings and associated unstructured data with your relational business data inside the Oracle Database. 10:30 Nikita: And the third step is creating vector indexes? Brent: Now you may want to create vector indexes in the event that you have huge vector spaces. This is an optional step, but this is beneficial for running similarity searches over those huge vector spaces. So once you have generated the vector embeddings and stored those vector embeddings and possibly created the vector indexes, you can then query your data with similarity searches. This allows for Native SQL operations and allows you to combine similarity searches with relational searches in order to retrieve relevant data. 11:15 Lois: Ok. I think I've got it. So, Step 1, generate the vector embeddings from your unstructured data. Step 2, store the vector embeddings. Step 3, create vector indices. And Step 4, combine similarity and keyword search. Brent: Now there is another optional step. You could generate a prompt and send it to a large language model for a full RAG inference. You can use the similarity search results to generate a prompt and send it to your generative large language model in order to complete your RAG pipeline. 11:59 Lois: Thank you for sharing such valuable insights about Oracle AI Vector Search, Brent. We can't wait to have you back next week to talk about vector indices and memory. Nikita: And if you want to know more about Oracle AI Vector Search, visit mylearn.oracle.com and check out the Oracle Database 23ai: Oracle AI Vector Search Fundamentals course. Lois: Yes, and if you're serious about advancing in your development journey, we recommend taking the Oracle Database 23ai SQL workshop. It's designed for those who might be familiar with SQL from other database platforms or even those completely new to SQL. Nikita: Yeah, we'll add the link to the workshop in the show notes so you can find it easily. Until next week, this is Nikita Abraham… Lois: And Lois Houston signing off! 12:45 That's all for this episode of the Oracle University Podcast. If you enjoyed listening, please click Subscribe to get all the latest episodes. We'd also love it if you would take a moment to rate and review us on your podcast app. See you again on the next episode of the Oracle University Podcast.

BUILDTank / buildCAST
#22-2024 Scott Doyle - RESNET Quality Assurance

BUILDTank / buildCAST

Play Episode Listen Later Oct 21, 2024 58:59


Scott Doyle is the managing director of quality assurance for the Residential Energy Services Network, or RESNET. RESNET governs the home energy rating system or the HERS® Energy Rating Index. Scott became a HERS® Rater in 2003 and since being certified has inspected and tested the energy performance of thousands of homes across a variety of construction types and climate zones. In 2007 he became a RESNET certified Quality Assurance Designee and Trainer and oversaw quality assurance activities for EnergyLogic and five other RESNET HERS® Rating Providers. This involved quality assurance oversight of nearly 10,000 homes annually.  Scott's next move was becoming the quality assurance manager at RESNET itself and he has been one of the guiding influences in protecting the RESNET brand and the home energy rating industry in general. As you will hear, I was one of Scott's trainers and employers which made this conversation even more enjoyable as it has been supper fun to watch Scott's carrier evolve over the years. Scott Dyle on LinkedIn Residential Energy Services Network / RESNET

RESTalk
EP133 Bridging the Gap: The Future of Energy Code Compliance with RESNET® HERS® Raters with Mark Johnson and Steve Baden

RESTalk

Play Episode Listen Later Oct 7, 2024 32:02


Knowledge is of no value unless you put it into practice. Anton Chekhov   In this episode of the RESTalk podcast, host Bill Spohn talks with Mark Johnson from the International Code Council (ICC) and Steve Baden from RESNET® about a new collaboration to enhance energy code compliance. They discuss the growing complexities of energy codes, particularly the challenge of ensuring compliance amid increasing technical and regulatory demands. To address this, ICC and RESNET® are introducing a program to involve HERS raters as third-party verifiers, providing quality assurance and supporting local code officials, especially where resources are limited.   The conversation delves into the origins and development of this program, which has been over a decade in the making. Steve and Mark describe their journey, starting with joint training initiatives and establishing standards such as the Energy Rating Index (ERI). The latest effort includes certification of RESNET® HERS® raters to act as recognized compliance specialists under the International Energy Conservation Code (IECC). This certification ensures that raters understand building science and are well-versed in energy code requirements, enhancing credibility with code officials.   Mark and Steve emphasize the need for collaboration and transparency in implementing these new compliance measures. They highlight that this partnership helps jurisdictions struggling with limited code enforcement capacity and provides career growth opportunities for HERS raters. The program aims to be fully launched by the end of 2025, with both guests encouraging raters to obtain their certifications soon to position themselves at the forefront of this compliance evolution.   Here's the link to the ICC website with all of the info on the IECC/HERS Compliance Specialist Designation: https://www.iccsafe.org/content/ecs-designation/ Flyer with a quick summary: https://cdn-www-v2.iccsafe.org/wp-content/uploads/ICC_HERSComplianceSpecialist_MessagingToCodeOfficials_infographic-002.jpg.webp   To the RESNET® community, we hear you and want to engage. Learn more at www.RESNET.us. Or for more info on this topic, contact RESNET at INFO@RESNET.US

The Water Zone
Live from WaterSmart Innovations 2024: Groundbreaking Solutions for Water Efficiency and Sustainability

The Water Zone

Play Episode Listen Later Sep 30, 2024 55:08


Rob Starr and Chris Davey broadcast from their recent visit to the 2024 WaterSmart Innovations Conference in Las Vegas, NV, where several companies shared their innovations addressing water and environmental challenges. RESNET's Paulette McGhie and EPA's Jonah Schein discuss new building standards for energy and water efficiency. Atoco's Magnus Bach highlights sustainable atmospheric water harvesting and CO2 capture technologies. John Green of BlueGreen Water Technologies introduce solutions for controlling blue-green algae. Kamstrup's Joe Ball detailes how ultrasonic water meters optimize revenue and water management. Lastly, Biogreen's Teresa Kim explained the benefits of Agromon biodegradable mulching film for sustainable farming. Podcast Recorded September 26, 2024

RESTalk
From 360K to 1 Million: RESNET®'s Vision for the Future with Scott Doyle and Ryan Meres

RESTalk

Play Episode Listen Later Sep 9, 2024 26:14


Quality is never an accident; it is always the result of intelligent effort. John Ruskin     In this episode of RESTalk, host Bill Spohn is joined by Scott Doyle and Ryan Meres from RESNET® to discuss their ambitious goal of achieving one million ratings annually by 2028. The conversation begins with an overview of RESNET®'s significant progress, which recorded 360,000 ratings in 2023—a 136% increase since 2013. The discussion highlights the importance of quality assurance in maintaining the credibility of the HERS rating system, particularly as RESNET® scales its operations and engages more stakeholders.   Scott Doyle, the Managing Director of Quality Assurance at RESNET®, emphasizes the critical role of quality assurance in ensuring consistency and reliability in the HERS rating process. This is especially important as RESNET's® influence grows, necessitating the trust of external stakeholders. Ryan elaborates on how the RESNET® self-funding model, primarily through fees for homes submitted to their registry, has allowed the organization to expand its staff and services since 2017 without relying on government grants.   The episode also explores opportunities to expand HERS® ratings in new and existing home markets. Ryan and Scott discuss how federal tax credits, builder incentives, and new technologies drive interest in HERS® ratings, particularly among large national production builders. They also touch on the potential for growth in the existing homes market, especially among companies that manage large portfolios of older properties. The conversation concludes with a call to action for HERS® raters to build capacity in anticipation of the growing demand for energy, water, and carbon ratings and energy codes services and a reminder of the importance of attending the upcoming RESNET® conference to stay informed and connected in the industry.   Link to the 2025 RESNET® Conference: https://whova.com/web/O69mUZ3%40Ukqwwk5w4pS%40yN-VrNyD68R4WHS7uRqUeos%3D/ To the RESNET® community, we hear you and want to engage. Learn more at www.RESNET.us. Or for more info on this topic, contact RESNET at INFO@RESNET.US      

R-Value
Innovative Dehumidification Technologies: What's New in the Market? with Nikki Krueger, Santa Fe

R-Value

Play Episode Listen Later Sep 1, 2024 40:17


On today's R-Value podcast, IDI's Ken Allison welcomes Nikki Kruger from Santa Fe dehumidifiers. As the industry grapples with the challenges of increasingly airtight homes, Nikki sheds light on the latest innovations in dehumidification technology and their role in maintaining healthy, comfortable living spaces.   Nikki brings over 20 years of experience in the indoor air quality (IAQ) industry to her role as Director of Marketing & Business Development at Santa Fe Dehumidifiers. A RESNET certified home energy rater and member of various industry committees, including the ACCA Manual Low Load Homes (LLH) Advisory Committee, Nikki is at the forefront of developing effective and sustainable solutions for ventilation and moisture control in buildings.    Ken and Nikki debunk dehumidification myths and reveal surprising insights about current technologies. Nikki's expertise shines as she explains why some popular solutions may not be as effective as commonly believed. As she pointedly states, "Everybody wants to use a technology to dehumidify that's not a dehumidifier... The reality of keeping our mechanical systems separate in order for them to focus and being able to deliver what we actually need is the simplest solution for the HVAC community."   In this episode: The impact of tighter building envelopes on indoor air quality and the need for mechanical ventilation Strategies for managing humidity in energy-efficient homes, including whole-house dehumidifiers The importance of proper HVAC system sizing in modern, well-insulated buildings Challenges with exhaust-only ventilation systems and the benefits of supply ventilation Discussion of dehumidification needs in various climate zones and building types, including Passive Houses The role of occupant behavior in managing indoor humidity and comfort Sizing considerations for dehumidifiers in different applications, such as crawl spaces and living spaces

RESTalk
EP131 HERS Ratings Unveiled: Insights into America's Energy-Efficient Homes with Ryan Meres

RESTalk

Play Episode Listen Later Aug 12, 2024 24:54


The greatest value of a picture is when it forces us to notice what we never expected to see. John Tukey     In this episode of RESTalk, host Bill Spohn welcomes his most frequent guest, RESNET staff member, Ryan Meres, to discuss the latest trends in the 2024 edition of the HERS-rated homes. This report, created as an initiative of the RESNET Suppliers Advisory Board, highlights data trends and analysis in the residential construction market. The report shows a consistent year-over-year increase in HERS ratings, with the total number last year reaching 362,000, and projections for this year indicating that the number may surpass 400,000. Notably, states like Massachusetts and Arizona have seen significant adoption rates of HERS ratings in new homes, with Massachusetts leading at 98%. The conversation delves into various aspects of the report, such as the geographic distribution of HERS ratings, the types of ventilation systems used in homes, and the rise of all-electric homes. Meres notes that certain states, like Texas, have seen substantial increases in the percentage of homes receiving HERS ratings. The report also explores energy cost savings, insulation values, and air leakage rates, highlighting that homes with solar panels tend to have better insulation compared to those without. Additionally, the trend towards high-efficiency mechanical equipment and the decreasing use of gas heating and water heating systems is discussed, reflecting a gradual shift towards more energy-efficient building practices. Finally, the podcast touches on the growing importance of the HERSH2O rating system, which measures water efficiency in homes. Ryan explains that the program has been expanding, particularly in the Southwest, and is beginning to gain traction in other regions as well. The report indicates that HERSH2O rated homes can save significant amounts of water annually. The conversation concludes with a reflection on the value of the data provided in the report, emphasizing its role in helping builders and decision-makers understand and implement best practices for achieving energy and water efficiency in residential construction. Link to the report: https://www.resnet.us/wp-content/uploads/RESNET_2024_HERSTrendsDataReport_FINAL.pdf Ryan's LinkedIn: https://www.linkedin.com/in/ryan-meres-58977110/ RESTalk: To the RESNET community, we hear you and want to engage. Learn more at www.RESNET.us. Or for more info on this topic, contact RESNET at INFO@RESNET.US  

RESTalk
EP130 RESNET 2025 Preview: Celebrating 30 Years and 4 Million Homes With Clara Hedrick

RESTalk

Play Episode Listen Later Jul 8, 2024 16:30


Great things in business are never done by one person. They're done by a team of people.  Steve Jobs   Bill Spohn welcomes RESNET staff member Clara Hedrick to discuss the upcoming RESNET 2025 conference. Clara, the lead events coordinator for RESNET, shares her excitement about the event and details her responsibilities. The conference is scheduled for January 27-30, 2025, in Tempe, Arizona, a location praised for its accessibility, nightlife, and natural beauty. Clara highlights the venue, Tempe Mission Palms, chosen for its amenities and feedback from previous events. With a goal of 500 attendees, Clara anticipates a sellout crowd and emphasizes the importance of early registration. She stresses that early registration is crucial to secure accommodation and participation, as the event is expected to be in high demand. The conference theme celebrates 30 years of RESNET and 4 million homes rated. Clara outlines the general schedule, starting with an opening reception sponsored by NAIMA on January 26. The following days will feature general sessions, panel discussions, and breakout sessions focusing on various themes, including building science, energy codes, and EPA programs. New session tracks like RESNET 101 and separate business and workforce development sessions aim to provide comprehensive insights for both newcomers and seasoned professionals. Special events, such as offsite tours and training sessions, will take place later in the week, providing educational and networking opportunities. Clara also discusses the exhibit hall, which will be adjacent to the general session room, offering booth spaces and tabletops for exhibitors. The conference aims to balance formal sessions with ample networking time, ensuring attendees can connect with peers and industry leaders. This is a unique opportunity to forge new connections and strengthen existing ones. Clara encourages listeners to stay informed through the RESNET mailing list and highlights the importance of feedback in shaping future events. The episode concludes with Clara expressing her passion for the event and her commitment to making it a success, thanking the RESNET community for their continued support and engagement.   Link to the conference RSVP form: https://www.resnet.us/2025-resnet-conference-rsvp/ Contact the conference team at: conference@resnet.us RESNET Newsletter sign up (the KEY to so much info): https://signup.e2ma.net/signup/1878040/1889360/ RESTalk: To the RESNET community, we hear you and want to engage. Learn more at www.RESNET.us. Or for more info on this topic, contact RESNET at INFO@RESNET.US  

RESTalk
EP129 Empowering Builders: RESNET's Push for Water and Energy Efficiency With Paulette McGhie and Ryan Meres

RESTalk

Play Episode Listen Later Jun 10, 2024 28:40


"In the end, we will conserve only what we love; we will love only what we understand; and we will understand only what we are taught."  — Baba Dioum   In the latest episode of RESTalk podcast, Bill Spohn hosts Paulette McGhie and Ryan Meres to discuss RESNET®'s innovative approaches to water, carbon, and building codes. Paulette McGhie, who recently joined RESNET, shares her extensive background in energy compliance and her passion for energy transparency, driven by a personal experience involving her mother's high utility bills. She now focuses on outreach, education, and advocacy, aiming to promote net-zero energy homes by 2040. The discussion highlights RESNET®'s recent activities, including a significant meeting with Utah's water conservation board to address water reduction goals. Paulette emphasizes the importance of collaborative efforts among key stakeholders to create incentives and recognition programs for builders adopting water-efficient practices. Ryan Meres discusses RESNET®'s recent policy forum in Washington, D.C., where they advocated for a federal tax credit for water efficiency, similar to the existing 45L tax credit for energy-efficient homes. The podcast also covers the challenges in promoting water efficiency and adopting advanced building codes. Paulette and Ryan acknowledge the difficulties builders face, particularly in regions with strict outdoor water use regulations. They stress the need for continuous education and advocacy to overcome these obstacles. Both guests are optimistic about increasing builder participation in RESNET® programs and establishing a robust energy code compliance program, aiming for significant progress in the next year.   Paueltte's LinkedIn: https://www.linkedin.com/in/paulette-mcghie-b4675516/ Ryan's LinkedIn: https://www.linkedin.com/in/ryan-meres-58977110/ RESTalk: To the RESNET community, we hear you and want to engage. Learn more at www.RESNET.us. Or for more info on this topic, contact RESNET at INFO@RESNET.US

HVAC School - For Techs, By Techs
What is Standard 310? w/ Eric Kaiser & Chris Hughes

HVAC School - For Techs, By Techs

Play Episode Listen Later May 30, 2024 60:10


Standard 310 is a technical workflow created by ACCA, ResNet, and ANSI for grading the installation of HVAC systems, typically in new home construction. It plays a crucial role in obtaining Energy Star certification, which can qualify homeowners for tax credits under the Inflation Reduction Act. The five steps of Standard 310 are design review, duct leakage test, total system airflow, blower fan watt draw, and refrigerant charge verification. In this podcast episode, host Bryan Orr is joined by guests Chris Hughes and Eric Kaiser to discuss Standard 310 and its implications for HVAC contractors. The standard aims to ensure that HVAC systems are installed correctly and operate as designed. The process involves a third-party HERS rater conducting various tests and measurements, which contractors need to be prepared for. Proper duct sealing, airflow settings, and refrigerant charging are critical for passing the assessments. One of the challenging aspects highlighted is the refrigerant charge verification step. The standard requires either non-invasive testing (which has temperature limitations) or weigh-in verification with geotagged photos. Chris Hughes suggests manufacturers could develop more consistent commissioning protocols to streamline this process. Topics covered in the podcast: Overview of Standard 310 and its five steps Importance for Energy Star certification and tax credits Role of HERS raters and HVAC contractors Duct leakage testing and proper sealing Airflow measurement methods Blower fan watt draw challenges Refrigerant charge verification options Need for consistent commissioning protocols Coordination and documentation required Future improvements to the standard   Have a question that you want us to answer on the podcast? Submit your questions at https://www.speakpipe.com/hvacschool.  Purchase your virtual tickets for the 5th Annual HVACR Training Symposium at https://hvacrschool.com/Symposium24.  Subscribe to our podcast on your iPhone or Android.   Subscribe to our YouTube channel.  Check out our handy calculators here or on the HVAC School Mobile App for Apple and Android.

Papers Read on AI
CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval

Papers Read on AI

Play Episode Listen Later May 16, 2024 36:53


State-of-the-art computer vision systems are trained to predict a fixed set of predetermined object categories. This restricted form of supervision limits their generality and usability since additional labeled data is needed to specify any other visual concept. Learning directly from raw text about images is a promising alternative which leverages a much broader source of supervision. We demonstrate that the simple pre-training task of predicting which caption goes with which image is an efficient and scalable way to learn SOTA image representations from scratch on a dataset of 400 million (image, text) pairs collected from the internet. After pre-training, natural language is used to reference learned visual concepts (or describe new ones) enabling zero-shot transfer of the model to downstream tasks. We study the performance of this approach by benchmarking on over 30 different existing computer vision datasets, spanning tasks such as OCR, action recognition in videos, geo-localization, and many types of fine-grained object classification. The model transfers non-trivially to most tasks and is often competitive with a fully supervised baseline without the need for any dataset specific training. For instance, we match the accuracy of the original ResNet-50 on ImageNet zero-shot without needing to use any of the 1.28 million training examples it was trained on. 2024: Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, Ilya Sutskever

RESTalk
EP128 Building the Future: The Rise of Build-to-Rent Housing With Thomas Cochran and Laurel Elam

RESTalk

Play Episode Listen Later May 13, 2024 25:51


"Build-to-rent isn't just about providing a place to live; it's about crafting communities where flexibility and convenience meet modern living standards." In this episode of the RESTalk podcast, host Bill Spohn welcomed returning guest Laurel Elam and new participant Thomas Cochran. The discussion primarily centered on the burgeoning trend of build-to-rent (BTR) housing. Thomas, who serves as the Senior Vice President of National Business Development and Marketing at ARCXIS, shared insights on his role which involves strategic partnerships and initiatives that span regional to national scales, focusing lately on sustainability and energy efficiency in building projects. The conversation delved into the specifics of the build-to-rent sector, which Thomas described as detached multifamily housing projects or "horizontal apartments." These developments are typically managed by property management firms after being constructed by home builders. Laurel, who has been actively presenting with Thomas on this topic, mentioned RESNET's involvement through a newly formed advisory group aimed at aligning the rating industry with the build-to-rent movement. This initiative reflects a significant opportunity for energy rating companies to contribute to this growing sector. Further, both speakers discussed the practical applications and implications of the build-to-rent model. They emphasized the importance of energy ratings not only for individual homeowners but also for institutional investors and property managers, highlighting the broad appeal and utility of the HERS (Home Energy Rating System) index in these projects. The podcast touched on the geographic expansion of build-to-rent projects, particularly noting active markets in Texas, Arizona, Florida, and Georgia, and how these projects cater to a diverse demographic, including younger generations and retirees looking for flexibility and convenience in housing. Thomas' LinkedIn: https://www.linkedin.com/in/thomas-c-549bab41/ Laurel's LinkedIn: https://www.linkedin.com/in/laurel-elam-5404817/ Related articles: https://www.resnet.us/articles/build-to-rent-housing-sees-dynamic-growth-in-2023/ https://www.resnet.us/articles/resnet-appoints-btr-advisory-group-arcxis-thomas-cochran-as-chair/ RESTalk: To the RESNET community, we hear you and want to engage. Learn more at www.RESNET.us. Or for more info on this topic, contact RESNET at INFO@RESNET.US    

RESTalk
EP127 Energizing Change: Advocates Power Up Policy at the RESNET Policy Forum With Robert Pegues and Carl Chidlow

RESTalk

Play Episode Listen Later Apr 1, 2024 24:44


"Advocacy is the mirror reflecting the voices of the community into the corridors of power." - Unknown In today's podcast we cover the topic of advocacy and public policy in the context of the residential energy sector learning from our guests Robert Pegues, General Manager of Technical Delivery at US Ecologic, and RESNET board member, and Carl Chidlow, a lobbyist representing RESNET. Robert shares his background in the sector, highlighting his experience at last year's policy forum and his involvement with regional and national energy councils. Carl provides insights into RESNET's efforts to engage with policymakers through the RESNET Policy Forum, emphasizing the importance of practitioner involvement in legislative processes and the positive outcomes from bipartisan support on issues like the 45L tax credit and VA home loans. Our main discussion revolves around the upcoming 2024 RESNET Policy Forum, where industry professionals will convene to advocate for energy efficiency and housing policies. Carl explains the event's objectives and the practical aspects of participating, such as scheduled meetings with Congress members and the importance of constituency in policy advocacy, a truly turnkey process for the participant. Our narrative underlines the significance of direct engagement in shaping policies that affect the residential energy sector. The conversation also touches on the practical and personal aspects of participating in such an event, with Robert sharing his initial apprehensions and eventual satisfaction from influencing policy and advocating for industry concerns. Both Carl and Robert encourage listeners to participate in the policy forum, highlighting the opportunity to affect change and the foundational American right to petition the government. The episode concludes with a call to action for listeners to consider attending the policy forum and engaging in the democratic process to advocate for their industry and interests. Robert's LinkedIn: https://www.linkedin.com/in/robert-pegues-785622a7/ Carl's LinkedIn: https://www.linkedin.com/in/carl-chidlow-7816ba37/ Link to the 2024 Policy Forum: https://www.resnet.us/2024-policy-forum/ RESTalk: To the RESNET community, we hear you and want to engage. Learn more at www.RESNET.us. Or for more info on this topic, contact RESNET at INFO@RESNET.US    

レアジョブ英会話 Daily News Article Podcast
Trial to see whether AI app accurately detects TB cough

レアジョブ英会話 Daily News Article Podcast

Play Episode Listen Later Mar 26, 2024 2:07


At the Kenya Medical Research Institute, research is underway to create a mobile phone application that uses AI to diagnose tuberculosis and other respiratory diseases. In a specially contained quiet room, Dr. Videlis Nduba and his team record coughs from people with respiratory diseases like tuberculosis as well as people without disease. The aim is to create software that can differentiate between the two and make a mobile phone application that can accurately recognize a cough connected to TB and other serious diseases. Natural or forced coughs are collected using three microphones, including a cheap version, a high-definition one and a microphone on a smartphone. The results are sent to the University of Washington which puts them through an existing computer software system called ResNet 18. Nduba believes that if the software can be proven in trials to perform accurately, it can shorten the time before a patient can get a diagnosis and treatment, and that will help curb the spread of TB. "The biggest achievement is reduced time to diagnosis. So, from when someone develops TB symptoms, to when a doctor determines they have TB and they need treatment sometimes the average can run from 3 to 2 months to one year. And when they are in the community they are infectious and they are transmitting TB. The moment they get a cough, if you can just expose them to this software and determine this is TB would reduce TB transmission in the community and a lot of TB is due to transmission," he says. But the software is not yet accurate enough to meet the standard required by the World Health Organization. The WHO says the application must be at least 90% accurate in recognizing a TB infection and it must be at least 80% accurate at detecting if no infection exists. Nduba's trials so far have shown 80% accuracy at detecting TB and 70% accuracy for detecting there is no TB. This article was provided by The Associated Press.

RESTalk
EP126 From Audits to Advocacy: Trailblazing Women in Energy Efficiency with Sharla Riead and Emelie Cuppernell-Glitch

RESTalk

Play Episode Listen Later Mar 25, 2024 25:39


"I never dreamed about success. I worked for it." - Estée Lauder   We welcome Sharla Riead, Lead Instructor at Energy Smart Institute and Emelie Cuppernell Glitch, VP Programs at Performance Systems Development to focusing on the recognition of women in the HERS (Home Energy Rating System) rating industry. Sharla shares her extensive background, beginning with founding an energy auditing company in 1979, which evolved into a HERS rating company, and subsequently into roles in quality assurance and training within the industry. She emphasizes her company's impact and the shift she has observed towards greater female involvement in the sector. Emelie details her journey in the industry, from her initial interest in science and residential energy to her current role as vice president of Performance Systems Development. She discusses the challenges she faced as a woman in the field, including overcoming assumed biases and the importance of establishing credibility. She reflects on the changing landscape for women in the industry, noting improvements and sharing a personal anecdote illustrating past gender assumptions. The conversation concludes with advice for women entering the industry, highlighting the importance of continuous learning, self-trust, and getting outside one's comfort zone. Both guests underline the evolving nature of the industry, noting an increase in female participation, the growing list of role models and encouraging more women to join. They stress the value of diversity and the need for different perspectives in building science and energy efficiency, illustrating the industry's ongoing transformation and the significant contributions of women. Links To Sharla and Emelie on LinkedIn are below along with a Press Release from the RESNET's Inaugural Class Recognizing Women Pioneers in the HERS Industry during the 2023 conference held in San Diego, CA. https://www.linkedin.com/in/sharla-riead/ https://www.linkedin.com/in/emelie-cuppernell-glitch-a8647b26/ https://www.resnet.us/about/inaugural-class-of-resnet-recognition-of-women-pioneers-in-hers-industry/ RESTalk: To the RESNET community, we hear you and want to engage. Learn more at www.RESNET.us. Or for more info on this topic, contact RESNET at INFO@RESNET.US

RESTalk
EP125 QA in Your Pocket How a Mobile App is Empowering RESNET HERS Ratings with Cassandra Wright, Leo Jansen and Billy Giblin

RESTalk

Play Episode Listen Later Feb 12, 2024 30:31


The best way to predict your future is to create it. - Abraham Lincoln   Data is king, they say. How can the RESNET Quality Assurance (QA) app's data insights revolutionize how we rate energy-efficient homes?   Imagine a world where getting your home energy rating was made easier by using a smartphone app. Is that the future the RESNET QA app promises?   Our trio of guests, Quality Assurance Designees (QADs) Cassandra Wright from Strand Systems and Leo Jansen from Energy Efficient Homes Midwest, and RESNET Quality Assurance Field  Specialist Billy Giblin, help us to understand the development and application of the new RESNET QA app, a tool designed to improve the quality and consistency of home energy ratings. Here are some key points:   Background: The QA checklist was created to address concerns about inconsistencies in quality assurance (QA) reviews. The checklist evolved through several versions and became a mandatory provider requirement. The need for better data access and tracking led to development of the QA app.   The App: Available on the Apple Store, Google Play, and as a web app. Streamlines the QA process by pulling data from the RESNET Buildings Registry. Saves time and reduces manual entries. Allows providers to control access for their QADs.   Future Plans: Integrate with other energy efficiency programs like Energy Star Indoor airPlus and Zero Energy Ready Homes. Create a forum for QADs to ask questions and share best practices. Develop dashboards for providers and QADs to track their QA and performance. Implement an API for providers who want to use their own tools for QA.   Benefits: Improves transparency and consistency in QA reviews. Provides valuable data for tracking and improving performance. Reduces the workload for QADs and providers.   Challenges: Providers who use a retroactive QA model may need to adjust their process. Learning curve for using the new app and API.   Overall, the RESNET QA app is a positive step towards improving the quality and consistency of home energy ratings. Everyone will need time to adjust to the new system, but the potential benefits are significant. Link to RESNET site with info on the QA app: www.resnet.us/about/qa/resnet-qa-app/ RESTalk: To the RESNET community, we hear you and want to engage. Learn more at www.RESNET.us. Or for more info on this topic, contact RESNET at INFO@RESNET.US  

Papers Read on AI
Matryoshka Representation Learning

Papers Read on AI

Play Episode Listen Later Jan 30, 2024 40:07


Learned representations are a central component in modern ML systems, serving a multitude of downstream tasks. When training such representations, it is often the case that computational and statistical constraints for each downstream task are unknown. In this context rigid, fixed capacity representations can be either over or under-accommodating to the task at hand. This leads us to ask: can we design a flexible representation that can adapt to multiple downstream tasks with varying computational resources? Our main contribution is Matryoshka Representation Learning (MRL) which encodes information at different granularities and allows a single embedding to adapt to the computational constraints of downstream tasks. MRL minimally modifies existing representation learning pipelines and imposes no additional cost during inference and deployment. MRL learns coarse-to-fine representations that are at least as accurate and rich as independently trained low-dimensional representations. The flexibility within the learned Matryoshka Representations offer: (a) up to 14x smaller embedding size for ImageNet-1K classification at the same level of accuracy; (b) up to 14x real-world speed-ups for large-scale retrieval on ImageNet-1K and 4K; and (c) up to 2% accuracy improvements for long-tail few-shot classification, all while being as robust as the original representations. Finally, we show that MRL extends seamlessly to web-scale datasets (ImageNet, JFT) across various modalities -- vision (ViT, ResNet), vision + language (ALIGN) and language (BERT). MRL code and pretrained models are open-sourced at https://github.com/RAIVNLab/MRL. 2022: Aditya Kusupati, Gantavya Bhatt, Aniket Rege, Matthew Wallingford, Aditya Sinha, V. Ramanujan, William Howard-Snyder, Kaifeng Chen, S. Kakade, Prateek Jain, Ali Farhadi https://arxiv.org/pdf/2205.13147v3.pdf

Non Toxic Environments Home Health & Wellness
NTE Live w/Nikki Krueger - Santa Fe Dehumidifiers

Non Toxic Environments Home Health & Wellness

Play Episode Listen Later Jan 27, 2024 54:08


Today's episode is all about moisture mitigation in our homes. We all know that excess moisture can lead to mold, but its also responsible for increased chemical offgassing.  This show for everyone, even if you think you live in a dry climate and dont have anything to worry about.   Nikki Krueger is the Director of Marketing & Business Development Manager for Santa Fe freestanding and whole house ventilating dehumidifiers. She has been involved in the indoor air quality industry for over 20 years. She is a RESNET certified home energy rater, a member of the ACCA Manual Low Load Homes (LLH) Advisory Committee, sits on the building envelope committee for the Spray Polyurethane Foam Association, has completed the ACCA Residential Design for Quality Installation Certification, and is serving on the 2024 National Green Building Standard update committee. She educates homeowner, builders, HVAC contractors, architects, engineers, crawl space contractors, and other professionals in the industry on the building science of ventilation and moisture control in buildings.    

The Art of Construction
333: Deep Dive Series: Applying Building Science, Episode 1: Pragmatic Building Think Tank

The Art of Construction

Play Episode Listen Later Jan 25, 2024 44:50


This is episode 1 of 4 discussing the national organization BS + Beer. “Concentrate on that control and predictability, everything is about control and predictability.” Join us this week as we talk with Robby Schwarz about BUILDTank and the foundations of building science and energy modeling. We kick off this Deep Dive Series with Robby and Devon discussing Building Science in our studio. They set up future episodes that will discuss Energize Denver and feature a LIVE podcast at BS + Beer Denver!  Robby has been a champion of home performance for over 25 years, focusing on building performance, applied building science, and system thinking.  He is committed to helping industry partners, builders, code jurisdictions, and others understand residential energy, applied building science, systems thinking, home performance and our role in the built environment. His interest started in 1995, while exploring how to incorporate green building materials into the production building environment.  Soon after, Robby founded his first company, BuiltWright, Inc.  In 2006, he cofounded EnergyLogic, now the largest energy rating firm of its kind in Colorado. In 2020 he became the principal thinker and founder of BUILDTank, Inc., a pragmatic building think tank specializing in actionable applied building science solutions. When getting his hands dirty doing the work he finds that ideas are stimulated, and innovative change can occur. Using what he has learned from working on thousands of homes, Robby has helped train and lead the industry.  He is actively involved in energy code development, builder and trade training, and educating the next generation of residential energy experts.  He has helped Colorado jurisdictions develop and implement their energy codes, presented code language that was successfully adopted at the national ICC® code hearings, and encouraged implementation of the simulated performance path for code compliance.  His ability to integrate applied building science and systems thinking with building programs such as Energy Star®, Indoor airPlus®, and DOE Zero Energy Ready Homes® has lead to thousands of certified homes in Colorado. Robby is a sought-after trainer and routinely presents at RESNET®, EEBA, the ENERGY STAR® Summit, the Colorado Chapter of the IECC®, Colorado Energy Office, and local Home Builders Associations. Watch a promo video for this Deep Dive Series!  Listen to Robby's podcast, BuildCAST: https://open.spotify.com/show/2M8c6vIFahbGsCBl675BAr 

Oil and Gas Startups Podcast
Resnet on Oil and Gas Startups

Oil and Gas Startups Podcast

Play Episode Listen Later Jan 18, 2024 57:45


In this episode, we engage in an insightful conversation with Ryan Rice, discussing his journey in the oil and gas industry and the innovative work at Resnet, a software and services startup focused on revolutionizing production operations and reservoir engineering.This episode dives into:Introduction to Resnet: Exploring the mission and services of Resnet, which aims to unite field and office operations through innovative software solutions and technical services.Well Tender - Field Service Management: Discussing Resnet's well tender field service management and field dispatching solution, designed to optimize production operations by ensuring efficient job allocation and safety.Gamification in Operations: The unique approach of incorporating gamification into operational processes to enhance engagement and efficiency.Intelligent Flow Control Services: Delving into Resnet's specialized services in reservoir engineering, particularly in conducting advanced well testing campaigns to understand well communication and influence asset development programs.Building a Tech-Driven Business: Ryan Rice shares insights on self-funding Resnet, balancing software development with consulting services, and the importance of cash flow in a startup.Evolution of Resnet's Vision: The journey from the initial concept of production as a service to the current focus on addressing communication and data access challenges in the oil and gas industry.Influence of Salesforce on Resnet: How the adoption of Salesforce at Rice Energy inspired the development of Resnet's platform, emphasizing the need for efficient and integrated business operations.Ryan Rice's Background and Career Path: From his early days working in the field at Rice Energy to pursuing petroleum engineering and eventually co-founding Resnet.The Future of Oil and Gas Technology: Discussing the evolving landscape of the oil and gas industry and the role of technology in driving future developments.This episode offers a deep dive into the innovative world of oil and gas technology, highlighting Ryan Rice's unique perspective and the transformative work being done at Resnet.

RESTalk
EP124 RESNET's audacious goal for 2028 and supporting initiatives with Steve Baden and Mark Johnson

RESTalk

Play Episode Listen Later Jan 15, 2024 29:24


Time is neutral and does not change things. With courage and initiative, leaders change things. - Jesse Jackson   The advent of a new year naturally brings broader thinking, more introspection, and the creation of future goals. What future goal has RESNET's board recently adopted? What near-term initiatives create the path to that goal? How confident is RESNET in meeting that goal? What is “embodied passion index?”  In 2024's inaugural episode of RESTALK, Steve Baden and Mark Johnson discuss RESNET's ambitious goal of achieving one million ratings by the end of 2028. This goal represents a significant shift towards a goal-driven budget, breaking away from traditional year-to-year budgeting. The board's decision to set such an audacious goal is based on several factors, including the increasing demand for housing, the need for energy efficiency in homes, and the availability of new tools and initiatives.   One key initiative is the collaboration between RESNET and the International Code Council (ICC) to improve compliance with energy codes. This involves training and certifying RESNET HERS (Home Energy Rating System) raters who can play a crucial role in ensuring compliance. The extension and amendment of the 45L tax credit also provides incentives for builders to focus on energy efficiency in homes. Additionally, the use of HERS index scores as a matrix for Environmental, Social, and Governance (ESG) reporting by production builders and the growth of green bond programs by finance organizations further demonstrate the industry's commitment to sustainability.   The interview also highlights RESNET's efforts in developing a water efficiency index, a carbon index for homes, and the promising build-to-rent movement, which addresses the housing shortage while promoting energy-efficient homes. The conversation concludes with both Steve and Mark expressing confidence in RESNET's ability to achieve its ambitious goal, driven by the passion and commitment of the industry and the growing demand for energy-efficient and sustainable homes. RESTalk: To the RESNET community, we hear you and want to engage. Learn more at www.RESNET.us Or for more info on this topic contact RESNET at INFO@RESNET.US  

RESTalk
EP123 The Best of RESTalk 2023 with Bill Spohn

RESTalk

Play Episode Listen Later Dec 11, 2023 45:55


That's the funny thing about time. It is only in looking back that it's easy to connect the dots.  To see exactly why everything needed to happen the way that it did. - Rebecca Serle   11 episodes, 11 topics, 17 different guests and over 5 hours of RES-Talking in 2023! We covered a lot of ground in the RESTalk podcast in 2023, now going into our 6th year of episodes. We hope we stimulated your thinking and moved you into action in this ever-evolving world of home energy ratings and peripheral topics. Listen in to this fast-paced recap of the year in RESTalk, maybe you missed a detail or two in these nuggets we have mined. If you'd like to dig a little deeper into the topics covered in these episodes, see the list mentioned in each section below. PART A: Organization. Systems & Affiliates Episode Speaker(s) Topic LINK 112 Steve Baden John Hensley 2023 RESNET Mission, Goals and Priorities LINK 114 Mark Johnson and Cy Kilbourn Meet the new RESNET board leaders LINK 121 Clara Hedrick and Emma Bennett Back Together and Stronger, the RESNET 2023 Conference LINK PART B: Buildings and Building Data Episode Speaker(s) Topic LINK 116 Ryan Meres RESNET Data Sheds Light on Energy Efficient Home Trends LINK 117 James Rodriguez & Ned Munoz Learn about the HERS Index and Texas House Bill 3215 LINK 120 Robert Broad Energy Efficient Homebuilding at Scale with Robert Broad of AMH Development LINK 122 Michael Lee Best Practices in Construction Technology Education LINK PART C: Carbon, ESG and HERS Zero Episode Speaker(s) Topic LINK 113 Phililp Fairey David Goldstein Update on RESNET Carbon Index LINK 115 Chris Magwood The New RESNET Embodied Carbon Advisory Committee LINK 118 Rob Lochner & David Best Santa Fe's Habitat for Humanity building homes with HERS-ZERO scores LINK 119 Matthew Cooper RESNET® Appoints New ESG Advisory Group LINK RESTalk: To the RESNET community, we hear you and want to engage. Learn more at www.RESNET.us Or for more info on this topic contact RESNET at INFO@RESNET.US  

RESTalk
EP122 Best Practices in Construction Technology Education with Michael Lee, Texarkana College

RESTalk

Play Episode Listen Later Nov 13, 2023 25:40


Whatever good things we build end up building us. -Jim Rohn   In today's episode we chat with Michael Lee coordinator of the Construction Technology Program at Texarkana College in Texarkana, TX.   We learn of his back story which led him to this role. He thoughtfully describes the approach he has taken to build the construction technology department at Texarkana College.   Michael describes the students that come through the program, where they come from, what they're taught, and where they end up working.    He spends extra effort in enriching his instruction by getting the students out into the field at events, (like the recent RESNET Texas Home Builders Association event), into factories or by bringing in guest instructors.   We also discuss how he includes the principles of energy efficient construction including the house as a system approach, HERS scoring and using the latest techniques and materials.   Details on the construction program in the following links:   https://www.texarkanacollege.edu/programs/construction-technology/ https://catalog.texarkanacollege.edu/article/construction-technology-associate-applied-science/ Degree plan: https://catalog.texarkanacollege.edu/article/construction-technology-associate-applied-science/ The Construction Technology program making the news: https://www.texarkanagazette.com/news/2022/oct/11/tc-construction-students-build-new-picnic-tables/ A couple of Mike's favorite learning resources: https://www.apawood.org/publication-search https://buildshownetwork.com/ https://www.youtube.com/@essentialcraftsman You can reach Michael via the Texarkana college website at: https://www.texarkanacollege.edu/faculty-and-staff-resources/?     RESTalk: To the RESNET community, we hear you and want to engage. Learn more at www.RESNET.us Or for more info on this topic contact RESNET at INFO@RESNET.US  

My Climate Journey
An Expert's Advice to Home Energy Efficiency

My Climate Journey

Play Episode Listen Later Oct 12, 2023 55:22


This episode is part of our Skilled Labor Series hosted by MCJ partner, Yin Lu. This series is focused on amplifying the voices of folks from the skilled labor workforce, including electricians, farmers, ranchers, HVAC installers, and others who are on the front lines of rewiring our infrastructure.David Holtzclaw is the founder and principle of Transduction Technologies, a small engineering firm based out of Omaha, Nebraska that provides science analysis, testing, and energy consulting services to residential and small commercial clients. In this episode, we are talking about weatherization and home energy efficiency.David and his team perform a number of services including energy evaluations, duct leak testing, ventilation testing, pressure mapping, combustion testing, infrared imaging and cost benefit analysis of implementing renewable energy systems as a whole. We discuss how the home energy efficiency market has grown over the past few decades, the top things you can do to your home to improve your energy efficiency, and both the tail and headwinds the IRA bill is bringing to consumers and contractors alike in Nebraska.In this episode, we cover: [03:11]: Origin of home energy auditing in the 1980s and creation of ResNet[05:29]: Home Energy Score (HES) for existing homes, Home Energy Rating System (HERS) for new homes[07:23]: ResNet's relationship with BPI (Building Performance Institute)[09:04]: Emergence of the first energy code for new construction, the IECC (International Energy Conservation Code)[11:17]: The impact of high interest rates on the demand for energy audits[14:47]: David's transition from aerospace and NASA to founding an energy efficiency company[20:43]: An overview of his customer base[24:27]: The main culprits of an energy-inefficient home[29:45]: David's approach to customizing homes during the design process[32:11]: Insights into mechanical ventilation[34:30]: How upfront investments like triple pane windows pay off[38:50]: Why cheaper heat pumps may be pushed over better models with the IRA[42:08]: The impact of politics on state energy efficiency funding[49:22]: Advice and cautions for listeners planning to electrify and weatherize their homes.Get connected: David Holtzclaw LinkedInYin X / LinkedInMCJ Podcast / Collective / Instagram*You can also reach us via email at info@mcjcollective.com, where we encourage you to share your feedback on episodes and suggestions for future topics or guests.Episode recorded on Aug 17, 2023 (Published on Oct 12, 2023)

RESTalk
EP121 Back Together and Stronger, the RESNET 2023 Conference with Clara Hedrick & Emma Bennett

RESTalk

Play Episode Listen Later Oct 9, 2023 26:18


Coming together is a beginning, keeping together is progress; working together is success. -Edward Everett Hale   On today's podcast we welcome Clara Hedrick and Emma Bennett to give us up to date and inside information about the Upcoming RESNET conference: Nov 15-17, 2023 in San Diego, CA. Since the last face to face conference was in 2020, and after virtual conferences in 2021 and 2022, Emma shares what will be the same and what will be different as compared to past conferences. We also over a few of the specific events that attendees can look forward to as well as recommend activities outside of the conference. Breakout session topics include Carbon/ESG, Water Efficiency & HERSH2O®, HERS® as the Gold Standard, California – Here We Come, New Opportunities for the Rating Industry, Energy Codes, Tapping the Existing Homes Market, Latest Developments in Building Science, Workforce Development, and Financing Improving the Energy Efficiency of Homes We close by sharing what we are each most looking forward to about the conference.   Links Mentioned in the episode: View the schedule, find travel details and learn more about the conference at: https://www.resnet.us/conference-2023/ Find details on the KB Homes Microgrid home tour: https://www.resnet.us/conference-2023/kb-homes-microgrid-tour/ Learn more about the RESNET Emerging Leadership council at: https://www.resnet.us/raters/emerging-leadership-council-elc/ The RESTALK Podcast EP 115 with Chris Magwood on the Embodied Carbon Advisory Committee: https://restalk.libsyn.com/ep115-the-new-resnet-embodied-carbon-advisory-committee-with-chris-magwood-from-rmi   RESTalk: To the RESNET community, we hear you and want to engage. Learn more at www.RESNET.us Or for more info on this topic contact RESNET at INFO@RESNET.US  

RESTalk
EP120 Energy Efficient Homebuilding at Scale with Robert Broad of AMH Development

RESTalk

Play Episode Listen Later Sep 18, 2023 25:48


Sustainability is no longer about doing less harm. It's about doing more good.   -Jochen Zeitz   In this episode of the RESTalk podcast we are joined by Robert Broad, Senior Vice President of Development at AMH (NYSE: AMH), where he oversees new home and community development operations - including land acquisition, land development, purchasing, product development, and construction.   Robert shares with us AMH's focus on a desire to create more energy efficient homes using scalable techniques and measurable goals. During this conversation we learn of some new and innovative approaches by AMH.    Robert also shares with us why AMH chooses to have its homes HERS® Rated and what he feels are the special aspects of a HERS® Rated home.   We hear that in 2022, AMH built 2,183 homes with an average HERS® score of 61.9 or 0.9 better than 2021.   Robert tells us how RESNET offers builders a quantifiable program where you can excel with a wide range of markers that's compatible across multiple regions and climate zones.    With AMH investing in being long-term owner-operator, durability is key operating metric in their new builds. RESNET helps AMH establish their baseline in sustainability efforts to create a strong foundation they can use to track progress year over year.   Robert's LinkedIn profile: https://www.linkedin.com/in/robert-broad-a182384/   https://www.resnet.us/articles/american-homes-4-rent-sets-path-to-zero-energy-homes-through-hers-ratings/   https://www.youtube.com/watch?v=IMMFx-3m2g4     RESTalk: To the RESNET community, we hear you and want to engage. Learn more at www.RESNET.us Or for more info on this topic contact RESNET at INFO@RESNET.US  

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Want to help define the AI Engineer stack? Have opinions on the top tools, communities and builders? We're collaborating with friends at Amplify to launch the first State of AI Engineering survey! Please fill it out (and tell your friends)!If AI is so important, why is its software so bad?This was the motivating question for Chris Lattner as he reconnected with his product counterpart on Tensorflow, Tim Davis, and started working on a modular solution to the problem of sprawling, monolithic, fragmented platforms in AI development. They announced a $30m seed in 2022 and, following their successful double launch of Modular/Mojo

RESTalk
EP119 RESNET appoints a new ESG Advisory Group with Matthew Cooper, PEG

RESTalk

Play Episode Listen Later Aug 14, 2023 28:05


Sustainable development is the pathway to the future we want for all. It is a framework to generate economic growth, achieve social justice, exercise environmental stewardship and strengthen governance.   -Ban Ki-moon   ESG has become an integral factor in the financial world. The Structured Finance Association estimates that $11.6 trillion, or $1 of every $4 invested in the United States, was invested under ESG investment strategies.  RESNET® HERS® Ratings are increasingly becoming the metric for Environmental, Social, and Governance (ESG) reporting on the energy performance of homes. How will RESNET keep track of the needs and opportunities presented by this movement in the industry?   RESNET® Executive Director Steve Baden has recently appointed a RESNET ESG Advisory Group  to better track these emerging opportunities and to develop recommendations on how RESNET® should position itself to take advantage of them. The group is composed of a select group of rating companies and builder representatives that are active in this area of the industry. We are joined on this podcast by Matthew Cooper of PEG chair of the advisory group.  Matthew shares with us his insights and details on the initial efforts of the task group which will be targeted on increasing the presence of RESNET® and HERS® Ratings to financers investing in this economic activity. Other members of the group are: Jacob Atalla, KB Home Erin Bordelon, D.R. Horton Thomas Cochran, ARCXIS Ian Hughes, Meritage Homes Eric Johnson, US Eco-Logic Nathan Kahre, EnergyLogic Chris Urbanus, Burgess Construction Consultants, Inc. Links: Matthew's LinkedIn https://www.linkedin.com/in/matthew-cooper-peg/   https://www.resnet.us/articles/green-bond-market-for-single-family-energy-efficient-homes-grows/   https://www.resnet.us/articles/american-homes-4-rent-sets-path-to-zero-energy-homes-through-hers-ratings/   https://www.resnet.us/articles/the-nations-big-builders-drive-up-hers-ratings-in-2020/   RESTalk: To the RESNET community, we hear you and want to engage. Learn more at www.RESNET.us Or for more info on this topic contact RESNET at INFO@RESNET.US  

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0
LLMs Everywhere: Running 70B models in browsers and iPhones using MLC — with Tianqi Chen of CMU / OctoML

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Aug 10, 2023 52:10


We have just announced our first set of speakers at AI Engineer Summit! Sign up for the livestream or email sponsors@ai.engineer if you'd like to support.We are facing a massive GPU crunch. As both startups and VC's hoard Nvidia GPUs like countries count nuclear stockpiles, tweets about GPU shortages have become increasingly common. But what if we could run LLMs with AMD cards, or without a GPU at all? There's just one weird trick: compilation. And there's one person uniquely qualified to do it.We had the pleasure to sit down with Tianqi Chen, who's an Assistant Professor at CMU, where he both teaches the MLC course and runs the MLC group. You might also know him as the creator of XGBoost, Apache TVM, and MXNet, as well as the co-founder of OctoML. The MLC (short for Machine Learning Compilation) group has released a lot of interesting projects:* MLC Chat: an iPhone app that lets you run models like RedPajama-3B and Vicuna-7B on-device. It gets up to 30 tok/s!* Web LLM: Run models like LLaMA-70B in your browser (!!) to offer local inference in your product.* MLC LLM: a framework that allows any language models to be deployed natively on different hardware and software stacks.The MLC group has just announced new support for AMD cards; we previously talked about the shortcomings of ROCm, but using MLC you can get performance very close to the NVIDIA's counterparts. This is great news for founders and builders, as AMD cards are more readily available. Here are their latest results on AMD's 7900s vs some of top NVIDIA consumer cards.If you just can't get a GPU at all, MLC LLM also supports ARM and x86 CPU architectures as targets by leveraging LLVM. While speed performance isn't comparable, it allows for non-time-sensitive inference to be run on commodity hardware.We also enjoyed getting a peek into TQ's process, which involves a lot of sketching:With all the other work going on in this space with projects like ggml and Ollama, we're excited to see GPUs becoming less and less of an issue to get models in the hands of more people, and innovative software solutions to hardware problems!Show Notes* TQ's Projects:* XGBoost* Apache TVM* MXNet* MLC* OctoML* CMU Catalyst* ONNX* GGML* Mojo* WebLLM* RWKV* HiPPO* Tri Dao's Episode* George Hotz EpisodePeople:* Carlos Guestrin* Albert GuTimestamps* [00:00:00] Intros* [00:03:41] The creation of XGBoost and its surprising popularity* [00:06:01] Comparing tree-based models vs deep learning* [00:10:33] Overview of TVM and how it works with ONNX* [00:17:18] MLC deep dive* [00:28:10] Using int4 quantization for inference of language models* [00:30:32] Comparison of MLC to other model optimization projects* [00:35:02] Running large language models in the browser with WebLLM* [00:37:47] Integrating browser models into applications* [00:41:15] OctoAI and self-optimizing compute* [00:45:45] Lightning RoundTranscriptAlessio: Hey everyone, welcome to the Latent Space podcast. This is Alessio, Partner and CTO in Residence at Decibel Partners, and I'm joined by my co-host Swyx, writer and editor of Latent Space. [00:00:20]Swyx: Okay, and we are here with Tianqi Chen, or TQ as people call him, who is assistant professor in ML computer science at CMU, Carnegie Mellon University, also helping to run Catalyst Group, also chief technologist of OctoML. You wear many hats. Are those, you know, your primary identities these days? Of course, of course. [00:00:42]Tianqi: I'm also, you know, very enthusiastic open source. So I'm also a VP and PRC member of the Apache TVM project and so on. But yeah, these are the things I've been up to so far. [00:00:53]Swyx: Yeah. So you did Apache TVM, XGBoost, and MXNet, and we can cover any of those in any amount of detail. But maybe what's one thing about you that people might not learn from your official bio or LinkedIn, you know, on the personal side? [00:01:08]Tianqi: Let me say, yeah, so normally when I do, I really love coding, even though like I'm trying to run all those things. So one thing that I keep a habit on is I try to do sketchbooks. I have a book, like real sketchbooks to draw down the design diagrams and the sketchbooks I keep sketching over the years, and now I have like three or four of them. And it's kind of a usually a fun experience of thinking the design through and also seeing how open source project evolves and also looking back at the sketches that we had in the past to say, you know, all these ideas really turn into code nowadays. [00:01:43]Alessio: How many sketchbooks did you get through to build all this stuff? I mean, if one person alone built one of those projects, he'll be a very accomplished engineer. Like you built like three of these. What's that process like for you? Like it's the sketchbook, like the start, and then you think about the code or like. [00:01:59]Swyx: Yeah. [00:02:00]Tianqi: So, so usually I start sketching on high level architectures and also in a project that works for over years, we also start to think about, you know, new directions, like of course generative AI language model comes in, how it's going to evolve. So normally I would say it takes like one book a year, roughly at that rate. It's usually fun to, I find it's much easier to sketch things out and then gives a more like a high level architectural guide for some of the future items. Yeah. [00:02:28]Swyx: Have you ever published this sketchbooks? Cause I think people would be very interested on, at least on a historical basis. Like this is the time where XGBoost was born, you know? Yeah, not really. [00:02:37]Tianqi: I started sketching like after XGBoost. So that's a kind of missing piece, but a lot of design details in TVM are actually part of the books that I try to keep a record of. [00:02:48]Swyx: Yeah, we'll try to publish them and publish something in the journals. Maybe you can grab a little snapshot for visual aid. Sounds good. [00:02:57]Alessio: Yeah. And yeah, talking about XGBoost, so a lot of people in the audience might know it's a gradient boosting library, probably the most popular out there. And it became super popular because many people started using them in like a machine learning competitions. And I think there's like a whole Wikipedia page of like all state-of-the-art models. They use XGBoost and like, it's a really long list. When you were working on it, so we just had Tri Dao, who's the creator of FlashAttention on the podcast. And I asked him this question, it's like, when you were building FlashAttention, did you know that like almost any transform race model will use it? And so I asked the same question to you when you were coming up with XGBoost, like, could you predict it would be so popular or like, what was the creation process? And when you published it, what did you expect? We have no idea. [00:03:41]Tianqi: Like, actually, the original reason that we built that library is that at that time, deep learning just came out. Like that was the time where AlexNet just came out. And one of the ambitious mission that myself and my advisor, Carlos Guestrin, then is we want to think about, you know, try to test the hypothesis. Can we find alternatives to deep learning models? Because then, you know, there are other alternatives like, you know, support vector machines, linear models, and of course, tree-based models. And our question was, if you build those models and feed them with big enough data, because usually like one of the key characteristics of deep learning is that it's taking a lot [00:04:22]Swyx: of data, right? [00:04:23]Tianqi: So we will be able to get the same amount of performance. That's a hypothesis we're setting out to test. Of course, if you look at now, right, that's a wrong hypothesis, but as a byproduct, what we find out is that, you know, most of the gradient boosting library out there is not efficient enough for us to test that hypothesis. So I happen to have quite a bit of experience in the past of building gradient boosting trees and their variants. So Effective Action Boost was kind of like a byproduct of that hypothesis testing. At that time, I'm also competing a bit in data science challenges, like I worked on KDDCup and then Kaggle kind of become bigger, right? So I kind of think maybe it's becoming useful to others. One of my friends convinced me to try to do a Python binding of it. That tends to be like a very good decision, right, to be effective. Usually when I build it, we feel like maybe a command line interface is okay. And now we have a Python binding, we have R bindings. And then it realized, you know, it started getting interesting. People started contributing different perspectives, like visualization and so on. So we started to push a bit more on to building distributive support to make sure it works on any platform and so on. And even at that time point, when I talked to Carlos, my advisor, later, he said he never anticipated that we'll get to that level of success. And actually, why I pushed for gradient boosting trees, interestingly, at that time, he also disagreed. He thinks that maybe we should go for kernel machines then. And it turns out, you know, actually, we are both wrong in some sense, and Deep Neural Network was the king in the hill. But at least the gradient boosting direction got into something fruitful. [00:06:01]Swyx: Interesting. [00:06:02]Alessio: I'm always curious when it comes to these improvements, like, what's the design process in terms of like coming up with it? And how much of it is a collaborative with like other people that you're working with versus like trying to be, you know, obviously, in academia, it's like very paper-driven kind of research driven. [00:06:19]Tianqi: I would say the extra boost improvement at that time point was more on like, you know, I'm trying to figure out, right. But it's combining lessons. Before that, I did work on some of the other libraries on matrix factorization. That was like my first open source experience. Nobody knew about it, because you'll find, likely, if you go and try to search for the package SVD feature, you'll find some SVN repo somewhere. But it's actually being used for some of the recommender system packages. So I'm trying to apply some of the previous lessons there and trying to combine them. The later projects like MXNet and then TVM is much, much more collaborative in a sense that... But, of course, extra boost has become bigger, right? So when we started that project myself, and then we have, it's really amazing to see people come in. Michael, who was a lawyer, and now he works on the AI space as well, on contributing visualizations. Now we have people from our community contributing different things. So extra boost even today, right, it's a community of committers driving the project. So it's definitely something collaborative and moving forward on getting some of the things continuously improved for our community. [00:07:37]Alessio: Let's talk a bit about TVM too, because we got a lot of things to run through in this episode. [00:07:42]Swyx: I would say that at some point, I'd love to talk about this comparison between extra boost or tree-based type AI or machine learning compared to deep learning, because I think there is a lot of interest around, I guess, merging the two disciplines, right? And we can talk more about that. I don't know where to insert that, by the way, so we can come back to it later. Yeah. [00:08:04]Tianqi: Actually, what I said, when we test the hypothesis, the hypothesis is kind of, I would say it's partially wrong, because the hypothesis we want to test now is, can you run tree-based models on image classification tasks, where deep learning is certainly a no-brainer right [00:08:17]Swyx: now today, right? [00:08:18]Tianqi: But if you try to run it on tabular data, still, you'll find that most people opt for tree-based models. And there's a reason for that, in the sense that when you are looking at tree-based models, the decision boundaries are naturally rules that you're looking at, right? And they also have nice properties, like being able to be agnostic to scale of input and be able to automatically compose features together. And I know there are attempts on building neural network models that work for tabular data, and I also sometimes follow them. I do feel like it's good to have a bit of diversity in the modeling space. Actually, when we're building TVM, we build cost models for the programs, and actually we are using XGBoost for that as well. I still think tree-based models are going to be quite relevant, because first of all, it's really to get it to work out of the box. And also, you will be able to get a bit of interoperability and control monotonicity [00:09:18]Swyx: and so on. [00:09:19]Tianqi: So yes, it's still going to be relevant. I also sometimes keep coming back to think about, are there possible improvements that we can build on top of these models? And definitely, I feel like it's a space that can have some potential in the future. [00:09:34]Swyx: Are there any current projects that you would call out as promising in terms of merging the two directions? [00:09:41]Tianqi: I think there are projects that try to bring a transformer-type model for tabular data. I don't remember specifics of them, but I think even nowadays, if you look at what people are using, tree-based models are still one of their toolkits. So I think maybe eventually it's not even a replacement, it will be just an ensemble of models that you can call. Perfect. [00:10:07]Alessio: Next up, about three years after XGBoost, you built this thing called TVM, which is now a very popular compiler framework for models. Let's talk about, so this came out about at the same time as ONNX. So I think it would be great if you could maybe give a little bit of an overview of how the two things work together. Because it's kind of like the model, then goes to ONNX, then goes to the TVM. But I think a lot of people don't understand the nuances. I can get a bit of a backstory on that. [00:10:33]Tianqi: So actually, that's kind of an ancient history. Before XGBoost, I worked on deep learning for two years or three years. I got a master's before I started my PhD. And during my master's, my thesis focused on applying convolutional restricted Boltzmann machine for ImageNet classification. That is the thing I'm working on. And that was before AlexNet moment. So effectively, I had to handcraft NVIDIA CUDA kernels on, I think, a GTX 2070 card. I have a 22070 card. It took me about six months to get one model working. And eventually, that model is not so good, and we should have picked a better model. But that was like an ancient history that really got me into this deep learning field. And of course, eventually, we find it didn't work out. So in my master's, I ended up working on recommender system, which got me a paper, and I applied and got a PhD. But I always want to come back to work on the deep learning field. So after XGBoost, I think I started to work with some folks on this particular MXNet. At that time, it was like the frameworks of CAFE, Ciano, PyTorch haven't yet come out. And we're really working hard to optimize for performance on GPUs. At that time, I found it's really hard, even for NVIDIA GPU. It took me six months. And then it's amazing to see on different hardwares how hard it is to go and optimize code for the platforms that are interesting. So that gets me thinking, can we build something more generic and automatic? So that I don't need an entire team of so many people to go and build those frameworks. So that's the motivation of starting working on TVM. There is really too little about machine learning engineering needed to support deep learning models on the platforms that we're interested in. I think it started a bit earlier than ONNX, but once it got announced, I think it's in a similar time period at that time. So overall, how it works is that TVM, you will be able to take a subset of machine learning programs that are represented in what we call a computational graph. Nowadays, we can also represent a loop-level program ingest from your machine learning models. Usually, you have model formats ONNX, or in PyTorch, they have FX Tracer that allows you to trace the FX graph. And then it goes through TVM. We also realized that, well, yes, it needs to be more customizable, so it will be able to perform some of the compilation optimizations like fusion operator together, doing smart memory planning, and more importantly, generate low-level code. So that works for NVIDIA and also is portable to other GPU backends, even non-GPU backends [00:13:36]Swyx: out there. [00:13:37]Tianqi: So that's a project that actually has been my primary focus over the past few years. And it's great to see how it started from where I think we are the very early initiator of machine learning compilation. I remember there was a visit one day, one of the students asked me, are you still working on deep learning frameworks? I tell them that I'm working on ML compilation. And they said, okay, compilation, that sounds very ancient. It sounds like a very old field. And why are you working on this? And now it's starting to get more traction, like if you say Torch Compile and other things. I'm really glad to see this field starting to pick up. And also we have to continue innovating here. [00:14:17]Alessio: I think the other thing that I noticed is, it's kind of like a big jump in terms of area of focus to go from XGBoost to TVM, it's kind of like a different part of the stack. Why did you decide to do that? And I think the other thing about compiling to different GPUs and eventually CPUs too, did you already see some of the strain that models could have just being focused on one runtime, only being on CUDA and that, and how much of that went into it? [00:14:50]Tianqi: I think it's less about trying to get impact, more about wanting to have fun. I like to hack code, I had great fun hacking CUDA code. Of course, being able to generate CUDA code is cool, right? But now, after being able to generate CUDA code, okay, by the way, you can do it on other platforms, isn't that amazing? So it's more of that attitude to get me started on this. And also, I think when we look at different researchers, myself is more like a problem solver type. So I like to look at a problem and say, okay, what kind of tools we need to solve that problem? So regardless, it could be building better models. For example, while we build extra boots, we build certain regularizations into it so that it's more robust. It also means building system optimizations, writing low-level code, maybe trying to write assembly and build compilers and so on. So as long as they solve the problem, definitely go and try to do them together. And I also see it's a common trend right now. Like if you want to be able to solve machine learning problems, it's no longer at Aggressor layer, right? You kind of need to solve it from both Aggressor data and systems angle. And this entire field of machine learning system, I think it's kind of emerging. And there's now a conference around it. And it's really good to see a lot more people are starting to look into this. [00:16:10]Swyx: Yeah. Are you talking about ICML or something else? [00:16:13]Tianqi: So machine learning and systems, right? So not only machine learning, but machine learning and system. So there's a conference called MLsys. It's definitely a smaller community than ICML, but I think it's also an emerging and growing community where people are talking about what are the implications of building systems for machine learning, right? And how do you go and optimize things around that and co-design models and systems together? [00:16:37]Swyx: Yeah. And you were area chair for ICML and NeurIPS as well. So you've just had a lot of conference and community organization experience. Is that also an important part of your work? Well, it's kind of expected for academic. [00:16:48]Tianqi: If I hold an academic job, I need to do services for the community. Okay, great. [00:16:53]Swyx: Your most recent venture in MLsys is going to the phone with MLCLLM. You announced this in April. I have it on my phone. It's great. I'm running Lama 2, Vicuña. I don't know what other models that you offer. But maybe just kind of describe your journey into MLC. And I don't know how this coincides with your work at CMU. Is that some kind of outgrowth? [00:17:18]Tianqi: I think it's more like a focused effort that we want in the area of machine learning compilation. So it's kind of related to what we built in TVM. So when we built TVM was five years ago, right? And a lot of things happened. We built the end-to-end machine learning compiler that works, the first one that works. But then we captured a lot of lessons there. So then we are building a second iteration called TVM Unity. That allows us to be able to allow ML engineers to be able to quickly capture the new model and how we demand building optimizations for them. And MLCLLM is kind of like an MLC. It's more like a vertical driven organization that we go and build tutorials and go and build projects like LLM to solutions. So that to really show like, okay, you can take machine learning compilation technology and apply it and bring something fun forward. Yeah. So yes, it runs on phones, which is really cool. But the goal here is not only making it run on phones, right? The goal is making it deploy universally. So we do run on Apple M2 Macs, the 17 billion models. Actually, on a single batch inference, more recently on CUDA, we get, I think, the most best performance you can get out there already on the 4-bit inference. Actually, as I alluded earlier before the podcast, we just had a result on AMD. And on a single batch, actually, we can get the latest AMD GPU. This is a consumer card. It can get to about 80% of the 4019, so NVIDIA's best consumer card out there. So it's not yet on par, but thinking about how diversity and what you can enable and the previous things you can get on that card, it's really amazing that what you can do with this kind of technology. [00:19:10]Swyx: So one thing I'm a little bit confused by is that most of these models are in PyTorch, but you're running this inside a TVM. I don't know. Was there any fundamental change that you needed to do, or was this basically the fundamental design of TVM? [00:19:25]Tianqi: So the idea is that, of course, it comes back to program representation, right? So effectively, TVM has this program representation called TVM script that contains more like computational graph and operational representation. So yes, initially, we do need to take a bit of effort of bringing those models onto the program representation that TVM supports. Usually, there are a mix of ways, depending on the kind of model you're looking at. For example, for vision models and stable diffusion models, usually we can just do tracing that takes PyTorch model onto TVM. That part is still being robustified so that we can bring more models in. On language model tasks, actually what we do is we directly build some of the model constructors and try to directly map from Hugging Face models. The goal is if you have a Hugging Face configuration, we will be able to bring that in and apply optimization on them. So one fun thing about model compilation is that your optimization doesn't happen only as a soft language, right? For example, if you're writing PyTorch code, you just go and try to use a better fused operator at a source code level. Torch compile might help you do a bit of things in there. In most of the model compilations, it not only happens at the beginning stage, but we also apply generic transformations in between, also through a Python API. So you can tweak some of that. So that part of optimization helps a lot of uplifting in getting both performance and also portability on the environment. And another thing that we do have is what we call universal deployment. So if you get the ML program into this TVM script format, where there are functions that takes in tensor and output tensor, we will be able to have a way to compile it. So they will be able to load the function in any of the language runtime that TVM supports. So if you could load it in JavaScript, and that's a JavaScript function that you can take in tensors and output tensors. If you're loading Python, of course, and C++ and Java. So the goal there is really bring the ML model to the language that people care about and be able to run it on a platform they like. [00:21:37]Swyx: It strikes me that I've talked to a lot of compiler people, but you don't have a traditional compiler background. You're inventing your own discipline called machine learning compilation, or MLC. Do you think that this will be a bigger field going forward? [00:21:52]Tianqi: First of all, I do work with people working on compilation as well. So we're also taking inspirations from a lot of early innovations in the field. Like for example, TVM initially, we take a lot of inspirations from Halide, which is just an image processing compiler. And of course, since then, we have evolved quite a bit to focus on the machine learning related compilations. If you look at some of our conference publications, you'll find that machine learning compilation is already kind of a subfield. So if you look at papers in both machine learning venues, the MLC conferences, of course, and also system venues, every year there will be papers around machine learning compilation. And in the compiler conference called CGO, there's a C4ML workshop that also kind of trying to focus on this area. So definitely it's already starting to gain traction and becoming a field. I wouldn't claim that I invented this field, but definitely I helped to work with a lot of folks there. And I try to bring a perspective, of course, trying to learn a lot from the compiler optimizations as well as trying to bring in knowledges in machine learning and systems together. [00:23:07]Alessio: So we had George Hotz on the podcast a few episodes ago, and he had a lot to say about AMD and their software. So when you think about TVM, are you still restricted in a way by the performance of the underlying kernel, so to speak? So if your target is like a CUDA runtime, you still get better performance, no matter like TVM kind of helps you get there, but then that level you don't take care of, right? [00:23:34]Swyx: There are two parts in here, right? [00:23:35]Tianqi: So first of all, there is the lower level runtime, like CUDA runtime. And then actually for NVIDIA, a lot of the mood came from their libraries, like Cutlass, CUDN, right? Those library optimizations. And also for specialized workloads, actually you can specialize them. Because a lot of cases you'll find that if you go and do benchmarks, it's very interesting. Like two years ago, if you try to benchmark ResNet, for example, usually the NVIDIA library [00:24:04]Swyx: gives you the best performance. [00:24:06]Tianqi: It's really hard to beat them. But as soon as you start to change the model to something, maybe a bit of a variation of ResNet, not for the traditional ImageNet detections, but for latent detection and so on, there will be some room for optimization because people sometimes overfit to benchmarks. These are people who go and optimize things, right? So people overfit the benchmarks. So that's the largest barrier, like being able to get a low level kernel libraries, right? In that sense, the goal of TVM is actually we try to have a generic layer to both, of course, leverage libraries when available, but also be able to automatically generate [00:24:45]Swyx: libraries when possible. [00:24:46]Tianqi: So in that sense, we are not restricted by the libraries that they have to offer. That's why we will be able to run Apple M2 or WebGPU where there's no library available because we are kind of like automatically generating libraries. That makes it easier to support less well-supported hardware, right? For example, WebGPU is one example. From a runtime perspective, AMD, I think before their Vulkan driver was not very well supported. Recently, they are getting good. But even before that, we'll be able to support AMD through this GPU graphics backend called Vulkan, which is not as performant, but it gives you a decent portability across those [00:25:29]Swyx: hardware. [00:25:29]Alessio: And I know we got other MLC stuff to talk about, like WebLLM, but I want to wrap up on the optimization that you're doing. So there's kind of four core things, right? Kernel fusion, which we talked a bit about in the flash attention episode and the tiny grab one memory planning and loop optimization. I think those are like pretty, you know, self-explanatory. I think the one that people have the most questions, can you can you quickly explain [00:25:53]Swyx: those? [00:25:54]Tianqi: So there are kind of a different things, right? Kernel fusion means that, you know, if you have an operator like Convolutions or in the case of a transformer like MOP, you have other operators that follow that, right? You don't want to launch two GPU kernels. You want to be able to put them together in a smart way, right? And as a memory planning, it's more about, you know, hey, if you run like Python code, every time when you generate a new array, you are effectively allocating a new piece of memory, right? Of course, PyTorch and other frameworks try to optimize for you. So there is a smart memory allocator behind the scene. But actually, in a lot of cases, it's much better to statically allocate and plan everything ahead of time. And that's where like a compiler can come in. We need to, first of all, actually for language model, it's much harder because dynamic shape. So you need to be able to what we call symbolic shape tracing. So we have like a symbolic variable that tells you like the shape of the first tensor is n by 12. And the shape of the third tensor is also n by 12. Or maybe it's n times 2 by 12. Although you don't know what n is, right? But you will be able to know that relation and be able to use that to reason about like fusion and other decisions. So besides this, I think loop transformation is quite important. And it's actually non-traditional. Originally, if you simply write a code and you want to get a performance, it's very hard. For example, you know, if you write a matrix multiplier, the simplest thing you can do is you do for i, j, k, c, i, j, plus, equal, you know, a, i, k, times b, i, k. But that code is 100 times slower than the best available code that you can get. So we do a lot of transformation, like being able to take the original code, trying to put things into shared memory, and making use of tensor calls, making use of memory copies, and all this. Actually, all these things, we also realize that, you know, we cannot do all of them. So we also make the ML compilation framework as a Python package, so that people will be able to continuously improve that part of engineering in a more transparent way. So we find that's very useful, actually, for us to be able to get good performance very quickly on some of the new models. Like when Lamato came out, we'll be able to go and look at the whole, here's the bottleneck, and we can go and optimize those. [00:28:10]Alessio: And then the fourth one being weight quantization. So everybody wants to know about that. And just to give people an idea of the memory saving, if you're doing FB32, it's like four bytes per parameter. Int8 is like one byte per parameter. So you can really shrink down the memory footprint. What are some of the trade-offs there? How do you figure out what the right target is? And what are the precision trade-offs, too? [00:28:37]Tianqi: Right now, a lot of people also mostly use int4 now for language models. So that really shrinks things down a lot. And more recently, actually, we started to think that, at least in MOC, we don't want to have a strong opinion on what kind of quantization we want to bring, because there are so many researchers in the field. So what we can do is we can allow developers to customize the quantization they want, but we still bring the optimum code for them. So we are working on this item called bring your own quantization. In fact, hopefully MOC will be able to support more quantization formats. And definitely, I think there's an open field that's being explored. Can you bring more sparsities? Can you quantize activations as much as possible, and so on? And it's going to be something that's going to be relevant for quite a while. [00:29:27]Swyx: You mentioned something I wanted to double back on, which is most people use int4 for language models. This is actually not obvious to me. Are you talking about the GGML type people, or even the researchers who are training the models also using int4? [00:29:40]Tianqi: Sorry, so I'm mainly talking about inference, not training, right? So when you're doing training, of course, int4 is harder, right? Maybe you could do some form of mixed type precision for inference. I think int4 is kind of like, in a lot of cases, you will be able to get away with int4. And actually, that does bring a lot of savings in terms of the memory overhead, and so on. [00:30:09]Alessio: Yeah, that's great. Let's talk a bit about maybe the GGML, then there's Mojo. How should people think about MLC? How do all these things play together? I think GGML is focused on model level re-implementation and improvements. Mojo is a language, super sad. You're more at the compiler level. Do you all work together? Do people choose between them? [00:30:32]Tianqi: So I think in this case, I think it's great to say the ecosystem becomes so rich with so many different ways. So in our case, GGML is more like you're implementing something from scratch in C, right? So that gives you the ability to go and customize each of a particular hardware backend. But then you will need to write from CUDA kernels, and you write optimally from AMD, and so on. So the kind of engineering effort is a bit more broadened in that sense. Mojo, I have not looked at specific details yet. I think it's good to start to say, it's a language, right? I believe there will also be machine learning compilation technologies behind it. So it's good to say, interesting place in there. In the case of MLC, our case is that we do not want to have an opinion on how, where, which language people want to develop, deploy, and so on. And we also realize that actually there are two phases. We want to be able to develop and optimize your model. By optimization, I mean, really bring in the best CUDA kernels and do some of the machine learning engineering in there. And then there's a phase where you want to deploy it as a part of the app. So if you look at the space, you'll find that GGML is more like, I'm going to develop and optimize in the C language, right? And then most of the low-level languages they have. And Mojo is that you want to develop and optimize in Mojo, right? And you deploy in Mojo. In fact, that's the philosophy they want to push for. In the ML case, we find that actually if you want to develop models, the machine learning community likes Python. Python is a language that you should focus on. So in the case of MLC, we really want to be able to enable, not only be able to just define your model in Python, that's very common, right? But also do ML optimization, like engineering optimization, CUDA kernel optimization, memory planning, all those things in Python that makes you customizable and so on. But when you do deployment, we realize that people want a bit of a universal flavor. If you are a web developer, you want JavaScript, right? If you're maybe an embedded system person, maybe you would prefer C++ or C or Rust. And people sometimes do like Python in a lot of cases. So in the case of MLC, we really want to have this vision of, you optimize, build a generic optimization in Python, then you deploy that universally onto the environments that people like. [00:32:54]Swyx: That's a great perspective and comparison, I guess. One thing I wanted to make sure that we cover is that I think you are one of these emerging set of academics that also very much focus on your artifacts of delivery. Of course. Something we talked about for three years, that he was very focused on his GitHub. And obviously you treated XGBoost like a product, you know? And then now you're publishing an iPhone app. Okay. Yeah. Yeah. What is his thinking about academics getting involved in shipping products? [00:33:24]Tianqi: I think there are different ways of making impact, right? Definitely, you know, there are academics that are writing papers and building insights for people so that people can build product on top of them. In my case, I think the particular field I'm working on, machine learning systems, I feel like really we need to be able to get it to the hand of people so that really we see the problem, right? And we show that we can solve a problem. And it's a different way of making impact. And there are academics that are doing similar things. Like, you know, if you look at some of the people from Berkeley, right? A few years, they will come up with big open source projects. Certainly, I think it's just a healthy ecosystem to have different ways of making impacts. And I feel like really be able to do open source and work with open source community is really rewarding because we have a real problem to work on when we build our research. Actually, those research bring together and people will be able to make use of them. And we also start to see interesting research challenges that we wouldn't otherwise say, right, if you're just trying to do a prototype and so on. So I feel like it's something that is one interesting way of making impact, making contributions. [00:34:40]Swyx: Yeah, you definitely have a lot of impact there. And having experience publishing Mac stuff before, the Apple App Store is no joke. It is the hardest compilation, human compilation effort. So one thing that we definitely wanted to cover is running in the browser. You have a 70 billion parameter model running in the browser. That's right. Can you just talk about how? Yeah, of course. [00:35:02]Tianqi: So I think that there are a few elements that need to come in, right? First of all, you know, we do need a MacBook, the latest one, like M2 Max, because you need the memory to be big enough to cover that. So for a 70 million model, it takes you about, I think, 50 gigahertz of RAM. So the M2 Max, the upper version, will be able to run it, right? And it also leverages machine learning compilation. Again, what we are doing is the same, whether it's running on iPhone, on server cloud GPUs, on AMDs, or on MacBook, we all go through that same MOC pipeline. Of course, in certain cases, maybe we'll do a bit of customization iteration for either ones. And then it runs on the browser runtime, this package of WebLM. So that will effectively... So what we do is we will take that original model and compile to what we call WebGPU. And then the WebLM will be to pick it up. And the WebGPU is this latest GPU technology that major browsers are shipping right now. So you can get it in Chrome for them already. It allows you to be able to access your native GPUs from a browser. And then effectively, that language model is just invoking the WebGPU kernels through there. So actually, when the LATMAR2 came out, initially, we asked the question about, can you run 17 billion on a MacBook? That was the question we're asking. So first, we actually... Jin Lu, who is the engineer pushing this, he got 17 billion on a MacBook. We had a CLI version. So in MLC, you will be able to... That runs through a metal accelerator. So effectively, you use the metal programming language to get the GPU acceleration. So we find, okay, it works for the MacBook. Then we asked, we had a WebGPU backend. Why not try it there? So we just tried it out. And it's really amazing to see everything up and running. And actually, it runs smoothly in that case. So I do think there are some kind of interesting use cases already in this, because everybody has a browser. You don't need to install anything. I think it doesn't make sense yet to really run a 17 billion model on a browser, because you kind of need to be able to download the weight and so on. But I think we're getting there. Effectively, the most powerful models you will be able to run on a consumer device. It's kind of really amazing. And also, in a lot of cases, there might be use cases. For example, if I'm going to build a chatbot that I talk to it and answer questions, maybe some of the components, like the voice to text, could run on the client side. And so there are a lot of possibilities of being able to have something hybrid that contains the edge component or something that runs on a server. [00:37:47]Alessio: Do these browser models have a way for applications to hook into them? So if I'm using, say, you can use OpenAI or you can use the local model. Of course. [00:37:56]Tianqi: Right now, actually, we are building... So there's an NPM package called WebILM, right? So that you will be able to, if you want to embed it onto your web app, you will be able to directly depend on WebILM and you will be able to use it. We are also having a REST API that's OpenAI compatible. So that REST API, I think, right now, it's actually running on native backend. So that if a CUDA server is faster to run on native backend. But also we have a WebGPU version of it that you can go and run. So yeah, we do want to be able to have easier integrations with existing applications. And OpenAI API is certainly one way to do that. Yeah, this is great. [00:38:37]Swyx: I actually did not know there's an NPM package that makes it very, very easy to try out and use. I want to actually... One thing I'm unclear about is the chronology. Because as far as I know, Chrome shipped WebGPU the same time that you shipped WebILM. Okay, yeah. So did you have some kind of secret chat with Chrome? [00:38:57]Tianqi: The good news is that Chrome is doing a very good job of trying to have early release. So although the official shipment of the Chrome WebGPU is the same time as WebILM, actually, you will be able to try out WebGPU technology in Chrome. There is an unstable version called Canary. I think as early as two years ago, there was a WebGPU version. Of course, it's getting better. So we had a TVM-based WebGPU backhand two years ago. Of course, at that time, there were no language models. It was running on less interesting, well, still quite interesting models. And then this year, we really started to see it getting matured and performance keeping up. So we have a more serious push of bringing the language model compatible runtime onto the WebGPU. [00:39:45]Swyx: I think you agree that the hardest part is the model download. Has there been conversations about a one-time model download and sharing between all the apps that might use this API? That is a great point. [00:39:58]Tianqi: I think it's already supported in some sense. When we download the model, WebILM will cache it onto a special Chrome cache. So if a different web app uses the same WebILM JavaScript package, you don't need to redownload the model again. So there is already something there. But of course, you have to download the model once at least to be able to use it. [00:40:19]Swyx: Okay. One more thing just in general before we're about to zoom out to OctoAI. Just the last question is, you're not the only project working on, I guess, local models. That's right. Alternative models. There's gpt4all, there's olama that just recently came out, and there's a bunch of these. What would be your advice to them on what's a valuable problem to work on? And what is just thin wrappers around ggml? Like, what are the interesting problems in this space, basically? [00:40:45]Tianqi: I think making API better is certainly something useful, right? In general, one thing that we do try to push very hard on is this idea of easier universal deployment. So we are also looking forward to actually have more integration with MOC. That's why we're trying to build API like WebILM and other things. So we're also looking forward to collaborate with all those ecosystems and working support to bring in models more universally and be able to also keep up the best performance when possible in a more push-button way. [00:41:15]Alessio: So as we mentioned in the beginning, you're also the co-founder of Octomel. Recently, Octomel released OctoAI, which is a compute service, basically focuses on optimizing model runtimes and acceleration and compilation. What has been the evolution there? So Octo started as kind of like a traditional MLOps tool, where people were building their own models and you help them on that side. And then it seems like now most of the market is shifting to starting from pre-trained generative models. Yeah, what has been that experience for you and what you've seen the market evolve? And how did you decide to release OctoAI? [00:41:52]Tianqi: One thing that we found out is that on one hand, it's really easy to go and get something up and running, right? So if you start to consider there's so many possible availabilities and scalability issues and even integration issues since becoming kind of interesting and complicated. So we really want to make sure to help people to get that part easy, right? And now a lot of things, if we look at the customers we talk to and the market, certainly generative AI is something that is very interesting. So that is something that we really hope to help elevate. And also building on top of technology we build to enable things like portability across hardwares. And you will be able to not worry about the specific details, right? Just focus on getting the model out. We'll try to work on infrastructure and other things that helps on the other end. [00:42:45]Alessio: And when it comes to getting optimization on the runtime, I see when we run an early adopters community and most enterprises issue is how to actually run these models. Do you see that as one of the big bottlenecks now? I think a few years ago it was like, well, we don't have a lot of machine learning talent. We cannot develop our own models. Versus now it's like, there's these great models you can use, but I don't know how to run them efficiently. [00:43:12]Tianqi: That depends on how you define by running, right? On one hand, it's easy to download your MLC, like you download it, you run on a laptop, but then there's also different decisions, right? What if you are trying to serve a larger user request? What if that request changes? What if the availability of hardware changes? Right now it's really hard to get the latest hardware on media, unfortunately, because everybody's trying to work on the things using the hardware that's out there. So I think when the definition of run changes, there are a lot more questions around things. And also in a lot of cases, it's not only about running models, it's also about being able to solve problems around them. How do you manage your model locations and how do you make sure that you get your model close to your execution environment more efficiently? So definitely a lot of engineering challenges out there. That we hope to elevate, yeah. And also, if you think about our future, definitely I feel like right now the technology, given the technology and the kind of hardware availability we have today, we will need to make use of all the possible hardware available out there. That will include a mechanism for cutting down costs, bringing something to the edge and cloud in a more natural way. So I feel like still this is a very early stage of where we are, but it's already good to see a lot of interesting progress. [00:44:35]Alessio: Yeah, that's awesome. I would love, I don't know how much we're going to go in depth into it, but what does it take to actually abstract all of this from the end user? You know, like they don't need to know what GPUs you run, what cloud you're running them on. You take all of that away. What was that like as an engineering challenge? [00:44:51]Tianqi: So I think that there are engineering challenges on. In fact, first of all, you will need to be able to support all the kind of hardware backhand you have, right? On one hand, if you look at the media library, you'll find very surprisingly, not too surprisingly, most of the latest libraries works well on the latest GPU. But there are other GPUs out there in the cloud as well. So certainly being able to have know-hows and being able to do model optimization is one thing, right? Also infrastructures on being able to scale things up, locate models. And in a lot of cases, we do find that on typical models, it also requires kind of vertical iterations. So it's not about, you know, build a silver bullet and that silver bullet is going to solve all the problems. It's more about, you know, we're building a product, we'll work with the users and we find out there are interesting opportunities in a certain point. And when our engineer will go and solve that, and it will automatically reflect it in a service. [00:45:45]Swyx: Awesome. [00:45:46]Alessio: We can jump into the lightning round until, I don't know, Sean, if you have more questions or TQ, if you have more stuff you wanted to talk about that we didn't get a chance to [00:45:54]Swyx: touch on. [00:45:54]Alessio: Yeah, we have talked a lot. [00:45:55]Swyx: So, yeah. We always would like to ask, you know, do you have a commentary on other parts of AI and ML that is interesting to you? [00:46:03]Tianqi: So right now, I think one thing that we are really pushing hard for is this question about how far can we bring open source, right? I'm kind of like a hacker and I really like to put things together. So I think it's unclear in the future of what the future of AI looks like. On one hand, it could be possible that, you know, you just have a few big players, you just try to talk to those bigger language models and that can do everything, right? On the other hand, one of the things that Wailing Academic is really excited and pushing for, that's one reason why I'm pushing for MLC, is that can we build something where you have different models? You have personal models that know the best movie you like, but you also have bigger models that maybe know more, and you get those models to interact with each other, right? And be able to have a wide ecosystem of AI agents that helps each person while still being able to do things like personalization. Some of them can run locally, some of them, of course, running on a cloud, and how do they interact with each other? So I think that is a very exciting time where the future is yet undecided, but I feel like there is something we can do to shape that future as well. [00:47:18]Swyx: One more thing, which is something I'm also pursuing, which is, and this kind of goes back into predictions, but also back in your history, do you have any idea, or are you looking out for anything post-transformers as far as architecture is concerned? [00:47:32]Tianqi: I think, you know, in a lot of these cases, you can find there are already promising models for long contexts, right? There are space-based models, where like, you know, a lot of some of our colleagues from Albert, who he worked on this HIPPO models, right? And then there is an open source version called RWKV. It's like a recurrent models that allows you to summarize things. Actually, we are bringing RWKV to MOC as well, so maybe you will be able to see one of the models. [00:48:00]Swyx: We actually recorded an episode with one of the RWKV core members. It's unclear because there's no academic backing. It's just open source people. Oh, I see. So you like the merging of recurrent networks and transformers? [00:48:13]Tianqi: I do love to see this model space continue growing, right? And I feel like in a lot of cases, it's just that attention mechanism is getting changed in some sense. So I feel like definitely there are still a lot of things to be explored here. And that is also one reason why we want to keep pushing machine learning compilation, because one of the things we are trying to push in was productivity. So that for machine learning engineering, so that as soon as some of the models came out, we will be able to, you know, empower them onto those environments that's out there. [00:48:43]Swyx: Yeah, it's a really good mission. Okay. Very excited to see that RWKV and state space model stuff. I'm hearing increasing chatter about that stuff. Okay. Lightning round, as always fun. I'll take the first one. Acceleration. What has already happened in AI that you thought would take much longer? [00:48:59]Tianqi: Emergence of more like a conversation chatbot ability is something that kind of surprised me before it came out. This is like one piece that I feel originally I thought would take much longer, but yeah, [00:49:11]Swyx: it happens. And it's funny because like the original, like Eliza chatbot was something that goes all the way back in time. Right. And then we just suddenly came back again. Yeah. [00:49:21]Tianqi: It's always too interesting to think about, but with a kind of a different technology [00:49:25]Swyx: in some sense. [00:49:25]Alessio: What about the most interesting unsolved question in AI? [00:49:31]Swyx: That's a hard one, right? [00:49:32]Tianqi: So I can tell you like what kind of I'm excited about. So, so I think that I have always been excited about this idea of continuous learning and lifelong learning in some sense. So how AI continues to evolve with the knowledges that have been there. It seems that we're getting much closer with all those recent technologies. So being able to develop systems, support, and be able to think about how AI continues to evolve is something that I'm really excited about. [00:50:01]Swyx: So specifically, just to double click on this, are you talking about continuous training? That's like a training. [00:50:06]Tianqi: I feel like, you know, training adaptation and it's all similar things, right? You want to think about entire life cycle, right? The life cycle of collecting data, training, fine tuning, and maybe have your local context that getting continuously curated and feed onto models. So I think all these things are interesting and relevant in here. [00:50:29]Swyx: Yeah. I think this is something that people are really asking, you know, right now we have moved a lot into the sort of pre-training phase and off the shelf, you know, the model downloads and stuff like that, which seems very counterintuitive compared to the continuous training paradigm that people want. So I guess the last question would be for takeaways. What's basically one message that you want every listener, every person to remember today? [00:50:54]Tianqi: I think it's getting more obvious now, but I think one of the things that I always want to mention in my talks is that, you know, when you're thinking about AI applications, originally people think about algorithms a lot more, right? Our algorithm models, they are still very important. But usually when you build AI applications, it takes, you know, both algorithm side, the system optimizations, and the data curations, right? So it takes a connection of so many facades to be able to bring together an AI system and be able to look at it from that holistic perspective is really useful when we start to build modern applications. I think it's going to continue going to be more important in the future. [00:51:35]Swyx: Yeah. Thank you for showing the way on this. And honestly, just making things possible that I thought would take a lot longer. So thanks for everything you've done. [00:51:46]Tianqi: Thank you for having me. [00:51:47]Swyx: Yeah. [00:51:47]Alessio: Thanks for coming on TQ. [00:51:49]Swyx: Have a good one. [00:51:49] Get full access to Latent Space at www.latent.space/subscribe

R-Value
Get to Know Your HERS rater

R-Value

Play Episode Listen Later Jul 24, 2023 40:42


Building standards and regulations have changed A LOT recently. It can be difficult to keep up with and communicate all of the changes, leading to failed inspections and lost revenue. The pace and volume of change is a major challenge for builders across the country and non-profit organizers are stepping into the maze to help solve the problem.   People like Cardice Howard, Deputy Director of RESNET. RESNET is a not-for-profit, membership corporation that created and maintains the HERS® Index to allow for easy comparison of energy performance of homes. Their mission is to make the energy use of all homes transparent, thereby driving residential sector energy use toward net zero.   Cardice has 27 years of operations management experience as an insulation contractor in the Dallas/Fort Worth, Texas market along with her passion as an insulation Industry advocate for the RESNET team. Cardice has developed and maintained cohesive relationships with builders, code officials, sales and operations staff over that time. She also has built teams to support over $2 million in sales per month in the Dallas/Fort Worth market, focusing on business development and strategic planning.   Ken and Cardice discuss how the HERS rating can be used as a standalone compliance pathway to meet state energy codes, helping builders to qualify for significant tax credits and making green building practices more economically viable. They also talk about software solutions that help achieve energy efficiency requirements in construction projects, prescriptive vs. performance building processes and the importance of unity and collaborative efforts among builders, insulators, code officials, and third-party raters in achieving overall project success.   In this podcast… 1:00 - How Cardice's wide range of contracting experience and outside-the-box approach helps her help builders dealing with a number of on-site challenges. 8:40 - How RESNET is working to simplify compliance for builders in the United States 12:00 - The dynamics of the HERS Index and a discussion about new regulations and ratings, including how HERS raters can act as code officials in some jurisdictions 17:45 - How RESNET is working closely with distributors and manufacturers in the construction industry to educate them on the latest standards and regulations. 26:38 - What if I'm working on a project that doesn't have a HERS rater? 31:56 - BONUS: Cardice shares her experiences with projects that she knew would be improved by a HERS rater

Building HVAC Science - Building Performance, Science, Health & Comfort
EP124 Business Philosophy, why did we hire an ELK? With Eric Kaiser (July 2023)

Building HVAC Science - Building Performance, Science, Health & Comfort

Play Episode Listen Later Jul 14, 2023 25:43


A simple episode where Eric Kaiser (ELK) and Bill Spohn (OverKill Bill) riff on TruTech's business philosophies: Core Values, Purpose and Niche and what kind of actions we take to put these words into action in the world. Also what we feel it means to be a good steward to the industry. OH and cheer on #HVACLIFE team in the Maine 70.3 Ironman on July 30, 2023 featuring Chris Hughes (TEC-13.2 miles running), Chad Simpson (Simpson Salute – 56 miles bicycling) and our own Eric Kaiser – 1.2 miles swimming). https://www.ironman.com/im703-maine Links mentioned in the podcast: Alabama Power training center: https://www.alabamapower.com/business/business-customers-and-services/hvac/hvac-training.html RESNET 2023 Proposals: https://www.resnet.us/wp-content/uploads/RESNET_Session-Topics_TOC_06-27.pdf RESNET 2023 Session Voting: https://docs.google.com/forms/d/1YQ0ezGAXFrmdwdaRht8K1Ga9YdGZQ7OhP9yWQDoxMGU/viewform?edit_requested=true&pli=1 RESNET Conference: https://www.resnet.us/conference-2023/ Magazine article on Building Science Summer Camp: https://www.jlconline.com/how-to/insulation/highlights-from-building-science-summer-camp_o Article on the 2022 United Association Event: https://www.pmmag.com/articles/104405-ua-builds-toward-the-future-with-68th-annual-instructor-training-program TEC Train the Trainer Event: https://energyconservatory.com/september-2023-hvac-system-performance-train-the-trainer/ HPC New England: https://building-performance.org/events/regional/new-england/ This episode was recorded in July 2023.  

RESTalk
EP118 Santa Fe's Habitat for Humanity Homes achieve HERS-ZERO scores with Rob Lochner and David Best

RESTalk

Play Episode Listen Later Jul 10, 2023 28:53


“Example is not the main thing in influencing others. It is the only thing.” -Albert Schweitzer   Habitat for Humanity's mission is to help families build and improve places to call home. In the process developing strong and stable communities with affordable housing. Often household utility costs make affordability a challenge. What factors need to fall into place to address this challenge of affordability? How are cozy, comfortable, attractive homes that use no net energy being built in the Habit for Humanity model?   Join us as Rob Lochner, Construction Director at Habit for Humanity, Santa Fe and David Best, HERS Rater with Evergreen Building Solutions (www.evergreenbuildingsolutions.com/) share with us the fascinating story of how 15 NetZero / HERS-ZERO homes have been built in the Habitat for Humanity program in Santa Fe, New Mexico. David describes some of the straightforward technical practices and measures that have been used as the program has evolved since 2020. While these homes are attracting interest and providing influence, Rob shares insights into what seems to be preventing more widespread adoption of these practices in more projects. To learn more or contact our guests, refer to the links below. Links for our guests: David Best: drb713@gmail.comDavid's LinkedIn https://www.linkedin.com/in/david-best-30359a33/ Rob Lochner: rob@santafehabitat.org Video with Rob: https://www.linkedin.com/posts/build-block-building-systems_video-icfs-and-habitat-for-humanity-activity-6947958189782958080-Uh5d?   Santa Fe Habitat for Humanity Website: https://santafehabitat.org/habitat-homes/ Albuquerque Journal Story: https://www.abqjournal.com/2599030/headline-goes-here-258.html     RESTalk: To the RESNET community, we hear you and want to engage. Learn more at www.RESNET.us Or for more info on this topic contact RESNET at INFO@RESNET.US  

RESTalk
EP117 Learning more about the HERS Index and Texas House Bill 3215 with James Rodriguez and Ned Munoz

RESTalk

Play Episode Listen Later Jun 12, 2023 32:03


“The trend is your friend.” -Mark Zweig    With Texas as the leading state in new home starts, we can expect some housing trends first to arise there. Why was House Bill 3215 developed and enacted in Texas? What entities and organizations participated in developing the language in the bill? How will this impact the HERS industry in Texas and elsewhere?   Joining us on today's podcast is Ned Munoz, V.P. of Regulatory Affairs and General Counsel at Texas Association of Builders, and James Rodriguez, Executive Vice President at Fox Energy Specialists. James shares with us the desire to set up a hybrid approach- that is, to introduce a second universal pathway of state energy compliance that is performance-based. Which means Texas now recognizes the Home Energy Rating System (HERS) Index scores as a standalone compliance pathway which untangles it from the current ICC/IRC versions of the energy rating index (ERI) pathway. Ned explains how the Texas Association of Builders recognizes the impact that mandated energy codes have on housing affordability while still encouraging energy-efficient home building and how these concepts influenced the wording of the bill. It's important to note that the State of Texas overlay energy code for single-family new construction was not updated or changed by the bill.  Below are the HERS Index pathway thresholds remaining in effect for 10 years. TIMELINE Climate Zone 2 HERS Index Climate Zone 3 HERS Index 09/1/19 to 08/31/22 63 63 09/1/22 to 08/31/25 59 59 09/1/25 to 08/31/28 57 57 09/1/28 to 09/01/31 55 55 Note: 2018 IECC mandatory items and building envelope thresholds must be upheld as part of these HERS scores.   LINKS: 2 articles on the topic: https://www.resnet.us/builders/hb3215/ https://www.foxenergyspecialists.com/news/texas-house-bill-hb-3215-to-take-effect-on-september-1st James Rodriguez on LinkedIn: https://www.linkedin.com/in/james-rodriguez-0523001/ Ned Munoz on LinkedIn: https://www.linkedin.com/in/ned-mu%C3%B1oz-13a5606/ FAQs on TX HB: 3125: https://www.resnet.us/wp-content/uploads/HB-3215-FAQs_08062021.pdf The actual TX HB 3215: https://www.resnet.us/wp-content/uploads/87R-HB-3215-Enrolled-version.pdf   RESTalk: To the RESNET community, we hear you and want to engage. Learn more at www.RESNET.us Or for more info on this topic contact RESNET at INFO@RESNET.US  

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0
MPT-7B and The Beginning of Context=Infinity — with Jonathan Frankle and Abhinav Venigalla of MosaicML

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later May 20, 2023 66:43


We are excited to be the first podcast in the world to release an in-depth interview on the new SOTA in commercially licensed open source models - MosiacML MPT-7B!The Latent Space crew will be at the NYC Lux AI Summit next week, and have two meetups in June. As usual, all events are on the Community page! We are also inviting beta testers for the upcoming AI for Engineers course. See you soon!One of GPT3's biggest limitations is context length - you can only send it up to 4000 tokens (3k words, 6 pages) before it throws a hard error, requiring you to bring in LangChain and other retrieval techniques to process long documents and prompts. But MosaicML recently open sourced MPT-7B, the newest addition to their Foundation Series, with context length going up to 84,000 tokens (63k words, 126 pages):This transformer model, trained from scratch on 1 trillion tokens of text and code (compared to 300B for Pythia and OpenLLaMA, and 800B for StableLM), matches the quality of LLaMA-7B. It was trained on the MosaicML platform in 9.5 days on 440 GPUs with no human intervention, costing approximately $200,000. Unlike many open models, MPT-7B is licensed for commercial use and it's optimized for fast training and inference through FlashAttention and FasterTransformer.They also released 3 finetuned models starting from the base MPT-7B: * MPT-7B-Instruct: finetuned on dolly_hhrlhf, a dataset built on top of dolly-5k (see our Dolly episode for more details). * MPT-7B-Chat: finetuned on the ShareGPT-Vicuna, HC3, Alpaca, Helpful and Harmless, and Evol-Instruct datasets.* MPT-7B-StoryWriter-65k+: it was finetuned with a context length of 65k tokens on a filtered fiction subset of the books3 dataset. While 65k is the advertised size, the team has gotten up to 84k tokens in response when running on a single node A100-80GB GPUs. ALiBi is the dark magic that makes this possible. Turns out The Great Gatsby is only about 68k tokens, so the team used the model to create new epilogues for it!On top of the model checkpoints, the team also open-sourced the entire codebase for pretraining, finetuning, and evaluating MPT via their new MosaicML LLM Foundry. The table we showed above was created using LLM Foundry in-context-learning eval framework itself!In this episode, we chatted with the leads of MPT-7B at Mosaic: Jonathan Frankle, Chief Scientist, and Abhinav Venigalla, Research Scientist who spearheaded the MPT-7B training run. We talked about some of the innovations they've brought into the training process to remove the need for 2am on-call PagerDutys, why the LLM dataset mix is such an important yet dark art, and why some of the traditional multiple-choice benchmarks might not be very helpful for the type of technology we are building.Show Notes* Introducing MPT-7B* Cerebras* Lottery Ticket Hypothesis* Hazy Research* ALiBi* Flash Attention* FasterTransformer* List of naughty words for C4 https://twitter.com/code_star/status/1661386844250963972* What is Sparsity?* Hungry Hungry Hippos* BF16 FPp.s. yes, MPT-7B really is codenamed LLongboi!Timestamps* Introductions [00:00:00]* Intro to Mosaic [00:03:20]* Training and Creating the Models [00:05:45]* Data Choices and the Importance of Repetition [00:08:45]* The Central Question: What Mix of Data Sets Should You Use? [00:10:00]* Evaluation Challenges of LLMs [0:13:00]* Flash Attention [00:16:00]* Fine-tuning for Creativity [00:19:50]* Open Source Licenses and Ethical Considerations [00:23:00]* Training Stability Enhancement [00:25:15]* Data Readiness & Training Preparation [00:30:00]* Dynamic Real-time Model Evaluation [00:34:00]* Open Science for Affordable AI Research [00:36:00]* The Open Approach [00:40:15]* The Future of Mosaic [00:44:11]* Speed and Efficiency [00:48:01]* Trends and Transformers [00:54:00]* Lightning Round and Closing [1:00:55]TranscriptAlessio: [00:00:00] Hey everyone. Welcome to the Latent Space podcast. This is Alessio partner and CTO-in-Residence at Decibel Partners. I'm joined by my co-host, Swyx, writer and editor of Latent Space.Swyx: Hey, and today we have Jonathan and Abhi from Mosaic ML. Welcome to our studio.Jonathan: Guys thank you so much for having us. Thanks so much.Swyx: How's it feel?Jonathan: Honestly, I've been doing a lot of podcasts during the pandemic, and it has not been the same.Swyx: No, not the same actually. So you have on your bio that you're primarily based in Boston,Jonathan: New York. New York, yeah. My Twitter bio was a probability distribution over locations.Swyx: Exactly, exactly. So I DMd you because I was obviously very interested in MPT-7B and DMd you, I was like, for the 0.2% of the time that you're in San Francisco, can you come please come to a podcast studio and you're like, I'm there next week.Jonathan: Yeah, it worked out perfectly. Swyx: We're really lucky to have you, I'll read off a few intros that people should know about you and then you can fill in the blanks.So Jonathan, you did your BS and MS at Princeton in programming languages and then found your way into ML for your PhD at MiT where you made a real splash with the lottery ticket hypothesis in 2018, which people can check up on. I think you've done a few podcasts about it over the years, which has been highly influential, and we'll talk about sparse models at Mosaic. You have also had some side [00:01:30] quest. You taught programming for lawyers and you did some law and privacy stuff in, in DC and also did some cryptography stuff. Um, and you've been an assistant professor at Harvard before earning your PhD.Jonathan:  I've yet to start.Swyx: You, you yet to start. Okay. But you just got your PhD.Jonathan:. I technically just got my PhD. I was at Mosaic which delayed my defense by about two years. It was, I was at 99% done for two years. Got the job at Harvard, Mosaic started, and I had better things to do than write my dissertation for two years. Swyx: You know, you know, this is very out of order.Jonathan: Like, oh, completely out of order, completely backwards. Go talk to my advisor about that. He's also an advisor at Mosaic and has been from the beginning. And, you know, go talk to him about finishing on time.Swyx: Great, great, great. And just to fill it out, Abhi, you did your BS and MS and MIT, you were a researcher at Cerebras, and you're now a research scientist at Mosaic. Just before we go into Mosaic stuff, I'm actually very curious about Cereus and, uh, just that, that space in general. Um, what are they doing that people should know about?Abhinav: Yeah, absolutely. Um, I think the biggest thing about CEREUS is that they're really building, you know, kind of the NextGen computing platform beyond, like GPUs.Um, they're trying to build a system that uses an entire wafer, you know, rather than cutting up a wafer into smaller chips and trying to train a model on that entire system, or actually more recently on many such wafers. Um, so it's, and it's really extraordinary. I think it's like the first time ever that kind of wafer scale computing has ever really worked. And so it's a really exciting time to be there, trying to figure out how we can map ML workloads to work, um, on a much, much bigger chip.Swyx: And do you use like [00:03:00] a different programming language or framework to do that? Or is that like..Abhinav: Yeah, so I mean, things have changed a bit since I was there.I think, um, you can actually run just normal tensor flow and pie torch on there. Um, so they've built a kind of software stack that compiles it down. So it actually just kind of works naturally. But yeah.Jonathan : Compiled versions of Python is a hot topic at the moment with Mojo as well. Swyx: And then Mosaic, you, you spearheaded the MPT-7B effort.INTRO TO MOSAIC [00:03:20]Abhinav: Uh, yeah. Yeah, so it's kind of like, it's been maybe six months, 12 months in the making. We kind of started working on LMs sort of back in the summer of last year. Um, and then we came with this blog post where we kind of profiled a lot of LMs and saw, hey, the cost of training is actually a lot lower than what people might think.Um, and then since then, you know, being inspired by kind of, you know, meta's release, so the LLaMA models and lots of other open source work, we kind of started working towards, well, what if we were to release a really good kind of 7 billion parameter model? And that's what MPT is. Alessio:You know, we mentioned some of the podcasts you had done, Jonathan, I think in one of them you mentioned Mosaic was not planning on building a  model and releasing and obviously you eventually did. So what are some of the things that got you there that maybe obviously LLaMA you mentioned was an inspiration. You now have both the training and like inference products that you offer. Was this more of a research challenge in a way, uh, that you wanted to do?Or how did the idea come to be?Jonathan: I think there were a couple of things. So we still don't have a first class model. We're not an open AI where, you know, our businesses come to use our one great model. Our business is built around customers creating their own models. But at the end of the day, if customers are gonna create their own models, we have to have the tools to help them do that, and to have the tools to help them do that and know that they work we have to create our own models to start. We have to know that we can do something great if customers are gonna do something great. And one too many people may have challenged me on Twitter about the fact that, you know, mosaic claims all these amazing numbers, but, you know, I believe not to, you know, call out Ross Whiteman here, but, you know, I believe he said at some point, you know, show us the pudding.Um, and so Ross, you know, please let me know how the pudding tastes. But in all seriousness, like I think there is something, this is a demo in some sense. This is to say we did this in 9.5 days for a really reasonable cost, straight through 200, an intervention. 200 K. Yep. Um, you can do this too.Swyx: Uh, and just to reference the numbers that you're putting out, this is the, the last year you were making a lot of noise for trading GPT 3 under 450 K, which is your, your initial estimate.Um, and then it went down to a 100 K and stable diffusion 160 k going down to less than 50 K as well.Jonathan: So I will be careful about that 100 K number. That's certainly the challenge I've given Abhi to hit. Oh, I wouldn't make the promise that we've hit yet, but you know, it's certainly a target that we have.And I, you know, Abhi may kill me for saying this. I don't think it's crazy. TRAINING AND CREATING THE MODELS [00:05:45] Swyx: So we definitely want to get into like estimation math, right? Like what, what needs to happen for those big order magnitude changes to in, in infrastructure costs. But, uh, let's kind of stick to the MPT-7B story. Yeah. Tell us everything.Like you have, uh, three different models. One of them. State of the art essentially on context length. Let's talk about the process of training them, the, uh, the decisions that you made. Um, I can go into, you know, individual details, but I just wanna let you let you rip.Abhinav: Yeah, so I mean, I think, uh, we started off with the base model, which is kind of for all practical purposes, a recreation of LLaMA 7B.Um, so it's a 7 billion perimeter model trained on the trillion tokens. Um, and our goal was like, you know, we should do it efficiently. We should be able to do it like, kind of hands free so we don't have to babysit the runs as they're doing them. And it could be kind of a, a launching point for these fine tune models and those fine tune models, you know, on, on the one hand they're kind of really fun for the community, like the story writer model, which has like a 65,000 length context window and you can even kind of extrapolate beyond that. Um, but they're, they're also kind of just tr inspirations really. So you could kind of start with an MPT-7B base and then build your own custom, you know, downstream. If you want a long context code model, you could do that with our platform. If you wanted one that was for a particular language, you could do that too.But yeah, so we picked kind of the three variance chat and instruct and story writer just kind of like inspirations looking at what people were doing in the community today. Yeah. Alessio: And what's the beginning of the math to come up with? You know, how many tokens you wanna turn it on? How many parameters do you want in a bottle? 7 billion and 30 billion seem to be kind of like two of the magic numbers going around right now. Abhinav: Yeah, definitely. Definitely. Yeah, I think like there's sort of these scaling laws which kind of tell you how to best spend your training compute if that's all you cared about. So if you wanna spend $200,000 exactly in the most efficient way, there'd be a recipe for doing that.Um, and that we usually go by the Chinchilla laws. Now for these models, we actually didn't quite do that because we wanted to make sure that people could actually run these at home and that they [00:07:30] were good for inference. So we trained them kind of beyond those chinchilla points so that we're almost over-training them.I think there's like a joke going on online that they're like long boy and that that came up internally because we were training them for really, really long durations. So that 7B model, the chinchilla point might be 140 billion tokens. Instead, we trained a trillion, so almost seven times longer than you normally would.Swyx: So longboi was the code name. So is it, is it the trading method? Is it the scaling law that you're trying to coin or is it the code name for the 64 billion?Jonathan: Uh, 64. It was just an internal joke for the, for training on way more tokens than you would via chinchilla. Okay. Um, we can coin it long boy and it, it really stuck, but just to, you know, long boys filled with two ELs at the beginning.Yeah. Cause you know, we wanted the lLLaMA thing in there as well. Jonathan: Yeah, yeah, yeah. Our darn CEO we have to rein him in that guy, you know, you can't, yeah. I'm gonna take away his Twitter password at some point. Um, but you know, he had to let that one out publicly. And then I believe there was a YouTube video where someone happened to see it mentioned before the model came out and called it the Long G boy or something like that.Like, so you know, now it's out there in the world. It's out there. It's like Sydnee can't put it back inSwyx: There's a beautiful picture which I think Naveen tweeted out, which, um, shows a long boy on a whiteboard.Jonathan: That was the origin of Long Boy. In fact, the legs of the lLLaMA were the two Ls and the long boy.DATA CHOICES AND THE IMPORTANCE OF REPETITION [00:08:45]Swyx: Well, talk to me about your data choices, right? Like this is your passion project. Like what can you tell us about it?Jonathan: Yeah, I think Abhi wanted to kill me by the end for trying to use all the GPUs on data and none of them on actually training the model. Um, at the end of the day, We know that you need to train these models and [00:09:00] lots of data, but there are a bunch of things we don't know.Number one is what kinds of different data sources matter. The other is how much does repetition really matter? And really kind of repetition can be broken down into how much does quality versus quantity matter. Suppose I had the world's best 10 billion tokens of data. Would it be better to train on that a hundred times or better to train on a trillion tokens of low quality, fresh data?And obviously there's, there's a middle point in between. That's probably the sweet spot. But how do you even know what good quality data is? And. So, yeah, this is, nobody knows, and I think the more time I spent, we have a whole data team, so me and several other people, the more time that we spent on this, you know, I came away thinking, gosh, we know nothing.Gosh, if I were back in academia right now, I would definitely go and, you know, write a paper about this because I have no idea what's going on.Swyx: You would write a paper about it. I'm interested in such a paper. I haven't come across any that exists. Could you frame the central question of such a paper?THE CENTRAL QUESTION: WHAT MIX OF DATA SETS SHOULD YOU USE? [00:10:00]Jonathan: Yeah. The central question is what mix of data sets should you use? Okay. Actually I've, you know, you had mentioned my law school stuff. I went back to Georgetown Law where I used to teach, um, in the midst of creating this model, and I actually sat down with a class of law students and asked them, I gave them our exact data sets, our data mixes, um, like how many tokens we had, and I said, Create the best data set for your model.Knowing they knew nothing about large language models, they just know that data goes in and it's going to affect the behavior. Um, and I was like, create a mix and they basically covered all the different trade-offs. Um, you probably want a lot of English language [00:10:30] text to start with. You get that from the web, but do you want it to be multilingual?If so, you're gonna have a lot less English text. Maybe it'll be worse. Do you wanna have code in there? There are all these beliefs that code leads to models being better at logical reasoning, of which I've seen zero evidence. Rep. It's not, um, I mean, really made a great code model, but code models leading to better chain of thought reasoning on the part of language or code being in the training set leading to better chain of thought reasoning.People claim this all the time, but I've still never seen any real evidence beyond that. You know, one of the generations of the GPT three model started supposedly from Code Da Vinci. Yes. And so there's a belief that, you know, maybe that helped. But again, no evidence. You know, there's a belief that spending a lot of time on good sources like Wikipedia is good for the model.Again, no evidence. At the end of the day, we tried a bunch of different data mixes and the answer was that there are some that are better or worse than others. We did find that the pile, for example, was a really solid data mix, but you know, there were stronger data mixes by our evaluation metrics. And I'll get back to the evaluation question in a minute cuz that's a really important one.This data set called c4, which is what the original T five model was trained on, is weirdly good. And everybody, when I posted on this on Twitter, like Stella Beaterman from Luther mentioned this, I think someone else mentioned this as well. C4 does really well in the metrics and we have no idea why we de-duplicated it against our evaluation set.So it's not like it memorized the data, it is just one web scrape from 2019. If you actually look at the T five paper and see how it was pre-processed, it looks very silly. Mm-hmm. They removed anything that had the word JavaScript in it because they didn't want to get like no JavaScript [00:12:00] warnings. They removed anything with curly braces cuz they didn't wanna get JavaScript in it.They looked at this list of bad words, um, and removed anything that had those bad words. If you actually look at the list of bad words, words like gay are on that list. And so there's, you know, it is a very problematic, you know, list of words, but that was the cleaning that leads to a data set that seems to be unbeatable.So that to me says that we know nothing about data. We, in fact used a data set called mc four as well, which is they supposedly did the same pre-processing of C4 just on more web calls. The English portion is much worse than C4 for reasons that completely escape us. So in the midst of all that, Basically I set two criteria.One was I wanted to be at least as good as mc four English, like make sure that we're not making things actively worse. And mc four English is a nice step up over other stuff that's out there. And two was to go all in on diversity after that, making sure that we had some code, we had some scientific papers, we had Wikipedia, because people are gonna use this model for all sorts of different purposes.But I think the most important thing, and I'm guessing abhi had a million opinions on this, is you're only as good as your evaluation. And we don't know how to evaluate models for the kind of generation we ask them to do. So past a certain point, you have to kinda shrug and say, well, my evaluation's not even measuring what I care about.Mm-hmm. So let me just make reasonable choices. EVALUATION CHALLENGES OF LLMs [0:13:00]Swyx: So you're saying MMLU, big bench, that kind of stuff is not. Convincing for youJonathan: A lot of this stuff is you've got two kinds of tasks. Some of these are more of multiple choice style tasks where there is a right answer. Um, either you ask the model to spit out A, B, C, or D or you know, and if you're more [00:13:30] sophisticated, you look at the perplexity of each possible answer and pick the one that the model is most likely to generate.But we don't ask these models to do multiple choice questions. We ask them to do open-ended generation. There are also open-ended generation tasks like summarization. You compare using things like a blue score or a rouge score, which are known to be very bad ways of comparing text. At the end of the day, there are a lot of great summaries of a paper.There are a lot of great ways to do open form generation, and so humans are, to some extent, the gold standard. Humans are very expensive. It turns out we can't put them into our eval pipeline and just have the humans look at our model every, you know, 10 minutes? Not yet. Not yet. Maybe soon. Um, are you volunteering Abhi?Abhinav: I, I, I just know we have a great eval team who's, uh, who's helping us build new metrics. So if they're listening,Jonathan:  But it's, you know, evaluation of large language models is incredibly hard and I don't think any of these metrics really truly capture. What we expect from the models in practice.Swyx: Yeah. And we might draw wrong conclusions.There's been a debate recently about the emergence phenomenon, whether or not it's a mirage, right? I don't know if you guys have opinions about that process. Abhinav: Yeah, I think I've seen like this paper and all and all, even just kind of plots from different people where like, well maybe it's just a artifact of power, like log scaling or metrics or, you know, we're meshing accuracy, which is this a very like harsh zero one thing.Yeah. Rather than kind of something more continuous. But yeah, similar to what Jonathan was saying about evals. Like there there's one issue of like you just like our diversity of eval metrics, like when we put these models up, even like the chat ones, the instruct ones, people are using 'em for such a variety of tasks.There's just almost no way we get ahead of time, like measuring individual dimensions. And then also particularly like, you know, at the 7B scale, [00:15:00] um, these models still are not super great yet at the really hard tasks, like some of the hardest tasks in MMLU and stuff. So sometimes they're barely scoring like the above kind of random chance, you know, like on really, really hard tasks.So potentially as we. You know, aim for higher and higher quality models. Some of these things will be more useful to us. But we kind of had to develop MPT 7B kind of flying a little bit blind on, on what we knew it was coming out and just going off of like, you know, a small set of common sensor reasoning tasks.And of course, you know, just comparing, you know, those metrics versus other open source models. Alessio: I think fast training in inference was like one of the goals, right? So there's always the trade off between doing the hardest thing and like. Doing all the other things quickly.Abhinav: Yeah, absolutely. Yeah, I mean, I think like, you know, even at the 7B scale, you know, uh, people are trying to run these things on CPUs at home.You know, people are trying to port these to their phones, basically prioritizing the fact that the small scale would lead to our adoption. That was like a big, um, big thing going on. Alessio: Yeah. and you mentioned, um, flash attention and faster transformer as like two of the core things. Can you maybe explain some of the benefits and maybe why other models don't use it?FLASH ATTENTION [00:16:00]Abhinav: Yeah, absolutely. So flash attention is this basically faster implementation of full attention. Um, it's like a mathematical equivalent developed by like actually some of our collaborators, uh, at Stanford. Uh, the hazy research. Hazy research, yeah, exactly.Jonathan: What is, what, what, what's the name hazy research mean?Abhinav: I actually have no idea.Swyx: I have no clue. All these labs have fun names. I always like the stories behind them.Abhinav: Yeah, absolutely. We really, really liked flash attention. We, I think, had to integrate into repo even as [00:16:30] as early as September of last year. And it really just helps, you know, with training speed and also inference speed and we kind of bake that into model architecture.And this is kind of unique amongst all the other hugging face models you see out there. So ours actually, you can toggle between normal torch attention, which will work anywhere and flash attention, which will work on GPUs right out of the box. And that way I think you get almost like a 2x speed up at training time and somewhere between like 50% to a hundred percent speed up at inference time as well.So again, this is just like, we really, really wanted people to use these and like, feel like an improvement and we, we have the team to, to help deliver that. Swyx: Another part, um, of your choices was alibi position, encodings, which people are very interested in, maybe a lot of people just, uh, to sort of take in, in coatings as, as a given.But there's actually a lot of active research and honestly, it's a lot of, um, it's very opaque as well. Like people don't know how to evaluate encodings, including position encodings, but may, may, could you explain, um, alibi and, um, your choice?Abhinav: Yeah, for sure. The alibi and uh, kind of flash attention thing all kind of goes together in interesting ways.And even with training stability too. What alibi does really is that it eliminates the need to have positional embeddings in your model. Where previously, if you're a token position one, you have a particular embedding that you add, and you can't really go beyond your max position, which usually is like about 2000.With alibies, they get rid of that. Instead, just add a bias to the attention map itself. That's kind of like this slope. And if at inference time you wanna go much, much larger, they just kind of stretch that slope out to a longer, longer number of positions. And because the slope is kind of continuous and you can interpret it, it all works out now.Now one of [00:18:00] the, the funny things we found is like with flash attention, it saved so much memory and like improved performance so much that even as early as I kind of last year, like we were profiling models with, with very long context lines up to like, you know, the 65 k that you seen in release, we just never really got around to using it cuz we didn't really know what we might use it for.And also it's very hard to train stably. So we started experimenting with alibi integration, then we suddenly found that, oh wow, stability improves dramatically and now we can actually work together with alibi in a long context lens. That's how we got to like our story writer model where we can stably train these models out to very, very long context lenses and, and use them performantly.Jonathan: Yeah.Swyx: And it's also why you don't have a firm number. Most people now have a firm number on the context line. Now you're just like, eh, 65 to 85Abhinav: Oh yeah, there's, there's a, there's a big age to be 64 K or 65 k. 65 k plus.Swyx: Just do powers of twos. So 64 isn't, you know. Jonathan: Right, right. Yeah. Yeah. But we could, I mean, technically the context length is infinite.If you give me enough memory, um, you know, we can just keep going forever. We had a debate over what number to say is the longest that we could handle. We picked 84 cakes. It's the longest I expect people to see easily in practice. But, you know, we played around for even longer than that and I don't see why we couldn't go longer.Swyx: Yeah. Um, and so for those who haven't read the blog posts, you put the Great Gatsby in there and, uh, asked it to write an epilogue, which seemed pretty impressive.Jonathan: Yeah. There are a bunch of epilogues floating around internally at Mosaic. Yeah. That wasn't my favorite. I think we all have our own favorites.Yeah. But there are a bunch of really, really good ones. There was one where, you know, it's Gatsby's funeral and then Nick starts talking to Gatsby's Ghost, and Gatsby's father shows up and, you know, then he's [00:19:30] at the police station with Tom. It was very plot heavy, like this is what comes next. And a bunch of that were just very Fitzgerald-esque, like, you know, beautiful writing.Um, but it was cool to just see that Wow, the model seemed to actually be working with. You know, all this input. Yeah, yeah. Like it's, it's exciting. You can think of a lot of things you could do with that kind of context length.FINE-TUNING FOR CREATIVITY [00:19:50]Swyx: Is there a trick to fine tuning for a creative task rather than, um, factual task?Jonathan: I don't know what that is, but probably, yeah, I think, you know, the person, um, Alex who did this, he did fine tune the model explicitly on books. The goal was to try to get a model that was really a story writer. But, you know, beyond that, I'm not entirely sure. Actually, it's a great question. Well, no, I'll ask you back.How would you measure that? Swyx: Uh, God, human feedback is the solve to all things. Um, I think there is a labeling question, right? Uh, in computer vision, we had a really, really good episode with Robo Flow on the segment. Anything model where you, you actually start human feedback on like very, I think it's something like 0.5% of the, the overall, uh, final, uh, uh, labels that you had.But then you sort augment them and then you, you fully automate them, um, which I think could be applied to text. It seems intuitive and probably people like snorkel have already raised ahead on this stuff, but I just haven't seen this applied in the language domain yet.Jonathan: It, I mean there are a lot of things that seem like they make a lot of sense in machine learning that never work and a lot of things that make zero sense that seem to work.So, you know, I've given up trying to even predict. Yeah, yeah. Until I see the data or try it, I just kind shg my shoulders and you know, you hope for the best. Bring data or else, right? Yeah, [00:21:00] exactly. Yeah, yeah, yeah.Alessio: The fine tuning of books. Books three is like one of the big data sets and there was the whole.Twitter thing about trade comments and like, you know, you know, I used to be a community moderator@agenius.com and we've run into a lot of things is, well, if you're explaining lyrics, do you have the right to redistribute the lyrics? I know you ended up changing the license on the model from a commercial use Permitted.Swyx: Yeah let's let them. I'm not sure they did. Jonathan: So we flipped it for about a couple hours. Swyx: Um, okay. Can we, can we introduce the story from the start Just for people who are under the loop. Jonathan: Yeah. So I can tell the story very simply. So, you know, the book three data set does contain a lot of books. And it is, you know, as I discovered, um, it is a data set that provokes very strong feelings from a lot of folks.Um, that was one, one guy from one person in particular, in fact. Um, and that's about it. But it turns out one person who wants a lot of attention can, you know, get enough attention that we're talking about it now. And so we had a, we had a discussion internally after that conversation and we talked about flipping the license and, you know, very late at night I thought, you know, maybe it's a good thing to do.And decided, you know, actually probably better to just, you know, Stan Pat's license is still Apache too. And one of the conversations we had was kind of, we hadn't thought about this cuz we had our heads down, but the Hollywood writer Strike took place basically the moment we released the model. Mm-hmm.Um, we were releasing a model that could do AI generated creative content. And that is one of the big sticking points during the strike. Oh, the optics are not good. So the optics aren't good and that's not what we want to convey. This is really, this is a demo of the ability to do really long sequence lengths and.Boy, you know, [00:22:30] that's, that's not timing that we appreciated. And so we talked a lot internally that night about like, oh, we've had time to read the news. We've had time to take a breath. We don't really love this. Came to the conclusion that it's better to just leave it as it is now and learn the lesson for the future.But certainly that was one of my takeaways is this stuff, you know, there's a societal context around this that it's easy to forget when you're in the trenches just trying to get the model to train. And you know, in hindsight, you know, I might've gone with a different thing than a story writer. I might've gone with, you know, coder because we seem to have no problem putting programmers out of work with these models.Swyx: Oh yeah. Please, please, you know, take away this stuff from me.OPEN SOURCE LICENSES AND ETHICAL CONSIDERATIONS [00:23:00]Jonathan: Right. You know, so it's, I think, you know, really. The copyright concerns I leave to the lawyers. Um, that's really, if I learned one thing teaching at a law school, it was that I'm not a lawyer and all this stuff is a little complicated, especially open source licenses were not designed for this kind of world.They were designed for a world of forcing people to be more open, not forcing people to be more closed. And I think, you know, that was part of the impetus here, was to try to use licenses to make things more closed. Um, which is, I think, against the grain of the open source ethos. So that struck me as a little bit strange, but I think the most important part is, you know, we wanna be thoughtful and we wanna do the right thing.And in that case, you know, I hope with all that interesting licensing fund you saw, we're trying to be really thoughtful about this and it's hard. I learned a lot from that experience. Swyx: There's also, I think, an open question of fair use, right? Is training on words of fair use because you don't have a monopoly on words, but some certain arrangements of words you do.And who is to say how much is memorization by a model versus actually learning and internalizing and then. Sometimes happening to land at the right, the [00:24:00] same result.Jonathan: And if I've learned one lesson, I'm not gonna be the person to answer that question. Right, exactly. And so my position is, you know, we will try to make this stuff open and available.Yeah. And, you know, let the community make decisions about what they are or aren't comfortable using. Um, and at the end of the day, you know, it still strikes me as a little bit weird that someone is trying to use these open source licenses to, you know, to close the ecosystem and not to make things more open.That's very much against the ethos of why these licenses were created.Swyx: So the official mosaic position, I guess is like, before you use TC MPC 7B for anything commercial, check your own lawyers now trust our lawyers, not mosaic's lawyers.Jonathan: Yeah, okay. Yeah. I'm, you know, our lawyers are not your lawyers.Exactly. And, you know, make the best decision for yourself. We've tried to be respectful of the content creators and, you know, at the end of the day, This is complicated. And this is something that is a new law. It's a new law. It's a new law that hasn't been established yet. Um, but it's a place where we're gonna continue to try to do the right thing.Um, and it's, I think, one of the commenters, you know, I really appreciated this said, you know, well, they're trying to do the right thing, but nobody knows what the right thing is to even do, you know, the, I guess the, the most right thing would've been to literally not release a model at all. But I don't think that would've been the best thing for the community either.Swyx: Cool.Well, thanks. Well handled. Uh, we had to cover it, just causeJonathan:  Oh, yes, no worries. A big piece of news. It's been on my mind a lot.TRAINING STABILITY ENHANCEMENT [00:25:15]Swyx: Yeah. Yeah. Well, you've been very thoughtful about it. Okay. So a lot of these other ideas in terms of architecture, flash, attention, alibi, and the other data sets were contributions from the rest of the let's just call it open community of, of machine learning advancements. Uh, but Mosaic in [00:25:30] particular had some stability improvements to mitigate loss spikes, quote unquote, uh, which, uh, I, I took to mean, uh, your existing set of tools, uh, maybe we just co kind of covered that. I don't wanna sort of put words in your mouth, but when you say things like, uh, please enjoy my empty logbook.How much of an oversell is that? How much, you know, how much is that marketing versus how much is that reality?Abhinav: Oh yeah. That, that one's real. Yeah. It's like fully end-to-end. Um, and I think.Swyx: So maybe like what, what specific features of Mosaic malibu?Abhinav: Totally, totally. Yeah. I think I'll break it into two parts.One is like training stability, right? Knowing that your model's gonna basically get to the end of the training without loss spikes. Um, and I think, you know, at the 7B scale, you know, for some models like it ha it's not that big of a deal. As you train for longer and longer durations, we found that it's trickier and trickier to avoid these lost spikes.And so we actually spent a long time figuring out, you know, what can we do about our initialization, about our optimizers, about the architecture that basically prevents these lost spikes. And you know, even in our training run, if you zoom in, you'll see small intermittent spikes, but they recover within a few hundred steps.And so that's kind of the magical bit. Our line is one of defenses we recover from Las Vegas, like just naturally, right? Mm-hmm. Our line two defense was that we used determinism and basically really smart resumption strategies so that if something catastrophic happened, we can resume very quickly, like a few batches before.And apply some of these like, uh, interventions. So we had these kinds of preparations, like a plan B, but we didn't have to use them at all for MPT 7B training. So, that was kind of like a lucky break. And the third part of like basically getting all the way to the empty law book is having the right training infrastructure.[00:27:00]So this is basically what, like is, one of the big selling points of the platform is that when you try to train these models on hundreds of GPUs, not many people outside, you know, like deep industry research owners, but the GPUs fail like a lot. Um, I would say like almost once every thousand a 100 days.So for us on like a big 512 cluster every two days, basically the run will fail. Um, and this is either due to GPUs, like falling off the bus, like that's, that's a real error we see, or kind of networking failures or something like that. And so in those situations, what people have normally done is they'll have an on-call team that's just sitting round the clock, 24-7 on slack, once something goes wrong.And if then they'll basically like to try to inspect the cluster, take nodes out that are broken, restart it, and it's a huge pain. Like we ourselves did this for a few months. And as a result of that, because we're building such a platform, we basically step by step automated every single one of those processes.So now when a run fails, we have this automatic kind of watch talk that's watching. It'll basically stop the job. Test the nodes cord in anyone's that are broken and relaunch it. And because our software's all deterministic has fast resumption stuff, it just continues on gracefully. So within that log you can see sometimes I think maybe at like 2:00 AM or something, the run failed and within a few minutes it's back up and running and all of us are just sleeping peacefully.Jonathan: I do wanna say that was hard one. Mm-hmm. Um, certainly this is not how things were going, you know, many months ago, hardware failures we had on calls who were, you know, getting up at two in the morning to, you know, figure out which node had died for what reason, restart the job, have to cord the node. [00:28:30] Um, we were seeing catastrophic loss spikes really frequently, even at the 7B scale that we're just completely derailing runs.And so this was step by step just ratcheting our way there. As Abhi said, to the point where, Many models are training at the moment and I'm sitting here in the studio and not worrying one bit about whether the runs are gonna continue. Yeah. Swyx: I'm, I'm not so much of a data center hardware kind of guy, but isn't there existing software to do this for CPUs and like, what's different about this domain? Does this question make sense at all?Jonathan: Yeah, so when I think about, like, I think back to all the Google fault tolerance papers I read, you know, as an undergrad or grad student mm-hmm. About, you know, building distributed systems. A lot of it is that, you know, Each CPU is doing, say, an individual unit of work.You've got a database that's distributed across your cluster. You wanna make sure that one CPU failing can't, or one machine failing can't, you know, delete data. So you, you replicate it. You know, you have protocols like Paxos where you're literally, you've got state machines that are replicated with, you know, with leaders and backups and things like that.And in this case, you were performing one giant computation where you cannot afford to lose any node. If you lose a node, you lose model state. If you lose a node, you can't continue. It may be that, that in the future we actually, you know, create new versions of a lot of our distributed training libraries that do have backups and where data is replicated so that if you lose a node, you can detect what node you've lost and just continue training without having to stop the run, you know?Pull from a checkpoint. Yeah. Restart again on different hardware. But for now, we're certainly in a world where if anything dies, that's the end of the run and you have to go back and recover from it. [00:30:00]DATA READINESS & TRAINING PREPARATION [00:30:00]Abhinav: Yeah. Like I think a big part, a big word there is like synchronous data pluralism, right? So like, we're basically saying that on every step, every GP is gonna do some work.They're gonna stay in sync with each other and average their, their gradients and continue. Now that there are algorithmic techniques to get around this, like you could say, oh, if a GP dies, just forget about it. All the data that's gonna see, we'll just forget about it. We're not gonna train on it.But, we don't like to do that currently because, um, it makes us give up determinism, stuff like that. Maybe in the future, as you go to extreme scales, we'll start looking at some of those methods. But at the current time it's like, we want determinism. We wanted to have a run that we could perfectly replicate if we needed to.And it was, the goal is figure out how to run it on a big cluster without humans having to babysit it. Babysit it. Alessio: So as you mentioned, these models are kind of the starting point for a lot of your customers To start, you have a. Inference product. You have a training product. You previously had a composer product that is now kind of not rolled into, but you have like a super set of it, which is like the LLM foundry.How are you seeing that change, you know, like from the usual LOP stack and like how people train things before versus now they're starting from, you know, one of these MPT models and coming from there. Like worship teams think about as they come to you and start their journey.Jonathan: So I think there's a key distinction to make here, which is, you know, when you say starting from MPT models, you can mean two things.One is actually starting from one of our checkpoints, which I think very few of our customers are actually going to do, and one is starting from our configuration. You can look at our friends at Rep for that, where, you know, MPT was in progress when Refl [00:31:30] came to us and said, Hey, we need a 3 billion parameter model by next week on all of our data.We're like, well, here you go. This is what we're doing, and if it's good enough for us, um, hopefully it's good enough for you. And that's basically the message we wanna send to our customers. MPT is basically clearing a path all the way through where they know that they can come bring their data, they can use our training infrastructure, they can use all of our amazing orchestration and other tools that abhi just mentioned, for fault tolerance.They can use Composer, which is, you know, still at the heart of our stack. And then the l l M Foundry is really the specific model configuration. They can come in and they know that thing is gonna train well because we've already done it multiple times. Swyx: Let's dig in a little bit more on what should people have ready before they come talk to you? So data architecture, eval that they're looking, etc.Abhinav: Yeah, I, I mean, I think we'll accept customers at any kind of stage in their pipeline. You know, like I'd say science, there's archetypes of people who have built products around like some of these API companies and reach a stage or maturity level where it's like we want our own custom models now, either for the purpose of reducing cost, right?Like our inference services. Quite a bit cheaper than using APIs or because they want some kind of customization that you can't really get from the other API providers. I'd say the most important things to have before training a big model. You know, you wanna have good eval metrics, you know, some kind of score that you can track as you're training your models and scaling up, they can tell you you're progressing.And it's really funny, like a lot of times customers will be really excited about training the models, right? It's really fun to like launch shelves on hundreds of gfs, just all around. It's super fun. But then they'll be like, but wait, what are we gonna measure? Not just the training loss, right? I mean, it's gotta be more than that.[00:33:00]So eval metrics is like a, it's a good pre-req also, you know, your data, you know, either coming with your own pre-training or fine-tune data and having like a strategy to clean it or we can help clean it too. I think we're, we're building a lot of tooling around that. And I think once you have those two kinds of inputs and sort of the budget that you want, we can pretty much walk you through the rest of it, right?Like that's kind of what we do. Recently we helped build CR FM's model for biomedical language a while back. Jonathan: Um, we can. That's the center of research for foundation models. Abhi: Exactly, exactly.Jonathan: Spelling it out for people. Of course.Abhinav: No, absolutely. Yeah, yeah. No, you've done more of these than I have.Um, I think, uh, basically it's sort of, we can help you figure out what model I should train to scale up so that when I go for my big run company, your here run, it's, uh, it's predictable. You can feel confident that it's gonna work, and you'll kind of know what quality you're gonna get out before you have to spend like a few hundred thousand dollars.DYNAMIC REAL-TIME MODEL EVALUATION [00:34:00]Alessio: The rap Reza from rap was on the podcast last week and, uh, they had human eval and then that, uh, I'm Jon Eval, which is like vibe based. Jonathan: And I, I do think the vibe based eval cannot be, you know, underrated really at the, I mean, at the end of the day we, we did stop our models and do vibe checks and we did, as we monitor our models, one of our evals was we just had a bunch of prompts and we would watch the answers as the model trained and see if they changed cuz honestly, You know, I don't really believe in any of these eval metrics to capture what we care about.Mm-hmm. But when you ask it, uh, you know, I don't know. I think one of our prompts was to suggest games for a three-year-old and a seven-year-old. That would be fun to play. Like that was a lot more [00:34:30] valuable to me personally, to see how that answer evolved and changed over the course of training. So, you know, and human eval, just to clarify for folks, human human eval is an automated evaluation metric.There's no humans in it at all. There's no humans in it at all. It's really badly named. I got so confused the first time that someone brought that to me and I was like, no, we're not bringing humans in. It's like, no, it's, it's automated. They just called it a bad name and there's only a hundred cents on it or something.Abhinav: Yeah. Yeah. And, and it's for code specifically, right?Jonathan: Yeah. Yeah. It's very weird. It's a, it's a weird, confusing name that I hate, but you know, when other metrics are called hella swag, like, you know, you do it, just gotta roll with it at this point. Swyx: You're doing live evals now. So one, one of the tweets that I saw from you was that it is, uh, important that you do it paralyzed.Uh, maybe you kind of wanna explain, uh, what, what you guys did.Abhinav: Yeah, for sure. So with LLM Foundry, there's many pieces to it. There's obviously the core training piece, but there's also, you know, tools for evaluation of models. And we've kind of had one of the, I think it's like the, the fastest like evaluation framework.Um, basically it's multi GPU compatible. It runs with Composer, it can support really, really big models. So basically our framework runs so fast that even Azure models are training. We can run these metrics live during the training. So like if you have a dashboard like weights and biases, you kind of watch all these evil metrics.We have, like, 15 or 20 of them honestly, that we track during the run and add negligible overhead. So we can actually watch as our models go and feel confident. Like, it's not like we wait until the very last day to, to test if the models good or notJonathan: That's amazing. Yeah. I love that we've gotten this far into the conversation.We still haven't talked about efficiency and speed. Those are usually our two watch words at Mosaic, which is, you know, that's great. That says that we're [00:36:00] doing a lot of other cool stuff, but at the end of the day, um, you know, Cost comes first. If you can't afford it, it doesn't matter. And so, you know, getting things down cheap enough that, you know, we can monitor in real time, getting things down cheap enough that we can even do it in the first place.That's the basis for everything we do.OPEN SCIENCE FOR AFFORDABLE AI RESEARCH [00:36:00]Alessio: Do you think a lot of the questions that we have around, you know, what data sets we should use and things like that are just because training was so expensive before that, we just haven't run enough experiments to figure that out. And is that one of your goals is trying to make it cheaper so that we can actually get the answers?Jonathan: Yeah, that's a big part of my personal conviction for being here. I think I'm, I'm still in my heart, the second year grad student who was jealous of all his friends who had GPUs and he didn't, and I couldn't train any models except in my laptop. And that, I mean, the lottery ticket experiments began on my laptop that I had to beg for one K 80 so that I could run amist.And I'm still that person deep down in my heart. And I'm a believer that, you know, if we wanna do science and really understand these systems and understand how to make them work well, understand how they behave, understand what makes them safe and reliable. We need to make it cheap enough that we can actually do science, and science involves running dozens of experiments.When I finally, you know, cleaned out my g c s bucket from my PhD, I deleted a million model checkpoints. I'm not kidding. There were over a million model checkpoints. That is the kind of science we need, you know, that's just what it takes. In the same way that if you're in a biology lab, you don't just grow one cell and say like, eh, the drug seems to work on that cell.Like, there's a lot more science you have to do before you really know.Abhinav: Yeah. And I think one of the special things about Mosaic's kind of [00:37:30] position as well is that we have such, so many customers all trying to train models that basically we have the incentive to like to devote all these resources and time to do this science.Because when we learn which pieces actually work, which ones don't, we get to help many, many people, right? And so that kind of aggregation process I think is really important for us. I remember way back there was a paper about Google that basically would investigate batch sizes or something like that.And it was this paper that must have cost a few million dollars during all the experience. And it was just like, wow, what a, what a benefit to the whole community. Now, like now we all get to learn from that and we get, we get to save. We don't have to spend those millions of dollars anymore. So I think, um, kind of mosaical science, like the insights we get on, on data, on pre-screening architecture, on all these different things, um, that's why customers come to us.Swyx: Yeah, you guys did some really good stuff on PubMed, G B T as well. That's the first time I heard of you. Of you. And that's also published to the community.Abhinav: Yeah, that one was really fun. We were like, well, no one's really trained, like fully from scratch domain specific models before. Like, what if we just did a biomed one?Would it still work? And, uh, yeah, I'd be really excited. That did, um, we'll probably have some follow up soon, I think, later this summer.Jonathan: Yeah. Yes. Stay tuned on that. Um, but I, I will say just in general, it's a really important value for us to be open in some sense. We have no incentive not to be open. You know, we make our money off of helping people train better.There's no cost to us in sharing what we learn with the community. Cuz really at the end of the day, we make our money off of those custom models and great infrastructure and, and putting all the pieces together. That's honestly where the Mosaic name came from. Not off of like, oh, we've got, you know, this one cool secret trick [00:39:00] that we won't tell you, or, you know, closing up.I sometimes, you know, in the past couple weeks I've talked to my friends at places like Brain or, you know, what used to be Brain Now Google DeepMind. Oh, I R I P Brain. Yeah. R i p Brian. I spent a lot of time there and it was really a formative time for me. Um, so I miss it, but. You know, I kind of feel like we're one of the biggest open research labs left in industry, which is a very sad state of affairs because we're not very big.Um, but at least can you say how big the team is actually? Yeah. We were about 15 researchers, so we're, we're tiny compared to, you know, the huge army of researchers I remember at Brain or at fair, at Deep Mind back, you know, when I was there during their heydays. Um, you know, but everybody else is kind of, you know, closed up and isn't saying very much anymore.Yeah. And we're gonna keep talking and we're gonna keep sharing and, you know, we will try to be that vanguard to the best of our ability. We're very small and I, I can't promise we're gonna do what those labs used to do in terms of scale or quantity of research, but we will share what we learn and we will try to create resources for the community.Um, I, I dunno, I just, I believe in openness fundamentally. I'm an academic at heart and it's sad to me to watch that go away from a lot of the big labs. THE OPEN APPROACH [00:40:15]Alessio: We just had a live pod about the, you know, open AI snow mode, uh, post that came out and it was one of the first time I really dove into Laura and some of the this new technologies, like how are you thinking about what it's gonna take for like the open approach to really work?Obviously today, GPT four is still, you know, part of like that state-of-the-art model for a [00:40:30] lot of tasks. Do you think some of the innovation and kind of returning methods that we have today are enough if enough people like you guys are like running these, these research groups that are open? Or do you think we still need a step function improvement there?Jonathan: I think one important point here is the idea of coexistence. I think when you look at, I don't know who won Linux or Windows, the answer is yes. Microsoft bought GitHub and has a Windows subsystem for Linux. Linux runs a huge number of our servers and Microsoft is still a wildly profitable company.Probably the most successful tech company right now. So who won open source or closed source? Yes. Um, and I think that's a similar world that we're gonna be in here where, you know, it's gonna be different things for different purposes. I would not run Linux on my laptop personally cuz I like connecting to wifi and printing things.But I wouldn't run Windows on one of my surfers. And so I do think what we're seeing with a lot of our customers is, do they choose opening IR mosaic? Yes. There's a purpose for each of these. You have to send your data off to somebody else with open eyes models. That's a risk. GPT four is amazing and I would never promise someone that if they come to Mosaic, they're gonna get a GPT four quality model.That's way beyond our means and not what we're trying to do anyway. But there's also a whole world for, you know, domain specific models, context specific models that are really specialized, proprietary, trained on your own data that can do things that you could never do with one of these big models. You can customize in crazy ways like G B T four is not gonna hit 65 K context length for a very long time, cuz they've already trained that [00:42:00] model and you know, they haven't even released the 32 K version yet.So we can, you know, we can do things differently, you know, by being flexible. So I think the answer to all this is yes. But we can't see the open source ecosystem disappear. And that's the scariest thing for me. I hear a lot of talk in academia about, you know, whatever happened to that academic research on this field called information retrieval?Well, in 1999 it disappeared. Why? Because Google came along and who cares about information retrieval research when you know you have a Google Scale, you know, Web Scale database. So you know, there's a balance here. We need to have both. Swyx: I wanna applaud you, Elaine. We'll maybe edit it a little like crowd applause, uh, line.Cuz I, I think that, um, that is something that as a research community, as people interested in progress, we need to see these things instead of just, uh, seeing marketing papers from the advertising GPT 4.Jonathan: Yeah. I, I think I, you know, to get on my soapbox for 10 more seconds. Go ahead. When I talk to policymakers about, you know, the AI ecosystem, the usual fear that I bring up is, Innovation will slow because of lack of openness.I've been complaining about this for years and it's finally happened. Hmm. Why is Google sharing, you know, these papers? Why is Open AI sharing these papers? There are a lot of reasons. You know, I have my own beliefs, but it's not something we should take for granted that everybody's sharing the work that they do and it turns out well, I think we took it for granted for a while and now it's gone.I think it's gonna slow down the pace of progress. In a lot of cases, each of these labs has a bit of a monoculture and being able to pass ideas [00:43:30] back and forth was a lot of what kept, you know, scientific progress moving. So it's imperative not just, you know, for the open source community and for academia, but for the progress of technology.That we have a vibrant open source research community.THE FUTURE OF MOSAIC [00:44:11]Swyx: There's a preview of the ecosystem and commentary that we're, we're gonna do. But I wanna close out some stuff on Mosaic. You launched a bunch of stuff this month. A lot of stuff, uh, actually was, I was listening to you on Gradient descent, uh, and other podcasts we know and love.Uh, and you said you also said you were not gonna do inference and, and, and last week you were like, here's Mosaic ML inference. Oops. So maybe just a, at a high level, what was Mosaic ml and like, what is it growing into? Like how do you conceptualize this? Jonathan: Yeah, and I will say gradient, when graded dissent was recorded, we weren't doing inference and had no plans to do it.It took a little while for the podcast to get out. Um, in the meantime, basically, you know, one thing I've learned at a startup, and I'm sure abhi can comment on this as well, focus is the most important thing. We have done our best work when we've been focused on doing one thing really well and our worst work when we've tried to do lots of things.Yeah. So, We don't want to do inference, we don't want to have had to do inference. Um, and at the end of the day, our customers were begging us to do it because they wanted a good way to serve the models and they liked our ecosystem. And so in some sense, we got dragged into it kicking and screaming. We're very excited to have a product.We're going to put our best foot forward and make something really truly amazing. But there is, you know, that's something that we were reluctant to do. You know, our customers convinced us it would be good for our business. It's been wonderful for business and we are gonna put everything into this, but you know, back when grading dissent came out, I [00:45:00] was thinking like, or when we recorded it or focused, oh God, like focus is the most important thing.I've learned that the hard way multiple times that Mosaic, abhi can tell you like, you know, I've made a lot of mistakes on not focusing enough. Um, boy inference, that's a whole second thing, and a whole different animal from training. And at the end of the day, when we founded the company, our belief was that inference was relatively well served at that time.There were a lot of great inference companies out there. Um, training was not well served, especially efficient training. And we had something to add there. I think we've discovered that as the nature of the models have changed, the nature of what we had to add to inference changed a lot and there became an opportunity for us to contribute something.But that was not the plan. But now we do wanna be the place that people come when they wanna train these big, complex, difficult models and know that it's gonna go right the first time and they're gonna have something they can servee right away. Um, you know, really the rep example of, you know, with 10 days to go saying, Hey, can you please train that model?And, you know, three or four days later the model was trained and we were just having fun doing interesting, fine tuning work in it for the rest of the 10 days, you know. That also requires good inference. Swyx: That's true, that's true. Like, so running evals and, and fine tuning. I'm just putting my business hat on and you know, and Alessio as well, like, uh, I've actually had fights with potential co-founders about this on the primary business.Almost like being training, right? Like essentially a one-time cost.Jonathan: Who told you it was a one time cost? What, who, who told you that?Swyx: No, no, no, no. Correct me. Jonathan: Yeah. Yeah. Let me correct you in two ways. Um, as our CEO Navine would say, if he were here, when you create version 1.0 of your software, do you then fire all the engineers?Of [00:46:30] course not. You never, like, MPT has a thousand different things we wanted to do that we never got to. So, you know, there will be future models.Abhinav: And, and the data that's been trained on is also changing over time too, right? If you wanna ask anything about, I guess like May of 2023, we'll have to retrain it further and so on.Right? And I think this is especially true for customers who run like the kind of things that need to be up to date on world knowledge. So I, I think like, you know, the other thing I would say too is that, The malls we have today are certainly not the best malls we'll ever produce. Right. They're gonna get smaller, they're gonna get faster, they're gonna get cheaper, they're gonna get lower latency, they're gonna get higher quality.Right? And so you always want the next gen version of MPT and the one after that and one after that. There's a reason that even the GPT series goes three, four, and we know there's gonna be a five. Right? Um, so I I I also don't see as a, as a one-time cost.Jonathan: Yeah. Yeah. And I, if you wanna cite a stat on this, there are very, very

RESTalk
EP116 RESNET Data Sheds Light on Energy Efficient Home Trends with Ryan Meres

RESTalk

Play Episode Listen Later May 8, 2023 23:12


“Data is the most valuable asset in the world.” - Brittany Kaiser    Housing trends have significant importance to the RESNET community. In addition to tracking progress with RESNET HERS Ratings, the hundreds of data points collected on each home provide insights into the trends in energy efficiency, component selection and electrification.  What are some of the latest trends in HERS Rated homes? What value can this data provide to all the stakeholders in RESNET?   RESTALK welcomes back RESNET's Program Director, Ryan Meres, to share with us some insights he has gathered from the 2023 report: Trends in HERS Rated Homes, a statistical abstract. Ryan shares with us the impetus of this report, as well as the latest trends in its third year of publication. Some of the big picture trends he notes are: 129% increase in annual Ratings from 2013 to 2022  Just shy of 338,000 Ratings last year (80% single family/20% multifamily)  Have seen year-over-year increases for a decade  Massachusetts is number one again with 82% of new homes receiving a Rating  Indiana comes in second with 68%  8 states achieved 50% or more of new homes rated  When looking only at the number of total ratings, Texas comes out on top with over 81,000 ratings.   Note: in the upcoming RESTALK Episode 117 we will learn about the HERS Index and Texas House Bill 3215. We also cover other topics in the report including efficiency, component types and usage and trends toward electrification. A link to the 2021 report:  https://www.resnet.us/wp-content/uploads/2021-Data-Trends-Report-of-HERS-Rated-Homes.pdf A link to the 2022 Report: https://t.e2ma.net/click/6263j7/ysd8cfc/u5splqh A link to the 2023 report will be available in the RESNET Newsletter. If you are not a subscriber, please subscribe here: https://signup.e2ma.net/signup/1878040/1889360/ RESTalk: To the RESNET community, we hear you and want to engage. Learn more at www.RESNET.us Or for more info on this topic contact RESNET at INFO@RESNET.US  

RESTalk
EP115 The New RESNET Embodied Carbon Advisory Committee with Chris Magwood from RMI

RESTalk

Play Episode Listen Later Apr 17, 2023 29:24


“Doing your best in this moment puts you in the best place for the next moment.” -Oprah Winfrey    A critical element in the decarbonization of homes is the embodied carbon produced in the construction of homes. As the current RESNET Carbon Index® only covers the carbon produced due to energy used in a home, what do we do next? What type of standards exists to perform this assessment? What are the challenges and roadmap ahead? Recently, the RESNET Board of Directors formally authorized the creation of an effort to explore the development of a residential embodied carbon standard. The first step in this process is the creation of an advisory committee that will review the development of the standard, provide suggestions on how to proceed, and vet drafts of the guidelines. Our guest today is Chris Magwood, Chair of the RESNET Embodied Carbon Advisory Committee. Chris' full-time role is the Manager of Carbon-Free Buildings at RMI. I found this topic fascinating as we explored the data and the impact on people, businesses, and the environment. My big takeaway was that the stacked benefits are wins all the way around, and pursuing these efforts is not likely to be cost-prohibitive, perhaps cost-beneficial. LINKS: Chris on LinkedIn: https://www.linkedin.com/in/chris-magwood-8a8a9738/ RESNET blog post on The RESNET Embodied Carbon Advisory Committee:  https://www.resnet.us/about/resnet-carbon-rating-index/resnet-appoints-advisory-committee-to-investigate-development-of-standard-to-calculate-the-embodied-carbon-in-homes/   RESTalk EP113 on the Carbon Index update: https://restalk.libsyn.com/ep113-update-on-resnet-carbon-index-with-philip-fairey-and-david-goldstein   Info on the RESNET Carbon Index: https://www.resnet.us/about/resnet-carbon-rating-index/     RESTalk: To the RESNET community, we hear you and want to engage. Learn more at www.RESNET.us Or for more info on this topic contact RESNET at INFO@RESNET.US  

RESTalk
EP114 Meet the new RESNET Board leaders, Mark Johnson & Cy Kilbourn

RESTalk

Play Episode Listen Later Mar 13, 2023 28:15


“Leadership is the capacity to translate vision into reality.” -Warren Bennis    What are some of the qualities embodied in outstanding board leaders? Someone who has experience in business leadership roles.  A person that has a proven track record in developing and executing strategy. Someone who demonstrates a forward-thinking mentality.   In today's podcast, we are joined by two outstanding board leaders, Mark Johnson, Executive VP and Director of Business Development at the International Code Council, and Cy Kilbourn, Vice President of Engineering at Ekotrope. We learn of their backgrounds and growing involvement with RESNET over the years and how fitting they are to take on these roles: Mark as Board President and Cy as Board Vice President. Mark describes his plans to continue to ensure that RESNET is the gold standard within the industry which includes the growing interest in and use of the Carbon Index. Cy notes how he will help encourage and support the opportunities offered in field code compliance work, ESG reporting, and the extension of the 45L tax credit. Mark and Cy are focusing on the realization and execution of these and other initiatives. LINKS: RESNET's 2023 Mission, Goals, and Priorities: https://www.resnet.us/wp-content/uploads/RESNET_2023_MissionGoalsPriorities_11-01-2022.pdf RESNET Board Page: https://www.resnet.us/about/resnet-board-of-directors-members/ Mark Johnson on LinkedIn: https://www.linkedin.com/in/mark-johnson-79b71a7/ Cy Kilbourn on LinkedIn:https://www.linkedin.com/in/cy-kilbourn-038bb019/   RESTalk: To the RESNET community, we hear you and want to engage. Learn more at www.RESNET.us Or for more info on this topic contact RESNET at INFO@RESNET.US