Podcasts about Latency

  • 454PODCASTS
  • 721EPISODES
  • 44mAVG DURATION
  • 1WEEKLY EPISODE
  • May 12, 2025LATEST

POPULARITY

20172018201920202021202220232024


Best podcasts about Latency

Latest podcast episodes about Latency

Crazy Wisdom
Episode #460: Voice First, Future Forward: The AI Agents Are Here

Crazy Wisdom

Play Episode Listen Later May 12, 2025 53:07


I, Stewart Alsop, welcomed Alex Levin, CEO and co-founder of Regal, to this episode of the Crazy Wisdom Podcast to discuss the fascinating world of AI phone agents. Alex shared some incredible insights into how AI is already transforming customer interactions and what the future holds for company agents, machine-to-machine communication, and even the nature of knowledge itself.Check out this GPT we trained on the conversation!Timestamps00:29 Alex Levin shares that people are often more honest with AI agents than human agents, especially regarding payments.02:41 The surprising persistence of voice as a preferred channel for customer interaction, and how AI is set to revolutionize it.05:15 Discussion of the three types of AI agents: personal, work, and company agents, and how conversational AI will become the main interface with brands.07:12 Exploring the shift to machine-to-machine interactions and how AI changes what knowledge humans need versus what machines need.10:56 The looming challenge of centralization versus decentralization in AI, and how Americans often prioritize experience over privacy.14:11 Alex explains how tokenized data can offer personalized experiences without compromising specific individual privacy.25:44 Voice is predicted to become the primary way we interact with brands and technology due to its naturalness and efficiency.33:21 Why AI agents are easier to implement in contact centers due to different entropy compared to typical software.38:13 How Regal ensures AI agents stay on script and avoid "hallucinations" by proper training and guardrails.46:11 The technical challenges in replicating human conversational latency and nuances in AI voice interactions.Key InsightsAI Elicits HonestyPeople tend to be more forthright with AI agents, particularly in financially sensitive situations like discussing overdue payments. Alex speculates this is because individuals may feel less judged by an AI, leading to more truthful disclosures compared to interactions with human agents.Voice is King, AI is its HeirDespite predictions of its decline, voice remains a dominant channel for customer interactions. Alex believes that within three to five years, AI will handle as much as 90% of these voice interactions, transforming customer service with its efficiency and availability.The Rise of Company AgentsThe primary interface with most brands is expected to shift from websites and apps to conversational AI agents. This is because voice is a more natural, faster, and emotive way for humans to interact, a behavior already seen in younger generations.Machine-to-Machine FutureWe're moving towards a world where AI agents representing companies will interact directly with AI agents representing consumers. This "machine-to-machine" (M2M) paradigm will redefine commerce and the nature of how businesses and customers engage.Ontology of KnowledgeAs AI systems process vast amounts of information, creating a clear "ontology of knowledge" becomes crucial. This means structuring and categorizing information so AI can understand the context and user's underlying intent, rather than just processing raw data.Tokenized Data for PrivacyA potential solution to privacy concerns is "tokenized data." Instead of providing AI with specific personal details, users could share generalized tokens (e.g., "high-intent buyer in 30s") that allow for personalized experiences without revealing sensitive, identifiable information.AI Highlights Human InconsistenciesImplementing AI often brings to light existing inconsistencies or unacknowledged issues within a company. For instance, AI might reveal discrepancies between official scripts and how top-performing human agents actually communicate, forcing companies to address these differences.Influence as a Key Human SkillIn a future increasingly shaped by AI, Sam Altman (via Alex) suggests that the ability to "influence" others will be a paramount human skill. This uniquely human trait will be vital, whether for interacting with other people or for guiding and shaping AI systems.Contact Information*   Regal AI: regal.ai*   Email: hello@regal.ai*   LinkedIn: www.linkedin.com/in/alexlevin1/

Heal Thy Self with Dr. G
When No One Tells You About Herpes! #378

Heal Thy Self with Dr. G

Play Episode Listen Later May 5, 2025 26:47


Over 70% of the world has herpes—yet it's still taboo. In this episode, Dr. G breaks down the truth about HSV-1 & HSV-2, from how it spreads to how to heal physically and emotionally. He shares the Heal Thyself protocol, featuring powerful supplements, nervous system tools, and mindset shifts to reduce outbreaks and reclaim your peace. #wellnessjourney #herpes #wellness ==== Thank You To Our Sponsors! Calroy Head on over to at calroy.com/drg and Save over $50 when you purchase the Vascanox and Arterosil bundle! ==== Timestamps: 00:00:00 - Understanding the Herpes Virus 00:02:56 - Prevalence, Latency & Treatment 06:00 - Transmission: Myths & Facts 08:58 - Triggers, Treatments & Misconceptions 12:02:47 - Antiviral Drugs & Holistic Healing 15:09 - Treatment: Sleep, Stress & Supplements 18:09 - Natural Herpes Remedies 21:15 - Treatment & Emotional Roots 24:10 - Healing Herpes: Shame & Self-Ownership Be sure to like and subscribe to #HealThySelf Hosted by Doctor Christian Gonzalez N.D. Follow Doctor G on Instagram @doctor.gonzalez https://www.instagram.com/doctor.gonzalez/ Sign up for our newsletter! https://drchristiangonzalez.com/newsletter/

No Priors: Artificial Intelligence | Machine Learning | Technology | Startups
Inside Deep Research with Isa Fulford: Building the Future of AI Agents

No Priors: Artificial Intelligence | Machine Learning | Technology | Startups

Play Episode Listen Later Apr 24, 2025 30:45


On this episode of No Priors, Sarah sits down with Isa Fulford, one of the masterminds behind deep research. They unpack how the initiative began, the role of human expert data, and what it takes to build agents with real-world capability and even taste. Isa shares the differences between deep research and OpenAI's o3 model, the challenges around latency, and how she sees agent capabilities evolving. Plus, OpenAI has announced that deep research is free for all US users starting today. Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @IsaFulf Show Notes: 0:00 Deep research's inception & evolution 6:12 Data creation  7:20 Reinforcement fine-tuning 9:05 Why human expert data matters 11:23 Failure modes of agents 13:55 The roadmap ahead for Deep Research 18:32 How do agents develop taste?  19:29 Experience and path to building a broadly capable agent 22:03 Deep research vs. o3 25:55 Latency 27:56 Predictions for agent capabilities

365 Message Center Show
Changes to Copilot Pages. Teams town hall ultra-low latency | Ep 375

365 Message Center Show

Play Episode Listen Later Apr 24, 2025 29:20


To keep you in your flow, M365 Copilot Pages will now open in the M365 Copilot App home, rather than opening in Loop. Keep curating and conversing with Copilot. Also, Teams town hall meetings that have been created using a Teams Premium license will enjoy a much lower latency between the production and audience experience.  - Microsoft Viva Connections: New Engage card in Connections dashboard  - Removing Microsoft 365 Copilot Actions from Targeted Release  - Microsoft Purview | Retiring Classic Content Search, Classic eDiscovery (Standard) Cases, Export PowerShell Parameters  - Microsoft 365 Copilot: Create a PowerPoint slide from a file or prompt  - Microsoft Teams Premium: Ultra-low latency (ULL) attendee experience for town halls  - Microsoft Teams: town hall organizers, co-organizers, presenters can join the event to preview as attendee  - Pages created in Microsoft 365 Copilot Chat will open in Microsoft 365 Copilot app  Join Daniel Glenn and Darrell as a Service Webster as they cover the latest messages in the Microsoft 365 Message Center.   Check out Darrell & Daniel's own YouTube channels at:  Darrell - https://youtube.com/modernworkmentor  Daniel - https://youtube.com/DanielGlenn   

Database School
Building a serverless database replica with Carl Sverre

Database School

Play Episode Listen Later Apr 18, 2025 88:59


Want to learn more SQLite? Check out my SQLite course: https://highperformancesqlite.com In this episode, Carl Sverre and I discuss why syncing everything is a bad idea and how his new project, Graft, makes edge-native, partially replicated databases possible. We dig into SQLite, object storage, transactional guarantees, and why Graft might be the foundation for serverless database replicas. SQLSync: https://sqlsync.dev Stop syncing everything blog post: https://sqlsync.dev/posts/stop-syncing-everything Graft: https://github.com/orbitinghail/graft Follow Carl: Twitter: https://twitter.com/carlsverre LinkedIn: https://www.linkedin.com/in/carlsverre Website: https://carlsverre.com/ Follow Aaron: Twitter: https://twitter.com/aarondfrancis LinkedIn: https://www.linkedin.com/in/aarondfrancis Website: https://aaronfrancis.com - find articles, podcasts, courses, and more. Chapters: 00:00 - Intro and Carl's controversial blog title 01:00 - Why “stop syncing everything” doesn't mean stop syncing 02:30 - The problem with full database syncs 03:20 - Quick recap of SQL Sync and multiplayer SQLite 04:45 - How SQL Sync works using physical replication 06:00 - The limitations that led to building Graft 09:00 - What is Graft? A high-level overview 16:30 - Syncing architecture: how Graft scales 18:00 - Graft's stateless design and Fly.io integration 20:00 - S3 compatibility and using Tigris as backend 22:00 - Latency tuning and express zone support 24:00 - Can Graft run locally or with Minio? 27:00 - Page store vs meta store in Graft 36:00 - Index-aware prefetching in SQLite 38:00 - Prefetching intelligence: Graft vs driver 40:00 - The benefits of Graft's architectural simplicity 48:00 - Three use cases: apps, web apps, and replicas 50:00 - Sync timing and perceived latency 59:00 - Replaying transactions vs logical conflict resolution 1:03:00 - What's next for Graft and how to get involved 1:05:00 - Hacker News reception and blog post feedback 1:06:30 - Closing thoughts and where to find Carl

Mac Minutes
Episode 277, Lossless audio and ultra low latency audio come to AirPods Max; iOS 18.4 updates; and WWDC2025 announced

Mac Minutes

Play Episode Listen Later Apr 4, 2025 8:59


In this episode, we will discuss several three recent Apple announcements: a recent new software update which brings lossless audio and ultra-low latency audio to AirPods Max, delivering the ultimate listening experience and even greater performance for music production. Let's go to the show to learn more; new operating system updates and WWDC2025 is scheduled … Continue reading Episode 277, Lossless audio and ultra low latency audio come to AirPods Max; iOS 18.4 updates; and WWDC2025 announced →

PodRocket - A web development podcast from LogRocket
Put your database in the browser with Ben Holmes

PodRocket - A web development podcast from LogRocket

Play Episode Listen Later Apr 3, 2025 32:25


Ben Holmes, product engineer at Warp, joins PodRocket to talk about local-first web apps and what it takes to run a database directly in the browser. He breaks down how moving data closer to the user can reduce latency, improve performance, and simplify frontend development. Learn about SQLite in the browser, syncing challenges, handling conflicts, and tools like WebAssembly, IndexedDB, and CRDTs. Plus, Ben shares insights from building his own SimpleSyncEngine and where local-first development is headed! Links https://bholmes.dev https://www.linkedin.com/in/bholmesdev https://www.youtube.com/@bholmesdev https://x.com/bholmesdev https://bsky.app/profile/bholmes.dev https://github.com/bholmesdev We want to hear from you! How did you find us? Did you see us on Twitter? In a newsletter? Or maybe we were recommended by a friend? Let us know by sending an email to our producer, Emily, at emily.kochanekketner@logrocket.com (mailto:emily.kochanekketner@logrocket.com), or tweet at us at PodRocketPod (https://twitter.com/PodRocketpod). Follow us. Get free stickers. Follow us on Apple Podcasts, fill out this form (https://podrocket.logrocket.com/get-podrocket-stickers), and we'll send you free PodRocket stickers! What does LogRocket do? LogRocket provides AI-first session replay and analytics that surfaces the UX and technical issues impacting user experiences. Start understand where your users are struggling by trying it for free at [LogRocket.com]. Try LogRocket for free today.(https://logrocket.com/signup/?pdr) Special Guest: Ben Holmes.

The Industrial Talk Podcast with Scott MacKenzie
Marcus McCarthy with Siemens Grid Software

The Industrial Talk Podcast with Scott MacKenzie

Play Episode Listen Later Apr 2, 2025 24:22 Transcription Available


Industrial Talk is onsite at DistribuTech 2025 and talking to Marcus McCarthy, Sr. Vice President at Siemens Grid Software about "Energy Solutions for the Future". Scott MacKenzie and Marcus McCarthy discuss the evolving utility industry and the role of digital twins in improving efficiency and reliability. Marcus highlights the challenges of aging infrastructure, increased power demand, and the need for carbon removal. He emphasizes the importance of accurate digital models for better planning and decision-making. Marcus explains how Siemens' digital twin solutions enable real-time operations and scenario simulations, enhancing network management. They also touch on the practicality of cloud technology and the industry's readiness to adopt new technologies. The conversation underscores the urgency for utilities to invest in digital twins to meet future energy demands and optimize grid performance. Action Items [ ] Connect with Marcus McCarthy on LinkedIn or at marcus.mccarthy@siemens.com to discuss further [ ] Establish accurate digital models of the utility network (digital twins) with temporal stamping to simulate future scenarios [ ] Explore how high-energy consumption facilities like data centers can optimize their interaction with the grid Outline Introduction and Welcome Scott MacKenzie as a passionate industry professional dedicated to transferring cutting-edge industry innovations and trends. Scott MacKenzie welcomes listeners to the Industrial Talk Podcast, highlighting the celebration of industry professionals worldwide. Scott mentions the podcast is brought to you by Siemens Smart Infrastructure and Grid Software, encouraging listeners to visit siemens.com for more information. Scott and Marcus discuss the massive scale of the Distribute Tech conference in Dallas, Texas, and Scott's limited time to explore the solutions. Background on Marcus McCarthy Marcus shares his background, mentioning his move from Ireland to the US about 12 years ago. Marcus discusses his career in utilities, focusing on distribution and transmission software systems. Scott and Marcus agree on the positive aspects of the utility industry, including the people and the current market dynamics. Marcus reflects on the industry's shift from a quiet period to a time of rapid change and innovation. Challenges and Pressures in the Utility Industry Marcus highlights the increasing demand for power and the need for safe and reliable delivery. Scott and Marcus discuss the challenges of aging infrastructure and the need for modernization. Marcus explains the complexities of meeting future power demands while addressing carbon removal efforts. Scott shares his experience as a lineman and the evolution of the utility industry from a linear design to a more distributed energy system. Digital Twin and Its Importance Scott expresses his enthusiasm for digital twin technology and its potential for simulation and decision-making. Marcus explains the critical role of digital twin in achieving faster, better decision-making in complex environments. Marcus discusses the importance of standardized models and the sharing of planning data among different players in the industry. Marcus highlights the need for real-time operations and the challenges of integrating planning and operations data. Cloud Technology and Latency

HPE Tech Talk
Manufacturing and the struggle to become 'smart'

HPE Tech Talk

Play Episode Listen Later Mar 27, 2025 21:30


In this episode we are looking at a sector where IT and tech innovation is taking efficiency to a whole new level - manufacturing.Manufacturing is in a precarious position as an industry. In the global north, growth is largely stagnant, according to those same UN statistics. Even in high-growth economies like China, it's slowing down. It's also notoriously inefficient. So, can tech help? And if so, what does that look like? Joining us to discuss is Dan Klein, an advisor on data and digital transformation with a special interest in the manufacturing sector.This is Technology Now, a weekly show from Hewlett Packard Enterprise. Every week we look at a story that's been making headlines, take a look at the technology behind it, and explain why it matters to organizations and what we can learn from it. About this week's guest, Dan Klein: https://www.linkedin.com/in/dplklein/?originalSubdomain=uk Sources cited in this week's episode: UN stats on the state of global manufacturing: https://stat.unido.org/portal/storage/file/publications/qiip/World_Manufacturing_Production_2024_Q1.pdfStatista report on global manufacturing and efficiency: https://www.statista.com/outlook/io/manufacturing/worldwide Water on Mars: https://pubs.geoscienceworld.org/gsa/geology/article/52/12/939/648640/Seismic-discontinuity-in-the-Martian-crust

Tech behind the Trends on The Element Podcast | Hewlett Packard Enterprise
Manufacturing and the struggle to become 'smart'

Tech behind the Trends on The Element Podcast | Hewlett Packard Enterprise

Play Episode Listen Later Mar 27, 2025 21:30


In this episode we are looking at a sector where IT and tech innovation is taking efficiency to a whole new level - manufacturing.Manufacturing is in a precarious position as an industry. In the global north, growth is largely stagnant, according to those same UN statistics. Even in high-growth economies like China, it's slowing down. It's also notoriously inefficient. So, can tech help? And if so, what does that look like? Joining us to discuss is Dan Klein, an advisor on data and digital transformation with a special interest in the manufacturing sector.This is Technology Now, a weekly show from Hewlett Packard Enterprise. Every week we look at a story that's been making headlines, take a look at the technology behind it, and explain why it matters to organizations and what we can learn from it. About this week's guest, Dan Klein: https://www.linkedin.com/in/dplklein/?originalSubdomain=uk Sources cited in this week's episode: UN stats on the state of global manufacturing: https://stat.unido.org/portal/storage/file/publications/qiip/World_Manufacturing_Production_2024_Q1.pdfStatista report on global manufacturing and efficiency: https://www.statista.com/outlook/io/manufacturing/worldwide Water on Mars: https://pubs.geoscienceworld.org/gsa/geology/article/52/12/939/648640/Seismic-discontinuity-in-the-Martian-crust

HPE Tech Talk, SMB
Manufacturing and the struggle to become 'smart'

HPE Tech Talk, SMB

Play Episode Listen Later Mar 27, 2025 21:30


In this episode we are looking at a sector where IT and tech innovation is taking efficiency to a whole new level - manufacturing.Manufacturing is in a precarious position as an industry. In the global north, growth is largely stagnant, according to those same UN statistics. Even in high-growth economies like China, it's slowing down. It's also notoriously inefficient. So, can tech help? And if so, what does that look like? Joining us to discuss is Dan Klein, an advisor on data and digital transformation with a special interest in the manufacturing sector.This is Technology Now, a weekly show from Hewlett Packard Enterprise. Every week we look at a story that's been making headlines, take a look at the technology behind it, and explain why it matters to organizations and what we can learn from it. About this week's guest, Dan Klein: https://www.linkedin.com/in/dplklein/?originalSubdomain=uk Sources cited in this week's episode: UN stats on the state of global manufacturing: https://stat.unido.org/portal/storage/file/publications/qiip/World_Manufacturing_Production_2024_Q1.pdfStatista report on global manufacturing and efficiency: https://www.statista.com/outlook/io/manufacturing/worldwide Water on Mars: https://pubs.geoscienceworld.org/gsa/geology/article/52/12/939/648640/Seismic-discontinuity-in-the-Martian-crust

Unexplored Territory
#093 - Best practices for Latency Sensitive Workloads featuring Mark A!

Unexplored Territory

Play Episode Listen Later Mar 23, 2025 36:10


Recently a new white paper was released on the topic of latency-sensitive workloads. I invited Mark Achtemichuck (X, LinkedIn) to the show to go over the various recommendations and best practices. Mark highlight many important configuration settings, and also recommends everyone to not only read the white paper, but also the vSphere 8 performance documentation. Also, his VMware Explore session comes highly recommended, make sure to watch it!Disclaimer: The thoughts and opinions shared in this podcast are our own/guest(s), and not necessarily those of Broadcom or VMware by Broadcom.

No Sharding - The Solana Podcast
Increase Bandwidth, Reduce Latency w/ Mateo Ward and Andrew McConnell (Malbec Labs)

No Sharding - The Solana Podcast

Play Episode Listen Later Feb 25, 2025 60:20


In this episode of Validated, Austin discusses his new venture DoubleZero with co-founders Andrew McConnell and Matteo Ward. They discuss the necessity of creating high-performance networking specifically tailored for blockchains, comparing it to the traditional internet and private networks. They delve into their backgrounds and how their experiences in telecom and high-frequency trading influence the development of Double Zero. This episode covers various technical topics including the limitations of the public internet, the importance of a purpose-built network, and how DoubleZero provides a decentralized, efficient, and secure connectivity for blockchain validators.  DISCLAIMER The content herein is provided for educational, informational, and entertainment purposes only, and does not constitute an offer to sell or a solicitation of an offer to buy any securities, options, futures, or other derivatives related to securities in any jurisdiction, nor should not be relied upon as advice to buy, sell or hold any of the foregoing. This content is intended to be general in nature and is not specific to you, the user or anyone else. You should not make any decision, financial, investment, trading or otherwise, based on any of the information presented without undertaking independent due diligence and consultation with a professional advisor. Solana Foundation Foundation and its agents, advisors, council members, officers and employees (the “Foundation Parties”) make no representation or warranties, expressed or implied, as to the accuracy of the information herein and expressly disclaims any and all liability that may be based on such information or any errors or omissions therein. The Foundation Parties shall have no liability whatsoever, under contract, tort, trust or otherwise, to any person arising from or related to the content or any use of the information contained herein by you or any of your representatives. All opinions expressed herein are the speakers' own personal opinions and do not reflect the opinions of any entities.

This Week in Neuroscience
TWiN 57: Repetitive injury, herpes, and Alzheimer's

This Week in Neuroscience

Play Episode Listen Later Feb 4, 2025 40:09


TWiN discusses a study showing that repetitive injury reactivates HSV-1 in a human brain tissue model and induces phenotypes associated with Alzheimer's disease. Hosts: Vincent Racaniello and Tim Cheung Subscribe (free): Apple Podcasts, Google Podcasts, RSS Links for this episode MicrobeTV Discord Server Repetitive injury, herpes, and Alzheimers (Sci Signal) The tau of herpesvirus (TWiV 1187) Fishing for viruses in senile (TWiV 519) Timestamps by Jolene Ramsey. Thanks! Music is by Ronald Jenkees Send your neuroscience questions and comments to twin@microbe.tv

The Six Five with Patrick Moorhead and Daniel Newman
The View from Davos with Qualcomm's Cristiano Amon

The Six Five with Patrick Moorhead and Daniel Newman

Play Episode Listen Later Jan 24, 2025 15:12


What's Qualcomm's CEO Cristiano Amon saying from Davos? He has great optimism for growth and the crucial role of collaboration between public-private partnerships in driving progress. Find out why below ⬇ Hosts Daniel Newman and Patrick Moorhead are back with another interview on The View From Davos. They met up with Qualcomm's Cristiano Amon, President and Chief Executive Officer, to discuss the latest tech advancements and market trends observed at this year's WEF. Cristiano shares his valuable insights from the forum including his optimism for growth and the crucial role of collaboration between public-private partnerships in driving progress. Check out the full interview for more on: AI in real-world applications and tangible value creation with are top of mind for business and government leaders Edge computing is key to unlocking AI's potential: Latency, privacy, and cost are driving a shift towards distributed computing power The lines between cloud and edge are blurring Qualcomm's role in powering AI innovation across industries, from mobile to automotive to industrial IoT A new era of IoT is dawning: Advances in AI, edge computing, and connectivity are creating opportunities for a resurgence of the Internet of Things.  

Packet Pushers - Full Podcast Feed
Tech Bytes: Can SD-WAN Solve Latency Issues for Modern Applications? (Sponsored)

Packet Pushers - Full Podcast Feed

Play Episode Listen Later Jan 20, 2025 21:08


Traditional SD-WAN ensures that business-critical apps get the best-performing network path to deliver a good user experience and meet service levels. But as SaaS and cloud adoption increase, the best path across a WAN may not be enough. Techniques like WAN ops and legacy caching techniques may have worked for enterprise or private apps, but... Read more »

Packet Pushers - Briefings In Brief
Tech Bytes: Can SD-WAN Solve Latency Issues for Modern Applications? (Sponsored)

Packet Pushers - Briefings In Brief

Play Episode Listen Later Jan 20, 2025 21:08


Traditional SD-WAN ensures that business-critical apps get the best-performing network path to deliver a good user experience and meet service levels. But as SaaS and cloud adoption increase, the best path across a WAN may not be enough. Techniques like WAN ops and legacy caching techniques may have worked for enterprise or private apps, but... Read more »

Environment Variables
Finding Signal Amongst the Noise in Carbon Aware Software

Environment Variables

Play Episode Listen Later Jan 9, 2025 35:10


In this episode of Environment Variables, host Chris Adams is joined by Tammy Sukprasert, a PhD student at the University of Massachusetts Amherst, to dive deep into her research on carbon-aware computing. Tammy explores the concept of shifting computing workloads across time and space to reduce carbon emissions, focusing on the benefits and limitations of this approach. She explains how moving workloads to cleaner regions or delaying them until cleaner energy sources are available can help cut emissions, but also discusses the challenges that come with real-world constraints like server capacity and latency. Together they discuss the findings from her recent papers, including the differences between average and marginal carbon intensity signals and how they impact decision-making. The conversation highlights the complexity of achieving carbon savings and the need for better metrics and strategies in the world of software development.

Mixing Music with Dee Kei | Audio Production, Technical Tips, & Mindset
Does Latency Play a Role in Parallel Compression?

Mixing Music with Dee Kei | Audio Production, Technical Tips, & Mindset

Play Episode Listen Later Jan 8, 2025 14:17


Thank you for being a subscriber to this exclusive content! ⁠SUBSCRIBE TO YOUTUBE⁠ ⁠Join the ‘Mixing Music Podcast' Discord!⁠ ⁠HIRE DEE KEI⁠ ⁠HIRE JAMES⁠ Find Dee Kei Braeden, and Jame on Social Media: Instagram: ⁠@DeeKeiMixes⁠  ⁠@JamesDeanMixes⁠ Twitter: ⁠@DeeKeiMixes⁠  ⁠CHECK OUT OUR OTHER RESOURCES⁠ Join the ‘Mixing Music Podcast' Group: ⁠Discord⁠ & ⁠Facebook⁠ The Mixing Music Podcast is sponsored by ⁠Izotope⁠, ⁠Antares (Auto Tune)⁠, ⁠Plugin Boutique⁠, ⁠Lauten Audio⁠, ⁠Spreaker⁠, ⁠Filepass⁠, & ⁠Canva⁠ The Mixing Music Podcast is a video and audio series on the art of music production and post-production. Dee Kei and Lu are both professionals in the Los Angeles music industry having worked with names like Keyshia Cole, Trey Songz, Ray J, Smokepurrp, Benny the Butcher, Sueco the Child, Ari Lennox, G-Eazy, Phresher, Lucky Daye, DDG, Lil Xan, Masego, $SNOT, Kanye West, King Kanja, Dreamville, BET, Universal Music, Interscope Records, etc. This video podcast is meant to be used for educational purposes only. This show is filmed at IN THE MIX STUDIOS located in North Hollywood, California. If you would like to sponsor the show, please email us at ⁠deekeimixes@gmail.com⁠.

Python Bytes
#415 Just put the fries in the bag bro

Python Bytes

Play Episode Listen Later Dec 23, 2024 32:59 Transcription Available


Topics covered in this episode: dbos-transact-py Typed Python in 2024: Well adopted, yet usability challenges persist RightTyper Lazy self-installing Python scripts with uv Extras Joke Watch on YouTube About the show Sponsored by us! Support our work through: Our courses at Talk Python Training The Complete pytest Course Patreon Supporters Connect with the hosts Michael: @mkennedy@fosstodon.org / @mkennedy.codes (bsky) Brian: @brianokken@fosstodon.org / @brianokken.bsky.social Show: @pythonbytes@fosstodon.org / @pythonbytes.fm (bsky) Join us on YouTube at pythonbytes.fm/live to be part of the audience. Usually Monday at 10am PT. Older video versions available there too. Finally, if you want an artisanal, hand-crafted digest of every week of the show notes in email form? Add your name and email to our friends of the show list, we'll never share it. Michael #1: dbos-transact-py DBOS Transact is a Python library providing ultra-lightweight durable execution. Durable execution means your program is resilient to any failure. If it is ever interrupted or crashes, all your workflows will automatically resume from the last completed step. Under the hood, DBOS Transact works by storing your program's execution state (which workflows are currently executing and which steps they've completed) in a Postgres database. Incredibly fast, for example 25x faster than AWS Step Functions. Brian #2: Typed Python in 2024: Well adopted, yet usability challenges persist Aaron Pollack on Engineering at Meta blog “Overall findings 88% of respondents “Always” or “Often” use Types in their Python code. IDE tooling, documentation, and catching bugs are drivers for the high adoption of types in survey responses, The usability of types and ability to express complex patterns still are challenges that leave some code unchecked. Latency in tooling and lack of types in popular libraries are limiting the effectiveness of type checkers. Inconsistency in type check implementations and poor discoverability of the documentation create friction in onboarding types into a project and seeking help when using the tools. “ Notes Seems to be a different survey than the 2023 (current) dev survey. Diff time frame and results. July 29 - Oct 8, 2024 Michael #3: RightTyper A fast and efficient type assistant for Python, including tensor shape inference Brian #4: Lazy self-installing Python scripts with uv Trey Hunner Creating your own ~/bin full of single-file command line scripts is common for *nix folks, still powerful but underutilized on Mac, and trickier but still useful on Windows. Python has been difficult in the past to use for standalone scripts if you need dependencies, but that's no longer the case with uv. Trey walks through user scripts (*nix and Mac) Using #! for scripts that don'thave dependencies Using #! with uv run --script and /// script for dependencies Discussion about how uv handles that. Extras Brian: Courses at pythontest.com If you live in a place (or are in a place in your life) where these prices are too much, let me know. I had a recent request and I really appreciate it. Michael: Python 3.14 update released Top episodes of 2024 at Talk Python Universal check for updates macOS: Settings > Keyboard > Keyboard shortcuts > App shortcuts > + Then add shortcut for single app, ^U and the menu title. Joke: Python with rizz

Get Your Tech On
Unleash Ultra-Fast Internet: Low Latency DOCSIS Explained

Get Your Tech On

Play Episode Listen Later Dec 11, 2024


We had a very busy episode with a lot of watcher questions. Watch or listen as Brady Volpe and John Downey go in-depth into the latest research on low-latency DOCSIS technology presented in "Latency Outcomes Across Access Network Architectures" at SCTE TechExpo! We'll be breaking down the key findings of 'Latency Measurement - Baselines', The post Unleash Ultra-Fast Internet: Low Latency DOCSIS Explained appeared first on Volpe Firm.

New Books Network
Tristan A. Volpe, "Leveraging Latency: How the Weak Compel the Strong with Nuclear Technology" (Oxford UP, 2023)

New Books Network

Play Episode Listen Later Dec 8, 2024 72:04


Over the last seven decades, some states successfully leveraged the threat of acquiring atomic weapons to compel concessions from superpowers. For many others, however, this coercive gambit failed to work. When does nuclear latency--the technical capacity to build the bomb--enable states to pursue effective coercion? In Leveraging Latency: How the Weak Compel the Strong with Nuclear Technology (Oxford UP, 2023), Tristan A. Volpe argues that having greater capacity to build weaponry doesn't translate to greater coercive advantage. Volpe finds that there is a trade-off between threatening proliferation and promising nuclear restraint. States need just enough bomb-making capacity to threaten proliferation but not so much that it becomes too difficult for them to offer nonproliferation assurances. The boundaries of this sweet spot align with the capacity to produce the fissile material at the heart of an atomic weapon. To test this argument, Volpe includes comparative case studies of four countries that leveraged latency against superpowers: Japan, West Germany, North Korea, and Iran. Volpe identifies a generalizable mechanism--the threat-assurance trade-off--that explains why more power often makes compellence less likely to work. Volpe proposes a framework that illuminates how technology shapes broader bargaining dynamics and helps to refine policy options for inhibiting the spread of nuclear weapons. As nuclear technology continues to cast a shadow over the global landscape, Leveraging Latency systematically assesses its coercive utility. Our guest today is Tristan Volpe, an Assistant Professor in the Defense Analysis Department at the Naval Postgraduate School and a nonresident fellow in the Nuclear Policy Program at the Carnegie Endowment for International Peace. Our host is Eleonora Mattiacci, an Associate Professor of Political Science at Amherst College. She is the author of "Volatile States in International Politics" (Oxford University Press, 2023). Learn more about your ad choices. Visit megaphone.fm/adchoices Support our show by becoming a premium member! https://newbooksnetwork.supportingcast.fm/new-books-network

New Books in Political Science
Tristan A. Volpe, "Leveraging Latency: How the Weak Compel the Strong with Nuclear Technology" (Oxford UP, 2023)

New Books in Political Science

Play Episode Listen Later Dec 8, 2024 72:04


Over the last seven decades, some states successfully leveraged the threat of acquiring atomic weapons to compel concessions from superpowers. For many others, however, this coercive gambit failed to work. When does nuclear latency--the technical capacity to build the bomb--enable states to pursue effective coercion? In Leveraging Latency: How the Weak Compel the Strong with Nuclear Technology (Oxford UP, 2023), Tristan A. Volpe argues that having greater capacity to build weaponry doesn't translate to greater coercive advantage. Volpe finds that there is a trade-off between threatening proliferation and promising nuclear restraint. States need just enough bomb-making capacity to threaten proliferation but not so much that it becomes too difficult for them to offer nonproliferation assurances. The boundaries of this sweet spot align with the capacity to produce the fissile material at the heart of an atomic weapon. To test this argument, Volpe includes comparative case studies of four countries that leveraged latency against superpowers: Japan, West Germany, North Korea, and Iran. Volpe identifies a generalizable mechanism--the threat-assurance trade-off--that explains why more power often makes compellence less likely to work. Volpe proposes a framework that illuminates how technology shapes broader bargaining dynamics and helps to refine policy options for inhibiting the spread of nuclear weapons. As nuclear technology continues to cast a shadow over the global landscape, Leveraging Latency systematically assesses its coercive utility. Our guest today is Tristan Volpe, an Assistant Professor in the Defense Analysis Department at the Naval Postgraduate School and a nonresident fellow in the Nuclear Policy Program at the Carnegie Endowment for International Peace. Our host is Eleonora Mattiacci, an Associate Professor of Political Science at Amherst College. She is the author of "Volatile States in International Politics" (Oxford University Press, 2023). Learn more about your ad choices. Visit megaphone.fm/adchoices Support our show by becoming a premium member! https://newbooksnetwork.supportingcast.fm/political-science

New Books in World Affairs
Tristan A. Volpe, "Leveraging Latency: How the Weak Compel the Strong with Nuclear Technology" (Oxford UP, 2023)

New Books in World Affairs

Play Episode Listen Later Dec 8, 2024 72:04


Over the last seven decades, some states successfully leveraged the threat of acquiring atomic weapons to compel concessions from superpowers. For many others, however, this coercive gambit failed to work. When does nuclear latency--the technical capacity to build the bomb--enable states to pursue effective coercion? In Leveraging Latency: How the Weak Compel the Strong with Nuclear Technology (Oxford UP, 2023), Tristan A. Volpe argues that having greater capacity to build weaponry doesn't translate to greater coercive advantage. Volpe finds that there is a trade-off between threatening proliferation and promising nuclear restraint. States need just enough bomb-making capacity to threaten proliferation but not so much that it becomes too difficult for them to offer nonproliferation assurances. The boundaries of this sweet spot align with the capacity to produce the fissile material at the heart of an atomic weapon. To test this argument, Volpe includes comparative case studies of four countries that leveraged latency against superpowers: Japan, West Germany, North Korea, and Iran. Volpe identifies a generalizable mechanism--the threat-assurance trade-off--that explains why more power often makes compellence less likely to work. Volpe proposes a framework that illuminates how technology shapes broader bargaining dynamics and helps to refine policy options for inhibiting the spread of nuclear weapons. As nuclear technology continues to cast a shadow over the global landscape, Leveraging Latency systematically assesses its coercive utility. Our guest today is Tristan Volpe, an Assistant Professor in the Defense Analysis Department at the Naval Postgraduate School and a nonresident fellow in the Nuclear Policy Program at the Carnegie Endowment for International Peace. Our host is Eleonora Mattiacci, an Associate Professor of Political Science at Amherst College. She is the author of "Volatile States in International Politics" (Oxford University Press, 2023). Learn more about your ad choices. Visit megaphone.fm/adchoices Support our show by becoming a premium member! https://newbooksnetwork.supportingcast.fm/world-affairs

New Books in National Security
Tristan A. Volpe, "Leveraging Latency: How the Weak Compel the Strong with Nuclear Technology" (Oxford UP, 2023)

New Books in National Security

Play Episode Listen Later Dec 8, 2024 72:04


Over the last seven decades, some states successfully leveraged the threat of acquiring atomic weapons to compel concessions from superpowers. For many others, however, this coercive gambit failed to work. When does nuclear latency--the technical capacity to build the bomb--enable states to pursue effective coercion? In Leveraging Latency: How the Weak Compel the Strong with Nuclear Technology (Oxford UP, 2023), Tristan A. Volpe argues that having greater capacity to build weaponry doesn't translate to greater coercive advantage. Volpe finds that there is a trade-off between threatening proliferation and promising nuclear restraint. States need just enough bomb-making capacity to threaten proliferation but not so much that it becomes too difficult for them to offer nonproliferation assurances. The boundaries of this sweet spot align with the capacity to produce the fissile material at the heart of an atomic weapon. To test this argument, Volpe includes comparative case studies of four countries that leveraged latency against superpowers: Japan, West Germany, North Korea, and Iran. Volpe identifies a generalizable mechanism--the threat-assurance trade-off--that explains why more power often makes compellence less likely to work. Volpe proposes a framework that illuminates how technology shapes broader bargaining dynamics and helps to refine policy options for inhibiting the spread of nuclear weapons. As nuclear technology continues to cast a shadow over the global landscape, Leveraging Latency systematically assesses its coercive utility. Our guest today is Tristan Volpe, an Assistant Professor in the Defense Analysis Department at the Naval Postgraduate School and a nonresident fellow in the Nuclear Policy Program at the Carnegie Endowment for International Peace. Our host is Eleonora Mattiacci, an Associate Professor of Political Science at Amherst College. She is the author of "Volatile States in International Politics" (Oxford University Press, 2023). Learn more about your ad choices. Visit megaphone.fm/adchoices Support our show by becoming a premium member! https://newbooksnetwork.supportingcast.fm/national-security

New Books in Science, Technology, and Society
Tristan A. Volpe, "Leveraging Latency: How the Weak Compel the Strong with Nuclear Technology" (Oxford UP, 2023)

New Books in Science, Technology, and Society

Play Episode Listen Later Dec 8, 2024 72:04


Over the last seven decades, some states successfully leveraged the threat of acquiring atomic weapons to compel concessions from superpowers. For many others, however, this coercive gambit failed to work. When does nuclear latency--the technical capacity to build the bomb--enable states to pursue effective coercion? In Leveraging Latency: How the Weak Compel the Strong with Nuclear Technology (Oxford UP, 2023), Tristan A. Volpe argues that having greater capacity to build weaponry doesn't translate to greater coercive advantage. Volpe finds that there is a trade-off between threatening proliferation and promising nuclear restraint. States need just enough bomb-making capacity to threaten proliferation but not so much that it becomes too difficult for them to offer nonproliferation assurances. The boundaries of this sweet spot align with the capacity to produce the fissile material at the heart of an atomic weapon. To test this argument, Volpe includes comparative case studies of four countries that leveraged latency against superpowers: Japan, West Germany, North Korea, and Iran. Volpe identifies a generalizable mechanism--the threat-assurance trade-off--that explains why more power often makes compellence less likely to work. Volpe proposes a framework that illuminates how technology shapes broader bargaining dynamics and helps to refine policy options for inhibiting the spread of nuclear weapons. As nuclear technology continues to cast a shadow over the global landscape, Leveraging Latency systematically assesses its coercive utility. Our guest today is Tristan Volpe, an Assistant Professor in the Defense Analysis Department at the Naval Postgraduate School and a nonresident fellow in the Nuclear Policy Program at the Carnegie Endowment for International Peace. Our host is Eleonora Mattiacci, an Associate Professor of Political Science at Amherst College. She is the author of "Volatile States in International Politics" (Oxford University Press, 2023). Learn more about your ad choices. Visit megaphone.fm/adchoices Support our show by becoming a premium member! https://newbooksnetwork.supportingcast.fm/science-technology-and-society

New Books in Korean Studies
Tristan A. Volpe, "Leveraging Latency: How the Weak Compel the Strong with Nuclear Technology" (Oxford UP, 2023)

New Books in Korean Studies

Play Episode Listen Later Dec 8, 2024 72:04


Over the last seven decades, some states successfully leveraged the threat of acquiring atomic weapons to compel concessions from superpowers. For many others, however, this coercive gambit failed to work. When does nuclear latency--the technical capacity to build the bomb--enable states to pursue effective coercion? In Leveraging Latency: How the Weak Compel the Strong with Nuclear Technology (Oxford UP, 2023), Tristan A. Volpe argues that having greater capacity to build weaponry doesn't translate to greater coercive advantage. Volpe finds that there is a trade-off between threatening proliferation and promising nuclear restraint. States need just enough bomb-making capacity to threaten proliferation but not so much that it becomes too difficult for them to offer nonproliferation assurances. The boundaries of this sweet spot align with the capacity to produce the fissile material at the heart of an atomic weapon. To test this argument, Volpe includes comparative case studies of four countries that leveraged latency against superpowers: Japan, West Germany, North Korea, and Iran. Volpe identifies a generalizable mechanism--the threat-assurance trade-off--that explains why more power often makes compellence less likely to work. Volpe proposes a framework that illuminates how technology shapes broader bargaining dynamics and helps to refine policy options for inhibiting the spread of nuclear weapons. As nuclear technology continues to cast a shadow over the global landscape, Leveraging Latency systematically assesses its coercive utility. Our guest today is Tristan Volpe, an Assistant Professor in the Defense Analysis Department at the Naval Postgraduate School and a nonresident fellow in the Nuclear Policy Program at the Carnegie Endowment for International Peace. Our host is Eleonora Mattiacci, an Associate Professor of Political Science at Amherst College. She is the author of "Volatile States in International Politics" (Oxford University Press, 2023). Learn more about your ad choices. Visit megaphone.fm/adchoices Support our show by becoming a premium member! https://newbooksnetwork.supportingcast.fm/korean-studies

New Books in Diplomatic History
Tristan A. Volpe, "Leveraging Latency: How the Weak Compel the Strong with Nuclear Technology" (Oxford UP, 2023)

New Books in Diplomatic History

Play Episode Listen Later Dec 8, 2024 72:04


Over the last seven decades, some states successfully leveraged the threat of acquiring atomic weapons to compel concessions from superpowers. For many others, however, this coercive gambit failed to work. When does nuclear latency--the technical capacity to build the bomb--enable states to pursue effective coercion? In Leveraging Latency: How the Weak Compel the Strong with Nuclear Technology (Oxford UP, 2023), Tristan A. Volpe argues that having greater capacity to build weaponry doesn't translate to greater coercive advantage. Volpe finds that there is a trade-off between threatening proliferation and promising nuclear restraint. States need just enough bomb-making capacity to threaten proliferation but not so much that it becomes too difficult for them to offer nonproliferation assurances. The boundaries of this sweet spot align with the capacity to produce the fissile material at the heart of an atomic weapon. To test this argument, Volpe includes comparative case studies of four countries that leveraged latency against superpowers: Japan, West Germany, North Korea, and Iran. Volpe identifies a generalizable mechanism--the threat-assurance trade-off--that explains why more power often makes compellence less likely to work. Volpe proposes a framework that illuminates how technology shapes broader bargaining dynamics and helps to refine policy options for inhibiting the spread of nuclear weapons. As nuclear technology continues to cast a shadow over the global landscape, Leveraging Latency systematically assesses its coercive utility. Our guest today is Tristan Volpe, an Assistant Professor in the Defense Analysis Department at the Naval Postgraduate School and a nonresident fellow in the Nuclear Policy Program at the Carnegie Endowment for International Peace. Our host is Eleonora Mattiacci, an Associate Professor of Political Science at Amherst College. She is the author of "Volatile States in International Politics" (Oxford University Press, 2023). Learn more about your ad choices. Visit megaphone.fm/adchoices

Automating Scientific Discovery, with Andrew White, Head of Science at Future House

Play Episode Listen Later Dec 5, 2024 118:32


In this episode of The Cognitive Revolution, Nathan interviews Andrew White, Professor of Chemical Engineering at the University of Rochester and Head of Science at Future House. We explore groundbreaking AI systems for scientific discovery, including PaperQA and Aviary, and discuss how large language models are transforming research. Join us for an insightful conversation about the intersection of AI and scientific advancement with this pioneering researcher in his first-ever podcast appearance. Check out Future House: https://www.futurehouse.org Help shape our show by taking our quick listener survey at https://bit.ly/TurpentinePulse SPONSORS: Oracle Cloud Infrastructure (OCI): Oracle's next-generation cloud platform delivers blazing-fast AI and ML performance with 50% less for compute and 80% less for outbound networking compared to other cloud providers13. OCI powers industry leaders with secure infrastructure and application development capabilities. New U.S. customers can get their cloud bill cut in half by switching to OCI before December 31, 2024 at https://oracle.com/cognitive SelectQuote: Finding the right life insurance shouldn't be another task you put off. SelectQuote compares top-rated policies to get you the best coverage at the right price. Even in our AI-driven world, protecting your family's future remains essential. Get your personalized quote at https://selectquote.com/cognitive Shopify: Shopify is the world's leading e-commerce platform, offering a market-leading checkout system and exclusive AI apps like Quikly. Nobody does selling better than Shopify. Get a $1 per month trial at https://shopify.com/cognitive CHAPTERS: (00:00:00) Teaser (00:01:13) About the Episode (00:04:37) Andrew White's Journey (00:10:23) GPT-4 Red Team (00:15:33) GPT-4 & Chemistry (00:17:54) Sponsors: Oracle Cloud Infrastructure (OCI) | SelectQuote (00:20:19) Biology vs Physics (00:23:14) Conceptual Dark Matter (00:26:27) Future House Intro (00:30:42) Semi-Autonomous AI (00:35:39) Sponsors: Shopify (00:37:00) Lab Automation (00:39:46) In Silico Experiments (00:45:22) Cost of Experiments (00:51:30) Multi-Omic Models (00:54:54) Scale and Grokking (01:00:53) Future House Projects (01:10:42) Paper QA Insights (01:16:28) Generalizing to Other Domains (01:17:57) Using Figures Effectively (01:22:01) Need for Specialized Tools (01:24:23) Paper QA Cost & Latency (01:27:37) Aviary: Agents & Environments (01:31:42) Black Box Gradient Estimation (01:36:14) Open vs Closed Models (01:37:52) Improvement with Training (01:40:00) Runtime Choice & Q-Learning (01:43:43) Narrow vs General AI (01:48:22) Future Directions & Needs (01:53:22) Future House: What's Next? (01:55:32) Outro SOCIAL LINKS: Website: https://www.cognitiverevolution.ai Twitter (Podcast): https://x.com/cogrev_podcast Twitter (Nathan): https://x.com/labenz LinkedIn: https://www.linkedin.com/in/nathanlabenz/ Youtube: https://www.youtube.com/@CognitiveRevolutionPodcast Apple: https://podcasts.apple.com/de/podcast/the-cognitive-revolution-ai-builders-researchers-and/id1669813431 Spotify: https://open.spotify.com/show/6yHyok3M3BjqzR0VB5MSyk

The Evolution of AI Agents: Lessons from 2024, with MultiOn CEO Div Garg

Play Episode Listen Later Dec 3, 2024 90:21


In this episode of The Cognitive Revolution, Nathan welcomes back Div Garg, Co-Founder and CEO of MultiOn, for his third appearance to discuss the evolving landscape of AI agents. We explore how agent development has shifted from open-ended frameworks to intelligent workflows, MultiOn's unique approach to agent development, and their journey toward achieving human-level performance. Dive into fascinating insights about data collection strategies, model fine-tuning techniques, and the future of agent authentication. Join us for an in-depth conversation about why 2025 might be the breakthrough year for AI agents. Check out MultiOn: https://www.multion.ai/ Help shape our show by taking our quick listener survey at https://bit.ly/TurpentinePulse SPONSORS: Oracle Cloud Infrastructure (OCI): Oracle's next-generation cloud platform delivers blazing-fast AI and ML performance with 50% less for compute and 80% less for outbound networking compared to other cloud providers13. OCI powers industry leaders with secure infrastructure and application development capabilities. New U.S. customers can get their cloud bill cut in half by switching to OCI before December 31, 2024 at https://oracle.com/cognitive SelectQuote: Finding the right life insurance shouldn't be another task you put off. SelectQuote compares top-rated policies to get you the best coverage at the right price. Even in our AI-driven world, protecting your family's future remains essential. Get your personalized quote at https://selectquote.com/cognitive Weights & Biases RAG++: Advanced training for building production-ready RAG applications. Learn from experts to overcome LLM challenges, evaluate systematically, and integrate advanced features. Includes free Cohere credits. Visit https://wandb.me/cr to start the RAG++ course today. RECOMMENDED PODCAST: Unpack Pricing - Dive into the dark arts of SaaS pricing with Metronome CEO Scott Woody and tech leaders. Learn how strategic pricing drives explosive revenue growth in today's biggest companies like Snowflake, Cockroach Labs, Dropbox and more. Apple: https://podcasts.apple.com/us/podcast/id1765716600 Spotify: https://open.spotify.com/show/38DK3W1Fq1xxQalhDSueFg CHAPTERS: (00:00:00) Teaser (00:00:40) About the Episode (00:04:10) The Rise of AI Agents (00:06:33) Open-Ended vs On-Rails (00:10:00) Agent Architecture (00:12:01) AI Learning & Feedback (00:14:01) Data Collection (Part 1) (00:18:27) Sponsors: Oracle Cloud Infrastructure (OCI) | SelectQuote (00:20:51) Data Collection (Part 2) (00:22:25) Self-Play & Rewards (00:25:04) Model Strategy & Agent Q (00:33:28) Sponsors: Weights & Biases RAG++ (00:34:39) Understanding Agent Q (00:43:16) Search & Learning (00:45:39) Benchmarks vs Reality (00:50:18) Positive Transfer & Scale (00:51:47) Fine-Tuning Strategies (00:55:16) Vision Strategy (01:00:16) Authentication & Security (01:03:48) Future of AI Agents (01:16:14) Cost, Latency, Reliability (01:19:30) Avoiding the Bitter Lesson (01:25:58) Agent-Assisted Future (01:27:11) Outro SOCIAL LINKS: Website: https://www.cognitiverevolution.ai Twitter (Podcast): https://x.com/cogrev_podcast Twitter (Nathan): https://x.com/labenz LinkedIn: https://www.linkedin.com/in/nathanlabenz/ Youtube: https://www.youtube.com/@CognitiveRevolutionPodcast Apple: https://podcasts.apple.com/de/podcast/the-cognitive-revolution-ai-builders-researchers-and/id1669813431 Spotify: https://open.spotify.com/show/6yHyok3M3BjqzR0VB5MSyk

HPE Tech Talk
Private Cellular Networking and the future of secure wide-area networks

HPE Tech Talk

Play Episode Listen Later Nov 28, 2024 17:18


In this episode we are looking at private cellular networks, a hot-topic in the networking space.In 2023, the 5G private network market was worth $2 billion. That's expected to grow to over $30 billion by 2030 (see Kaleido report below), despite 5G being unlikely to overtake 4G as the dominant private networking technology until 2027.So, why is private 5G networking such a growth area, and what could it mean for our organizations? Joining us to discuss is Richard Band, HPE's Senior Sales Director for Private Networking in Europe, the Middle East, Africa, and Latin America.This is Technology Now, a weekly show from Hewlett Packard Enterprise. Every week we look at a story that's been making headlines, take a look at the technology behind it, and explain why it matters to organizations and what we can learn from it. Do you have a question for the expert? Ask it here using this Google form: https://forms.gle/8vzFNnPa94awARHMA About this week's guest: Richard Band: https://www.linkedin.com/in/richardband76?originalSubdomain=fr Sources cited in this week's episode:Kaleido Intelligence report into 5G Private Networks: https://kaleidointelligence.com/private-cellular-networks-annual-spend/GrandView research into 5G Private Networks: https://kaleidointelligence.com/private-cellular-networks-annual-spend/ Uranus' unusual moons: https://www.nature.com/articles/s41550-024-02389-3#:~:text=The%20inner%20three%20of%20the,present%20beneath%20their%20surfaces47%2C

Tech behind the Trends on The Element Podcast | Hewlett Packard Enterprise
Private Cellular Networking and the future of secure wide-area networks

Tech behind the Trends on The Element Podcast | Hewlett Packard Enterprise

Play Episode Listen Later Nov 28, 2024 17:18


In this episode we are looking at private cellular networks, a hot-topic in the networking space.In 2023, the 5G private network market was worth $2 billion. That's expected to grow to over $30 billion by 2030 (see Kaleido report below), despite 5G being unlikely to overtake 4G as the dominant private networking technology until 2027.So, why is private 5G networking such a growth area, and what could it mean for our organizations? Joining us to discuss is Richard Band, HPE's Senior Sales Director for Private Networking in Europe, the Middle East, Africa, and Latin America.This is Technology Now, a weekly show from Hewlett Packard Enterprise. Every week we look at a story that's been making headlines, take a look at the technology behind it, and explain why it matters to organizations and what we can learn from it. Do you have a question for the expert? Ask it here using this Google form: https://forms.gle/8vzFNnPa94awARHMA About this week's guest: Richard Band: https://www.linkedin.com/in/richardband76?originalSubdomain=fr Sources cited in this week's episode:Kaleido Intelligence report into 5G Private Networks: https://kaleidointelligence.com/private-cellular-networks-annual-spend/GrandView research into 5G Private Networks: https://kaleidointelligence.com/private-cellular-networks-annual-spend/ Uranus' unusual moons: https://www.nature.com/articles/s41550-024-02389-3#:~:text=The%20inner%20three%20of%20the,present%20beneath%20their%20surfaces47%2C

HPE Tech Talk, SMB
Private Cellular Networking and the future of secure wide-area networks

HPE Tech Talk, SMB

Play Episode Listen Later Nov 28, 2024 17:18


In this episode we are looking at private cellular networks, a hot-topic in the networking space.In 2023, the 5G private network market was worth $2 billion. That's expected to grow to over $30 billion by 2030 (see Kaleido report below), despite 5G being unlikely to overtake 4G as the dominant private networking technology until 2027.So, why is private 5G networking such a growth area, and what could it mean for our organizations? Joining us to discuss is Richard Band, HPE's Senior Sales Director for Private Networking in Europe, the Middle East, Africa, and Latin America.This is Technology Now, a weekly show from Hewlett Packard Enterprise. Every week we look at a story that's been making headlines, take a look at the technology behind it, and explain why it matters to organizations and what we can learn from it. Do you have a question for the expert? Ask it here using this Google form: https://forms.gle/8vzFNnPa94awARHMA About this week's guest: Richard Band: https://www.linkedin.com/in/richardband76?originalSubdomain=fr Sources cited in this week's episode:Kaleido Intelligence report into 5G Private Networks: https://kaleidointelligence.com/private-cellular-networks-annual-spend/GrandView research into 5G Private Networks: https://kaleidointelligence.com/private-cellular-networks-annual-spend/ Uranus' unusual moons: https://www.nature.com/articles/s41550-024-02389-3#:~:text=The%20inner%20three%20of%20the,present%20beneath%20their%20surfaces47%2C

Designing the Future: Inside Canva's AI Strategy with John Milinovich, GenAI Product Lead at Canva

Play Episode Listen Later Nov 23, 2024 86:05


Nathan explores the world of AI-powered design with John Milinovich, Head of Generative AI Product at Canva. In this episode of The Cognitive Revolution, we dive into Canva's innovative approach to AI integration, from task automation to human augmentation. Join us for an insightful discussion about fine-tuning foundation models, AI's impact on architecture, and practical tips for AI product development at scale. Check out Canva: https://www.canva.com Be notified early when Turpentine's drops new publication: https://www.turpentine.co/exclusiveaccess SPONSORS: SelectQuote: Finding the right life insurance shouldn't be another task you put off. SelectQuote compares top-rated policies to get you the best coverage at the right price. Even in our AI-driven world, protecting your family's future remains essential. Get your personalized quote at https://selectquote.com/cognitive Incogni: Take your personal data back with Incogni! Use code REVOLUTION at the link below and get 60% off an annual plan: https://incogni.com/revolution Shopify: Shopify is the world's leading e-commerce platform, offering a market-leading checkout system and exclusive AI apps like Quikly. Nobody does selling better than Shopify. Get a $1 per month trial at https://shopify.com/cognitive Oracle Cloud Infrastructure (OCI): Oracle's next-generation cloud platform delivers blazing-fast AI and ML performance with 50% less for compute and 80% less for outbound networking compared to other cloud providers13. OCI powers industry leaders with secure infrastructure and application development capabilities. New U.S. customers can get their cloud bill cut in half by switching to OCI before December 31, 2024 at https://oracle.com/cognitive Brave: The Brave search API can be used to assemble a data set to train your AI models and help with retrieval augmentation at the time of inference. All while remaining affordable with developer first pricing, integrating the Brave search API into your workflow translates to more ethical data sourcing and more human representative data sets. Try the Brave search API for free for up to 2000 queries per month at https://bit.ly/BraveTCR RECOMMENDED PODCAST: Unpack Pricing - Dive into the dark arts of SaaS pricing with Metronome CEO Scott Woody and tech leaders. Learn how strategic pricing drives explosive revenue growth in today's biggest companies like Snowflake, Cockroach Labs, Dropbox and more. Apple: https://podcasts.apple.com/us/podcast/id1765716600 Spotify: https://open.spotify.com/show/38DK3W1Fq1xxQalhDSueFg CHAPTERS: (00:00:00) Teaser (00:01:13) Sponsors: SelectQuote (00:02:26) About the Episode (00:04:33) Introduction - Creativity vs Design (00:08:39) AI-Assisted Experiences (00:10:25) Automation & Augmentation (00:15:27) Pixels to Objects to Concepts (00:17:58) Sponsors: Incogni | Shopify (00:20:40) Concept-Level Interfaces (00:23:35) The Future of Design (00:29:39) Human Element in Design (00:32:49) AI Talking to AI (00:35:52) Sponsors: Oracle Cloud Infrastructure (OCI) | Brave (00:38:04) Purpose-Specific AI Experiences (00:45:29) GPT-4 Image Editing (00:51:17) Graduated Approach to Launch (00:55:09) Fine-Tuning GPT-4 (00:59:10) Cost & Latency (01:01:29) Hiring AI Engineers (01:05:02) Engineering Best Practices (01:09:00) Inspiration in the AI Space (01:18:28) The Gen AI Application Layer (01:24:26) Outro SOCIAL LINKS: Website: https://www.cognitiverevolution.ai Twitter (Podcast): https://x.com/cogrev_podcast Twitter (Nathan): https://x.com/labenz LinkedIn: https://www.linkedin.com/in/nathanlabenz/ Youtube: https://www.youtube.com/@CognitiveRevolutionPodcast Apple: https://podcasts.apple.com/de/podcast/the-cognitive-revolution-ai-builders-researchers-and/id1669813431 Spotify: https://open.spotify.com/show/6yHyok3M3BjqzR0VB5MSyk

No Latency
The Bullet Train PT2 - No Latency Live #SDCC2024

No Latency

Play Episode Listen Later Nov 20, 2024 56:50


No Latency Live at SDCC24! Part 2! The Crew and some new and old friends, take on the Bullet Train. Domino needs them to steal a rare piece of Cybertech, and there's very little time before the train leaves the city and it's gone for good.Jade, Evan and Tracie are joined by Dayeanne Hutton, Lemar the Con Guy and Utahmie live on stage at San Diego Comic Con 2024! With a special Cameo performance from Spoon as Jeb.Full video version on this episode and Part 2 (coming next week) are available on our Youtube Channel.More info can be found here: ⁠⁠⁠⁠⁠⁠linktr.ee/NoLatency⁠⁠⁠⁠⁠⁠ If you'd like to support us, We now have a Patreon! ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Patreon.com/nolatency⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Even more information and MERCH is on our website! ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.nolatencypodcast.com⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Twitter: @nolatencypod Instagram: @nolatencypod Find @SkullorJade, ⁠⁠⁠⁠⁠⁠⁠@Miss_Magitek⁠⁠⁠⁠⁠⁠⁠ and ⁠⁠⁠⁠⁠⁠⁠@Binary_Dragon⁠⁠⁠⁠⁠⁠⁠,⁠⁠⁠⁠⁠⁠⁠ @retrodatv⁠⁠⁠⁠⁠⁠⁠ on twitch, for live D&D, TTRPGs and more. #cyberpunkred #actualplay #ttrpg #radioplay #scifi #cyberpunk #drama #comedy #LIVE #SDCC24

Solana Weekly
Solana Weekly #92 - How Chris is Increasing Bandwidth and Reducing Latency (IBRL) On Solana with TitanDex

Solana Weekly

Play Episode Listen Later Nov 20, 2024 55:53


No Latency
The Bullet Train PT1 - No Latency Live #SDCC2024

No Latency

Play Episode Listen Later Nov 13, 2024 46:55


No Latency Live at SDCC24! The Crew and some new and old friends, take on the Bullet Train. Domino needs them to steal a rare piece of Cybertech, and there's very little time before the train leaves the city and it's gone for good.Jade, Evan and Tracie are joined by Dayeanne Hutton, Lemar the Con Guy and Utahmie live on stage at San Diego Comic Con 2024! With a special Cameo performance from Spoon as Jeb.Full video version on this episode and Part 2 (coming next week) are available on our Youtube Channel.More info can be found here: ⁠⁠⁠⁠⁠⁠linktr.ee/NoLatency⁠⁠⁠⁠⁠⁠ If you'd like to support us, We now have a Patreon! ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Patreon.com/nolatency⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Even more information and MERCH is on our website! ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.nolatencypodcast.com⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Twitter: @nolatencypod Instagram: @nolatencypod Find @SkullorJade, ⁠⁠⁠⁠⁠⁠⁠@Miss_Magitek⁠⁠⁠⁠⁠⁠⁠ and ⁠⁠⁠⁠⁠⁠⁠@Binary_Dragon⁠⁠⁠⁠⁠⁠⁠,⁠⁠⁠⁠⁠⁠⁠ @retrodatv⁠⁠⁠⁠⁠⁠⁠ on twitch, for live D&D, TTRPGs and more. #cyberpunkred #actualplay #ttrpg #radioplay #scifi #cyberpunk #drama #comedy #LIVE #SDCC24

Unleashed - How to Thrive as an Independent Professional
586. Sanjar Iyer,  How to Analyze a Telecommunications Company

Unleashed - How to Thrive as an Independent Professional

Play Episode Listen Later Nov 11, 2024 30:56


Sanjay Iyer, a consultant for 25 years, discusses the evolution of telecommunications companies, focusing on network, infrastructure, quality, and coverage analysis. He explains that coverage is the first aspect of a network, determining the reach and number of homes it can deliver service to. The structure of networks has evolved over the years, with different types of networks for broadband, such as fiber to the home, hybrid fiber coax, and fixed wireless axis. Assessing the Infrastructure Quality Sanjay explains the process of assessing the infrastructure quality of a telecommunications company, which involves evaluating speeds, latency, and other factors such as the density of homes in the neighborhood. Speeds are rated at megabits per second, but factors like the number of people using television, density of homes, and latency can affect the speed of upstream and downstream packets. Latency is another factor that covers systemic network design quality. Sanjay also mentions that there are temporary issues in a coax network, such as fluctuation noise and overhead versus underground cables. To understand the total quality of a network, it is essential to separate temporary issues from systemic problems. He suggests measuring the quality at a home level, rather than at the broad network level. Network Assessment Factors Sanjay explains the importance of assessing network outcomes such as latency and speed when buying a provider and explains why companies should focus on outcome metrics and infrastructure quality. He talks about the first and second metric, capital expenditure efficiency and network upgrades. Sanjya explains why getworks have been continuously groomed and expanded to deliver more bandwidth over the years, and understanding how they have done it historically and what it will take to achieve the gold standard of one gigabits per second downstream to every home is crucial and what it would cost.  Challenges Faced when Analyzing Networks The conversation turns to the  challenges companies face in analyzing their own networks, as there is no single source of truth for determining their network coverage. One challenge is the cost of bandwidth, which can be expensive and unpredictable. To get the bandwidth right, companies must calculate the capex efficiency model, which assumes an average number of households per node and exploits it to the entire country. This model is often incorrect, leading to unpredictable network costs. Another challenge is fiber optic and broadband penetration analysis. The Federal Communications Commission has created a national database that tracks every household's speed and coverage from service providers. This information is publicly available and can be used to analyze homes and serviceable locations. The FCC has also created a service coverage map at a national scale, which can be used to allocate government capital to underserved areas and subsidize network bills. Analyzing Market Share Sanjay  discusses the process of analyzing market share in a given market. He uses the FCC database to measure network footprint, focusing on census block group levels to determine customer penetration. Machine learning is particularly interesting as it provides insights into customer profiles, economic or household level information, which can help predict underperformance, overperformance, and areas for improvement. Iyer is currently working on building tools to predict the ROI of broadband investments, analyzing existing footprints and adjacent locations, and predicting expansion paths. He is also involved in generative AI, which is popular but not widely adopted due to issues with LLM tech adoption. Iyer is developing a governance model that looks at all aspects of Gen AI, from use cases to production and costs, and is building products with an AI-first approach, using tools like chat and GPT to develop software products based on specific requirements. Timestamps: 04:30: Assessing Infrastructure Quality and Network Economics 08:37: Capital Expenditure Efficiency and Network Upgrades  13:27: Challenges in Network Data Availability  17:52: Fiber Optic and Broadband Penetration Analysis  21:21: Customer Churn Rate and Retention Strategy 25:45: Subscriber-Based Growth and Market Share Analysis  27:32: Sanjay Iyer's Current Practice and AI Focus    Links: LinkedIn: https://www.linkedin.com/in/sanjay-iyer/ Website: https://www.combinatree.com/ Resource: https://umbrex.com/resources/how-to-analyze-a-telecommunications-company/ Unleashed is produced by Umbrex, which has a mission of connecting independent management consultants with one another, creating opportunities for members to meet, build relationships, and share lessons learned. Learn more at www.umbrex.com.  

Packet Pushers - Full Podcast Feed
N4N002: Bandwidth and Latency Explained

Packet Pushers - Full Podcast Feed

Play Episode Listen Later Nov 7, 2024 24:42


In this episode of N Is For Networking, co-hosts Ethan Banks and Holly Metlitzky take a question from college student Douglas that turns into a ride on the networking highway as they navigate the lanes of bandwidth and latency. Ethan and Holly define the concepts of bandwidth and latency and discuss current data transfer protocols... Read more »

Packet Pushers - Fat Pipe
N4N002: Bandwidth and Latency Explained

Packet Pushers - Fat Pipe

Play Episode Listen Later Nov 7, 2024 24:42


In this episode of N Is For Networking, co-hosts Ethan Banks and Holly Metlitzky take a question from college student Douglas that turns into a ride on the networking highway as they navigate the lanes of bandwidth and latency. Ethan and Holly define the concepts of bandwidth and latency and discuss current data transfer protocols... Read more »

Modern Web
Modern Web Podcast S12E39- Fly.io for Easier Cloud Deployment with Annie Sexton

Modern Web

Play Episode Listen Later Nov 6, 2024 39:10


Annie Sexton, Developer Advocate at Fly.io, to discuss Fly.io's approach to simplifying cloud deployment. Annie shares Fly.io's unique position as a public cloud that offers the flexibility of infrastructure control with a streamlined developer experience. They explore Fly.io's private networking and distributed app capabilities, allowing developers to deploy applications close to users worldwide with ease. Annie also addresses common challenges in distributed systems, including latency, data replication, and the balance between global reach and simple, single-region projects. Chapters: - 00:00 - 01:32 Introduction to the Modern Web Podcast and Guests - 01:33 - 04:00 Overview of Fly.io and Annie's Role as Developer Advocate - 04:01 - 06:35 What Makes Fly.io Stand Out Among Cloud Platforms - 06:36 - 08:57 Distributed Applications: Benefits and Use Cases - 08:58 - 11:28 Understanding Distributed Web Servers and Private Networking - 11:29 - 13:49 Challenges in Distributed Data and Replication Techniques - 13:50 - 16:12 Fly.io's Unique Solutions for Data Consistency - 16:13 - 18:34 When to Consider a Distributed Setup for Your Application - 18:35 - 20:35 Tools and Tips for Evaluating Geographical Distribution Needs - 20:36 - 22:22 Simplifying Global Deployment with Fly.io's Command Features - 22:23 - 24:18 Considerations for Latency and Performance Optimization - 24:19 - 26:45 Balancing Simplicity with Advanced Control for Developers - 26:46 - 29:04 Easy Deployment for Hobbyists and Smaller Projects - 29:05 - 31:27 Getting Started on Fly.io with Fly Launch - 31:28 - 33:48 Developer Advocacy and Meeting Diverse Needs in the Cloud - 33:49 - 36:15 Catering to Beginners and Experienced Developers Alike - 36:16 - End Closing Remarks and Where to Find Fly.io and the Hosts Follow Annie Sexton on Social Media Twitter:https://x.com/_anniebabannie_ Linkedin: https://www.linkedin.com/in/annie-sexton-11472a46/ Github: https://github.com/anniebabannie

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Apologies for lower audio quality; we lost recordings and had to use backup tracks. Our guests today are Anastasios Angelopoulos and Wei-Lin Chiang, leads of Chatbot Arena, fka LMSYS, the crowdsourced AI evaluation platform developed by the LMSys student club at Berkeley, which became the de facto standard for comparing language models. Arena ELO is often more cited than MMLU scores to many folks, and they have attracted >1,000,000 people to cast votes since its launch, leading top model trainers to cite them over their own formal academic benchmarks:The Limits of Static BenchmarksWe've done two benchmarks episodes: Benchmarks 101 and Benchmarks 201. One issue we've always brought up with static benchmarks is that 1) many are getting saturated, with models scoring almost perfectly on them 2) they often don't reflect production use cases, making it hard for developers and users to use them as guidance. The fundamental challenge in AI evaluation isn't technical - it's philosophical. How do you measure something that increasingly resembles human intelligence? Rather than trying to define intelligence upfront, Arena let users interact naturally with models and collect comparative feedback. It's messy and subjective, but that's precisely the point - it captures the full spectrum of what people actually care about when using AI.The Pareto Frontier of Cost vs IntelligenceBecause the Elo scores are remarkably stable over time, we can put all the chat models on a map against their respective cost to gain a view of at least 3 orders of magnitude of model sizes/costs and observe the remarkable shift in intelligence per dollar over the past year:This frontier stood remarkably firm through the recent releases of o1-preview and price cuts of Gemini 1.5:The Statistics of SubjectivityIn our Benchmarks 201 episode, Clémentine Fourrier from HuggingFace thought this design choice was one of shortcomings of arenas: they aren't reproducible. You don't know who ranked what and what exactly the outcome was at the time of ranking. That same person might rank the same pair of outputs differently on a different day, or might ask harder questions to better models compared to smaller ones, making it imbalanced. Another argument that people have brought up is confirmation bias. We know humans prefer longer responses and are swayed by formatting - Rob Mulla from Dreadnode had found some interesting data on this in May:The approach LMArena is taking is to use logistic regression to decompose human preferences into constituent factors. As Anastasios explains: "We can say what components of style contribute to human preference and how they contribute." By adding these style components as parameters, they can mathematically "suck out" their influence and isolate the core model capabilities.This extends beyond just style - they can control for any measurable factor: "What if I want to look at the cost adjusted performance? Parameter count? We can ex post facto measure that." This is one of the most interesting things about Arena: You have a data generation engine which you can clean and turn into leaderboards later. If you wanted to create a leaderboard for poetry writing, you could get existing data from Arena, normalize it by identifying these style components. Whether or not it's possible to really understand WHAT bias the voters have, that's a different question.Private EvalsOne of the most delicate challenges LMSYS faces is maintaining trust while collaborating with AI labs. The concern is that labs could game the system by testing multiple variants privately and only releasing the best performer. This was brought up when 4o-mini released and it ranked as the second best model on the leaderboard:But this fear misunderstands how Arena works. Unlike static benchmarks where selection bias is a major issue, Arena's live nature means any initial bias gets washed out by ongoing evaluation. As Anastasios explains: "In the long run, there's way more fresh data than there is data that was used to compare these five models." The other big question is WHAT model is actually being tested; as people often talk about on X / Discord, the same endpoint will randomly feel “nerfed” like it happened for “Claude European summer” and corresponding conspiracy theories:It's hard to keep track of these performance changes in Arena as these changes (if real…?) are not observable.The Future of EvaluationThe team's latest work on RouteLLM points to an interesting future where evaluation becomes more granular and task-specific. But they maintain that even simple routing strategies can be powerful - like directing complex queries to larger models while handling simple tasks with smaller ones.Arena is now going to expand beyond text into multimodal evaluation and specialized domains like code execution and red teaming. But their core insight remains: the best way to evaluate intelligence isn't to simplify it into metrics, but to embrace its complexity and find rigorous ways to analyze it. To go after this vision, they are spinning out Arena from LMSys, which will stay as an academia-driven group at Berkeley.Full Video PodcastChapters* 00:00:00 - Introductions* 00:01:16 - Origin and development of Chatbot Arena* 00:05:41 - Static benchmarks vs. Arenas* 00:09:03 - Community building* 00:13:32 - Biases in human preference evaluation* 00:18:27 - Style Control and Model Categories* 00:26:06 - Impact of o1* 00:29:15 - Collaborating with AI labs* 00:34:51 - RouteLLM and router models* 00:38:09 - Future of LMSys / ArenaShow Notes* Anastasios Angelopoulos* Anastasios' NeurIPS Paper Conformal Risk Control* Wei-Lin Chiang* Chatbot Arena* LMSys* MTBench* ShareGPT dataset* Stanford's Alpaca project* LLMRouter* E2B* DreadnodeTranscriptAlessio [00:00:00]: Hey everyone, welcome to the Latent Space podcast. This is Alessio, Partner and CTO in Residence at Decibel Partners, and I'm joined by my co-host Swyx, founder of Smol.ai.Swyx [00:00:14]: Hey, and today we're very happy and excited to welcome Anastasios and Wei Lin from LMSys. Welcome guys.Wei Lin [00:00:21]: Hey, how's it going? Nice to see you.Anastasios [00:00:23]: Thanks for having us.Swyx [00:00:24]: Anastasios, I actually saw you, I think at last year's NeurIPS. You were presenting a paper, which I don't really super understand, but it was some theory paper about how your method was very dominating over other sort of search methods. I don't remember what it was, but I remember that you were a very confident speaker.Anastasios [00:00:40]: Oh, I totally remember you. Didn't ever connect that, but yes, that's definitely true. Yeah. Nice to see you again.Swyx [00:00:46]: Yeah. I was frantically looking for the name of your paper and I couldn't find it. Basically I had to cut it because I didn't understand it.Anastasios [00:00:51]: Is this conformal PID control or was this the online control?Wei Lin [00:00:55]: Blast from the past, man.Swyx [00:00:57]: Blast from the past. It's always interesting how NeurIPS and all these academic conferences are sort of six months behind what people are actually doing, but conformal risk control, I would recommend people check it out. I have the recording. I just never published it just because I was like, I don't understand this enough to explain it.Anastasios [00:01:14]: People won't be interested.Wei Lin [00:01:15]: It's all good.Swyx [00:01:16]: But ELO scores, ELO scores are very easy to understand. You guys are responsible for the biggest revolution in language model benchmarking in the last few years. Maybe you guys want to introduce yourselves and maybe tell a little bit of the brief history of LMSysWei Lin [00:01:32]: Hey, I'm Wei Lin. I'm a fifth year PhD student at UC Berkeley, working on Chatbot Arena these days, doing crowdsourcing AI benchmarking.Anastasios [00:01:43]: I'm Anastasios. I'm a sixth year PhD student here at Berkeley. I did most of my PhD on like theoretical statistics and sort of foundations of model evaluation and testing. And now I'm working 150% on this Chatbot Arena stuff. It's great.Alessio [00:02:00]: And what was the origin of it? How did you come up with the idea? How did you get people to buy in? And then maybe what were one or two of the pivotal moments early on that kind of made it the standard for these things?Wei Lin [00:02:12]: Yeah, yeah. Chatbot Arena project was started last year in April, May, around that. Before that, we were basically experimenting in a lab how to fine tune a chatbot open source based on the Llama 1 model that I released. At that time, Lama 1 was like a base model and people didn't really know how to fine tune it. So we were doing some explorations. We were inspired by Stanford's Alpaca project. So we basically, yeah, grow a data set from the internet, which is called ShareGPT data set, which is like a dialogue data set between user and chat GPT conversation. It turns out to be like pretty high quality data, dialogue data. So we fine tune on it and then we train it and release the model called V2. And people were very excited about it because it kind of like demonstrate open way model can reach this conversation capability similar to chat GPT. And then we basically release the model with and also build a demo website for the model. People were very excited about it. But during the development, the biggest challenge to us at the time was like, how do we even evaluate it? How do we even argue this model we trained is better than others? And then what's the gap between this open source model that other proprietary offering? At that time, it was like GPT-4 was just announced and it's like Cloud One. What's the difference between them? And then after that, like every week, there's a new model being fine tuned, released. So even until still now, right? And then we have that demo website for V2 now. And then we thought like, okay, maybe we can add a few more of the model as well, like API model as well. And then we quickly realized that people need a tool to compare between different models. So we have like a side by side UI implemented on the website to that people choose, you know, compare. And we quickly realized that maybe we can do something like, like a battle on top of ECLMs, like just anonymize it, anonymize the identity, and that people vote which one is better. So the community decides which one is better, not us, not us arguing, you know, our model is better or what. And that turns out to be like, people are very excited about this idea. And then we tweet, we launch, and that's, yeah, that's April, May. And then it was like first two, three weeks, like just a few hundred thousand views tweet on our launch tweets. And then we have regularly double update weekly, beginning at a time, adding new model GPT-4 as well. So it was like, that was the, you know, the initial.Anastasios [00:04:58]: Another pivotal moment, just to jump in, would be private models, like the GPT, I'm a little,Wei Lin [00:05:04]: I'm a little chatty. That was this year. That was this year.Anastasios [00:05:07]: Huge.Wei Lin [00:05:08]: That was also huge.Alessio [00:05:09]: In the beginning, I saw the initial release was May 3rd of the beta board. On April 6, we did a benchmarks 101 episode for a podcast, just kind of talking about, you know, how so much of the data is like in the pre-training corpus and blah, blah, blah. And like the benchmarks are really not what we need to evaluate whether or not a model is good. Why did you not make a benchmark? Maybe at the time, you know, it was just like, Hey, let's just put together a whole bunch of data again, run a, make a score that seems much easier than coming out with a whole website where like users need to vote. Any thoughts behind that?Wei Lin [00:05:41]: I think it's more like fundamentally, we don't know how to automate this kind of benchmarks when it's more like, you know, conversational, multi-turn, and more open-ended task that may not come with a ground truth. So let's say if you ask a model to help you write an email for you for whatever purpose, there's no ground truth. How do you score them? Or write a story or a creative story or many other things like how we use ChatterBee these days. It's more open-ended. You know, we need human in the loop to give us feedback, which one is better. And I think nuance here is like, sometimes it's also hard for human to give the absolute rating. So that's why we have this kind of pairwise comparison, easier for people to choose which one is better. So from that, we use these pairwise comparison, those to calculate the leaderboard. Yeah. You can add more about this methodology.Anastasios [00:06:40]: Yeah. I think the point is that, and you guys probably also talked about this at some point, but static benchmarks are intrinsically, to some extent, unable to measure generative model performance. And the reason is because you cannot pre-annotate all the outputs of a generative model. You change the model, it's like the distribution of your data is changing. New labels to deal with that. New labels are great automated labeling, right? Which is why people are pursuing both. And yeah, static benchmarks, they allow you to zoom in to particular types of information like factuality, historical facts. We can build the best benchmark of historical facts, and we will then know that the model is great at historical facts. But ultimately, that's not the only axis, right? And we can build 50 of them, and we can evaluate 50 axes. But it's just so, the problem of generative model evaluation is just so expansive, and it's so subjective, that it's just maybe non-intrinsically impossible, but at least we don't see a way. We didn't see a way of encoding that into a fixed benchmark.Wei Lin [00:07:47]: But on the other hand, I think there's a challenge where this kind of online dynamic benchmark is more expensive than static benchmark, offline benchmark, where people still need it. Like when they build models, they need static benchmark to track where they are.Anastasios [00:08:03]: It's not like our benchmark is uniformly better than all other benchmarks, right? It just measures a different kind of performance that has proved to be useful.Swyx [00:08:14]: You guys also published MTBench as well, which is a static version, let's say, of Chatbot Arena, right? That people can actually use in their development of models.Wei Lin [00:08:25]: Right. I think one of the reasons we still do this static benchmark, we still wanted to explore, experiment whether we can automate this, because people, eventually, model developers need it to fast iterate their model. So that's why we explored LM as a judge, and ArenaHard, trying to filter, select high-quality data we collected from Chatbot Arena, the high-quality subset, and use that as a question and then automate the judge pipeline, so that people can quickly get high-quality signal, benchmark signals, using this online benchmark.Swyx [00:09:03]: As a community builder, I'm curious about just the initial early days. Obviously when you offer effectively free A-B testing inference for people, people will come and use your arena. What do you think were the key unlocks for you? Was it funding for this arena? Was it marketing? When people came in, do you see a noticeable skew in the data? Which obviously now you have enough data sets, you can separate things out, like coding and hard prompts, but in the early days, it was just all sorts of things.Anastasios [00:09:31]: Yeah, maybe one thing to establish at first is that our philosophy has always been to maximize organic use. I think that really does speak to your point, which is, yeah, why do people come? They came to use free LLM inference, right? And also, a lot of users just come to the website to use direct chat, because you can chat with the model for free. And then you could think about it like, hey, let's just be kind of like more on the selfish or conservative or protectionist side and say, no, we're only giving credits for people that battle or so on and so forth. Strategy wouldn't work, right? Because what we're trying to build is like a big funnel, a big funnel that can direct people. And some people are passionate and interested and they battle. And yes, the distribution of the people that do that is different. It's like, as you're pointing out, it's like, that's not as they're enthusiastic.Wei Lin [00:10:24]: They're early adopters of this technology.Anastasios [00:10:27]: Or they like games, you know, people like this. And we've run a couple of surveys that indicate this as well, of our user base.Wei Lin [00:10:36]: We do see a lot of developers come to the site asking polling questions, 20-30%. Yeah, 20-30%.Anastasios [00:10:42]: It's obviously not reflective of the general population, but it's reflective of some corner of the world of people that really care. And to some extent, maybe that's all right, because those are like the power users. And you know, we're not trying to claim that we represent the world, right? We represent the people that come and vote.Swyx [00:11:02]: Did you have to do anything marketing-wise? Was anything effective? Did you struggle at all? Was it success from day one?Wei Lin [00:11:09]: At some point, almost done. Okay. Because as you can imagine, this leaderboard depends on community engagement participation. If no one comes to vote tomorrow, then no leaderboard.Anastasios [00:11:23]: So we had some period of time when the number of users was just, after the initial launch, it went lower. Yeah. And, you know, at some point, it did not look promising. Actually, I joined the project a couple months in to do the statistical aspects, right? As you can imagine, that's how it kind of hooked into my previous work. At that time, it wasn't like, you know, it definitely wasn't clear that this was like going to be the eval or something. It was just like, oh, this is a cool project. Like Wayland seems awesome, you know, and that's it.Wei Lin [00:11:56]: Definitely. There's in the beginning, because people don't know us, people don't know what this is for. So we had a hard time. But I think we were lucky enough that we have some initial momentum. And as well as the competition between model providers just becoming, you know, became very intense. Intense. And then that makes the eval onto us, right? Because always number one is number one.Anastasios [00:12:23]: There's also an element of trust. Our main priority in everything we do is trust. We want to make sure we're doing everything like all the I's are dotted and the T's are crossed and nobody gets unfair treatment and people can see from our profiles and from our previous work and from whatever, you know, we're trustworthy people. We're not like trying to make a buck and we're not trying to become famous off of this or that. It's just, we're trying to provide a great public leaderboard community venture project.Wei Lin [00:12:51]: Yeah.Swyx [00:12:52]: Yes. I mean, you are kind of famous now, you know, that's fine. Just to dive in more into biases and, you know, some of this is like statistical control. The classic one for human preference evaluation is humans demonstrably prefer longer contexts or longer outputs, which is actually something that we don't necessarily want. You guys, I think maybe two months ago put out some length control studies. Apart from that, there are just other documented biases. Like, I'd just be interested in your review of what you've learned about biases and maybe a little bit about how you've controlled for them.Anastasios [00:13:32]: At a very high level, yeah. Humans are biased. Totally agree. Like in various ways. It's not clear whether that's good or bad, you know, we try not to make value judgments about these things. We just try to describe them as they are. And our approach is always as follows. We collect organic data and then we take that data and we mine it to get whatever insights we can get. And, you know, we have many millions of data points that we can now use to extract insights from. Now, one of those insights is to ask the question, what is the effect of style, right? You have a bunch of data, you have votes, people are voting either which way. We have all the conversations. We can say what components of style contribute to human preference and how do they contribute? Now, that's an important question. Why is that an important question? It's important because some people want to see which model would be better if the lengths of the responses were the same, were to be the same, right? People want to see the causal effect of the model's identity controlled for length or controlled for markdown, number of headers, bulleted lists, is the text bold? Some people don't, they just don't care about that. The idea is not to impose the judgment that this is not important, but rather to say ex post facto, can we analyze our data in a way that decouples all the different factors that go into human preference? Now, the way we do this is via statistical regression. That is to say the arena score that we show on our leaderboard is a particular type of linear model, right? It's a linear model that takes, it's a logistic regression that takes model identities and fits them against human preference, right? So it regresses human preference against model identity. What you get at the end of that logistic regression is a parameter vector of coefficients. And when the coefficient is large, it tells you that GPT 4.0 or whatever, very large coefficient, that means it's strong. And that's exactly what we report in the table. It's just the predictive effect of the model identity on the vote. The other thing that you can do is you can take that vector, let's say we have M models, that is an M dimensional vector of coefficients. What you can do is you say, hey, I also want to understand what the effect of length is. So I'll add another entry to that vector, which is trying to predict the vote, right? That tells me the difference in length between two model responses. So we have that for all of our data. We can compute it ex post facto. We added it into the regression and we look at that predictive effect. And then the idea, and this is formally true under certain conditions, not always verifiable ones, but the idea is that adding that extra coefficient to this vector will kind of suck out the predictive power of length and put it into that M plus first coefficient and quote, unquote, de-bias the rest so that the effect of length is not included. And that's what we do in style control. Now we don't just do it for M plus one. We have, you know, five, six different style components that have to do with markdown headers and bulleted lists and so on that we add here. Now, where is this going? You guys see the idea. It's a general methodology. If you have something that's sort of like a nuisance parameter, something that exists and provides predictive value, but you really don't want to estimate that. You want to remove its effect. In causal inference, these things are called like confounders often. What you can do is you can model the effect. You can put them into your model and try to adjust for them. So another one of those things might be cost. You know, what if I want to look at the cost adjusted performance of my model, which models are punching above their weight, parameter count, which models are punching above their weight in terms of parameter count, we can ex post facto measure that. We can do it without introducing anything that compromises the organic nature of theWei Lin [00:17:17]: data that we collect.Anastasios [00:17:18]: Hopefully that answers the question.Wei Lin [00:17:20]: It does.Swyx [00:17:21]: So I guess with a background in econometrics, this is super familiar.Anastasios [00:17:25]: You're probably better at this than me for sure.Swyx [00:17:27]: Well, I mean, so I used to be, you know, a quantitative trader and so, you know, controlling for multiple effects on stock price is effectively the job. So it's interesting. Obviously the problem is proving causation, which is hard, but you don't have to do that.Anastasios [00:17:45]: Yes. Yes, that's right. And causal inference is a hard problem and it goes beyond statistics, right? It's like you have to build the right causal model and so on and so forth. But we think that this is a good first step and we're sort of looking forward to learning from more people. You know, there's some good people at Berkeley that work on causal inference for the learning from them on like, what are the really most contemporary techniques that we can use in order to estimate true causal effects if possible.Swyx [00:18:10]: Maybe we could take a step through the other categories. So style control is a category. It is not a default. I have thought that when you wrote that blog post, actually, I thought it would be the new default because it seems like the most obvious thing to control for. But you also have other categories, you have coding, you have hard prompts. We consider that.Anastasios [00:18:27]: We're still actively considering it. It's just, you know, once you make that step, once you take that step, you're introducing your opinion and I'm not, you know, why should our opinion be the one? That's kind of a community choice. We could put it to a vote.Wei Lin [00:18:39]: We could pass.Anastasios [00:18:40]: Yeah, maybe do a poll. Maybe do a poll.Swyx [00:18:42]: I don't know. No opinion is an opinion.Wei Lin [00:18:44]: You know what I mean?Swyx [00:18:45]: Yeah.Wei Lin [00:18:46]: There's no neutral choice here.Swyx [00:18:47]: Yeah. You have all these others. You have instruction following too. What are your favorite categories that you like to talk about? Maybe you tell a little bit of the stories, tell a little bit of like the hard choices that you had to make.Wei Lin [00:18:57]: Yeah. Yeah. Yeah. I think the, uh, initially the reason why we want to add these new categories is essentially to answer some of the questions from our community, which is we won't have a single leaderboard for everything. So these models behave very differently in different domains. Let's say this model is trend for coding, this model trend for more technical questions and so on. On the other hand, to answer people's question about like, okay, what if all these low quality, you know, because we crowdsource data from the internet, there will be noise. So how do we de-noise? How do we filter out these low quality data effectively? So that was like, you know, some questions we want to answer. So basically we spent a few months, like really diving into these questions to understand how do we filter all these data because these are like medias of data points. And then if you want to re-label yourself, it's possible, but we need to kind of like to automate this kind of data classification pipeline for us to effectively categorize them to different categories, say coding, math, structure, and also harder problems. So that was like, the hope is when we slice the data into these meaningful categories to give people more like better signals, more direct signals, and that's also to clarify what we are actually measuring for, because I think that's the core part of the benchmark. That was the initial motivation. Does that make sense?Anastasios [00:20:27]: Yeah. Also, I'll just say, this does like get back to the point that the philosophy is to like mine organic, to take organic data and then mine it x plus factor.Alessio [00:20:35]: Is the data cage-free too, or just organic?Anastasios [00:20:39]: It's cage-free.Wei Lin [00:20:40]: No GMO. Yeah. And all of these efforts are like open source, like we open source all of the data cleaning pipeline, filtering pipeline. Yeah.Swyx [00:20:50]: I love the notebooks you guys publish. Actually really good just for learning statistics.Wei Lin [00:20:54]: Yeah. I'll share this insights with everyone.Alessio [00:20:59]: I agree on the initial premise of, Hey, writing an email, writing a story, there's like no ground truth. But I think as you move into like coding and like red teaming, some of these things, there's like kind of like skill levels. So I'm curious how you think about the distribution of skill of the users. Like maybe the top 1% of red teamers is just not participating in the arena. So how do you guys think about adjusting for it? And like feels like this where there's kind of like big differences between the average and the top. Yeah.Anastasios [00:21:29]: Red teaming, of course, red teaming is quite challenging. So, okay. Moving back. There's definitely like some tasks that are not as subjective that like pairwise human preference feedback is not the only signal that you would want to measure. And to some extent, maybe it's useful, but it may be more useful if you give people better tools. For example, it'd be great if we could execute code with an arena, be fantastic.Wei Lin [00:21:52]: We want to do it.Anastasios [00:21:53]: There's also this idea of constructing a user leaderboard. What does that mean? That means some users are better than others. And how do we measure that? How do we quantify that? Hard in chatbot arena, but where it is easier is in red teaming, because in red teaming, there's an explicit game. You're trying to break the model, you either win or you lose. So what you can do is you can say, Hey, what's really happening here is that the models and humans are playing a game against one another. And then you can use the same sort of Bradley Terry methodology with some, some extensions that we came up with in one of you can read one of our recent blog posts for, for the sort of theoretical extensions. You can attribute like strength back to individual players and jointly attribute strength to like the models that are in this jailbreaking game, along with the target tasks, like what types of jailbreaks you want.Wei Lin [00:22:44]: So yeah.Anastasios [00:22:45]: And I think that this is, this is a hugely important and interesting avenue that we want to continue researching. We have some initial ideas, but you know, all thoughts are welcome.Wei Lin [00:22:54]: Yeah.Alessio [00:22:55]: So first of all, on the code execution, the E2B guys, I'm sure they'll be happy to helpWei Lin [00:22:59]: you.Alessio [00:23:00]: I'll please set that up. They're big fans. We're investors in a company called Dreadnought, which we do a lot in AI red teaming. I think to me, the most interesting thing has been, how do you do sure? Like the model jailbreak is one side. We also had Nicola Scarlini from DeepMind on the podcast, and he was talking about, for example, like, you know, context stealing and like a weight stealing. So there's kind of like a lot more that goes around it. I'm curious just how you think about the model and then maybe like the broader system, even with Red Team Arena, you're just focused on like jailbreaking of the model, right? You're not doing kind of like any testing on the more system level thing of the model where like, maybe you can get the training data back, you're going to exfiltrate some of the layers and the weights and things like that.Wei Lin [00:23:43]: So right now, as you can see, the Red Team Arena is at a very early stage and we are still exploring what could be the potential new games we can introduce to the platform. So the idea is still the same, right? And we build a community driven project platform for people. They can have fun with this website, for sure. That's one thing, and then help everyone to test these models. So one of the aspects you mentioned is stealing secrets, stealing training sets. That could be one, you know, it could be designed as a game. Say, can you still use their credential, you know, we hide, maybe we can hide the credential into system prompts and so on. So there are like a few potential ideas we want to explore for sure. Do you want to add more?Anastasios [00:24:28]: I think that this is great. This idea is a great one. There's a lot of great ideas in the Red Teaming space. You know, I'm not personally like a Red Teamer. I don't like go around and Red Team models, but there are people that do that and they're awesome. They're super skilled. When I think about the Red Team arena, I think those are really the people that we're building it for. Like, we want to make them excited and happy, build tools that they like. And just like chatbot arena, we'll trust that this will end up being useful for the world. And all these people are, you know, I won't say all these people in this community are actually good hearted, right? They're not doing it because they want to like see the world burn. They're doing it because they like, think it's fun and cool. And yeah. Okay. Maybe they want to see, maybe they want a little bit.Wei Lin [00:25:13]: I don't know. Majority.Anastasios [00:25:15]: Yeah.Wei Lin [00:25:16]: You know what I'm saying.Anastasios [00:25:17]: So, you know, trying to figure out how to serve them best, I think, I don't know where that fits. I just, I'm not expressing. And give them credits, right?Wei Lin [00:25:24]: And give them credit.Anastasios [00:25:25]: Yeah. Yeah. So I'm not trying to express any particular value judgment here as to whether that's the right next step. It's just, that's sort of the way that I think we would think about it.Swyx [00:25:35]: Yeah. We also talked to Sander Schulhoff of the HackerPrompt competition, and he's pretty interested in Red Teaming at scale. Let's just call it that. You guys maybe want to talk with him.Wei Lin [00:25:45]: Oh, nice.Swyx [00:25:46]: We wanted to cover a little, a few topical things and then go into the other stuff that your group is doing. You know, you're not just running Chatbot Arena. We can also talk about the new website and your future plans, but I just wanted to briefly focus on O1. It is the hottest, latest model. Obviously, you guys already have it on the leaderboard. What is the impact of O1 on your evals?Wei Lin [00:26:06]: Made our interface slower.Anastasios [00:26:07]: It made it slower.Swyx [00:26:08]: Yeah.Wei Lin [00:26:10]: Because it needs like 30, 60 seconds, sometimes even more to, the latency is like higher. So that's one. Sure. But I think we observe very interesting things from this model as well. Like we observe like significant improvement in certain categories, like more technical or math. Yeah.Anastasios [00:26:32]: I think actually like one takeaway that was encouraging is that I think a lot of people before the O1 release were thinking, oh, like this benchmark is saturated. And why were they thinking that? They were thinking that because there was a bunch of models that were kind of at the same level. They were just kind of like incrementally competing and it sort of wasn't immediately obvious that any of them were any better. Nobody, including any individual person, it's hard to tell. But what O1 did is it was, it's clearly a better model for certain tasks. I mean, I used it for like proving some theorems and you know, there's some theorems that like only I know because I still do a little bit of theory. Right. So it's like, I can go in there and ask like, oh, how would you prove this exact thing? Which I can tell you has never been in the public domain. It'll do it. It's like, what?Wei Lin [00:27:19]: Okay.Anastasios [00:27:20]: So there's this model and it crushed the benchmark. You know, it's just like really like a big gap. And what that's telling us is that it's not saturated yet. It's still measuring some signal. That was encouraging. The point, the takeaway is that the benchmark is comparative. There's no absolute number. There's no maximum ELO. It's just like, if you're better than the rest, then you win. I think that was actually quite helpful to us.Swyx [00:27:46]: I think people were criticizing, I saw some of the academics criticizing it as not apples to apples. Right. Like, because it can take more time to reason, it's basically doing some search, doing some chain of thought that if you actually let the other models do that same thing, they might do better.Wei Lin [00:28:03]: Absolutely.Anastasios [00:28:04]: To be clear, none of the leaderboard currently is apples to apples because you have like Gemini Flash, you have, you know, all sorts of tiny models like Lama 8B, like 8B and 405B are not apples to apples.Wei Lin [00:28:19]: Totally agree. They have different latencies.Anastasios [00:28:21]: Different latencies.Wei Lin [00:28:22]: Control for latency. Yeah.Anastasios [00:28:24]: Latency control. That's another thing. We can do style control, but latency control. You know, things like this are important if you want to understand the trade-offs involved in using AI.Swyx [00:28:34]: O1 is a developing story. We still haven't seen the full model yet, but it's definitely a very exciting new paradigm. I think one community controversy I just wanted to give you guys space to address is the collaboration between you and the large model labs. People have been suspicious, let's just say, about how they choose to A-B test on you. I'll state the argument and let you respond, which is basically they run like five anonymous models and basically argmax their Elo on LMSYS or chatbot arena, and they release the best one. Right? What has been your end of the controversy? How have you decided to clarify your policy going forward?Wei Lin [00:29:15]: On a high level, I think our goal here is to build a fast eval for everyone, and including everyone in the community can see the data board and understand, compare the models. More importantly, I think we want to build the best eval also for model builders, like all these frontier labs building models. They're also internally facing a challenge, which is how do they eval the model? That's the reason why we want to partner with all the frontier lab people, and then to help them testing. That's one of the... We want to solve this technical challenge, which is eval. Yeah.Anastasios [00:29:54]: I mean, ideally, it benefits everyone, right?Wei Lin [00:29:56]: Yeah.Anastasios [00:29:57]: And people also are interested in seeing the leading edge of the models. People in the community seem to like that. Oh, there's a new model up. Is this strawberry? People are excited. People are interested. Yeah. And then there's this question that you bring up of, is it actually causing harm?Wei Lin [00:30:15]: Right?Anastasios [00:30:16]: Is it causing harm to the benchmark that we are allowing this private testing to happen? Maybe stepping back, why do you have that instinct? The reason why you and others in the community have that instinct is because when you look at something like a benchmark, like an image net, a static benchmark, what happens is that if I give you a million different models that are all slightly different, and I pick the best one, there's something called selection bias that plays in, which is that the performance of the winning model is overstated. This is also sometimes called the winner's curse. And that's because statistical fluctuations in the evaluation, they're driving which model gets selected as the top. So this selection bias can be a problem. Now there's a couple of things that make this benchmark slightly different. So first of all, the selection bias that you include when you're only testing five models is normally empirically small.Wei Lin [00:31:12]: And that's why we have these confidence intervals constructed.Anastasios [00:31:16]: That's right. Yeah. Our confidence intervals are actually not multiplicity adjusted. One thing that we could do immediately tomorrow in order to address this concern is if a model provider is testing five models and they want to release one, and we're constructing the models at level one minus alpha, we can just construct the intervals instead at level one minus alpha divided by five. That's called Bonferroni correction. What that'll tell you is that the final performance of the model, the interval that gets constructed, is actually formally correct. We don't do that right now, partially because we know from simulations that the amount of selection bias you incur with these five things is just not huge. It's not huge in comparison to the variability that you get from just regular human voters. So that's one thing. But then the second thing is the benchmark is live, right? So what ends up happening is it'll be a small magnitude, but even if you suffer from the winner's curse after testing these five models, what'll happen is that over time, because we're getting new data, it'll get adjusted down. So if there's any bias that gets introduced at that stage, in the long run, it actually doesn't matter. Because asymptotically, basically in the long run, there's way more fresh data than there is data that was used to compare these five models against these private models.Swyx [00:32:35]: The announcement effect is only just the first phase and it has a long tail.Anastasios [00:32:39]: Yeah, that's right. And it sort of like automatically corrects itself for this selection adjustment.Swyx [00:32:45]: Every month, I do a little chart of Ellim's ELO versus cost, just to track the price per dollar, the amount of like, how much money do I have to pay for one incremental point in ELO? And so I actually observe an interesting stability in most of the ELO numbers, except for some of them. For example, GPT-4-O August has fallen from 12.90

Infinite Machine Learning
Voice-to-Voice Foundation Models

Infinite Machine Learning

Play Episode Listen Later Oct 30, 2024 39:08


Alan Cowen is the cofounder and CEO of Hume, a company building voice-to-voice foundation models. They recently raised their $50M Series B from Union Square Ventures, Nat Friedman, Daniel Gross, and others. Alan's favorite book: 1984 (Author: George Orwell)(00:01) Introduction(00:06) Defining Voice-to-Voice Foundation Models(01:26) Historical Context: Handling Voice and Speech Understanding(03:54) Emotion Detection in Voice AI Models(04:33) Training Models to Recognize Human Emotion in Speech(07:19) Cultural Variations in Emotional Expressions(09:00) Semantic Space Theory in Emotion Recognition(12:11) Limitations of Basic Emotion Categories(15:50) Recognizing Blended Emotional States(20:15) Objectivity in Emotion Science(24:37) Practical Aspects of Deploying Voice AI Systems(28:17) Real-Time System Constraints and Latency(31:30) Advancements in Voice AI Models(32:54) Rapid-Fire Round--------Where to find Prateek Joshi: Newsletter: https://prateekjoshi.substack.com Website: https://prateekj.com LinkedIn: https://www.linkedin.com/in/prateek-joshi-91047b19 Twitter: https://twitter.com/prateekvjoshi 

Finding Genius Podcast
Breaking Boundaries In HIV Research: Leor Weinberger On Viral Latency & Revolutionary Therapies

Finding Genius Podcast

Play Episode Listen Later Oct 4, 2024 34:17


In today's episode, we are honored to be joined by Leor Weinberger, the William and Ute Bowes Distinguished Professor of Virology, director of the Gladstone Center for Cell Circuitry, professor of pharmaceutical chemistry, and professor of biochemistry and biophysics at Gladstone Institutes/University of California, San Francisco. As a world-renowned virologist and quantitative biologist, Leor has made a significant impact in the field of HIV research with his groundbreaking discovery of the HIV virus latency circuit. Leor's lab studies the fundamental processes of viral biology in the pursuit of developing innovative first-in-class therapies against HIV. They use computational and experimental approaches, including quantitative, single-cell and single-molecule microscopy and mathematical modeling… Click play to find out:  How quantitative and theoretical biophysics apply to HIV. Why HIV latency has always been a problem with successful treatment.  What happens when viral loads are lower in the blood of infected individuals.  When to administer a therapeutic that overcomes barriers to biodistribution.  How are Leor and his team tackling the biggest challenges in human health? Tune in now to learn more about their unique and innovative approach to disrupting the way science is done – and how these discoveries have the potential to change lives! You can follow along with Leor and his fascinating work with the Gladstone Center for Cell Circuitry here. Episode also available on Apple Podcast: http://apple.co/30PvU9

The SaaS CFO
$77M Raised to Build Gen AI Apps Faster and More Efficient

The SaaS CFO

Play Episode Listen Later Oct 1, 2024 23:25


Welcome back to The SaaS CFO Podcast! Today, we're thrilled to have a remarkable guest joining us—Lin Qiao, the CEO and co-founder of Fireworks AI. Lin brings a wealth of expertise from her days as a researcher and software engineer, having worked at industry giants like IBM and Facebook. She'll share her inspiring journey from academia to leading innovations in the AI space. In this episode, Lin takes us through the revolutionary products Fireworks AI offers, aimed at transforming how application developers and product engineers leverage GenAI technology. Imagine building advanced AI-driven applications in days instead of years, without needing an army of engineers. Lin expertly breaks down the company's offerings using a compelling car assembly analogy, making it easy for us to grasp the intricate layers of their technology. We'll also dive into the diverse industry applications of Fireworks AI, spanning startups, digital-native enterprises, and traditional companies. Lin discusses the groundbreaking shifts this technology can bring, the unique challenges and costs involved, and how her company aims to simplify this landscape for developers and enterprises alike. Furthermore, Lin opens up about Fireworks AI's impressive milestones, including their fundraising journey, key metrics, and future product directions. Whether you're an app developer or an enterprise executive, this conversation promises invaluable insights into accelerating innovation with GenAI. So, grab your headphones and join us for an enlightening discussion with Lin Qiao on The SaaS CFO Podcast! Show Notes: 00:00 From researcher to impactful software engineer at Facebook. 05:47 Mobile apps revolutionized industries, creating vast opportunities. 06:49 Create videos for automation and business efficiency. 09:55 Latency challenges with engaging, high-size JNAI models. 13:33 Accelerating strategy and execution for diverse companies. 17:01 Critical partnerships, funding growth, expand AI system, teamwork. 22:05 Explore and deploy models with Fireworks AI. Links: SaaS Fundraising Stories: https://www.thesaasnews.com/news/fireworks-ai-secures-25-million-in-series-a https://www.thesaasnews.com/news/fireworks-ai-raises-52-million-in-series-b Lin Qiao's LinkedIn: https://www.linkedin.com/in/lin-qiao-22248b4/ Fireworks AI's Linkedn: https://www.linkedin.com/company/fireworks-ai/ Fireworks AI's Website: https://fireworks.ai/ To learn more about Ben check out the links below: Subscribe to Ben's daily metrics newsletter: https://saasmetricsschool.beehiiv.com/subscribe Subscribe to Ben's SaaS newsletter: https://mailchi.mp/df1db6bf8bca/the-saas-cfo-sign-up-landing-page SaaS Metrics courses here: https://www.thesaasacademy.com/ Join Ben's SaaS community here: https://www.thesaasacademy.com/offers/ivNjwYDx/checkout Follow Ben on LinkedIn: https://www.linkedin.com/in/benrmurray

The Data Center Frontier Show
Future-Ready Cabling for AI: The Journey towards 800G

The Data Center Frontier Show

Play Episode Listen Later Sep 26, 2024 16:25


Join us for this podcast as we explore the dynamic landscape of data centers and how Artificial Intelligence (AI) has reshaped them. We'll delve into the shift from a 'north-south' traffic system to the sophisticated 'east-west' system that revolutionized data processing. Our guest, Dave Hessong from Corning, illustrates the crucial role of high-speed connections like 800G in meeting AI's demands. The discussion reveals how upgrading to this speed is not just beneficial, but essential in optimizing your data center. Latency, a key factor in network performance, is also a core topic of our conversation. Understanding its significance and how reducing it can enhance performance provides an edge in today's competitive market. The discussion further delves into the importance of state-of-the-art fiber optic cables, connectors, and cabling architecture in boosting a data center's performance. The complexities of AI deployment, its impact on fiber density, and the innovative solutions it necessitates are also explored. As we unveil the future of data centers, the estimated rise in AI capacity and the associated challenges are discussed. These include the increased power requirements and the need for a more organized cable and fiber infrastructure. While 800G might seem like just the beginning, the discussion elaborates on how this transition can future-proof your data centers for the next three to seven years. The extraordinary and transformative impact of AI, still in its infancy, on business and society is also a key highlight. Looking to the future, the anticipated growth in bandwidth as AI continues to evolve, and the exciting prospect of technology reaching 1.6Tbps next year, are discussed. We encourage you to tune in and engage with us as we navigate this rapidly evolving field. Regardless of your level of expertise, this conversation promises valuable insights into the future of data centers. Join us on this enlightening journey into the world of AI and data centers.

Let's Get Legal
Vogelzang Law: What is the latency period for asbestos exposure and why is it important?

Let's Get Legal

Play Episode Listen Later Sep 21, 2024


Christian Luciano Santiago of Vogelzang Law, joins Jon Hansen on Let's Get Legal to talk about asbestos and its uses that people might’ve not known about. Christian explains how exposure could have happened through one’s household, automobiles, and construction sites. For more information, call (312) 466-1669.

On The Homefront with Jeff Dudan
Rushing to Conflict: 2024 Business Tips | On The Homefront With Jeff Dudan #95

On The Homefront with Jeff Dudan

Play Episode Listen Later Aug 6, 2024 9:27


Join the conversation in the comments below! Do you Rush To Conflict?In This Episode:00:00:00 - Introduction: Addressing Business Delays00:00:15 - Understanding Latency in Business00:01:00 - The Impact of Latency on Productivity00:01:45 - The Three Words to Combat Latency00:02:30 - Why "Rush to Conflict" Matters00:03:15 - Examples of Effective Conflict Resolution00:04:00 - Building a Conflict-Ready Organization00:04:45 - The Role of Clarity in Reducing Latency00:05:30 - Avoiding Common Pitfalls in Business Communication00:06:15 - Conclusion: Implementing the Rush to Conflict Strategy Want to own your own business? Take our business ownership quizFor your FREE Discernment eBookJoin our Exclusive Facebook GroupVisit our InstagramJoin and be a part of On The HomefrontConnect with Jeff Dudan

Lex Fridman Podcast
#438 – Elon Musk: Neuralink and the Future of Humanity

Lex Fridman Podcast

Play Episode Listen Later Aug 2, 2024


Elon Musk is CEO of Neuralink, SpaceX, Tesla, xAI, and CTO of X. DJ Seo is COO & President of Neuralink. Matthew MacDougall is Head Neurosurgeon at Neuralink. Bliss Chapman is Brain Interface Software Lead at Neuralink. Noland Arbaugh is the first human to have a Neuralink device implanted in his brain. Transcript: https://lexfridman.com/elon-musk-and-neuralink-team-transcript Please support this podcast by checking out our sponsors: https://lexfridman.com/sponsors/ep438-sc SPONSOR DETAILS: - Cloaked: https://cloaked.com/lex and use code LexPod to get 25% off - MasterClass: https://masterclass.com/lexpod to get 15% off - Notion: https://notion.com/lex - LMNT: https://drinkLMNT.com/lex to get free sample pack - Motific: https://motific.ai - BetterHelp: https://betterhelp.com/lex to get 10% off CONTACT LEX: Feedback - give feedback to Lex: https://lexfridman.com/survey AMA - submit questions, videos or call-in: https://lexfridman.com/ama Hiring - join our team: https://lexfridman.com/hiring Other - other ways to get in touch: https://lexfridman.com/contact EPISODE LINKS: Neuralink's X: https://x.com/neuralink Neuralink's Website: https://neuralink.com/ Elon's X: https://x.com/elonmusk DJ's X: https://x.com/djseo_ Matthew's X: https://x.com/matthewmacdoug4 Bliss's X: https://x.com/chapman_bliss Noland's X: https://x.com/ModdedQuad xAI: https://x.com/xai Tesla: https://x.com/tesla Tesla Optimus: https://x.com/tesla_optimus Tesla AI: https://x.com/Tesla_AI PODCAST INFO: Podcast website: https://lexfridman.com/podcast Apple Podcasts: https://apple.co/2lwqZIr Spotify: https://spoti.fi/2nEwCF8 RSS: https://lexfridman.com/feed/podcast/ YouTube Full Episodes: https://youtube.com/lexfridman YouTube Clips: https://youtube.com/lexclips SUPPORT & CONNECT: - Check out the sponsors above, it's the best way to support this podcast - Support on Patreon: https://www.patreon.com/lexfridman - Twitter: https://twitter.com/lexfridman - Instagram: https://www.instagram.com/lexfridman - LinkedIn: https://www.linkedin.com/in/lexfridman - Facebook: https://www.facebook.com/lexfridman - Medium: https://medium.com/@lexfridman OUTLINE: Here's the timestamps for the episode. On some podcast players you should be able to click the timestamp to jump to that time. (00:00) - Introduction (09:26) - Elon Musk (12:42) - Telepathy (19:22) - Power of human mind (23:49) - Future of Neuralink (29:04) - Ayahuasca (38:33) - Merging with AI (43:21) - xAI (45:34) - Optimus (52:24) - Elon's approach to problem-solving (1:09:59) - History and geopolitics (1:14:30) - Lessons of history (1:18:49) - Collapse of empires (1:26:32) - Time (1:29:14) - Aliens and curiosity (1:36:48) - DJ Seo (1:44:57) - Neural dust (1:51:40) - History of brain–computer interface (1:59:44) - Biophysics of neural interfaces (2:10:12) - How Neuralink works (2:16:03) - Lex with Neuralink implant (2:36:01) - Digital telepathy (2:47:03) - Retracted threads (2:52:38) - Vertical integration (2:59:32) - Safety (3:09:27) - Upgrades (3:18:30) - Future capabilities (3:47:46) - Matthew MacDougall (3:53:35) - Neuroscience (4:00:44) - Neurosurgery (4:11:48) - Neuralink surgery (4:30:57) - Brain surgery details (4:46:40) - Implanting Neuralink on self (5:02:34) - Life and death (5:11:54) - Consciousness (5:14:48) - Bliss Chapman (5:28:04) - Neural signal (5:34:56) - Latency (5:39:36) - Neuralink app (5:44:17) - Intention vs action (5:55:31) - Calibration (6:05:03) - Webgrid (6:28:05) - Neural decoder (6:48:40) - Future improvements (6:57:36) - Noland Arbaugh (6:57:45) - Becoming paralyzed (7:11:20) - First Neuralink human participant (7:15:21) - Day of surgery (7:33:08) - Moving mouse with brain (7:58:27) - Webgrid (8:06:28) - Retracted threads (8:14:53) - App improvements (8:21:38) - Gaming (8:32:36) - Future Neuralink capabilities (8:35:31) - Controlling Optimus robot (8:39:53) - God

Build with Leila Hormozi
Become an Effective Leader with These 3 Principles | Ep 163

Build with Leila Hormozi

Play Episode Listen Later Jul 22, 2024 27:11


“In order to move forward, you must direct people into the future rather than resurrect the past.” Today, join Leila (@LeilaHormozi) as she guest speaks and talks about the three essential leadership principles: prioritizing immediate reinforcement over powerful but delayed feedback, differentiating constructive criticism from harmful insults, and focusing on future improvements rather than dwelling on past mistakes. She breaks down effective leadership into operational steps for creating an unshakable business culture and driving your team towards consistent growth.Welcome to Build where we talk about the lessons I have learned in scaling big businesses, gaining millions in sales, and helping our portfolio companies do the same. Buckle up, because we're creating an unshakeable business.Timestamps:(1:33) - Principle 1: Latency over intensity(7:34) - Principle 2: Criticism vs. insults(19:50) - Principle 3: Future focused feedbackFollow Leila Hormozi's Socials:LinkedIn | Instagram | YouTube | Twitter | Acquisition