POPULARITY
Part 2 of this series could have easily been renamed "AI for science: The expert's guide to practical machine learning.” We continue our discussion with Christoph Molnar and Timo Freiesleben to look at how scientists can apply supervised machine learning techniques from the previous episode into their research.Introduction to supervised ML for science (0:00) Welcome back to Christoph Molnar and Timo Freiesleben, co-authors of “Supervised Machine Learning for Science: How to Stop Worrying and Love Your Black Box”The model as the expert? (1:00)Evaluation metrics have profound downstream effects on all modeling decisionsData augmentation offers a simple yet powerful way to incorporate domain knowledgeDomain expertise is often undervalued in data science despite being crucialMeasuring causality: Metrics and blind spots (10:10)Causality approaches in ML range from exploring associations to inferring treatment effectsConnecting models to scientific understanding (18:00)Interpretation methods must stay within realistic data distributions to yield meaningful insightsRobustness across distribution shifts (26:40)Robustness requires understanding what distribution shifts affect your modelPre-trained models and transfer learning provide promising paths to more robust scientific MLReproducibility challenges in ML and science (35:00)Reproducibility challenges differ between traditional science and machine learningGo back to listen to part one of this series for the conceptual foundations that support these practical applications.Check out Christoph and Timo's book “Supervised Machine Learning for Science: How to Stop Worrying and Love Your Black Box” available online now.What did you think? Let us know.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
// SPONSORS //The Farm at Okefenokee: https://okefarm.com/iCoin: https://icointechnology.com/breedloveHeart and Soil Supplements (use discount code BREEDLOVE): https://heartandsoil.co/In Wolf's Clothing: https://wolfnyc.com/Blockware Solutions: https://mining.blockwaresolutions.com/breedloveOn Ramp: https://onrampbitcoin.com/?grsf=breedloveMindlab Pro: https://www.mindlabpro.com/breedloveCoinbits: https://coinbits.app/breedlove // PRODUCTS I ENDORSE //Protect your mobile phone from SIM swap attacks: https://www.efani.com/breedloveNoble Protein (discount code BREEDLOVE for 15% off): https://nobleorigins.com/Lineage Provisions (use discount code BREEDLOVE): https://lineageprovisions.com/?ref=breedlove_22Colorado Craft Beef (use discount code BREEDLOVE): https://coloradocraftbeef.com/ // SUBSCRIBE TO THE CLIPS CHANNEL //https://www.youtube.com/@robertbreedloveclips2996/videos // OUTLINE //0:00 - WiM Episode Trailer1:22 - Nassim Taleb's “Anti-Fragile”11:36 - Why Anti-Fragile?19:15 - The Farm at Okefenokee20:25 - iCoin Bitcoin Wallet21:55 - The Central Triad: Fragile, Robust, and Anti-Fragile26:45 - Learning31:05 - Skin in the Game: Decisions and Risk36:28 - Heart and Soil Supplements37:28 - Helping Lightning Startups with In Wolf's Clothing38:20 - Skin in the Game and Anti-Fragility41:34 - Academia and WTF Happened in 197147:54 - Central Triad Explained57:43 - Mine Bitcoin with Blockware Solutions59:05 - OnRamp Bitcoin Custody1:00:28 - Modern Problems: Health, Big Pharma, Raising Children 1:12:28 - Surviving Scarcity and Weimar Hyperinflation1:18:17 - Fragile vs Anti-Fragile Systems (Centralization vs Free Markets)1:20:44 - Mind Lab Pro Supplements1:21:53 - Buy Bitcoin with Coinbits1:23:22 - Hormesis and Iatrogenics: The Benefits of Adversity1:29:46 - Money Printing and Anti-Fragility1:32:54 - Alcoholism and Money Printing1:36:22 - Have Agency: Humans are Programmable1:40:20 - Closing Thoughts // PODCAST //Podcast Website: https://whatismoneypodcast.com/Apple Podcast: https://podcasts.apple.com/us/podcast/the-what-is-money-show/id1541404400Spotify: https://open.spotify.com/show/25LPvm8EewBGyfQQ1abIsERSS Feed: https://feeds.simplecast.com/MLdpYXYI // SUPPORT THIS CHANNEL //Bitcoin: 3D1gfxKZKMtfWaD1bkwiR6JsDzu6e9bZQ7Sats via Strike: https://strike.me/breedlove22Dollars via Paypal: https://www.paypal.com/paypalme/RBreedloveDollars via Venmo: https://account.venmo.com/u/Robert-Breedlove-2
Featured Attorney PanelPeter Fitzgerald - Partner at Angelina and Herrick www.ah-lawyers.com/William Lohman - Head of Real Estate Law @ Castle Law www.castlelaw.comMike Baradt - Law Offices of Michael Bradt https://mbradtlaw.com/The Multi-Board Residential Real Estate Contract has been updated to version 8.0. Stay ahead of the curve in this session that will provide an in-depth overview on the changes to the new contract - equipping you with the knowledge and tools to navigate these changes confidently and effectively.This powerful session provides a comprehensive overview of the new contract's key elements, reformatted structure, legal considerations and potential impact on transactions.And, our expert panel will guide you through real-world scenarios, practical tips and critical updates to help you master the revisions and minimize risks for your clients and business. A must-watch session for all REALTORS.Introduction and Welcome (00:00:00)Panelist Introductions (00:01:39)History of the 80 Contract (00:04:07)Feedback on Contract Changes (00:09:00)Key Enhancements to the 80 Contract (00:10:58)Impact of Organizational Changes (00:12:10 Transparency in Broker Commissions (00:14:21)Understanding Property Types (00:15:10)Robustness of Buyer Broker Agreements (00:16:24)Value of Buyer Agents (00:17:14)Expanded Personal Property Paragraph (00:18:03)Impact of Solar Panels (00:18:39)Clarification of Notice Options (00:20:42)Homeowner and Flood Insurance Changes (00:21:33)As Is Clause Clarification (00:24:29)Concerns with As Is Paragraph (00:25:49)Health and Safety Exclusions (00:28:01)Negotiating As Is Transactions (00:30:50)Risks of Making Requests (00:33:04)Opportunities for Buyers (00:34:53)Investor Strategies (00:35:14)Offer Games (00:36:09)Expectations in Real Estate (00:37:18)Inspection Paragraphs (00:38:07)Knowledge Representation Changes (00:39:00)Escrow Language Changes (00:41:08)Performance Clause Expansion (00:42:56)Cancellation of Prior Contracts (00:44:17)Alternative Energy Paragraph (00:48:24)Inconsistent Representation Language (00:49:43)The Cost of Solar Panels (00:52:08)Seller's Preparation (00:52:34)Contract Addendum Overview (00:52:53)Reverse Contingency Addendum (00:53:05)Repair Addendum Insights (00:53:40)Organizing the Addendum List (00:53:54)Notice and Response Form (00:54:18)Appraisal Addendum Discussion (00:54:29)Broker's Role in Transactions (00:56:47)Intent to Escalate (00:58:42)Multi-Unit Addendum (01:00:00)Repair Request Agreement (01:00:20)Reverse Contingency in Seller's Market (01:01:11)Introduction of Land Trust Team (01:02:14)Real Estate Fraud Session Follow-up (01:03:09)Closing Remarks (01:03:17)Escrow Discussion (01:04:20)Valuing Time in Escrow Management (01:05:38)Final Goodbyes (01:06:21)
Dr. Max Bartolo from Cohere discusses machine learning model development, evaluation, and robustness. Key topics include model reasoning, the DynaBench platform for dynamic benchmarking, data-centric AI development, model training challenges, and the limitations of human feedback mechanisms. The conversation also covers technical aspects like influence functions, model quantization, and the PRISM project.Max Bartolo (Cohere):https://www.maxbartolo.com/https://cohere.com/commandTRANSCRIPT:https://www.dropbox.com/scl/fi/vujxscaffw37pqgb6hpie/MAXB.pdf?rlkey=0oqjxs5u49eqa2m7uaol64lbw&dl=0TOC:1. Model Reasoning and Verification [00:00:00] 1.1 Model Consistency and Reasoning Verification [00:03:25] 1.2 Influence Functions and Distributed Knowledge Analysis [00:10:28] 1.3 AI Application Development and Model Deployment [00:14:24] 1.4 AI Alignment and Human Feedback Limitations2. Evaluation and Bias Assessment [00:20:15] 2.1 Human Evaluation Challenges and Factuality Assessment [00:27:15] 2.2 Cultural and Demographic Influences on Model Behavior [00:32:43] 2.3 Adversarial Examples and Model Robustness3. Benchmarking Systems and Methods [00:41:54] 3.1 DynaBench and Dynamic Benchmarking Approaches [00:50:02] 3.2 Benchmarking Challenges and Alternative Metrics [00:50:33] 3.3 Evolution of Model Benchmarking Methods [00:51:15] 3.4 Hierarchical Capability Testing Framework [00:52:35] 3.5 Benchmark Platforms and Tools4. Model Architecture and Performance [00:55:15] 4.1 Cohere's Model Development Process [01:00:26] 4.2 Model Quantization and Performance Evaluation [01:05:18] 4.3 Reasoning Capabilities and Benchmark Standards [01:08:27] 4.4 Training Progression and Technical Challenges5. Future Directions and Challenges [01:13:48] 5.1 Context Window Evolution and Trade-offs [01:22:47] 5.2 Enterprise Applications and Future ChallengesREFS:[00:03:10] Research at Cohere with Laura Ruis et al., Max Bartolo, Laura Ruis et al.https://cohere.com/research/papers/procedural-knowledge-in-pretraining-drives-reasoning-in-large-language-models-2024-11-20[00:04:15] Influence functions in machine learning, Koh & Lianghttps://arxiv.org/abs/1703.04730[00:08:05] Studying Large Language Model Generalization with Influence Functions, Roger Grosse et al.https://storage.prod.researchhub.com/uploads/papers/2023/08/08/2308.03296.pdf[00:11:10] The LLM ARChitect: Solving ARC-AGI Is A Matter of Perspective, Daniel Franzen, Jan Disselhoff, and David Hartmannhttps://github.com/da-fr/arc-prize-2024/blob/main/the_architects.pdf[00:12:10] Hugging Face model repo for C4AI Command A, Cohere and Cohere For AIhttps://huggingface.co/CohereForAI/c4ai-command-a-03-2025[00:13:30] OpenInterpreterhttps://github.com/KillianLucas/open-interpreter[00:16:15] Human Feedback is not Gold Standard, Tom Hosking, Max Bartolo, Phil Blunsomhttps://arxiv.org/abs/2309.16349[00:27:15] The PRISM Alignment Dataset, Hannah Kirk et al.https://arxiv.org/abs/2404.16019[00:32:50] How adversarial examples arise, Andrew Ilyas, Shibani Santurkar, Dimitris Tsipras, Logan Engstrom, Brandon Tran, Aleksander Madryhttps://arxiv.org/abs/1905.02175[00:43:00] DynaBench platform paper, Douwe Kiela et al.https://aclanthology.org/2021.naacl-main.324.pdf[00:50:15] Sara Hooker's work on compute limitations, Sara Hookerhttps://arxiv.org/html/2407.05694v1[00:53:25] DataPerf: Community-led benchmark suite, Mazumder et al.https://arxiv.org/abs/2207.10062[01:04:35] DROP, Dheeru Dua et al.https://arxiv.org/abs/1903.00161[01:07:05] GSM8k, Cobbe et al.https://paperswithcode.com/sota/arithmetic-reasoning-on-gsm8k[01:09:30] ARC, François Chollethttps://github.com/fchollet/ARC-AGI[01:15:50] Command A, Coherehttps://cohere.com/blog/command-a[01:22:55] Enterprise search using LLMs, Coherehttps://cohere.com/blog/commonly-asked-questions-about-search-from-coheres-enterprise-customers
Man in his pomp yet without understanding is like the beasts that perish.- Psalm 49:20 This Episode's Links and Timestamps:00:00 – Scripture Reading02:59 – Introduction10:41 – Commentary on Psalm 4934:06 – Reviewing ‘Antifragile' by Nassim Taleb1:12:12 – Robustness and Antifragility1:22:06 – Interventionism and Iatrogenics1:29:36 – Ethics and Foundational Asymmetry
Daniel Franzen and Jan Disselhoff, the "ARChitects" are the official winners of the ARC Prize 2024. Filmed at Tufa Labs in Zurich - they revealed how they achieved a remarkable 53.5% accuracy by creatively utilising large language models (LLMs) in new ways. Discover their innovative techniques, including depth-first search for token selection, test-time training, and a novel augmentation-based validation system. Their results were extremely surprising.SPONSOR MESSAGES:***CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments. Check out their super fast DeepSeek R1 hosting!https://centml.ai/pricing/Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich.Goto https://tufalabs.ai/***Jan Disselhoffhttps://www.linkedin.com/in/jan-disselhoff-1423a2240/Daniel Franzenhttps://github.com/da-frARC Prize: http://arcprize.org/TRANSCRIPT AND BACKGROUND READING:https://www.dropbox.com/scl/fi/utkn2i1ma79fn6an4yvjw/ARCHitects.pdf?rlkey=67pe38mtss7oyhjk2ad0d2aza&dl=0TOC1. Solution Architecture and Strategy Overview[00:00:00] 1.1 Initial Solution Overview and Model Architecture[00:04:25] 1.2 LLM Capabilities and Dataset Approach[00:10:51] 1.3 Test-Time Training and Data Augmentation Strategies[00:14:08] 1.4 Sampling Methods and Search Implementation[00:17:52] 1.5 ARC vs Language Model Context Comparison2. LLM Search and Model Implementation[00:21:53] 2.1 LLM-Guided Search Approaches and Solution Validation[00:27:04] 2.2 Symmetry Augmentation and Model Architecture[00:30:11] 2.3 Model Intelligence Characteristics and Performance[00:37:23] 2.4 Tokenization and Numerical Processing Challenges3. Advanced Training and Optimization[00:45:15] 3.1 DFS Token Selection and Probability Thresholds[00:49:41] 3.2 Model Size and Fine-tuning Performance Trade-offs[00:53:07] 3.3 LoRA Implementation and Catastrophic Forgetting Prevention[00:56:10] 3.4 Training Infrastructure and Optimization Experiments[01:02:34] 3.5 Search Tree Analysis and Entropy Distribution PatternsREFS[00:01:05] Winning ARC 2024 solution using 12B param model, Franzen, Disselhoff, Hartmannhttps://github.com/da-fr/arc-prize-2024/blob/main/the_architects.pdf[00:03:40] Robustness of analogical reasoning in LLMs, Melanie Mitchellhttps://arxiv.org/html/2411.14215[00:07:50] Re-ARC dataset generator for ARC task variations, Michael Hodelhttps://github.com/michaelhodel/re-arc[00:15:00] Analysis of search methods in LLMs (greedy, beam, DFS), Chen et al.https://arxiv.org/html/2408.00724v2[00:16:55] Language model reachability space exploration, University of Torontohttps://www.youtube.com/watch?v=Bpgloy1dDn0[00:22:30] GPT-4 guided code solutions for ARC tasks, Ryan Greenblatthttps://redwoodresearch.substack.com/p/getting-50-sota-on-arc-agi-with-gpt[00:41:20] GPT tokenization approach for numbers, OpenAIhttps://platform.openai.com/docs/guides/text-generation/tokenizer-examples[00:46:25] DFS in AI search strategies, Russell & Norvighttps://www.amazon.com/Artificial-Intelligence-Modern-Approach-4th/dp/0134610997[00:53:10] Paper on catastrophic forgetting in neural networks, Kirkpatrick et al.https://www.pnas.org/doi/10.1073/pnas.1611835114[00:54:00] LoRA for efficient fine-tuning of LLMs, Hu et al.https://arxiv.org/abs/2106.09685[00:57:20] NVIDIA H100 Tensor Core GPU specs, NVIDIAhttps://developer.nvidia.com/blog/nvidia-hopper-architecture-in-depth/[01:04:55] Original MCTS in computer Go, Yifan Jinhttps://stanford.edu/~rezab/classes/cme323/S15/projects/montecarlo_search_tree_report.pdf
Vinu Sankar Sadasivan is a CS PhD ... Currently, I am working as a full-time Student Researcher at Google DeepMind on jailbreaking multimodal AI models. Robustness, Detectability, and Data Privacy in AI // MLOps Podcast #289 with Vinu Sankar Sadasivan, Student Researcher at Google DeepMind. // Abstract Recent rapid advancements in Artificial Intelligence (AI) have made it widely applicable across various domains, from autonomous systems to multimodal content generation. However, these models remain susceptible to significant security and safety vulnerabilities. Such weaknesses can enable attackers to jailbreak systems, allowing them to perform harmful tasks or leak sensitive information. As AI becomes increasingly integrated into critical applications like autonomous robotics and healthcare, the importance of ensuring AI safety is growing. Understanding the vulnerabilities in today's AI systems is crucial to addressing these concerns. // Bio Vinu Sankar Sadasivan is a final-year Computer Science PhD candidate at The University of Maryland, College Park, advised by Prof. Soheil Feizi. His research focuses on Security and Privacy in AI, with a particular emphasis on AI robustness, detectability, and user privacy. Currently, Vinu is a full-time Student Researcher at Google DeepMind, working on jailbreaking multimodal AI models. Previously, Vinu was a Research Scientist intern at Meta FAIR in Paris, where he worked on AI watermarking. Vinu is a recipient of the 2023 Kulkarni Fellowship and has earned several distinctions, including the prestigious Director's Silver Medal. He completed a Bachelor's degree in Computer Science & Engineering at IIT Gandhinagar in 2020. Prior to their PhD, Vinu gained research experience as a Junior Research Fellow in the Data Science Lab at IIT Gandhinagar and through internships at Caltech, Microsoft Research India, and IISc. // MLOps Swag/Merch https://shop.mlops.community/ // Related Links Website: https://vinusankars.github.io/ --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Richard on LinkedIn: https://www.linkedin.com/in/vinusankars/
Analysis of image classifiers demonstrates that it is possible to understand backprop networks at the task-relevant run-time algorithmic level. In these systems, at least, networks gain their power from deploying massive parallelism to check for the presence of a vast number of simple, shallow patterns. https://betterwithout.ai/images-surface-features This episode has a lot of links: David Chapman's earliest public mention, in February 2016, of image classifiers probably using color and texture in ways that "cheat": twitter.com/Meaningness/status/698688687341572096 Jordana Cepelewicz's “Where we see shapes, AI sees textures,” Quanta Magazine, July 1, 2019: https://www.quantamagazine.org/where-we-see-shapes-ai-sees-textures-20190701/ “Suddenly, a leopard print sofa appears”, May 2015: https://web.archive.org/web/20150622084852/http://rocknrollnerd.github.io/ml/2015/05/27/leopard-sofa.html “Understanding How Image Quality Affects Deep Neural Networks” April 2016: https://arxiv.org/abs/1604.04004 Goodfellow et al., “Explaining and Harnessing Adversarial Examples,” December 2014: https://arxiv.org/abs/1412.6572 “Universal adversarial perturbations,” October 2016: https://arxiv.org/pdf/1610.08401v1.pdf “Exploring the Landscape of Spatial Robustness,” December 2017: https://arxiv.org/abs/1712.02779 “Overinterpretation reveals image classification model pathologies,” NeurIPS 2021: https://proceedings.neurips.cc/paper/2021/file/8217bb4e7fa0541e0f5e04fea764ab91-Paper.pdf “Approximating CNNs with Bag-of-Local-Features Models Works Surprisingly Well on ImageNet,” ICLR 2019: https://openreview.net/forum?id=SkfMWhAqYQ Baker et al.'s “Deep convolutional networks do not classify based on global object shape,” PLOS Computational Biology, 2018: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006613 François Chollet's Twitter threads about AI producing images of horses with extra legs: twitter.com/fchollet/status/1573836241875120128 and twitter.com/fchollet/status/1573843774803161090 “Zoom In: An Introduction to Circuits,” 2020: https://distill.pub/2020/circuits/zoom-in/ Geirhos et al., “ImageNet-Trained CNNs Are Biased Towards Texture; Increasing Shape Bias Improves Accuracy and Robustness,” ICLR 2019: https://openreview.net/forum?id=Bygh9j09KX Dehghani et al., “Scaling Vision Transformers to 22 Billion Parameters,” 2023: https://arxiv.org/abs/2302.05442 Hasson et al., “Direct Fit to Nature: An Evolutionary Perspective on Biological and Artificial Neural Networks,” February 2020: https://www.gwern.net/docs/ai/scaling/2020-hasson.pdf
During the 19th SIMM-podcast episode we assist to a conversation between biologist Olivier Hamant (Institut Michel Serre) and musician Jean-Luc Plouvier (Ictus Ensemble) on ideas and experiences they share with each other from their work on the natural world and the world of music-making. SIMM-founder Lukas Pairon interviews them. Referenced during this podcast-episode: Philippe Boesmans, John Cage, "Un homme ça s'empêche" (Albert Camus), eco-anxiety, free jazz, Philip Glass, Glenn Gould, Goodhart's law, Olivier Hamant's 'Antidote to the cult of performance', Olivier Hamant's 'De l'incohérence - philosophie politique de la robustesse', Olivier Hamant's 'La troisième voix du vivant', Ictus Ensemble, Steve Reich's 'Music for 18 Musicians', Ircam, robustness, serendipity, Michel Serre Institut, Reich's psychoacoustic by-products of repetition and phase-shifting, Simon Sinek's conference on 'The Infinite Game', systems thinking, Gertrude Stein, stochastic processes in biologyThe transcription of this episode can be found here.And during this episode music is shortly heard from Steve Reich's 'Music for 18 Musicians' (played by the Steve Reich Ensemble), as well as rehearsal recordings of the Kinshasa based traditional drummers ensemble Beta Mbonda.
Modern-day supply chains are obsessed with the goal of becoming resilient. But is that enough?
Today, Dr. Patrick Schloss, Professor in the Department of Microbiology and Immunology in the School of Medicine at the University of Michigan, joins the #QualityQuorum to discuss how the human microbiome is studied, possible pitfalls in such data analysis, and what tools he and his coworkers have developed to lead toward repeatable, hypothesis-driven science. Host: Mark O. Martin Guest: Patrick Schloss Subscribe: Apple Podcasts, Spotify Become a patron of Matters Microbial! Links for this episode An overview of how the gut microbiome is analyzed. One of the articles discussed by Dr. Schloss exploring reproducibility in microbiome studies: “Identifying and Overcoming Threats to Reproducibility, Replicability, Robustness, and Generalizability in Microbiome Research.” Another article discussed by Dr. Schloss, regarding the link between the microbiome and obesity: “Looking for a Signal in the Noise: Revisiting Obesity and the Microbiome.” An article from Dr. Schloss' research team that explores a link between the human microbiome and a type of colorectal cancer. A link to the MOTHUR project, used to analyze microbiome data. A link to a video by Dr. Schloss: “Understanding Disease Through the Lens of the Microbiome.” Dr. Schloss' YouTube channel about data analysis. Dr. Schloss' research group website. Dr. Schloss' faculty website. Intro music is by Reber Clark Send your questions and comments to mattersmicrobial@gmail.com
Where are your delicate and vulnerable spots that, when touched, cause a defensive reaction? In this thought-provoking episode, Ali explores the concept of fragility with Kelly Wendorf, founder of Equus and author of "Flying Lead Change." Kelly, a horse-wise woman and insightful author, unpacks how our most vulnerable spots trigger defensive reactions in both personal and professional settings. She illuminates the origins of our fragile responses, examining how shame triggers can impact our leadership and decision-making, and contrasts fragility's "power-over" mindset with the transformative potential of robustness. The dialogue moves beyond identifying fragile behaviors to examine practical approaches for cultivating robustness. Kelly shares insights on developing a more open, accountable, and genuinely connected way of being in the world. Drawing from her extensive experience with equine wisdom, she offers unique perspectives on authenticity and wholeness, challenging conventional notions of power and control in both personal growth and leadership contexts. Leave us a review on Apple Podcasts! Follow our step by step guides: - How To: Leave a Review on Your Computer: - How To: Leave a Review on Your iPhone: Never miss an episode! Sign up for our newsletter to stay up to date on all our episode releases. Fragility Feedback Conflict Radical Self-inquiry
In this episode of Learning from Machine Learning, we explore the intersection of pure mathematics and modern data science with Leland McInnes, the mind behind an ecosystem of tools for unsupervised learning including UMAP, HDBSCAN, PyNN Descent and DataMapPlot. As a researcher at the Tutte Institute for Mathematics and Computing, McInnes has fundamentally shaped how we approach and understand complex data.McInnes views data through a unique geometric lens, drawing from his background in algebraic topology to uncover hidden patterns and relationships within complex datasets. This perspective led to the creation of UMAP, a breakthrough in dimensionality reduction that preserves both local and global data structure to allow for incredible visualizations and clustering. Similarly, his clustering algorithm HDBSCAN tackles the messy reality of real-world data, handling varying densities and noise with remarkable effectiveness.But perhaps what's most striking about McInnes isn't just his technical achievements – it's his philosophy toward algorithm development. He champions the concept of "decomposing black box algorithms," advocating for transparency and understanding over blind implementation. By breaking down complex algorithms into their fundamental components, McInnes argues, we gain the power to adapt and innovate rather than simply consume.For those entering the field, McInnes offers poignant advice: resist the urge to chase the hype. Instead, find your unique angle, even if it seems unconventional. His own journey – applying concepts from algebraic topology and fuzzy simplicial sets to data science – demonstrates how breakthrough innovations often emerge from unexpected connections.Throughout our conversation, McInnes's passion for knowledge and commitment to understanding shine through. His approach reminds us that the most powerful advances in data science often come not from following the crowd, but from diving deep into fundamentals and drawing connections across disciplines.There's immense value in understanding the tools you use, questioning established approaches, and bringing your unique perspective to the field. As McInnes shows us, sometimes the most significant breakthroughs come from seeing familiar problems through a new lens.Resources for Leland McInnesLeland's GithubUMAPHDBSCANPyNN DescentDataMapPlotEVoCReferencesMaarten GrootendorstLearning from Machine Learning Episode 1Vincent Warmerdam - CalmcodeLearning from Machine Learning Episode 2Matt RocklinEmily Riehl - Category Theory in ContextLorena BarbaDavid Spivak - Fuzzy Simplicial SetsImproving Mapper's Robustness by Varying Resolution According to Lens-Space DensityLearning from Machine LearningYoutubehttps://mindfulmachines.substack.com/
Join Nathan for an expansive conversation with Dan Hendrycks, Executive Director of the Center for AI Safety and Advisor to Elon Musk's XAI. In this episode of The Cognitive Revolution, we explore Dan's groundbreaking work in AI safety and alignment, from his early contributions to activation functions to his recent projects on AI robustness and governance. Discover insights on representation engineering, circuit breakers, and tamper-resistant training, as well as Dan's perspectives on AI's impact on society and the future of intelligence. Don't miss this in-depth discussion with one of the most influential figures in AI research and safety. Check out some of Dan's research papers: MMLU: https://arxiv.org/abs/2009.03300 GELU: https://arxiv.org/abs/1606.08415 Machiavelli Benchmark: https://arxiv.org/abs/2304.03279 Circuit Breakers: https://arxiv.org/abs/2406.04313 Tamper Resistant Safeguards: https://arxiv.org/abs/2408.00761 Statement on AI Risk: https://www.safe.ai/work/statement-on-ai-risk Apply to join over 400 Founders and Execs in the Turpentine Network: https://www.turpentinenetwork.co/ SPONSORS: Shopify: Shopify is the world's leading e-commerce platform, offering a market-leading checkout system and exclusive AI apps like Quikly. Nobody does selling better than Shopify. Get a $1 per month trial at https://shopify.com/cognitive. LMNT: LMNT is a zero-sugar electrolyte drink mix that's redefining hydration and performance. Ideal for those who fast or anyone looking to optimize their electrolyte intake. Support the show and get a free sample pack with any purchase at https://drinklmnt.com/tcr. Notion: Notion offers powerful workflow and automation templates, perfect for streamlining processes and laying the groundwork for AI-driven automation. With Notion AI, you can search across thousands of documents from various platforms, generating highly relevant analysis and content tailored just for you - try it for free at https://notion.com/cognitiverevolution Oracle: Oracle Cloud Infrastructure (OCI) is a single platform for your infrastructure, database, application development, and AI needs. OCI has four to eight times the bandwidth of other clouds; offers one consistent price, and nobody does data better than Oracle. If you want to do more and spend less, take a free test drive of OCI at https://oracle.com/cognitive CHAPTERS: (00:00:00) Teaser (00:00:48) About the Show (00:02:17) About the Episode (00:05:41) Intro (00:07:19) GELU Activation Function (00:10:48) Signal Filtering (00:12:46) Scaling Maximalism (00:18:35) Sponsors: Shopify | LMNT (00:22:03) New Architectures (00:25:41) AI as Complex System (00:32:35) The Machiavelli Benchmark (00:34:10) Sponsors: Notion | Oracle (00:37:20) Understanding MMLU Scores (00:45:23) Reasoning in Language Models (00:49:18) Multimodal Reasoning (00:54:53) World Modeling and Sora (00:57:07) Arc Benchmark and Hypothesis (01:01:06) Humanity's Last Exam (01:08:46) Benchmarks and AI Ethics (01:13:28) Robustness and Jailbreaking (01:18:36) Representation Engineering (01:30:08) Convergence of Approaches (01:34:18) Circuit Breakers (01:37:52) Tamper Resistance (01:49:10) Interpretability vs. Robustness (01:53:53) Open Source and AI Safety (01:58:16) Computational Irreducibility (02:06:28) Neglected Approaches (02:12:47) Truth Maxing and XAI (02:19:59) AI-Powered Forecasting (02:24:53) Chip Bans and Geopolitics (02:33:30) Working at CAIS (02:35:03) Extinction Risk Statement (02:37:24) Outro
Welcome to another episode of Category Visionaries — the show that explores GTM stories from tech's most innovative B2B founders. In today's episode, we're speaking with Saman Farid, CEO & Founder of Formic, a robotics platform that has raised over $60 Million in funding. Here are the most interesting points from our conversation: On-demand robot workforce: Formic provides factories with robots that can work on an hourly basis, aiming to solve labor shortages by making robotic solutions more accessible. COVID and labor shortages: Saman founded Formic during the pandemic, driven by supply chain disruptions and the unfilled jobs crisis in manufacturing, which showed a need for better automation solutions. From VC to founder: Saman transitioned back to being a founder because he believed in the massive potential of robotics. He saw a $10 billion opportunity in automating manufacturing tasks and felt compelled to take the leap. Robustness over cutting-edge tech: Formic focuses on reliability, prioritizing proven, dependable robotics technology that can run two shifts daily without downtime, over pursuing the latest but less mature advancements. Unique business model - "Pay for Productivity": Formic guarantees performance and productivity, only charging clients for the output they get, which has been a game-changer in driving faster adoption of robotics. Scaling with $250M in financing: Formic uses a sophisticated capital stack to finance deployments, which allows them to offer a "try before you buy" model, making robotics more affordable and less risky for manufacturers. // Sponsors: Front Lines — We help B2B tech companies launch, manage, and grow podcasts that drive demand, awareness, and thought leadership. www.FrontLines.io The Global Talent Co. — We help tech startups find, vet, hire, pay, and retain amazing marketing talent that costs 50-70% less than the US & Europe. www.GlobalTalent.co
In this episode, Kevin Werbach is joined by Reggie Townsend, VP of Data Ethics at SAS, an analytics software for business platform. Together they discuss SAS's nearly 50-year long history of supporting business's technology and the recent implementation of responsible AI initiatives. Reggie introduces model cards and the importance of variety in AI systems across diverse stakeholders and sectors. Reggie and Kevin explore the increase in both consumer trust and purchases when they feel a brand is ethical in its use of AI and the importance of trustworthy AI in employee retention and recruitment. Their discussion approaches the idea of bias in an untraditional way, highlighting the positive humanistic nature of bias and learning to manage the negative implications. Finally, Reggie shares his insights on fostering ethical AI practices through literacy and open dialogue, stressing the importance of authentic commitment and collaboration among developers, deployers, and regulators. SAS adds to its trustworthy AI offerings with model cards and AI governance services Article by Reggie Townsend: Talking AI in Washington, DC Reggie Townsend oversees the Data Ethics Practice (DEP) at SAS Institute. He leads the global effort for consistency and coordination of strategies that empower employees and customers to deploy data driven systems that promote human well-being, agency and equity. He has over 20 years of experience in strategic planning, management, and consulting focusing on topics such as advanced analytics, cloud computing and artificial intelligence. With visibility across multiple industries and sectors where the use of AI is growing, he combines this extensive business and technology expertise with a passion for equity and human empowerment. Want to learn more? Engage live with Professor Werbach and other Wharton faculty experts in Wharton's new Strategies for Accountable AI online executive education program. It's perfect for managers, entrepreneurs, and advisors looking to harness AI's power while addressing its risks.
On a special edition of the BioCentury This Week podcast, BioCentury's editors deliver their takeaways from the debut Grand Rounds conference, which focused on whether biotech can write a more successful playbook for translating from target to product. Weaving together takeaways from the panels, fireside chat and keynote at the conference, the editors assess the tensions between generalizability and fit-for-purpose models, between having control and capturing complexity, and, in human data, between scale and robustness/reliability, particularly for longitudinal readouts. The editors also discuss BioCentury's Q&A with USC Keck School's Patrick Lyden, who explained how high-quality, reproducible preclinical science can be feasible.View full story: https://www.biocentury.com/article/65363100:00 - Introduction02:56 - Generalizability vs. Fit for Purpose10:03 - Control vs. Complexity17:57 - Scale vs. Robustness in Human Data26:25 - Hypothesis-Driven vs. Unbiased Research28:50 - Grand Rounds 2025To submit a question to BioCentury's editors, email the BioCentury This Week team at podcasts@biocentury.com.Reach us by sending a text
Episode 305 is with Lead YDP Physiotherapist at AFC Bournemouth Aaron Hull We discussed: ▫️Moving from Australia ▫️Future of Physical Performance ▫️Studying S&C as a physio ▫️Running a successful MDT & much more! You can connect with Aaron on LinkedIn or via our Online Community If you enjoy this episode make sure to check out the previous episodes below: Brett Bartholomew - https://youtu.be/W95WIZXl5u0 Gareth Sandford & Damien Harper - https://youtu.be/BQUYkihCeD8 Stu McMillan - https://youtu.be/ya5b3TCm9Ws Keep up to date with the amazing work our sponsors are doing here: Good Prep - https://thegoodprep.com Discover the power of nutrition at WWW.THEGOODPREP.COM and use code FFF15 for 15% off your first order Rezzil - rezzil.com Hytro - hytro.com Maximise your athletic potential with Hytro BFR. Easier, safer and more practical BFR for squads to prepare for and recover from exercise than ever before. Click the link [[ https://bit.ly/3ILVsbU ]] to speak to our Pro Sports team about how to get Hytro BFR at your club. Join our online community & get access to the very best Football Fitness content as well as the ability to connect with Sport Scientists and Strength & Conditioning coaches from around the world. To get FULL access to all of these & even more like this, sign up to a FREE month on our online community at the link below. www.footballfitfed.com/forum/index.aspx Keep up to date with everything that is going on at Football Fitness Federation at the following links: Twitter - @FootballFitFed Instagram - @FootballFitFed Website - www.footballfitfed.com
Andrew Ilyas, a PhD student at MIT who is about to start as a professor at CMU. We discuss Data modeling and understanding how datasets influence model predictions, Adversarial examples in machine learning and why they occur, Robustness in machine learning models, Black box attacks on machine learning systems, Biases in data collection and dataset creation, particularly in ImageNet and Self-selection bias in data and methods to address it. MLST is sponsored by Brave: The Brave Search API covers over 20 billion webpages, built from scratch without Big Tech biases or the recent extortionate price hikes on search API access. Perfect for AI model training and retrieval augmentated generation. Try it now - get 2,000 free queries monthly at http://brave.com/api Andrew's site: https://andrewilyas.com/ https://x.com/andrew_ilyas TOC: 00:00:00 - Introduction and Andrew's background 00:03:52 - Overview of the machine learning pipeline 00:06:31 - Data modeling paper discussion 00:26:28 - TRAK: Evolution of data modeling work 00:43:58 - Discussion on abstraction, reasoning, and neural networks 00:53:16 - "Adversarial Examples Are Not Bugs, They Are Features" paper 01:03:24 - Types of features learned by neural networks 01:10:51 - Black box attacks paper 01:15:39 - Work on data collection and bias 01:25:48 - Future research plans and closing thoughts References: Adversarial Examples Are Not Bugs, They Are Features https://arxiv.org/pdf/1905.02175 TRAK: Attributing Model Behavior at Scale https://arxiv.org/pdf/2303.14186 Datamodels: Predicting Predictions from Training Data https://arxiv.org/pdf/2202.00622 Adversarial Examples Are Not Bugs, They Are Features https://arxiv.org/pdf/1905.02175 IMAGENET-TRAINED CNNS https://arxiv.org/pdf/1811.12231 ZOO: Zeroth Order Optimization Based Black-box https://arxiv.org/pdf/1708.03999 A Spline Theory of Deep Networks https://proceedings.mlr.press/v80/balestriero18b/balestriero18b.pdf Scaling Monosemanticity https://transformer-circuits.pub/2024/scaling-monosemanticity/ Adversarial Examples Are Not Bugs, They Are Features https://gradientscience.org/adv/ Adversarial Robustness Limits via Scaling-Law and Human-Alignment Studies https://proceedings.mlr.press/v235/bartoldson24a.html Prior Convictions: Black-Box Adversarial Attacks with Bandits and Priors https://arxiv.org/abs/1807.07978 Estimation of Standard Auction Models https://arxiv.org/abs/2205.02060 From ImageNet to Image Classification: Contextualizing Progress on Benchmarks https://arxiv.org/abs/2005.11295 Estimation of Standard Auction Models https://arxiv.org/abs/2205.02060 What Makes A Good Fisherman? Linear Regression under Self-Selection Bias https://arxiv.org/abs/2205.03246 Towards Tracing Factual Knowledge in Language Models Back to the Training Data [Akyürek] https://arxiv.org/pdf/2205.11482
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: A Robust Natural Latent Over A Mixed Distribution Is Natural Over The Distributions Which Were Mixed, published by johnswentworth on August 22, 2024 on The AI Alignment Forum. This post walks through the math for a theorem. It's intended to be a reference post, which we'll link back to as-needed from future posts. The question which first motivated this theorem for us was: "Redness of a marker seems like maybe a natural latent over a bunch of parts of the marker, and redness of a car seems like maybe a natural latent over a bunch of parts of the car, but what makes redness of the marker 'the same as' redness of the car? How are they both instances of one natural thing, i.e. redness? (or 'color'?)". But we're not going to explain in this post how the math might connect to that use-case; this post is just the math. Suppose we have multiple distributions P1,…,Pk over the same random variables X1,…,Xn. (Speaking somewhat more precisely: the distributions are over the same set, and an element of that set is represented by values (x1,…,xn).) We take a mixture of the distributions: P[X]:=jαjPj[X], where jαj=1 and α is nonnegative. Then our theorem says: if an approximate natural latent exists over P[X], and that latent is robustly natural under changing the mixture weights α, then the same latent is approximately natural over Pj[X] for all j. Mathematically: the natural latent over P[X] is defined by (x,λP[Λ=λ|X=x]), and naturality means that the distribution (x,λP[Λ=λ|X=x]P[X=x]) satisfies the naturality conditions (mediation and redundancy).The theorem says that, if the joint distribution (x,λP[Λ=λ|X=x]jαjPj[X=x]) satisfies the naturality conditions robustly with respect to changes in α, then (x,λP[Λ=λ|X=x]Pj[X=x]) satisfies the naturality conditions for all j. "Robustness" here can be interpreted in multiple ways - we'll cover two here, one for which the theorem is trivial and another more substantive, but we expect there are probably more notions of "robustness" which also make the theorem work. Trivial Version First notion of robustness: the joint distribution (x,λP[Λ=λ|X=x]jαjPj[X=x]) satisfies the naturality conditions to within ϵ for all values of α (subject to jαj=1 and α nonnegative). Then: the joint distribution (x,λP[Λ=λ|X=x]jαjPj[X=x]) satisfies the naturality conditions to within ϵ specifically for αj=δjk, i.e. α which is 0 in all entries except a 1 in entry k. In that case, the joint distribution is (x,λP[Λ=λ|X=x]Pk[X=x]), therefore Λ is natural over Pk. Invoke for each k, and the theorem is proven. ... but that's just abusing an overly-strong notion of robustness. Let's do a more interesting one. Nontrivial Version Second notion of robustness: the joint distribution (x,λP[Λ=λ|X=x]jαjPj[X=x]) satisfies the naturality conditions to within ϵ, and the gradient of the approximation error with respect to (allowed) changes in α is (locally) zero. We need to prove that the joint distributions (x,λP[Λ=λ|X=x]Pj[X=x]) satisfy both the mediation and redundancy conditions for each j. We'll start with redundancy, because it's simpler. Redundancy We can express the approximation error of the redundancy condition with respect to Xi under the mixed distribution as DKL(P[Λ,X]||P[X]P[Λ|Xi])=EX[DKL(P[Λ|X]||P[Λ|Xi])] where, recall, P[Λ,X]:=P[Λ|X]jαjPj[X]. We can rewrite that approximation error as: EX[DKL(P[Λ|X]||P[Λ|Xi])] =jαjPj[X]DKL(P[Λ|X]||P[Λ|Xi]) =jαjEjX[DKL(P[Λ|X]||P[Λ|Xi])] Note that Pj[Λ|X]=P[Λ|X] is the same under all the distributions (by definition), so: =jαjDKL(Pj[Λ,X]||P[Λ|Xi]) and by factorization transfer: jαjDKL(Pj[Λ,X]||Pj[Λ|Xi]) In other words: if ϵji is the redundancy error with respect to Xi under distribution j, and ϵi is the redundancy error with respect to Xi under the mixed distribution P, then ϵijαjϵji The redundancy error of the mixed distribution is a...
Can your AI models survive a big disaster? While a recent major IT incident with CrowdStrike wasn't AI related, the magnitude and reaction reminded us that no system no matter how proven is immune to failure. AI modeling systems are no different. Neglecting the best practices of building models can lead to unrecoverable failures. Discover how the three-tiered framework of robustness, resiliency, and anti-fragility can guide your approach to creating AI infrastructures that not only perform reliably under stress but also fail gracefully when the unexpected happens.Show NotesTechnology, incidents, and why basics matter (00:00:03)While the recent Crowdstrike incident wasn't caused by AI, it's impact was a wakeup call for people and processes that support critical systemsAs AI is increasingly being used at both experimental and production levels, we can expect AI incidents are a matter of if, not when. What can you do to prepare?The "7P's": Are you capable of handling the unexpected? (00:09:05)The 7Ps is an adage, dating back to WWII, that aligns with our "do things the hard way" approach to AI governance and modeling systems.Let's consider the levels of building a performant system: Robustness, Resiliency, and AntifragilityModel robustness (00:10:03)Robustness is a very important but often overlooked component of building modeling systems. We suspect that part of the problem is due to: The Kaggle-driven upbringing of data scientistsAssumed generalizability of modeling systems, when models are optimized to perform well on their training data but do not generalize enough to perform well on unseen data.Model resilience (00:16:10)Resiliency is the ability to absorb adverse stimuli without destruction and return to its pre-event state.In practice, robustness and resiliency, testing, and planning are often easy components to leave out. This is where risks and threats are exposed.See also, Episode 8. Model validation: Robustness and resilienceModels and antifragility (00:25:04)Unlike resiliency, which is the ability to absorb damaging inputs without breaking, antifragility is the ability of a system to improve from challenging stimuli. (i.e. the human body)A key question we need to ask ourselves if we are not actively building our AI systems to be antifragile, why are we using AI systems at all?What did you think? Let us know.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Fields that I reference when thinking about AI takeover prevention, published by Buck Shlegeris on August 13, 2024 on The AI Alignment Forum. Is AI takeover like a nuclear meltdown? A coup? A plane crash? My day job is thinking about safety measures that aim to reduce catastrophic risks from AI (especially risks from egregious misalignment). The two main themes of this work are the design of such measures (what's the space of techniques we might expect to be affordable and effective) and their evaluation (how do we decide which safety measures to implement, and whether a set of measures is sufficiently robust). I focus especially on AI control, where we assume our models are trying to subvert our safety measures and aspire to find measures that are robust anyway. Like other AI safety researchers, I often draw inspiration from other fields that contain potential analogies. Here are some of those fields, my opinions on their strengths and weaknesses as analogies, and some of my favorite resources on them. Robustness to insider threats The single field that seems most closely analogous to AI control is reducing vulnerability to threats resulting from insiders, especially threats that arise from access to IT systems. That is, you have a bunch of employees who need to have access to various IT systems in order to do their jobs, and you're worried that some of these employees will abuse their access in order to accomplish some nefarious purpose. In a bank, you might worry about your tellers withdrawing money from customer accounts inappropriately. In a software company, you might worry about your employees sabotaging your code out of spite, violating user privacy for fun, or stealing your data to give it to a competitor. This is quite similar to the problem faced by AI developers when they're trying to control AIs (especially in internal deployments, which I argue is important here): All of the access the AI has is access the developer decided to give it; the developer faces a constant tradeoff between the productivity of the AI (which is maximized by being permissive) and the danger it poses (which is minimized by being restrictive). And obviously, because it's an AI it only has access to resources that are connected to the computer network where it's running, so we can restrict our attention to the IT resources. It's easy for the developer to constantly surveil the AI, which substantially reduces the space of strategies the AI can take. The developer has a lot of control over workflows, and is able to choose them to make it harder for the AI to subtly cause problems. For example, they can restrict affordances in ways that make it harder for AIs to cause problems without being noticed. An example of a similar technique in the insider threat prevention case is the use of administrative APIs instead of direct access to a system, as described in "Building Secure and Reliable Systems" chapter 5. That is: instead of having your engineers resolve issues by SSHing into production machines (which implicitly grants them very broad access that can lead to problems, intentional or accidental), you design an administrative API which allows the engineers to access just the affordances that usually suffice for them to do their job. Of course, they'll sometimes need to take actions that aren't available in the administrative API; this is called a breakglass mechanism. But because breakglass actions are rare, it's affordable to apply more scrutiny to them: you might use multi-party authorization, requiring another engineer to sign off on the commands before they take effect, or you might do auditing (e.g. apparently some SRE teams review all the breakglass actions at a weekly team meeting, which allows them to spot irregularities and discourage gratuitous use of the mechanism). Similar t...
Podcast: Emerson Automation Experts (LS 23 · TOP 10% what is this?)Episode: Building Cybersecurity Robustness in Pipeline Operations PodcastPub date: 2024-07-25Manufacturers and producers across all industries know the challenges in keeping their operations cyber-secure. Industries such as pipeline transportation and electrical & gas distribution networks face additional challenges in the wide geographic spread of their operations and the need for reliance on public communications networks. In this podcast, I'm joined by Emerson cybersecurity expert Steve Hill to discuss these additional challenges and ways the companies in these industries, suppliers, and federal regulators are collaborating to develop and implement best practices for strong cyber resiliency. Give the podcast a listen and visit the SCADA Solutions & Software for Energy Logistics on Emerson.com and the AspenTech Digital Grid Management page for methods and solutions to improve your cybersecurity defenses and ongoing programs. Transcript Jim: Hi, everyone. This is Jim Cahill with another “Emerson Automation Experts” podcast. Pipelines cover a wide geographic area and require continuous monitoring for safe, efficient, and reliable operations. Today, I’m joined by Steve Hill to discuss the challenges pipeline operators face in keeping their pipeline networks cybersecure. Welcome to the podcast, Steve. Steve: Thanks, Jim. Pleasure to be here. Jim: Well, it’s great to have you. I guess, let’s get started by asking you to share your background and path to your current role here with us at Emerson. Steve: Thanks, yeah. I’ve been in the automation and SCADA industry for about 40 years, started on the hardware design and communications that then moved over to software. And it’s nearly 20 years I’ve been with Emerson. I joined as part of the Bristol Babcock acquisition. My main focus now is working in wide-area SCADA as the director of SCADA Solutions for Emerson, and most of that’s working in the oil and gas industry, working with Emerson sales and the engineering teams and our customers as they design systems and products for the industry. And also, alongside that, for the last few years, I’ve been collaborating with CISA. That’s the U.S. government Cybersecurity and Infrastructure Security Agency as part of the Joint Cyber Defense Collaborative. Jim: Okay. That’s a nice, varied background. That’s really good for our discussion. So, what exactly do you mean by wide-area SCADA? Steve: That’s a great question. There’s a SCADA system where the software is monitoring equipment across a very wide area. It might be a very large geographic area, like a pipeline or gas, or water distribution network, or perhaps a well field. I mean, some of the systems, for example, I was speaking to a customer last week who is monitoring an entire pipeline across Peru, and yet, their control centers are actually in Mexico. So, to do that kind of thing, the equipment is usually connected via public networks. You know, private networks don’t extend that far, and even the control centers may be widely distributed. And as part of that, compared to in-plant control, there’s an assumption that your communications are clearly not gonna be 100% perfect. You’re gonna lose communications either momentarily, like with cellular networks, and when, for example, like we’ve got in Texas this week, with natural events like hurricanes can cut communications for hours. But because these systems are all critical infrastructure, such as pipelines or electrical distribution, the actual operations, the process, must never be interrupted. Today, we’re talking about cybersecurity, and that same sensitivity is why these systems are now the target to some of the most sophisticated cyberattacks. Jim: Okay, that gives a picture of the breadth of these types of SCADA systems, and you had mentioned you’d work with CISA, the cybersecurity infrastructure defense agency, and the Joint Cyber Defense Collaborative, which I’ll just call JCDC for short. Can you give some more examples on that work? Steve: Yeah. Really, I could give you a bit of background. Probably many of our listeners know that there’s been several successful cyberattacks against critical infrastructure over the last few years. Probably the most famous in the pipeline industry was an attack that’s referred to as the Colonial Pipeline attack. That was actually a criminal ransomware attack that resulted in gasoline and jet fuel shortage across the Eastern U.S. for several days, and that was criminals basically trying to get money. And it was almost a random attack, it wasn’t targeted. However, there have been actual state-sponsored attacks, and probably the one that was most successful was prior to the Russian military attack against Ukraine. They actually instituted several successful cyberattacks against the Ukrainian power grid. And very concerning is, in recent months, the U.S. infrastructure, including pipelines, have been successfully infiltrated by a group that are called Volt Typhoon, who are thought to be from the People’s Republic of China. So JCDC and CISA are working hard to really counter and protect against these threats. Jim: Wow. Well, that’s clearly a huge concern. What is the JCDC doing to address these challenges? Steve: Well, in 2023, so last year, JCDC facilitated the development of something called the Pipeline Reference Architecture. Basically, Emerson, alongside other industry vendors and also pipeline operators, participated in the development of this Pipeline Reference Architecture, which I’ll refer to as the PRA. It’s a fairly short document that outlines the design and operating principles for SCADA systems in the pipeline industry. And one thing the government is keen to point out, it’s not a regulatory document, but it does set out the best principles and is intended as guidance for the industry. Really, they want to work with the industry to come up with best practices. Jim: Well, it sounds like this PRA is another set of standards to address cybersecurity. Why is another document needed in the industry where a bunch of standards exist now? Steve: Yeah, that’s a question I and other members get asked quite a lot. The main reason is that wide-area SCADA represents a very different set of challenges to traditional SCADA, which we refer to as inside the wire. So for example, a refinery or a manufacturing plant, everything is in one location. But as I mentioned before, wide-area SCADA has got a very wide displacement, physically. It also actually has a lot of remote field workers. There may be folks working on that system hundreds of miles from base, and you’re also using communications networks that are not even owned or operated by the owners of the pipeline. Though this PRA is really intended for the pipeline industry, clearly, it’s applicable to almost any wide-area SCADA, that’s water or electrical industry as well. Jim: Okay, that makes sense. So those are definitely challenges that don’t exist for more automation systems, as you say, inside the wire. Tell us more about how the PRA addresses these. Steve: Well, the big thing is segmentation, basically, taking the network and splitting it into different levels that represent different areas of the operation. For example, the internet would be what’s referred to as level zero, and moving all the way down to the bottom of the network, that’s level nine. And the levels in between that represent different levels of trust. Now, those who are familiar with cybersecurity and SCADA are probably familiar with something that is called the Purdue model, which I think first came out in the late 1980s, and that also splits up SCADA and control networks and actually business networks into different levels. However, when that came out, the internet was in its infancy. No one would ever have used the internet or even really public IP networks for their connectivity. So it doesn’t really take into account many of the things we take for granted today in these systems. So the PRA is intended to expand and take into account the reality that, for example, some of this critical data will actually be transiting across a public network, right? And in order to achieve that with this segmentation, we’re using a concept called Defense in Depth, right? And as you go down the different levels of the network, the assumption is you can trust each item on that network better. So, for example, on the internet, you don’t trust anything, but when you get down, let’s say, to the communications between an RTU [remote terminal unit] and a gas chromatograph on a local serial link, you might completely trust that. Now, it’s interesting, although that’s part of the PRA model, that does actually conflict with a security concept called Zero Trust, which is something that Emerson has really based our products on. But both zero trust and defense in depth are valid. Jim: Now, you had mentioned a couple of concepts I’d like to explore a little bit more in there, and let’s start with zero trust. Can you explain that concept to us? Steve: Oh, yeah. Yeah. Zero trust is a concept where any piece of equipment or software should trust nothing. Don’t trust anything else on the network, don’t trust the network to be safe, and it should not rely on anything else for protection. And historically, SCADA was protected, for example, by firewalls. You would use insecure products that were known to not be secure because they were developed perhaps 20 or 30 years ago and hide them behind firewalls, and that’s really how we’ve handled security today. But there’s a realization you can’t do that. So we now need to design products so that they don’t trust anything. But the reality is many of our customers, Emerson’s customers and pipeline operators, have devices that were installed perhaps 30 years ago. That’s the typical lifespan of some RTUs and controllers in this industry. So as a result, when you get down to the lower levels of the network, zero trust doesn’t work. So you do have to have levels of additional protection. So for example, if you had a Modbus link, which is basically insecure almost by design, that should be protected by additional levels of firewalls and so on. But if you’re designing a modern product, it should be designed so it doesn’t rely on anything else. And that’s the concept of zero trust. Jim: Okay, got it. So don’t trust anything. Everything must be proven out. And the other concept you talked about was defense in depth. So, what does that mean? Steve: Well, the phrase is most commonly used where we’re talking about a network with multiple levels in. So when you come from, for example, the internet into your business network, you would have a set of firewalls and what’s called the demilitarized zone. And then when you go from your business network down to your controls network, you’d have another set of firewalls. So it’s multiple levels of protection. However, that same concept should be used actually within products as well. And, in fact, Emerson takes that very seriously with our secure development lifecycle certifications, IEC 62443, and how we design those products. Jim: Well, that’s good. As you get those two and as you put in more modern technology, that it complies and has that cybersecurity built into mind there. So, can you give us an example of how it’s built in? Steve: Yeah. That great one. If I take, for example, the Emerson FB3000 RTU, that’s a flow computer and a controller device that’s designed specifically for the oil and gas industry, especially for pipelines, an obvious concern is that that may be attacked externally to modify the firmware. Now, at the first level, the RTU itself has secure protocols. It uses something called DNP3, which would, in theory, provide access to the device. But then the firmware, when we issue new firmware, we put it on a website so we have protection of the website, we also publish a hash, which is basically a unique key that the customer downloading the firmware can check. It hasn’t been modified by anyone attacking the website. But then, when they actually put it into the RTU, so they’re updating firmware, the RTU will check that that firmware was developed by Emerson and was intended for that device. It does that by certifying certificates on the load. Now, once it’s in the device and it’s running in the field, you might say, “Well, the task is done,” but there’s an additional level of protection. It will continually and on boot, check that firmware, make sure the certificate still matches, it’s not being changed. And if it has been changed, it will actually revert to a known good factory firmware that’s basically embedded in the device. So you can see that there’s really five or six different things all checking and ensuring that firmware in that device was not compromised. So basically, multiple levels within the device, and in addition, there’s multiple levels on the network. So the bad guys have to get through a lot of different levels to damage or compromise the device. And we’re trying to do that with everything we design today. Jim: Yeah. And with modern cryptography and making any change completely will change that hash and everything and make it impossible to slip something in without it being noticed. So that’s really a nice thing. Steve: Yeah. And the fact that even if it detects it, it then goes back to factory firmware, which may be a slightly older version, but your operation will keep running. It will keep controlling, which is a very nice feature. Jim: Yeah, that’s a great example there. I guess, going back to the PRA, what else does it include other than the segmentation that you discussed? Steve: There’s about 10 high-level principles that cover aspects of the design and operation of the SCADA system. And for each of these, there’s various examples and guidance on how to actually follow the principle in a real-world system. So, for example, there was a whole section on how to manage third-party devices in the contractors, because on a pipeline system, you’re almost certainly gonna have, for example, engineers from Emerson coming in from third parties. So it gives examples on the real-world aspects of operating the system. Jim: Are there other examples from it you can share? Steve: Yeah. One important one is when you’re designing the system, you should identify and document all of the different data flows that occur. And that’s, when I say data flow, communications or conversation between different pieces of equipment. So, for example, this RTU may communicate with that SCADA platform on this particular machine and may communicate with a measurement system on another machine, document all of those data flows, and then deny all other data flows by default. Then, after the system is running, continually monitor it passively. And if you see an additional communication, say, between two pieces of equipment that normally never communicated or didn’t communicate on a particular IP socket, flag that immediately, because it may be something that’s going on that was unexpected. It certainly was outside the original design of the system. Jim: This has been very educational. Thank you so much, Steve. Where can our listeners go to learn more? Steve: Well, really a couple of places. If you go to the CISA blog, which is at www.cisa.gov/news-events, there’s details there. The actual PRA was published on March the 26th of this year. And also, if you want to discover more about Emerson’s involvement in wide-area SCADA and the cybersecurity associated with it, if you go to Emerson.com/SCADAforEnergy, you’ll find some information there. Jim: Okay, great. And I’ll add some links to that and to some of the other things we discussed in the transcript. Well, thank you so much for joining us today, Steve. Steve: Not a problem. It’s a pleasure. -End of transcript-The podcast and artwork embedded on this page are from Emerson Team, which is the property of its owner and not affiliated with or endorsed by Listen Notes, Inc.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Does robustness improve with scale?, published by ChengCheng on July 25, 2024 on The AI Alignment Forum. Adversarial vulnerabilities have long been an issue in various ML systems. Large language models (LLMs) are no exception, suffering from issues such as jailbreaks: adversarial prompts that bypass model safeguards. At the same time, scale has led to remarkable advances in the capabilities of LLMs, leading us to ask: to what extent can scale help solve robustness? In this post, we explore this question in the classification setting: predicting the binary label of a text input. We find that scale alone does little to improve model robustness, but that larger models benefit more from defenses such as adversarial training than do smaller models. We study models in the classification setting as there is a clear notion of "correct behavior": does the model output the right label? We can then naturally define robustness as the proportion of the attacked dataset that the model correctly classifies. We evaluate models on tasks such as spam detection and movie sentiment classification. We adapt pretrained foundation models for classification by replacing the generative model's unembedding layer with a randomly initialized classification head, and then fine-tune the models on each task. We focus on adversarial-suffix style attacks: appending an adversarially chosen prompt to a benign prompt in an attempt to cause the model to misclassify the input, e.g., classify a spam email as not-spam. We consider two attacks: the state-of-the-art Greedy Coordinate Gradient method (Zou et al., 2023), and a baseline random token attack. This simple threat model has the advantage of being unlikely to change the semantics of the input. For example, a spam email is still spam even if a handful of tokens are appended to it. Of course, attackers are not limited to such a simple threat model: studying more open-ended threat models (such as rephrasing the prompt, or replacing words with synonyms) and corresponding attack methods (such as LLM generated adversarial prompts) is an important direction that we hope to pursue soon in future work. For more information, see our blog post or paper. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
Emerson cybersecurity expert Steve Hill joins podcast host Jim Cahill to discuss these additional challenges and ways the companies in these industries, suppliers, and federal regulators are collaborating to develop and implement best practices for strong cyber resiliency.
Why are robustness and quality so important when it comes to data centers? In this fifth episode of the “We Power AI” podcast series, our expert Danny Clavette, Distinguished System Architecture Engineer, will provide some very interesting insights and examples on the server rack structure and failure rates in data center components, highlighting how Infineon solutions can support a better reliability in the GenAI world. Learn more about Infineon's offering for data center and computing here. Request a topic for our next episodes or drop any questions at wepowerai@infineon.com Stay tuned and don't miss our next episode on July 30th!
Guest: Campbell Watson, Senior Research Scientist at IBM ResearchAs artificial intelligence, or AI, continues to become more pervasive in our technology, it's only natural to wonder what it means for meteorology and climatology. Believe it or not, AI is already revolutionizing how we develop models in the Earth and Space sciences. Joining us today is Campbell Watson, a Senior Research Scientist at IBM Research, to discuss how we are creating these AI models, and the opportunities and advancements we hope to learn from using them.Chapters00:00 Introduction to AI in Meteorology and Climatology01:35 Campbell Watson's Background and Interest in Weather04:18 The Use of AI in Weather Forecasting05:24 The Evolution of AI in Weather Forecasting06:26 Understanding AI at its Core08:09 Difference Between AI and Conventional Weather Model Forecasting10:44 The Difference Between AI and Conventional Weather Model Forecasting13:31 AI Models Learning from Weather Models20:30 Foundation Models and Unsupervised Learning25:12 Verification of AI in Weather Forecasting27:57 The Robustness of AI Models28:24 Learning from Observations29:23 AI Models Learning from Observations30:09 The Importance of Observations30:38 Comparing Deep Learning and AI with Traditional Weather Forecasting Models35:38 Trustworthiness in AI Forecasting36:12 Adopting AI in Weather Forecasting41:46 Nowcasting with AI44:03 AI in Climate Forecasting47:38 The Future of AI in Weather ForecastingSee Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Beyond the Board: Exploring AI Robustness Through Go, published by AdamGleave on June 19, 2024 on The AI Alignment Forum. Last year, we showed that supposedly superhuman Go AIs can be beaten by human amateurs playing specific "cyclic" patterns on the board. Vulnerabilities have previously been observed in a wide variety of sub- or near-human AI systems, but this result demonstrates that even far superhuman AI systems can fail catastrophically in surprising ways. This lack of robustness poses a critical challenge for AI safety, especially as AI systems are integrated in critical infrastructure or deployed in large-scale applications. We seek to defend Go AIs, in the process developing insights that can make AI applications in various domains more robust against unpredictable threats. We explored three defense strategies: positional adversarial training on handpicked examples of cyclic patterns, iterated adversarial training against successively fine-tuned adversaries, and replacing convolutional neural networks with vision transformers. We found that the two adversarial training methods defend against the original cyclic attack. However, we also found several qualitatively new adversarial strategies that can overcome all these defenses. Nonetheless, finding these new attacks is more challenging than against an undefended KataGo, requiring more training compute resources for the adversary. For more information, see our blog post, project website or paper. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
Artificial General Intelligence (AGI) Show with Soroush Pour
We speak with Stephen Casper, or "Cas" as his friends call him. Cas is a PhD student at MIT in the Computer Science (EECS) department, in the Algorithmic Alignment Group advised by Prof Dylan Hadfield-Menell. Formerly, he worked with the Harvard Kreiman Lab and the Center for Human-Compatible AI (CHAI) at Berkeley. His work focuses on better understanding the internal workings of AI models (better known as “interpretability”), making them robust to various kinds of adversarial attacks, and calling out the current technical and policy gaps when it comes to making sure our future with AI goes well. He's particularly interested in finding automated ways of finding & fixing flaws in how deep neural nets handle human-interpretable concepts.We talk to Stephen about:* His technical AI safety work in the areas of: * Interpretability * Latent attacks and adversarial robustness * Model unlearning * The limitations of RLHF* Cas' journey to becoming an AI safety researcher* How he thinks the AI safety field is going and whether we're on track for a positive future with AI* Where he sees the biggest risks coming with AI* Gaps in the AI safety field that people should work on* Advice for early career researchersHosted by Soroush Pour. Follow me for more AGI content:Twitter: https://twitter.com/soroushjpLinkedIn: https://www.linkedin.com/in/soroushjp/== Show links ==-- Follow Stephen --* Website: https://stephencasper.com/* Email: (see Cas' website above)* Twitter: https://twitter.com/StephenLCasper* Google Scholar: https://scholar.google.com/citations?user=zaF8UJcAAAAJ-- Further resources --* Automated jailbreaks / red-teaming paper that Cas and I worked on together (2023) - https://twitter.com/soroushjp/status/1721950722626077067* Sam Marks paper on Sparse Autoencoders (SAEs) - https://arxiv.org/abs/2403.19647* Interpretability papers involving downstream tasks - See section 4.2 of https://arxiv.org/abs/2401.14446* MMET paper on model editing - https://arxiv.org/abs/2210.07229* Motte & bailey definition - https://en.wikipedia.org/wiki/Motte-and-bailey_fallacy* Bomb-making papers tweet thread by Cas - https://twitter.com/StephenLCasper/status/1780370601171198246* Paper: undoing safety with as few as 10 examples - https://arxiv.org/abs/2310.03693* Recommended papers on latent adversarial training (LAT) - * https://ai-alignment.com/training-robust-corrigibility-ce0e0a3b9b4d * https://arxiv.org/abs/2403.05030* Scoping (related to model unlearning) blog post by Cas - https://www.alignmentforum.org/posts/mFAvspg4sXkrfZ7FA/deep-forgetting-and-unlearning-for-safely-scoped-llms* Defending against failure modes using LAT - https://arxiv.org/abs/2403.05030* Cas' systems for reading for research - * Follow ML Twitter * Use a combination of the following two search tools for new Arxiv papers: * https://vjunetxuuftofi.github.io/arxivredirect/ * https://chromewebstore.google.com/detail/highlight-this-finds-and/fgmbnmjmbjenlhbefngfibmjkpbcljaj?pli=1 * Skim a new paper or two a day + take brief notes in a searchable notes app* Recommended people to follow to learn about how to impact the world through research - * Dan Hendrycks * Been Kim * Jacob Steinhardt * Nicolas Carlini * Paul Christiano * Ethan PerezRecorded May 1, 2024
‘Ranges of Robustness' is het nieuwe album van de Belgische componist en pianist Wim Mertens. Zoals de titel verraadt, gaat hij op dit album op zoek naar de reikwijdten van robuustheid. Mertens is een internationaal artiest die als solist en met zijn ensemble concerten heeft gegeven over de hele wereld. Bekende albums van hem zijn ‘Vergessen' en ‘Struggle for Pleasure'. Presentatie: Andrew Makkinga
Our 170th episode with a summary and discussion of last week's big AI news! With hosts Andrey Kurenkov (https://twitter.com/andrey_kurenkov) and Jeremie Harris (https://twitter.com/jeremiecharris) Feel free to leave us feedback here. Read out our text newsletter and comment on the podcast at https://lastweekin.ai/ Email us your questions and feedback at contact@lastweekin.ai and/or hello@gladstone.ai Timestamps + Links: Tools & Apps(00:03:33) KLING is the latest AI video generator that could rival OpenAI's Sora (00:09:16) ‘Apple Intelligence' will automatically choose between on-device and cloud-powered AI (00:12:21) Udio introduces new udio-130 music generation model and more advanced features (00:14:38) Perplexity AI's new feature will turn your searches into shareable pages (00:16:35) ElevenLabs' AI generator makes explosions or other sound effects with just a prompt (00:18:37) Google's updated AI-powered NotebookLM expands to India, UK and over 200 other countries Applications & Business(00:19:40) OpenAI is restarting its robotics research group (00:25:01) Saudi fund invests in China effort to create rival to OpenAI (00:29:34) UAE seeks ‘marriage' with US over artificial intelligence deals (00:33:01) Zoox to test self-driving cars in Austin and Miami (00:35:49) Microsoft Lays Off 1,500 Workers, Blames "AI Wave" (00:38:28) Avengers, assemble—Google, Intel, Microsoft, AMD and more team up to develop an interconnect standard to rival Nvidia's NVLink Projects & Open Source(00:40:39) GLM-4-9B-Chat-1M (00:46:37) Hugging Face and Pollen Robotics show off first project: an open source robot that does chores (00:49:40) Zyphra debuts Zyda, a 1.3T language modeling dataset it claims outperforms Pile, C4, arxiv (00:51:59) Stability AI debuts new Stable Audio Open for sound design Research & Advancements(00:54:05) Scaling and evaluating sparse autoencoders (01:04:54) Improving Alignment and Robustness with Short Circuiting (01:12:11) Automatic Data Curation for Self-Supervised Learning: A Clustering-Based Approach (01:16:20) GPT-4 didn't ace the bar exam after all, MIT research suggests — it didn't even break the 70th percentile Policy & Safety(01:20:11) Former OpenAI researcher foresees AGI reality in 2027 (01:28:03) OpenAI Insiders Warn of a ‘Reckless' Race for Dominance (01:33:52) Testing and mitigating elections-related risks (01:36:26) Teams of LLM Agents can Exploit Zero-Day Vulnerabilities Synthetic Media & Art(01:43:23) The Uncanny Rise of the World's First AI Beauty Pageant (01:46:25) Outro + AI Song
In this episode of the HatsOff Podcast, the CEO of 17hats, Amanda Rae, and guest, Carissa Woo delve into the critical role of automation and process in business growth. They emphasize how lead automation and immediate engagement can significantly enhance client trust and conversion rates. Amanda highlights the newly launched Marketplace by 17hats, connecting users with verified, successful partners to share practical tools and strategies. This episode also offers advice for solopreneurs on maintaining resilience, creating a vision with clear milestones, and the importance of proactive client communication for sustained success. What You'll Learn 02:48 Exploring Amanda's Roots: Growing Up in Small-Town Texas 17:34 The Beginnings of 17hats: From User to CEO and Beyond 22:49 Mastering Lead Automation: Boosting Engagement and Conversions 36:51 Launching the 17hats Marketplace: Connecting Consumers with Verified Experts 41:05 Staying True to the Vision: Evolving with Simplicity and Robustness
Dan is joined by Mayukh Bhattacharya, Engineering, Executive Director, at Synopsys. Mayukh has been with Synopsys since 2003. For the first 14 years, he made many technical contributions to PrimeSim XA. Currently, he leads R&D teams for PrimeSim Design Robustness and PrimeSim Custom Fault products. He was one of the early… Read More
About the Guest Dr. Dritz currently provides global technical support for PIC, the porcine division of Genus. PIC directly employs people in 18 countries worldwide and PIC products are available in more than 50 countries across the globe. Previously, Dr. Dritz was a Professor at the College of Veterinary Medicine, Kansas State University from 1995 […]
About the Guests Dr. Sergio Canavate was born in Spain and obtained his veterinarian degree from the University of Murcia. He started his career in the pork industry in Agropor SL, where he learned the core principles and values of pork production. In 2010 he joined PIC Spain, and since then, he has shown willingness […]
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: $250K in Prizes: SafeBench Competition Announcement, published by Center for AI Safety on April 3, 2024 on The Effective Altruism Forum. TLDR: CAIS is distributing $250,000 in prizes for benchmarks that empirically assess AI safety. This project is supported by Schmidt Sciences, submissions are open until February 25th, 2025. Winners will be announced April 25th, 2025. To view additional info about the competition, including submission guidelines and FAQs, visit https://www.mlsafety.org/safebench If you are interested in receiving updates about SafeBench, feel free to sign up on the homepage using the link above. About the Competition: The Center for AI Safety is offering prizes for the best benchmarks across the following four categories: Robustness: designing systems to be reliable in the face of adversaries and highly unusual situations. Monitoring: detect malicious use, monitor predictions, and discover unexpected model functionality Alignment: building models that represent and safely optimize difficult-to-specify human values. Safety Applications: using ML to address broader risks related to how ML systems are handled. Judges: Zico Kolter, Carnegie Mellon Mark Greaves, AI2050 Bo Li, University of Chicago Dan Hendrycks, Center for AI Safety Timeline: Mar 25, 2024: Competition Opens Feb 25, 2025: Submission Deadline Apr 25, 2025: Winners Announced Competition Details: Prizes: There will be three prizes worth $50,000 and five prizes worth $20,000. Eligibility: Benchmarks released prior to the competition launch are ineligible for prize consideration. Benchmarks released after competition launch are eligible. More details about prize eligibility can be found in our terms and conditions. Evaluation criteria: Benchmarks will be assessed according to this evaluation criteria. In order to encourage progress in safety, without also encouraging general advances in capabilities, benchmarks must clearly delineate safety and capabilities. Submission Format: If you have already written a paper on your benchmark, submit that (as long as it was published after the SafeBench launch date of March 25th, 2024). Otherwise, you may submit a thorough write-up of your benchmark, including source code. An example of such a write-up can be found in this document. Dataset Policy: By default, we will require the code and dataset for all submissions to be publicly available on Github. However, if the submission deals with a dangerous capability, we will review whether to publicly release the dataset on a case-by-case basis. If you are interested in receiving updates about SafeBench, feel free to sign up on the homepage: https://www.mlsafety.org/safebench Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org
Join us for a thought-provoking session on the next episode of The Uncertainty of show with James Burnell discussing Safety Culture Improvement: engineered or emergent?As humans we are experts at developing engineering solutions which provide a level of Certainty. A popular belief is that Management can envision an ideal safety culture. We can then engineer this desired culture in the same way we design and build a robust machine.In this show we explore another region of the real world that is Uncertain and Unpredictable. For context we will look at the Aviation Industry from a Robustness and Resilience perspective.Questions we will address:Why does the Blunt end of the Safety Spear (regulators, elected officials) prefer a robust approach to improve a resilient safety culture?What if there was a different view of safety culture - not one that can be engineered but one that emerges naturally from systemic changes?How might using sense-making methods and tools be ‘faster, better, cheaper' to improve safety culture?
Robustness Abstract Philip and Fred discuss the idea of a robust design for a product or system. Key Points Join Philip and Fred as they discuss Topics include: To improve the ability of a product to withstand unexpected stresses, improve the robustness. Design techniques to improve robustness. Various methods to improve reliability like stress-strength analysis, […] The post SOR 953 Robustness appeared first on Accendo Reliability.
Champions League Clash on The Overlap! We are BACK, dissecting the burning question: who will reign supreme in Europe this season? Prepare for fiery takes, tactical tussles, and maybe even a friendly wager as we reveal our top contenders. But, two teams caused a major rift (can you guess which ones?). Will logic prevail (plot twist: nope)? Tune in, grab your scarves, and join the conversation. Don't forget to share your own predictions – who are YOU backing? #UCL #Predictions #TheOverlap
Chris Scott is from Coventry and has been training since 2010. He was educated as a sports therapist and is a personal trainer and strength and conditioning coach by trade. He has delved down the rabbit hole of physical training for parkour, maybe deeper than anyone else in the parkour community, which has led him to become recognised by the strength and conditioning world and invited as a guest onto various strength and conditioning podcasts which have aided in a better understand and respect for the undeniable athleticism of our unruly and rugged community of street rats. Instagram - @chris94scott If you'd rather watch the podcast and occasionally see some added clips and pics relevant to the conversation, you can access it by becoming a YouTube join member on Storror's YouTube, along with other weekly members-only content - https://www.youtube.com/channel/UCdPui8EYr_sX6q1xNXCRPXg/join
On this week's Education Gadfly Show podcast, Debbie Veney, a senior vice president at the National Alliance for Public Charter Schools, joins Mike to discuss the growth in the charter sector since 2019. Then, on the Research Minute, Amber reviews a new study investigating the effects of mock instruction and coaching on pre-service teacher performance.Recommended content:“Believing in public education: A demographic and state-level analysis of public charter school and district public school enrollment trends” —Drew Jacobs and Debbie Veney, National Alliance for Public Charter Schools“Which large school districts provide fertile terrain for charter growth?” —Amber M. Northern and Michael J. Petrilli, Thomas B. Fordham Institute Julie Cohen, Anandita Krishnamachari, Vivian C. Wong, and Steffen Erickson, “Experimental Evidence on the Robustness of Coaching Supports in Teacher Education,” Educational Researcher (December 2023).Feedback Welcome: Have ideas for improving our podcast? Send them to Daniel Buck at dbuck@fordhaminstitute.org.
Outline00:00 - Intro00:47 - Mind the gap (metric)03:16 - Moment problems and Nevanlinna interpolation06:53 - “Everything relates to everything else”11:27 - Distances between power spectra 16:08 - Optimal mass transport and Schrödinger bridges32:25 - Sinkhorn iteration and Wasserstein geometry37:45 - Color of turbulence41:38 - Thermodynamics and energy harvesting from heat baths55:01 - Quantum mechanics57:55 - Adjustable one-ports1:07:20 - “Aha moments” and advice to future generations1:12:20 - OutroLinksTryphon's website: https://georgiou.eng.uci.edu/Robustness analysis of nonlinear feedback systems: an input-output approach (paper): https://tinyurl.com/4785kxnyA topological approach to Nevanlinna–Pick interpolation (paper): https://tinyurl.com/4vudtz8cA generalized entropy criterion for Nevanlinna-Pick interpolation with degree constraint (paper): https://tinyurl.com/39enabucThe meaning of distances in spectral analysis, IEEE CDC plenary 2007 (slides): https://tinyurl.com/mrytp7j8Optimal Steering of a Linear Stochastic System to a Final Probability Distribution, Part I (paper): https://tinyurl.com/2nbm3sa6Y. Chen - https://tinyurl.com/37frfx67M. Pavon - https://tinyurl.com/yjhbawhbMittag Lefler - https://tinyurl.com/29cjum7jStochastic control liasons (paper): https://tinyurl.com/4s8y775bSinkhorn iteration: https://tinyurl.com/ym5catr2Color of turbulence (paper): https://tinyurl.com/5n77bepbHarvesting energy from a periodic heat bath (paper): https://tinyurl.com/2vadpu93Principles of lossless adjustable one-ports (paper): https://tinyurl.com/53v23yt4Inerter: https://tinyurl.com/ya2bkkhwSupport the showPodcast infoPodcast website: https://www.incontrolpodcast.com/Apple Podcasts: https://tinyurl.com/5n84j85jSpotify: https://tinyurl.com/4rwztj3cRSS: https://tinyurl.com/yc2fcv4yYoutube: https://tinyurl.com/bdbvhsj6Facebook: https://tinyurl.com/3z24yr43Twitter: https://twitter.com/IncontrolPInstagram: https://tinyurl.com/35cu4kr4Acknowledgments and sponsorsThis episode was supported by the National Centre of Competence in Research on «Dependable, ubiquitous automation» and the IFAC Activity fund. The podcast benefits from the help of an incredibly talented and passionate team. Special thanks to L. Seward, E. Cahard, F. Banis, F. Dörfler, J. Lygeros, ETH studio and mirrorlake . Music was composed by A New Element.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Adversarial Robustness Could Help Prevent Catastrophic Misuse, published by Aidan O'Gara on December 11, 2023 on The AI Alignment Forum. There have been several discussions about the importance of adversarial robustness for scalable oversight. I'd like to point out that adversarial robustness is also important under a different threat model: catastrophic misuse. For a brief summary of the argument: Misuse could lead to catastrophe. AI-assisted cyberattacks, political persuasion, and biological weapons acquisition are plausible paths to catastrophe. Today's models do not robustly refuse to cause harm. If a model has the ability to cause harm, we should train it to refuse to do so. Unfortunately, GPT-4, Claude, Bard, and Llama all have received this training, but they still behave harmfully when facing prompts generated by adversarial attacks, such as this one and this one. Adversarial robustness will likely not be easily solved. Over the last decade, thousands of papers have been published on adversarial robustness. Most defenses are near useless, and the best defenses against a constrained attack in a CIFAR-10 setting still fail on 30% of inputs. Redwood Research's work on training a reliable text classifier found the task quite difficult. We should not expect an easy solution. Progress on adversarial robustness is possible. Some methods have improved robustness, such as adversarial training and data augmentation. But existing research often assumes overly narrow threat models, ignoring both creative attacks and creative defenses. Refocusing research with good evaluations focusing on LLMs and other frontier models could lead to valuable progress. This argument requires a few caveats. First, it assumes a particular threat model: that closed source models will have more dangerous capabilities than open source models, and that malicious actors will be able to query closed source models. This seems like a reasonable assumption over the next few years. Second, there are many other ways to reduce risks from catastrophic misuse, such as removing hazardous knowledge from model weights, strengthening societal defenses against catastrophe, and holding companies legally liable for sub-extinction level harms. I think we should work on these in addition to adversarial robustness, as part of a defense-in-depth approach to misuse risk. Overall, I think adversarial robustness should receive more effort from researchers and labs, more funding from donors, and should be a part of the technical AI safety research portfolio. This could substantially mitigate the near-term risk of catastrophic misuse, in addition to any potential benefits for scalable oversight. The rest of this post discusses each of the above points in more detail. Misuse could lead to catastrophe There are many ways that malicious use of AI could lead to catastrophe. AI could enable cyberattacks, personalized propaganda and mass manipulation, or the acquisition of weapons of mass destruction. Personally, I think the most compelling case is that AI will enable biological terrorism. Ideally, ChatGPT would refuse to aid in dangerous activities such as constructing a bioweapon. But by using an adversarial jailbreak prompt, undergraduates in a class taught by Kevin Esvelt at MIT evaded this safeguard: In one hour, the chatbots suggested four potential pandemic pathogens, explained how they can be generated from synthetic DNA using reverse genetics, supplied the names of DNA synthesis companies unlikely to screen orders, identified detailed protocols and how to troubleshoot them, and recommended that anyone lacking the skills to perform reverse genetics engage a core facility or contract research organization. Fortunately, today's models lack key information about building bioweapons. It's not even clear that they're more u...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: What's new at FAR AI, published by AdamGleave on December 4, 2023 on The AI Alignment Forum. Summary We are FAR AI: an AI safety research incubator and accelerator. Since our inception in July 2022, FAR has grown to a team of 12 full-time staff, produced 13 academic papers, opened the coworking space FAR Labs with 40 active members, and organized field-building events for more than 160 ML researchers. Our organization consists of three main pillars: Research. We rapidly explore a range of potential research directions in AI safety, scaling up those that show the greatest promise. Unlike other AI safety labs that take a bet on a single research direction, FAR pursues a diverse portfolio of projects. Our current focus areas are building a science of robustness (e.g. finding vulnerabilities in superhuman Go AIs), finding more effective approaches to value alignment (e.g. training from language feedback), and model evaluation (e.g. inverse scaling and codebook features). Coworking Space. We run FAR Labs, an AI safety coworking space in Berkeley. The space currently hosts FAR, AI Impacts, SERI MATS, and several independent researchers. We are building a collaborative community space that fosters great work through excellent office space, a warm and intellectually generative culture, and tailored programs and training for members. Applications are open to new users of the space (individuals and organizations). Field Building. We run workshops, primarily targeted at ML researchers, to help build the field of AI safety research and governance. We co-organized the International Dialogue for AI Safety bringing together prominent scientists from around the globe, culminating in a public statement calling for global action on AI safety research and governance. We will soon be hosting the New Orleans Alignment Workshop in December for over 140 researchers to learn about AI safety and find collaborators. We want to expand, so if you're excited by the work we do, consider donating or working for us! We're hiring research engineers, research scientists and communications specialists. Incubating & Accelerating AI Safety Research Our main goal is to explore new AI safety research directions, scaling up those that show the greatest promise. We select agendas that are too large to be pursued by individual academic or independent researchers but are not aligned with the interests of for-profit organizations. Our structure allows us to both (1) explore a portfolio of agendas and (2) execute them at scale. Although we conduct the majority of our work in-house, we frequently pursue collaborations with researchers at other organizations with overlapping research interests. Our current research falls into three main categories: Science of Robustness. How does robustness vary with model size? Will superhuman systems be vulnerable to adversarial examples or "jailbreaks" similar to those seen today? And, if so, how can we achieve safety-critical guarantees? Relevant work: Vulnerabilities in superhuman Go AIs, AI Safety in a World of Vulnerable Machine Learning Systems. Value Alignment. How can we learn reliable reward functions from human data? Our research focuses on enabling higher bandwidth, more sample-efficient methods for users to communicate preferences for AI systems; and improved methods to enable training with human feedback. Relevant work: VLM-RM: Specifying Rewards with Natural Language, Training Language Models with Language Feedback. Model Evaluation: How can we evaluate and test the safety-relevant properties of state-of-the-art models? Evaluation can be split into black-box approaches that focus only on externally visible behavior ("model testing"), and white-box approaches that seek to interpret the inner workings ("interpretability"). These approaches are complementary, with ...
In this episode, Nathan chats with Josh Albrecht, CTO of Imbue. They discuss how to create agents for reasoning, reliability, and robustness. If you need an ecommerce platform, check out our sponsor Shopify: https://shopify.com/cognitive for a $1/month trial period. RECOMMENDED PODCAST: Every week investor and writer of the popular newsletter The Diff, Byrne Hobart, and co-host Erik Torenberg discuss today's major inflection points in technology, business, and markets – and help listeners build a diversified portfolio of trends and ideas for the future. Subscribe to “The Riff” with Byrne Hobart and Erik Torenberg: https://www.youtube.com/@TheRiffPodcast SPONSORS: Shopify is the global commerce platform that helps you sell at every stage of your business. Shopify powers 10% of ALL eCommerce in the US. And Shopify's the global force behind Allbirds, Rothy's, and Brooklinen, and 1,000,000s of other entrepreneurs across 175 countries.From their all-in-one e-commerce platform, to their in-person POS system – wherever and whatever you're selling, Shopify's got you covered. With free Shopify Magic, sell more with less effort by whipping up captivating content that converts – from blog posts to product descriptions using AI. Sign up for $1/month trial period: https://shopify.com/cognitive MasterClass https://masterclass.com/zen get two memberships for the price of 1 Learn from the best to become your best. Learn how to negotiate a raise with Chris Voss or manage your relationships with Esther Perel. Boost your confidence and find practical takeaways you can apply to your life and at work. If you own a business or are a team leader, use MasterClass to empower and create future-ready employees and leaders. Moment of Zen listeners will get two memberships for the price of one at https://masterclass.com/zen Omneky is an omnichannel creative generation platform that lets you launch hundreds of thousands of ad iterations that actually work customized across all platforms, with a click of a button. Omneky combines generative AI and real-time advertising data. Mention "Cog Rev" for 10% off. X/SOCIAL @labenz (Nathan) @eriktorenberg (Erik) @CogRev_Podcast TIMESTAMPS: (00:00:00) – Episode Preview (00:07:14) – What does it mean to be a research company? (00:10:25) – How is the reasoning landscape these days and how might it evolve? (00:11:03) – Data quality is highly important (00:21:15) – What's the difference between good features and a good world model? (00:27:31) – The impact of new modalities on reasoning (00:29:15) – How much can reasoning and knowledge be separated? (00:45:13) – Imbue demo and are they building their own LLMs or using others? (00:49:37) – Does Imbue have a deal with Nvidia? (00:57:48) – Carbs framework (01:12:57) – Imbue's involvement with policy and and AI safety (01:16:23) – Takeaways from AI Safety Summit and Biden's Order
In episode 100 of The Gradient Podcast, Daniel Bashir speaks to Professor Thomas Dietterich.Professor Dietterich is Distinguished Professor Emeritus in the School of Electrical Engineering and Computer Science at Oregon State University. He is a pioneer in the field of machine learning, and has authored more than 225 refereed publications and two books. His current research topics include robust artificial intelligence, robust human-AI systems, and applications in sustainability. He is a former President of the Association for the Advancement of Artificial Intelligence, and the founding President of the International Machine Learning Society. Other major roles include Executive Editor of the journal Machine Learning, co-founder of the Journal for Machine Learning Research, and program chair of AAAI 1990 and NIPS 2000. He currently serves as one of the moderators for the cs.LG category on arXiv.Have suggestions for future podcast guests (or other feedback)? Let us know here or reach us at editor@thegradient.pubSubscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Episode 100 Note* (02:03) Intro* (04:23) Prof. Dietterich's background* (14:20) Kuhn and theory development in AI, how Prof Dietterich thinks about the philosophy of science and AI* (20:10) Scales of understanding and sentience, grounding, observable evidence* (23:58) Limits of statistical learning without causal reasoning, systematic understanding* (25:48) A challenge for the ML community: testing for systematicity* (26:13) Forming causal understandings of the world* (28:18) Learning at the Knowledge Level* (29:18) Background and definitions* (32:18) Knowledge and goals, a note on LLMs* (33:03) What it means to learn* (41:05) LLMs as learning results of inference without learning first principles* (43:25) System I/II thinking in humans and LLMs* (47:23) “Routine Science”* (47:38) Solving multiclass learning problems via error-correcting output codes* (52:53) Error-correcting codes and redundancy* (54:48) Why error-correcting codes work, contra intuition* (59:18) Bias in ML* (1:06:23) MAXQ for hierarchical RL* (1:15:48) Computational sustainability* (1:19:53) Project TAHMO's moonshot* (1:23:28) Anomaly detection for weather stations* (1:25:33) Robustness* (1:27:23) Motivating The Familiarity Hypothesis* (1:27:23) Anomaly detection and self-models of competence* (1:29:25) Measuring the health of freshwater streams* (1:31:55) An open set problem in species detection* (1:33:40) Issues in anomaly detection for deep learning* (1:37:45) The Familiarity Hypothesis* (1:40:15) Mathematical intuitions and the Familiarity Hypothesis* (1:44:12) What's Wrong with LLMs and What We Should Be Building Instead* (1:46:20) Flaws in LLMs* (1:47:25) The systems Prof Dietterich wants to develop* (1:49:25) Hallucination/confabulation and LLMs vs knowledge bases* (1:54:00) World knowledge and linguistic knowledge* (1:55:07) End-to-end learning and knowledge bases* (1:57:42) Components of an intelligent system and separability* (1:59:06) Thinking through external memory* (2:01:10) OutroLinks:* Research — Fundamentals (Philosophy of AI)* Learning at the Knowledge Level* What Does it Mean for a Machine to Understand?* Research – “Routine science”* Ensemble methods in ML and error-correcting output codes* Solving multiclass learning problems via error-correcting output codes* An experimental comparison of bagging, boosting, and randomization* ML Bias, Statistical Bias, and Statistical Variance of Decision Tree Algorithms* The definitive treatment of these questions, by Gareth James* Discovering/Exploiting structure in MDPs:* MAXQ for hierarchical RL* Exogenous State MDPs (paper with George Trimponias, slides)* Research — Ecosystem Informatics and Computational Sustainability* Project TAHMO* Challenges for ML in Computational Sustainability* Research — Robustness* Steps towards robust AI (AAAI President's Address)* Benchmarking NN Robustness to Common Corruptions and Perturbations with Dan Hendrycks* The familiarity hypothesis: Explaining the behavior of deep open set methods* Recent commentary* Toward High-Reliability AI* What's Wrong with Large Language Models and What We Should Be Building Instead Get full access to The Gradient at thegradientpub.substack.com/subscribe
Kevin Mannion is a Strength and Conditioning coach and Premiership Rugby Club Sale Sharks. Previously he has worked for Gloucester Rugby and Rugby League side Wigan Warriors.Show notes:Developing Successful Academy Programs - 1:31Kevin Mannion shares insights on what made the Wigan period successful, including the players' understanding of the responsibility of playing for Wigan and the presence of leaders like Andy Farrell and Chris Identy. He emphasizes the importance of creating a cultural underpinning of young local players playing for their home team, which generates a sense of pride from the fans. Michael Wright adds to the discussion by highlighting the importance of players wanting to be part of the group and how it translates to the pitch and the fans.Characteristics of Players in Rugby Union = 11:15Kevin Mannion and Michael Wright discuss the characteristics required for players in rugby union, including the need for robustness and adaptability. Mannion explains how the program is designed to cater to the diverse characteristics of players, with different programs for front five and back rolls and backs. He emphasizes the importance of functional weight and creating a suit of armor rather than just adding weight for weight's sake.Managing Player Progression and Loans - 21:41Kevin and Michael discuss the loan system in rugby and how it can be used to develop players. They talk about the importance of timing and ensuring that players are physically and mentally ready for the demands of the game at different levels. They also discuss the challenges of managing the progression and pathway of young players, including the need to balance the needs of the first team with the development of young players.Player Preparation and Robustness 30:13Kevin and Michael discuss the importance of player preparation during the season, with Kevin emphasizing the need for recovery and freshness for games. They also discuss how to prepare players to be more robust, with Kevin mentioning a blend of touch, grab, and shoulders on training with physio-led gym prep. Kevin also mentions that every player has individual Boltons based on screening, and that they try to fit in three different elements of preparation within the week.Importance of Pre-Game Preparation and Natural Consequences - 34:23Michael and Kevin emphasize the importance of pre-game preparation and how it can affect a player's performance and opportunities. They discuss how natural consequences can motivate players to do the necessary work and how injuries can be a result of not doing the work. They also mention the challenges of working with cross organizations to ensure players are able to peak and perform at the right times.Overview of the Academy Schedule and Player Management- 43:41Kevin Mannion explains the academy schedule and player management, including the different pathways for under 16s and the importance of managing player fatigue. He also discusses the interaction between the academy and school systems and the need to ensure players have the best experiences to promote themselves. The academy also manages high potential players, known as England Academy players, with different interactions and considerations around their playing time.Maximizing Point of Difference - 55:10Kevin and Michael explore the concept of point of difference in athletes and how to help them embrace and improve it. They discuss the importance of identifying specific skills and techniques for different positions in football and how to manage training to maximize performance. Hosted on Acast. See acast.com/privacy for more information.
Watch behind the scenes with Bert on Patreon: https://www.patreon.com/posts/bert-de-vries-93230722 https://discord.gg/aNPkGUQtc5 https://twitter.com/MLStreetTalk Note, there is some mild background music on chapter 1 (Least Action), 3 (Friston) and 5 (Variational Methods) - please skip ahead if annoying. It's a tiny fraction of the overall podcast. YT version: https://youtu.be/2wnJ6E6rQsU Bert de Vries is Professor in the Signal Processing Systems group at Eindhoven University. His research focuses on the development of intelligent autonomous agents that learn from in-situ interactions with their environment. His research draws inspiration from diverse fields including computational neuroscience, Bayesian machine learning, Active Inference and signal processing. Bert believes that development of signal processing systems will in the future be largely automated by autonomously operating agents that learn purposeful from situated environmental interactions. Bert received nis M.Sc. (1986) and Ph.D. (1991) degrees in Electrical Engineering from Eindhoven University of Technology (TU/e) and the University of Florida, respectively. From 1992 to 1999, he worked as a research scientist at Sarnoff Research Center in Princeton (NJ, USA). Since 1999, he has been employed in the hearing aids industry, both in engineering and managerial positions. De Vries was appointed part-time professor in the Signal Processing Systems Group at TU/e in 2012. Contact: https://twitter.com/bertdv0 https://www.tue.nl/en/research/researchers/bert-de-vries https://www.verses.ai/about-us Panel: Dr. Tim Scarfe / Dr. Keith Duggar TOC: [00:00:00] Principle of Least Action [00:05:10] Patreon Teaser [00:05:46] On Friston [00:07:34] Capm Peterson (VERSES) [00:08:20] Variational Methods [00:16:13] Dan Mapes (VERSES) [00:17:12] Engineering with Active Inference [00:20:23] Jason Fox (VERSES) [00:20:51] Riddhi Jain Pitliya [00:21:49] Hearing Aids as Adaptive Agents [00:33:38] Steven Swanson (VERSES) [00:35:46] Main Interview Kick Off, Engineering and Active Inference [00:43:35] Actor / Streaming / Message Passing [00:56:21] Do Agents Lose Flexibility with Maturity? [01:00:50] Language Compression [01:04:37] Marginalisation to Abstraction [01:12:45] Online Structural Learning [01:18:40] Efficiency in Active Inference [01:26:25] SEs become Neuroscientists [01:35:11] Building an Automated Engineer [01:38:58] Robustness and Design vs Grow [01:42:38] RXInfer [01:51:12] Resistance to Active Inference? [01:57:39] Diffusion of Responsibility in a System [02:10:33] Chauvinism in "Understanding" [02:20:08] On Becoming a Bayesian Refs: RXInfer https://biaslab.github.io/rxinfer-website/ Prof. Ariel Caticha https://www.albany.edu/physics/faculty/ariel-caticha Pattern recognition and machine learning (Bishop) https://www.microsoft.com/en-us/research/uploads/prod/2006/01/Bishop-Pattern-Recognition-and-Machine-Learning-2006.pdf Data Analysis: A Bayesian Tutorial (Sivia) https://www.amazon.co.uk/Data-Analysis-Bayesian-Devinderjit-Sivia/dp/0198568320 Probability Theory: The Logic of Science (E. T. Jaynes) https://www.amazon.co.uk/Probability-Theory-Principles-Elementary-Applications/dp/0521592712/ #activeinference #artificialintelligence
Artificial intelligence and machine learning (AI/ML) have been extremely successful in predicting, optimizing, and controlling the behavior of complex interacting systems. Robustness and explainability of existing AI/ML methods, however, remain big challenges, and clearly new approaches are needed. In this program, Ambuj K. Singh, Distinguished Professor of Computer Science at the UC, Santa Barbara, explains that the human brain motivated the early development of the field of deep learning, and neuroscientific concepts have contributed to the profound success of deep learning algorithms across many areas. The next leap in AI/ML may again come from a deeper understanding of modularity, robustness, and adaptability of brain architectures. Series: "GRIT Talks" [Science] [Show ID: 39331]
Episode 8. This is the first in a series of episodes dedicated to model validation. Today, we focus on model robustness and resilience. From complex financial systems to why your gym might be overcrowded at New Year's, you've been directly affected by these aspects of model validation.AI hype and consumer trust (0:03) FTC article highlights consumer concerns about AI's impact on lives and businesses (Oct 3, FTC)Increased public awareness of AI and the masses of data needed to train it led to increased awareness of potential implications for misuse.Need for transparency and trust in AI's development and deployment.Model validation and its importance in AI development (3:42)Importance of model validation in AI development, ensuring models are doing what they're supposed to do.FTC's heightened awareness of responsibility and the need for fair and unbiased AI practices.Model validation (targeted, specific) vs model evaluation (general, open-ended).Model validation and resilience in machine learning (8:26)Collaboration between engineers and businesses to validate models for resilience and robustness.Resilience: how well a model handles adverse data scenarios.Robustness: model's ability to generalize to unforeseen data.Aerospace Engineering: models must be resilient and robust to perform well in real-world environments.Statistical evaluation and modeling in machine learning (14:09)Statistical evaluation involves modeling distribution without knowing everything, using methods like Monte Carlo sampling.Monte Carlo simulations originated in physics for assessing risk and uncertainty in decision-making.Monte Carlo methods for analyzing model robustness and resilience (17:24)Monte Carlo simulations allow exploration of potential input spaces and estimation of underlying distribution.Useful when analytical solutions are unavailable.Sensitivity analysis and uncertainty analysis as major flavors of analyses.Monte Carlo techniques and model validation (21:31)Versatility of Monte Carlo simulations in various fields.Using Monte Carlo experiments to explore semantic space vectors of language models like GPT.Importance of validating machine learning models through negative scenario analysis.Stress testing and resiliency in finance and engineering (25:48)Importance of stress testing in finance, combining traditional methods with Monte Carlo techniques.Synthetic data's potential in modeling critical systems.Identifying potential gaps and vulnerabilities in critical systems.Using operations research and model validation in AI development (30:13)Operations research can help find an equilibrium in overcrowding in gyms.Robust methods for solving complex problems in logistics and hWhat did you think? Let us know.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.