A podcast about the fundamentals of safe and resilient modeling systems behind the AI that impacts our lives and our businesses.Â
Dr. Andrew Clark & Sid Mangalik
Dr. Michael Zargham provides a systems engineering perspective on AI agents, emphasizing accountability structures and the relationship between principals who deploy agents and the agents themselves. In this episode, he brings clarity to the often misunderstood concept of agents in AI by grounding them in established engineering principles rather than treating them as mysterious or elusive entities.Show highlights• Agents should be understood through the lens of the principal-agent relationship, with clear lines of accountability• True validation of AI systems means ensuring outcomes match intentions, not just optimizing loss functions• LLMs by themselves are "high-dimensional word calculators," not agents - agents are more complex systems with LLMs as components• Guardrails provide deterministic constraints ("musts" or "shalls") versus constitutional AI's softer guidance ("shoulds")• Systems engineering approaches from civil engineering and materials science offer valuable frameworks for AI development• Authority and accountability must align - people shouldn't be held responsible for systems they don't have authority to control• The transition from static input-output to closed-loop dynamical systems represents the shift toward truly agentic behavior• Robust agent systems require both exploration (lab work) and exploitation (hardened deployment) phases with different standardsExplore Dr. Zargham's workProtocols and Institutions (Feb 27, 2025)Comments Submitted by BlockScience, University of Washington APL Information Risk and Synthetic Intelligence Research Initiative (IRSIRI), Cognitive Security and Education Forum (COGSEC), and the Active Inference Institute (AII) to the Networking and Information Technology Research and Development National Coordination Office's Request for Comment on The Creation of a National Digital Twins R&D Strategic Plan NITRD-2024-13379 (Aug 8, 2024)What did you think? Let us know.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
Part 2 of this series could have easily been renamed "AI for science: The expert's guide to practical machine learning.” We continue our discussion with Christoph Molnar and Timo Freiesleben to look at how scientists can apply supervised machine learning techniques from the previous episode into their research.Introduction to supervised ML for science (0:00) Welcome back to Christoph Molnar and Timo Freiesleben, co-authors of “Supervised Machine Learning for Science: How to Stop Worrying and Love Your Black Box”The model as the expert? (1:00)Evaluation metrics have profound downstream effects on all modeling decisionsData augmentation offers a simple yet powerful way to incorporate domain knowledgeDomain expertise is often undervalued in data science despite being crucialMeasuring causality: Metrics and blind spots (10:10)Causality approaches in ML range from exploring associations to inferring treatment effectsConnecting models to scientific understanding (18:00)Interpretation methods must stay within realistic data distributions to yield meaningful insightsRobustness across distribution shifts (26:40)Robustness requires understanding what distribution shifts affect your modelPre-trained models and transfer learning provide promising paths to more robust scientific MLReproducibility challenges in ML and science (35:00)Reproducibility challenges differ between traditional science and machine learningGo back to listen to part one of this series for the conceptual foundations that support these practical applications.Check out Christoph and Timo's book “Supervised Machine Learning for Science: How to Stop Worrying and Love Your Black Box” available online now.What did you think? Let us know.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
Machine learning is transforming scientific research across disciplines, but many scientists remain skeptical about using approaches that focus on prediction over causal understanding. That's why we are excited to have Christoph Molnar return to the podcast with Timo Freibusleben. They are co-authors of "Supervised Machine Learning for Science: How to Stop Worrying and Love your Black Box." We will talk about the perceived problems with automation in certain sciences and find out how scientists can use machine learning without losing scientific accuracy.• Different scientific disciplines have varying goals beyond prediction, including control, explanation, and reasoning about phenomena• Traditional scientific approaches build models from simple to complex, while machine learning often starts with complex models• Scientists worry about using ML due to lack of interpretability and causal understanding• ML can both integrate domain knowledge and test existing scientific hypotheses• "Shortcut learning" occurs when models find predictive patterns that aren't meaningful• Machine learning adoption varies widely across scientific fields• Ecology and medical imaging have embraced ML, while other fields remain cautious• Future directions include ML potentially discovering scientific laws humans can understand• Researchers should view machine learning as another tool in their scientific toolkitStay tuned! In part 2, we'll shift the discussion with Christoph and Timo to talk about putting these concepts into practice. What did you think? Let us know.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
Unlock the secrets to AI's modeling paradigms. We emphasize the importance of modeling practices, how they interact, and how they should be considered in relation to each other before you act. Using the right tool for the right job is key. We hope you enjoy these examples of where the greatest AI and machine learning techniques exist in your routine today.More AI agent disruptors (0:56)Proxy from London start-up Convergence AIAnother hit to OpenAI, this product is available for free, unlike OpenAI's Operator. AI Paris Summit - What's next for regulation? (4:40)[Vice President] Vance tells Europeans that heavy regulation can kill AIUS federal administration withdrawing from the previous trend of sweeping big tech regulation on modeling systems.The EU is pushing to reduce bureaucracy but not regulatory pressureModeling paradigms explained (10:33)As companies look for an edge in high-stakes computations, we've seen best-in-class rediscovering expert system-based techniques that, with modern computing power, are breathing new light into them. Paradigm 1: Agents (11:23)Paradigm 2: Generative (14:26)Paradigm 3: Mathematical optimization (regression) (18:33)Paradigm 4: Predictive (classification) (23:19)Paradigm 5: Control theory (24:37)The right modeling paradigm for the job? (28:05)What did you think? Let us know.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
Agentic AI it the latest foray into big-bet promises for businesses and society at large. While promising autonomy and efficiency, AI agents raise fundamental questions about their accuracy, governance, and the potential pitfalls of over-reliance on automation. Does this story sound vaguely familiar? Hold that thought. This discussion about the over-under of certain promises is for you.Show NotesThe economics of LLMs and DeepSeek R1 (00:00:03)Reviewing recent developments in AI technologies and their implications Discussing the impact of DeepSeek's R1 model on the AI landscape, NVIDIA The origins of agentic AI (00:07:12)Status quo of AI models to date: Is big tech backing away from promise of generative AI?Agentic AI designed to perceive, reason, act, and learnGovernance and agentic AI (00:13:12)Examining the tension between cost efficiency and performance risks [LangChain State of AI Agents Report]Highlighting governance concerns related to AI agents Issues with agentic AI implementation (00:21:01)Considering the limitations of AI agents and their adoption in the workplace Analyzing real-world experiments with AI agent technologies, like Devin What's next for complex and agentic AI systems (00:29:27)Offering insights on the cautious integration of these systems in business practicesEncouraging a thoughtful approach to leveraging AI capabilities for measurable outcomesWhat did you think? Let us know.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
What if privacy could be as dynamic and socially aware as the communities it aims to protect? Sebastian Benthall, a senior research fellow from NYU's Information Law Institute, shows us how privacy is complex. He uses Helen Nissenbaum's work with contextual integrity and concepts in differential privacy to explain the complexity of privacy. Our talk explains how privacy is not just about protecting data but also about following social rules in different situations, from healthcare to education. These rules can change privacy regulations in big ways.Show notesIntro: Sebastian Benthall (0:03)Research: Designing Fiduciary Artificial Intelligence (Benthall, Shekman)Integrating Differential Privacy and Contextual Integrity (Benthall, Cummings)Exploring differential privacy and contextual integrity (1:05)Discussion about the origins of each subjectHow are differential privacy and contextual integrity used to enforce each other?Accepted context or legitimate context? (9:33)Does context develop from what society accepts over time?Approaches to determine situational context and legitimacyNext steps in contextual integrity (13:35)Is privacy as we know it ending?Areas where integrated differential privacy and contextual integrity can help (Cummings)Interpretations of differential privacy (14:30)Not a silver bulletNew questions posed from NIST about its applicationPrivacy determined by social norms (20:25)Game theory and its potential for understanding social normsAgents and governance: what will ultimately decide privacy? (25:27)Voluntary disclosures and the biases it can present towards groups that are least concerned with privacyAvoiding self-fulfilling prophecy from data and contextWhat did you think? Let us know.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
What if the secret to successful AI governance lies in understanding the evolution of model documentation? In this episode, our hosts challenge the common belief that model cards marked the start of documentation in AI. We explore model documentation practices, from their crucial beginnings in fields like finance to their adaptation in Silicon Valley. Our discussion also highlights the important role of early modelers and statisticians in advocating for a complete approach that includes the entire model development lifecycle.Show NotesModel documentation origins and best practices (1:03)Documenting a model is a comprehensive process that requires giving users and auditors clear understanding: Why was the model built? What data goes into a model? How is the model implemented? What does the model output? Model cards - pros and cons (7:33)Model cards for model reporting, Association for Computing MachineryEvolution from this research to Google's definition to todayHow the market perceives them vs. what they areWhy the analogy “nutrition labels for models” needs a closer lookSystem cards - pros and cons (12:03)To their credit, OpenAI system cards somewhat bridge the gap between proper model documentation and a model card.Contains complex descriptions of evaluation methodologies along with results; extra points for reporting red-teaming resultsRepresents 3rd-party opinions of the social and ethical implications of the release of the modelAutomating model documentation with generative AI (17:17)Finding the balance in automation in a great governance strategyGenerative AI can provide an assist in editing and personal workflowImproving documentation for AI governance (23:11)As model expert, engage from the beginning with writing the bulk of model documentation by hand.The exercise of documenting your models solidifies your understanding of the model's goals, values, and methods for the businessWhat did you think? Let us know.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
Are businesses ready for large language models as a path to AI? In this episode, the hosts reflect on the past year of what has changed and what hasn't changed in the world of LLMs. Join us as we debunk the latest myths and emphasize the importance of robust risk management in AI integration. The good news is that many decisions about adoption have forced businesses to discuss their future and impact in the face of emerging technology. You won't want to miss this discussion.Intro and news: The veto of California's AI Safety Bill (00:00:03)Can state-specific AI regulations really protect consumers, or do they risk stifling innovation? (Gov. Newsome's response)Veto highlights the critical need for risk-based regulations that don't rely solely on the size and cost of language models Arguments to be made for a cohesive national framework that ensures consistent AI regulation across the United StatesAre businesses ready to embrace large language models, or are they underestimating the challenges? (00:08:35) The myth that acquiring a foundational model is a quick fix for productivity woes The essential role of robust risk management strategies, especially in sensitive sectors handling personal dataReview of model cards, Open AI's system cards, and the importance of thorough testing, validation, and stricter regulations to prevent a false sense of securityTransparency alone is not enough; objective assessments are crucial for genuine progress in AI integrationFrom hallucinations in language models to ethical energy use, we tackle some of the most pressing problems in AI today (00:16:29)Reinforcement learning with annotators and the controversial use of other models for reviewJan LeCun's energy systems and retrieval-augmented generation (RAG) offer intriguing alternatives that could reshape modeling approachesThe ethics of advancing AI technologies, consider the parallels with past monumental achievements and the responsible allocation of resources (00:26:49)There is good news about developments and lessons learned from LLMs; but there is also a long way to go.Our original predictions in episode 2 for LLMs still reigns true: “Reasonable expectations of LLMs: Where truth matters and risk tolerance is low, LLMs will not be a good fit”With increased hype and awareness from LLMs came varying levels of interest in how all model types and their impacts are governed in a business.What did you think? Let us know.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
Our special guest, astrophysicist Rachel Losacca, explains the intricacies of galaxies, modeling, and the computational methods that unveil their mysteries. She shares stories about how advanced computational resources enable scientists to decode galaxy interactions over millions of years with true-to-life accuracy. Sid and Andrew discuss transferable practices for building resilient modeling systems. Prologue: Why it's important to bring stats back [00:00:03]Announcement from the American Statistical Association (ASA): Data Science Statement Updated to Include “ and AI” Today's guest: Rachel Losacco [00:02:10]Rachel is an astrophysicist who's worked with major galaxy formation simulations for many years. She hails from Leiden (Lie-den) University and the University of Florida. As a Senior Data Scientist, she works on modeling road safety. Defining complex systems through astrophysics [00:02:52]Discussion about origins and adoption of complex systemsDifficulties with complex systems: Nonlinearity, chaos and randomness, collective dynamics and hierarchy, and emergence.Complexities of nonlinear systems [00:08:20]Linear models (Least Squares, GLMs, SVMs) can be incredibly powerful but they cannot model all possible functions (e.g. a decision boundary of concentric circles)Non-linearity and how it exists in the natural worldChaos and randomness [00:11:30]Enter references to Jurassic Park and The Butterfly Effect“In universe simulations, a change to a single parameter can govern if entire galaxy clusters will ever form” - RachelCollective dynamics and hierarchy [00:15:45]Interactions between agents don't occur globally and often is mediated through effects that only happen on specific sub-scalesAdaptation: components of systems breaking out of linear relationships between inputs and outputs to better serve the function of the greater system Emergence and complexity [00:23:36]New properties arise from the system that cannot be explained by the base rules governing the systemExamples in astrophysics [00:24:34]These difficulties are parts of solving previously impossible problemsConsider this lecture from IIT Delhi on Complex Systems to get a sense of what is required to study and formalize a complex system and its collective dynamics (https://www.youtube.com/watch?v=yJ39ppgJlf0)Consciousness and reasoning from a new point of view [00:31:45]Non-linearity, hierarchy, feedback loops, and emergence may be ways to study consciousness. The brain is a complex system that a simple set of rules cannot fully define.See: Brain modeling from scratch of C. Elgans What did you think? Let us know.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
Can your AI models survive a big disaster? While a recent major IT incident with CrowdStrike wasn't AI related, the magnitude and reaction reminded us that no system no matter how proven is immune to failure. AI modeling systems are no different. Neglecting the best practices of building models can lead to unrecoverable failures. Discover how the three-tiered framework of robustness, resiliency, and anti-fragility can guide your approach to creating AI infrastructures that not only perform reliably under stress but also fail gracefully when the unexpected happens.Show NotesTechnology, incidents, and why basics matter (00:00:03)While the recent Crowdstrike incident wasn't caused by AI, it's impact was a wakeup call for people and processes that support critical systemsAs AI is increasingly being used at both experimental and production levels, we can expect AI incidents are a matter of if, not when. What can you do to prepare?The "7P's": Are you capable of handling the unexpected? (00:09:05)The 7Ps is an adage, dating back to WWII, that aligns with our "do things the hard way" approach to AI governance and modeling systems.Let's consider the levels of building a performant system: Robustness, Resiliency, and AntifragilityModel robustness (00:10:03)Robustness is a very important but often overlooked component of building modeling systems. We suspect that part of the problem is due to: The Kaggle-driven upbringing of data scientistsAssumed generalizability of modeling systems, when models are optimized to perform well on their training data but do not generalize enough to perform well on unseen data.Model resilience (00:16:10)Resiliency is the ability to absorb adverse stimuli without destruction and return to its pre-event state.In practice, robustness and resiliency, testing, and planning are often easy components to leave out. This is where risks and threats are exposed.See also, Episode 8. Model validation: Robustness and resilienceModels and antifragility (00:25:04)Unlike resiliency, which is the ability to absorb damaging inputs without breaking, antifragility is the ability of a system to improve from challenging stimuli. (i.e. the human body)A key question we need to ask ourselves if we are not actively building our AI systems to be antifragile, why are we using AI systems at all?What did you think? Let us know.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
Join us as we chat with Patrick Hall, Principal Scientist at Hallresearch.ai and Assistant Professor at George Washington University. He shares his insights on the current state of AI, its limitations, and the potential risks associated with it. The conversation also touched on the importance of responsible AI, the role of the National Institute of Standards and Technology (NIST) AI Risk Management Framework (RMF) in adoption, and the implications of using generative AI in decision-making.Show notesGovernance, model explainability, and high-risk applications 00:00:03 Intro to PatrickHis latest book: Machine Learning for High-Risk Applications: Approaches to Responsible AI (2023)The benefits of NIST AI Risk Management Framework 00:04:01 Does not have a profit motive, which avoids the potential for conflicts of interest when providing guidance on responsible AI. Solicits, adjudicates, and incorporates feedback from the public and other stakeholders.NIST is not law, however it's recommendations set companies up for outcome-based reviews by regulators.Accountability challenges in "blame-free" cultures 00:10:24 Cites these cultures have the hardest time with the framework's recommendationsPractices like documentation and fair model reviews need accountability and objectivityIf everyone's responsible, no one's responsible.The value of explainable models vs black-box models 00:15:00 Concerns about replacing explainable models with LLMs for LLM's sake Why generative AI is bad for decision-making AI and its impact on students 00:21:49 Students are more indicative of where the hype and market is todayTeaching them how to work through the best model for the best job despite the hypeAI incidents and contextual failures 00:26:17 AI Incident Database AI, as it currently stands, is a memorizing and calculating technology. It lacks the ability to incorporate real-world context.McDonald's AI Drive-Thru debacle is a warning to us allGenerative AI and homogenization problems 00:34:30Recommended resources from Patrick:Ed Zitron “Better Offline” NIST ARIA AI Safety Is a Narrative ProblemWhat did you think? Let us know.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
Ready to uncover the secrets of modern systems engineering and the future of AI? Join us for an enlightening conversation with Matt Barlin, the Chief Science Officer of Valence. Matt's extensive background in systems engineering and data lineage sets the stage for a fascinating discussion. He sheds light on the historical evolution of the field, the critical role of documentation, and the early detection of defects in complex systems. This episode promises to expand your understanding of model-based systems and data issues, offering valuable insights that only an expert of Matt's caliber can provide.In the heart of our episode, we dive into the fundamentals and transformative benefits of data lineage in AI. Matt draws intriguing parallels between data lineage and the engineering life cycle, stressing the importance of tracking data origins, access rights, and verification processes. Discover how decentralized identifiers are paving the way for individuals to control and monetize their own data. With the phasing out of third-party cookies and the challenges of human-generated training data shortages, we explore how systems like retrieval-augmented generation (RAG) and compliance regulations like the EU AI Act are shaping the landscape of AI data quality and compliance. Don't miss this thought-provoking episode that promises to keep you at the forefront of responsible AI.What did you think? Let us know.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
Explore the basics of differential privacy and its critical role in protecting individual anonymity. The hosts explain the latest guidelines and best practices in applying differential privacy to data for models such as AI. Learn how this method also makes sure that personal data remains confidential, even when datasets are analyzed or hacked.Show NotesIntro and AI news (00:00) Google AI search tells users to glue pizza and eat rocks Gary Marcus on break? (Maybe and X only break)What is differential privacy? (06:34)Differential privacy is a process for sensitive data anonymization that offers each individual in a dataset the same privacy they would experience if they were removed from the dataset entirely.NIST's recent paper SP 800-226 IPD: “Any privacy harms that result form a differentially private analysis could have happened if you had not contributed your data”.There are two main types of differential privacy: global (NIST calls it Central) and localWhy should people care about differential privacy? (11:30)Interest has been increasing for organizations to intentionally and systematically prioritize the privacy and safety of user dataSpeed up deployments of AI systems for enterprise customers since connections to raw data do not need to be establishedIncrease data security for customers that utilize sensitive data in their modeling systemsMinimize the risk of sensitive data exposure for your data privileges - i.e. Don't be THAT organizationGuidelines and resources for applied differential privacyGuidelines for Evaluating Differential Privacy Guarantees: NIST De-IdentificationPractical examples of applied differential privacy (15:58)Continuous Features - cite: Dwork, McSherry, Nissim, and Smith's 2006 seminal paper "Calibrating Noise to Sensitivity in Private Data Analysis”[2], introduces a concept called ε-differential privacyCategorical Features - cite: Warner (1965) created a randomized response technique in his paper titled: “Randomized Response: A Survey Technique for Eliminating Evasive Answer Bias” Summary and key takeaways (23:59)Differential privacy is going to be a part of how many of us need to manage data privacyData providers can't provide us with anonymized data for analysis or when anonymization isn't enough for our privacy needsHopeful that cohort targeting takes over for individual targetingRemember: Differential privacy does not prevent bias!What did you think? Let us know.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
Artificial Intelligence (AI) stands at a unique intersection of technology, ethics, and regulation. The complexities of responsible AI are brought into sharp focus in this episode featuring Anthony Habayeb, CEO and co-founder of Monitaur, As responsible AI is scrutinized for its role in profitability and innovation, Anthony and our hosts discuss the imperatives of safe and unbiased modeling systems, the role of regulations, and the importance of ethics in shaping AI.Show notesPrologue: Why responsible AI? Why now? (00:00:00)Deviating from our normal topics about modeling best practicesContext about where regulation plays a role in industries besides big techCan we learn from other industries about the role of "responsibility" in products? Special guest, Anthony Habayeb (00:02:59)Introductions and start of the discussionOf all the companies you could build around AI, why governance?Is responsible AI the right phrase? (00:11:20)Should we even call good modeling and business practices "responsible AI"?Is having responsible AI a “want to have?” or a “need to have?”Importance of AI regulation and responsibility (00:14:49)People in the AI and regulation worlds have started pushing back on Responsible AI.Do regulations impede freedom?Discussing the big picture of responsibility and governance: Explainability, repeatability, records, and auditWhat about bias and fairness? (00:22:40)You can have fair models that operate with biasBias in practice identifies inequities that models have learnedFairness is correcting for societal biases to level the playing field for safer business and modeling practices to prevail.Responsible deployment and business management (00:35:10)Discussion about what organizations get right about responsible AIAnd what organizations can get completely wrong if they aren't careful.Embracing responsible AI practices (00:41:15)Getting your teams, companies, and individuals involved in the movement towards building AI responsiblyWhat did you think? Let us know.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
Baseline modeling is a necessary part of model validation. In our expert opinion, it should be required before model deployment. There are many baseline modeling types and in this episode, we're discussing their use cases, strengths, and weaknesses. We're sure you'll appreciate a fresh take on how to improve your modeling practices.Show notesIntroductions and news: why reporting and visibility is a good thing for AI 0:03Spoiler alert: Providing visibility to AI bias audits does NOT mean exposing trade secrets. Some reports claim otherwise.Discussion about AI regulation in the context of current events and how regulation is playing out between Boeing and the FAA (tbc)Understanding baseline modeling for machine learning 7:41Establishing baselines allows us to understand how models perform relative to simple rules-based models, aka heuristics.Reporting results without baselines to compare against is like giving a movie a rating of 5 without telling the listener that you were using a 10-point scale.Baseline modeling comparisons are part of rigorous model validations and should always be conducted during early model development and final production deployment.Pairs with analyses of theoretical upper bounds for modeling performance to show how your technique scores between acceptable worst and best case performance.We often find complex models being deployed in the real world that haven't proven their value over simpler and explainable baseline modelsClassification baselines and model performance comparison 19:40Uniform Random Selection - simulate how your model does against a baseline model that guesses classes randomly like a dice.Most Frequent Class (MFC) - the most telling test and often the most telling test in the case of highly skewed data with inappropriate metrics.Single-feature modeling - Validates how much the complex signal from your data and model improves over a bare minimum explainable model.And more…Exploring regression and more advanced baselines for modeling 24:11Regression baselines: mean, median mode, Single-variable linear regression, Lag 1, and Least 5% re-interpretationAdvanced baselines in language and visionConclusions 35:39Baseline modeling is a necessary part of model validationThere are differing flavors of baselines that are appropriate for all types of modelingBaselines are needed to establish fair and realistic lower bounds for performanceIf your model can't perform significantly better than a baseline consider scrapping the model and trying a new approachWhat did you think? Let us know.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
In this episode, we explore information theory and the not-so-obvious shortcomings of its popular metrics for model monitoring; and where non-parametric statistical methods can serve as the better option. Introduction and latest news 0:03Gary Marcus has written an article questioning the hype around generative AI, suggesting it may not be as transformative as previously thought.This in contrast to announcements out of the NVIDIA conference during the same week.Information theory and its applications in AI. 3:45The importance of information theory in computer science, citing its applications in cryptography and communication.The basics of information theory, including the concept of entropy, which measures the uncertainty of a random variable.Information theory as a fundamental discipline in computer science, and how it has been applied in recent years, particularly in the field of machine learning.The speakers clarify the difference between a metric and a divergence, which is crucial to understanding how information theory is being misapplied in some casesInformation theory metrics and their limitations. 7:05Divergences are a type of measurement that don't follow simple rules like distance, and they have some nice properties but can be troublesome in certain use cases.KL Divergence is a popular test for monitoring changes in data distributions, but it's not symmetric and can lead to incorrect comparisons.Sid explains that KL divergence measures the slight surprisal or entropy difference between moving from one data distribution to another, and is not the same as KS test.Metrics for monitoring AI model changes. 10:41The limitations of KL divergence and its alternatives, including Jenson Shannon divergence and population stability index.They highlight the issues with KL divergence, such as asymmetry and handling of zeros, and the advantages of Jenson Shannon divergence, which can handle both issues, and population stability index, which provides a quantitative measure of changes in model distributions.The popularity of information theory metrics in AI and ML is largely due to legacy and a lack of understanding of the underlying concepts. Information theory metrics may not be the best choice for quantifying change in risk in the AI and ML space, but they are the ones that are commonly used due to familiarity and ease of use.Using nonparametric statistics in modeling systems. 15:09Information theory divergences are not useful for monitoring production model performance, according to the speakers.Andrew Clark highlights the advantages of using nonparametric statistics in machine learning, including distribution agnosticism and the ability to test for significance without knowing the underlying distribution.Sid MangaliWhat did you think? Let us know.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
In this episode, the hosts focus on the basics of anomaly detection in machine learning and AI systems, including its importance, and how it is implemented. They also touch on the topic of large language models, the (in)accuracy of data scraping, and the importance of high-quality data when employing various detection methods. You'll even gain some techniques you can use right away to improve your training data and your models.Intro and discussion (0:03)Questions about Information Theory from our non-parametric statistics episode.Google CEO calls out chatbots (WSJ)A statement about anomaly detection as it was regarded in 2020 (Forbes)In the year 2024, are we using AI to detect anomalies, or are we detecting anomalies in AI? Both? Understanding anomalies and outliers in data (6:34)Anomalies or outliers are data that are so unexpected that their inclusion raises warning flags about inauthentic or misrepresented data collection. The detection of these anomalies is present in many fields of study but canonically in: finance, sales, networking, security, machine learning, and systems monitoringA well-controlled modeling system should have few outliersWhere anomalies come from, including data entry mistakes, data scraping errors, and adversarial agents Biggest dinosaur example: https://fivethirtyeight.com/features/the-biggest-dinosaur-in-history-may-never-have-existed/Detecting outliers in data analysis (15:02)High-quality, highly curated data is crucial for effective anomaly detection. Domain expertise plays a significant role in anomaly detection, particularly in determining what makes up an anomaly.Anomaly detection methods (19:57)Discussion and examples of various methods used for anomaly detection Supervised methodsUnsupervised methodsSemi-supervised methodsStatistical methodsAnomaly detection challenges and limitations (23:24)Anomaly detection is a complex process that requires careful consideration of various factors, including the distribution of the data, the context in which the data is used, and the potential for errors in data entryPerhaps we're detecting anomalies in human research design, not AI itself?A simple first step to anomaly detection is to visually plot numerical fields. "Just look at your data, don't take it at face value and really examine if it does what you think it does and it has what you think it has in it." This basic practice, devoid of any complex AI methods, can be an effective starting point in identifying potential anomalies.What did you think? Let us know.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
We're taking a slight detour from modeling best practices to explore questions about AI and consciousness. With special guest Michael Herman, co-founder of Monitaur and TestDriven.io, the team discusses different philosophical perspectives on consciousness and how these apply to AI. They also discuss the potential dangers of AI in its current state and why starting fresh instead of iterating can make all the difference in achieving characteristics of AI that might resemble consciousness. Show notesWhy consciousness for this episode?Enough listeners have randomly asked the hosts if Skynet is on the horizonDoes modern or future AI have the wherewithal to take over the world, and is it even conscious or intelligent? Do we even have a good definition of consciousness?Introducing Michael Herman as guest speakerCo-founder of Monitaur, Engineer extraordinaire, and creator of TestDriven.io, a training company that focuses on educating and upskilling mid-level senior-level web developers.Degree and studies in philosophy and technologyEstablishing the philosophical foundation of consciousnessConsciousness is around us everywhere. It can mean different things to different people.Most discussion about the subject bypasses the Mind-Body Problem and a few key theories:Dualism - the mind and body are distinctMaterialism - matter is king and consciousness arises in complex material systemsPanpsychism - consciousness is king. It underlies everything at the quantum levelThe potential dangers of achieving consciousness in AIWhile there is potential for AI to reach consciousness, we're far from that point. Dangers are more related to manipulation and misinformation, rather than the risk of conscious machines turning against humanity.The need for a new approach to developing AI systemsThere's a need to start from scratch if the goal is to achieve consciousness in AI systems.Current modeling techniques might not lead to AI achieving consciousness. A new paradigm might be required.There's a need to define what consciousness in AI means and to develop a test for it. Final thoughts and wrap-upIf consciousness is truly the goal, the case for starting from scratch allows for fairness and ethics to be established foundationallyAI systems should be built with human values in mindWhat did you think? Let us know.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
Data scientists, researchers, engineers, marketers, and risk leaders find themselves at a crossroads to expand their skills or risk obsolescence. The hosts discuss how a growth mindset and "the fundamentals" of AI can help.Our episode shines a light on this vital shift, equipping listeners with strategies to elevate their skills and integrate multidisciplinary knowledge. We share stories from the trenches on how each role affects robust AI solutions that adhere to ethical standards, and how embracing a T-shaped model of expertise can empower data scientists to lead the charge in industry-specific innovations.Zooming out to the executive suite, we dissect the complex dance of aligning AI innovation with core business strategies. Business leaders take note as we debunk the myth of AI as a panacea and advocate for a measured, customer-centric approach to technology adoption. We emphasize the decisive role executives play in steering their companies through the AI terrain, ensuring that every technological choice propels the business forward, overcoming the ephemeral allure of AI trends. Suggested courses, public offerings:Undergrad level Stanford course (Coursera): Machine Learning SpecializationGraduate-level MIT Open Courseware: Machine LearningWe hope you enjoy this candid conversation that could reshape your outlook on the future of AI and the roles and responsibilities that support it.Resources mentioned in this episodeLinkedIn's jobs on the rise 20243 questions to separate AI from marketing hypeDisruption or distortion? The impact of AI on future operating modelsThe Obstacle is the Way by Ryan HolidayWhat did you think? Let us know.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
Get ready for 2024 and a brand new episode! We discuss non-parametric statistics in data analysis and AI modeling. Learn more about applications in user research methods, as well as the importance of key assumptions in statistics and data modeling that must not be overlooked, After you listen to the episode, be sure to check out the supplement material in Exploring non-parametric statistics.Welcome to 2024 (0:03)AI, privacy, and marketing in the tech industryOpenAI's GPT store launch. (The Verge)Google's changes to third-party cookies. (Gizmodo)Non-parametric statistics and its applications (6:49)A solution for modeling in environments where data knowledge is limited.Contrast non-parametric statistics with parametric statistics, plus their respective strengths and weaknesses.Assumptions in statistics and data modeling (9:48)The importance of understanding statistics in data science, particularly in modeling and machine learning. (Probability distributions, Wikipedia)Discussion about a common assumption of representing data with a normal distribution; oversimplifies complex real-world phenomena.The importance of understanding data assumptions when using statistical modelsStatistical distributions and their importance in data analysis (15:08)Discuss the importance of subject matter experts in evaluating data distributions, as assumptions about data shape can lead to missed power and incorrect modeling. Examples of different distributions used in various situations, such as Poisson for wait times and counts, and discrete distributions like uniform and Gaussian normal for continuous events.Consider the complexity of selecting the appropriate distribution for statistical analysis; understand the specific distribution and its properties.Non-parametric statistics and its applications in data analysis (19:31)Non-parametric statistics are more robust to outliers and can generalize across different datasets without requiring domain expertise or data massaging.Methods rely on rank ordering and have less statistical power compared to parametric methods, but are more flexible and can handle complex data sets better.Discussion about the usefulness and limitations, which require more data to detect meaningful changes compared to parametric tests.Non-parametric tests for comparing data sets (24:15)Non-parametric tests, including the K-S test and chi-square test, which can compare two sets of data without assuming a specific distribution.Can also be used for machine learning, classification, and regression tasks, even when the underlying datWhat did you think? Let us know.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
It's the end of 2023 and our first season. The hosts reflect on what's happened with the fundamentals of AI regulation, data privacy, and ethics. Spoiler alert: a lot! And we're excited to share our outlook for AI in 2024.AI regulation and its impact in 2024.Hosts reflect on AI regulation discussions from their first 10 episodes, discussing what went well and what didn't.Its potential impact on innovation. 2:36AI innovation, regulation, and best practices. 7:05AI, privacy, and data security in healthcare. 11:08Data scientists face contradictory goals of fairness and privacy in AI, with increased interest in balancing both in 2024.HIPAA privacy and the increasing use of machine learning and AI in healthcare.Special thanks to the team at Shifting Privacy Left for a more in-depth discussion about privacy and synthetic data.AI safety and ethics in NLP research. 15:40Does OpenAI's closed model set a bad precedent for research?The tension in NLP research: AI safety, and OpenAI's approach Modeling mindset and reality between scientists and AI experts. 18:44Thanks to special guests who joined us in 2023Josh Pyle and his insights on bias and actuarial bias, and how they've applied to their work. See episode 10.Christoph Molnar for explaining the differences in mindset between scientists and AI scientists when modeling reality. See episode 4.AI ethics and its challenges in the industry. 21:46Andrew Clark emphasizes the importance of understanding the basics of time-series analysis and choosing the right tools for the job, rather than relying on automated methods or blindly using existing techniques.Sid Mangalik: AI ethics discussion needs to level up, as people are confusing models with AGI and not addressing practical issues.Andrew Clark: AI ethics is at a bad crossroads, with high-level discussions divorced from reality and companies paying lip service to ethical concerns.AI ethics and responsible development. 26:10Companies with ethical bodies and practices are better equipped to handle AI ethics concerns.Andrew expresses concern about the ethical implications of AI, particularly in the context of Google's Gemini project.Comparing AI safety to carbon credits and the importance of a proactive approach to ethical considerations.AI ethics and its importance in business. 29:31Susan highlights the importance of ethics in AI, citing examples of challenges between board and executive teams.What did you think? Let us know.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
Joshua Pyle joins us in a discussion about managing bias in the actuarial sciences. Together with Andrew's and Sid's perspectives from both the economic and data science fields, they deliver an interdisciplinary conversation about bias that you'll only find here.OpenAI news plus new developments in language models. 0:03The hosts get to discuss the aftermath of OpenAI and Sam Altman's return as CEOTension between OpenAI's board and researchers on the push for slow, responsible AI development vs fast, breakthrough model-making.Microsoft researchers find that smaller, high-quality data sets can be more effective for training language models than larger, lower-quality sets (Orca 2).Google announces Gemini, a trio of models with varying parameters, including an ultra-light version for phones Bias in actuarial sciences with Joshua Pyle, FCAS. 9:29Josh shares insights on managing bias in Actuarial Sciences, drawing on his 20 years of experience in the field.Bias in actuarial work defined as differential treatment leading to unfavorable outcomes, with protected classes including race, religion, and more.Actuarial bias and model validation in ratemaking. 15:48The importance of analyzing the impact of pricing changes on protected classes, and the potential for unintended consequences when using proxies in actuarial ratemaking.Three major causes of unfair bias in ratemaking (Contingencies, Nov 2023)Gaps in the actuarial process that could lead to bias, including a lack of a standardized governance framework for model validation and calibration.Actuarial standards, bias, and credibility. 20:45Complex state-level regulations and limited data pose challenges for predictive modeling in insurance.Actuaries debate definition and mitigation of bias in continuing education.Bias analysis in actuarial modeling. 27:16The importance of identifying dislocation analysis in bias analysis.Analyze two versions of a model to compare predictive power of including vs. excluding protected class (race).Bias in AI models in actuarial field. 33:56Actuaries can learn from data scientists' tendency to over-engineer models.Actuaries may feel excluded from the Big Data era due to their need to explain their methodsStandardization is needed to help actuaries identify and mitigate bias.Interdisciplinary approaches to AI modeling and governance. 42:11Sid hopes to see more systematic and published approaches to addressing bias in the data science field.Andrew emphasizes the importance of interdisciplinary collaboration between actuaries, data scientists, and economists to create more accurate and fair modeling systems.Josh agrees and highlights the need for better governance structures to support this collaboration, citing the lack of good journals and academic silos as a chaWhat did you think? Let us know.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
Episode 9. Continuing our series run about model validation. In this episode, the hosts focus on aspects of performance, why we need to do statistics correctly, and not use metrics without understanding how they work, to ensure that models are evaluated in a meaningful way.AI regulations, red team testing, and physics-based modeling. 0:03The hosts discuss the Biden administration's executive order on AI and its implications for model validation and performance.Evaluating machine learning models using accuracy, recall, and precision. 6:52The four types of results in classification: true positive, false positive, true negative, and false negative.The three standard metrics are composed of these elements: accuracy, recall, and precision.Accuracy metrics for classification models. 12:36Precision and recall are interrelated aspects of accuracy in machine learning.Using F1 score and F beta score in classification models, particularly when dealing with imbalanced data.Performance metrics for regression tasks. 17:08Handling imbalanced outcomes in machine learning, particularly in regression tasks.The different metrics used to evaluate regression models, including mean squared error.Performance metrics for machine learning models. 19:56Mean squared error (MSE) as a metric for evaluating the accuracy of machine learning models, using the example of predicting house prices.Mean absolute error (MAE) as an alternative metric, which penalizes large errors less heavily and is more straightforward to compute.Graph theory and operations research applications. 25:48Graph theory in machine learning, including the shortest path problem and clustering. Euclidean distance is a popular benchmark for measuring distances between data points. Machine learning metrics and evaluation methods. 33:06Model validation using statistics and information theory. 37:08Entropy, its roots in classical mechanics and thermodynamics, and its application in information theory, particularly Shannon entropy calculation. The importance the use case and validation metrics for machine learning models.What did you think? Let us know.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
Episode 8. This is the first in a series of episodes dedicated to model validation. Today, we focus on model robustness and resilience. From complex financial systems to why your gym might be overcrowded at New Year's, you've been directly affected by these aspects of model validation.AI hype and consumer trust (0:03) FTC article highlights consumer concerns about AI's impact on lives and businesses (Oct 3, FTC)Increased public awareness of AI and the masses of data needed to train it led to increased awareness of potential implications for misuse.Need for transparency and trust in AI's development and deployment.Model validation and its importance in AI development (3:42)Importance of model validation in AI development, ensuring models are doing what they're supposed to do.FTC's heightened awareness of responsibility and the need for fair and unbiased AI practices.Model validation (targeted, specific) vs model evaluation (general, open-ended).Model validation and resilience in machine learning (8:26)Collaboration between engineers and businesses to validate models for resilience and robustness.Resilience: how well a model handles adverse data scenarios.Robustness: model's ability to generalize to unforeseen data.Aerospace Engineering: models must be resilient and robust to perform well in real-world environments.Statistical evaluation and modeling in machine learning (14:09)Statistical evaluation involves modeling distribution without knowing everything, using methods like Monte Carlo sampling.Monte Carlo simulations originated in physics for assessing risk and uncertainty in decision-making.Monte Carlo methods for analyzing model robustness and resilience (17:24)Monte Carlo simulations allow exploration of potential input spaces and estimation of underlying distribution.Useful when analytical solutions are unavailable.Sensitivity analysis and uncertainty analysis as major flavors of analyses.Monte Carlo techniques and model validation (21:31)Versatility of Monte Carlo simulations in various fields.Using Monte Carlo experiments to explore semantic space vectors of language models like GPT.Importance of validating machine learning models through negative scenario analysis.Stress testing and resiliency in finance and engineering (25:48)Importance of stress testing in finance, combining traditional methods with Monte Carlo techniques.Synthetic data's potential in modeling critical systems.Identifying potential gaps and vulnerabilities in critical systems.Using operations research and model validation in AI development (30:13)Operations research can help find an equilibrium in overcrowding in gyms.Robust methods for solving complex problems in logistics and hWhat did you think? Let us know.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
Episode 7. To use or not to use? That is the question about digital twins that the fundamentalists explore. Many solutions continue to be proposed for making AI systems safer, but can digital twins really deliver for AI what we know they can do for physical systems? Tune in and find out.Show notesDigital twins by definition. 0:03Digital twins are one-to-one digital models of real-life products, systems, or processes, used for simulations, testing, monitoring, maintenance, or practice decommissioning.The digital twin should be indistinguishable from the physical twin, allowing for safe and efficient problem-solving in a computerized environment.Digital twins in manufacturing and aerospace engineering. 2:22Digital twins are virtual replicas of physical processes, useful in manufacturing and space, but often misunderstood as just simulations or models.Sid highlights the importance of identifying digital twin trends and distinguishing them from simulations or sandbox environments.Andrew emphasizes the need for data standards and ETL processes to handle different vendors and data forms, clarifying that digital twins are not a one-size-fits-all solution.Digital twins, AI models, and validation in a hybrid environment. 6:51Validation is crucial for deploying mission-critical AI models, including generative AI.Sid clarifies the misconception that AI models can directly replicate physical systems, emphasizing the importance of modeling specific data and context.Andrew and Susan discuss the confusion around modeling and its limitations, including the need to validate models on specific datasets and avoid generalizing across contexts.Referenced article from Venture Beat, 10 digital twin trends for 2023Digital twins, IoT, and their applications. 11:05Susan and Sid discuss the limitations of digital twins, including their inability to interact with the real world and the complexity of modeling systems.They reference a 2012 NASA paper that popularized the term "digital twin" and highlight the potential for confusion in its application to various industries.Sid: Digital twinning requires more than just IoT devices, it's a complex process that involves monitoring and IoT devices across the physical system to create a perfect digital twin.Andrew: Digital twins raise security and privacy concerns, especially in healthcare, where there are lots of IoT devices and personal data that need to be protected.Data privacy and security in digital twin technology. 17:03Digital twins and data privacy face off in IoT debate.Susan and Andrew discuss data privacy concerns with AI and IoT, highlighting the potential for data breaches and lack of transparency.Digital twins in healthcare and technology. 20:16Susan and Andrew discuss digital twins in various industries, emphasizing their impWhat did you think? Let us know.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
Episode 6. What does systems engineering have to do with AI fundamentals? In this episode, the team discusses what data and computer science as professions can learn from systems engineering, and how the methods and mindset of the latter can boost the quality of AI-based innovations.Show notes News and episode commentary 0:03ChatGPT usage is down for the second straight month.The importance of understanding the data and how it affects the quality of synthetic data for non-tabular use cases like text. (Episode 5, Synthetic data)Business decisions. The 2012 case of Target using algorithms in their advertising. (CIO, June 2023)Systems engineering thinking. 3:45The difference between building algorithms and building models, and building systems. The term systems engineering came from Bell Labs in the 1940s, and came into its own with the NASA Apollo program.A system is a way of looking at the world. There's emergent behavior, complex interactions and relationships between data.AI systems and ML systems are often distant from the expertise of people who do systems engineering.Learning the hard way. 9:25Systems engineering is about doing things the hard way, learning the physical sciences, math and how things work.What else can be learned from the Apollo program.Developing a system, and how important it is to align the importance of criticality and safety of the project.Systems engineering is often associated incorrectly with waterfall in software engineering, What is a safer model to build? 14:26What is a safer model, and how is systems engineering going to fit in with this world?The data science hacker culture can be counterintuitive to this approach For example, actuaries have a professional code of ethics and a set way that they learn.Step back and review your model. 18:26Peer review your model and see if they can break it and stress-test it. Build monitoring around knowing where the fault points are and also talk to business leaders.Be careful about the other impacts that can have on the business or externally on the people who start using it.Marketing this type of engineering as robustness of the model, identifying what it is good at and what it's bad at, and that in itself can be a piece of selling.Systems thinking gives a chance to create lasting models What did you think? Let us know.Good AI Needs Great Governance Define, manage, and automate your AI model governance lifecycle from policy to proof.Disclaimer: This post contains affiliate links. If you make a purchase, I may receive a commission at no extra cost to you.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
Episode 5. This episode about synthetic data is very real. The fundamentalists uncover the pros and cons of synthetic data; as well as reliable use cases and the best techniques for safe and effective use in AI. When even SAG-AFTRA and OpenAI make synthetic data a household word, you know this is an episode you can't miss.Show notesWhat is synthetic data? 0:03Definition is not a succinct one-liner, which is one of the key issues with assessing synthetic data generation.Using general information scraped from the web for ML is backfiring.Synthetic data generation and data recycling. 3:48OpenAI is running against the problem that they don't have enough data and the scale at which they're trying to operate.The poisoning effect that happens when trying to take your own data.Synthetic data generation is not a panacea. It is not an exact science. It's more of an art than a science.The pros and cons of using synthetic data. 6:46The pros and cons of using synthetic data to train AI models, and how it differs from traditional medical data.The importance of diversity in the training of AI models.Synthetic data is a nuanced field, taking away the complexity of building data that is representative of a solution.Differences between randomized and synthetic data. 9:52Differential privacy is a lot more difficult to execute than a lot of people are talking about.Anonymization is a huge piece of the application for the fairness bias, especially with larger deployments.The hardest part is capturing complex interrelationships. (i.e. Fukushima reactor testing wasn't high enough)The pros and cons of ChatGPT. 13:54Invalid use cases for synthetic data in more depth,Examples where humans cannot anonymize effectivelyCreating new data for where the company is right now before diving into the use cases; i.e. differential privacy.Mentally meaningful use cases for synthetic data. 16:38Meaningful use cases for synthetic data, using the power of synthetic data correctly to generate outcomes that are important to you.Pros and cons of using synthetic data in controlled environments.The fallacy of "fairness through awareness". 18:39Synthetic data is helpful for stress testing systems, edge case scenario thought experiments, simulation, stress testing system design, and scenario-based methodologies.The recent push to use synthetic data.Data augmentation and digital twin work. 21:26 Synthetic data as the only data is where the difficulties arise.Data augmentation is a better use case for synthetic data.Examples of digital twin methodology to createWhat did you think? Let us know.Good AI Needs Great Governance Define, manage, and automate your AI model governance lifecycle from policy to proof.Disclaimer: This post contains affiliate links. If you make a purchase, I may receive a commission at no extra cost to you.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
Episode 4. The AI Fundamentalists welcome Christoph Molnar to discuss the characteristics of a modeling mindset in a rapidly innovating world. He is the author of multiple data science books including Modeling Mindsets, Interpretable Machine Learning, and his latest book Introduction to Conformal Prediction with Python. We hope you enjoy this enlightening discussion from a model builder's point of view.To keep in touch with Christoph's work, subscribe to his newsletter Mindful Modeler - "Better machine learning by thinking like a statistician. About model interpretation, paying attention to data, and always staying critical."SummaryIntroduction. 0:03Introduction to the AI fundamentalists podcast.Welcome, Christopher MolnarWhat is machine learning? How do you look at it? 1:03AI systems and machine learning systems.Separating machine learning from classical statistical modeling.What's the best machine learning approach? 3:41Confusion in the space between statistical learning and machine learning.The importance of modeling mindsets.Different approaches to using interpretability in machine learning.Holistic AI in systems engineering.Modeling is the most fun part but also the beginning. 8:19Modeling is the most fun part of machine learning.How to get lost in modeling.How can we use the techniques in interpretable ML to create a system that we can explain to stakeholders that are non-technical? 10:36How to interpret at the non-technical level.Reproducibility is a big part of explainability.Conformal prediction vs. interpretability tools. 12:51Explanability to a data scientist vs. a regulator.Interoperability is not a panacea.Conformal prediction with Python.Roadblocks to conformal prediction being used in the industry.What's the best technique for a job in data science? 17:20The bandwagon effect of Netflix and machine learning.The mindset difference between data science and other professions.Machine learning is always catching up with the best practices in the industry. 19:21The machine learning industry is catching up with best practices.Synthetic data to fill in gaps.The barrier to entry in machine learning.How to learn from new models.How to train your mindset before you What did you think? Let us know.Good AI Needs Great Governance Define, manage, and automate your AI model governance lifecycle from policy to proof.Disclaimer: This post contains affiliate links. If you make a purchase, I may receive a commission at no extra cost to you.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
Episode 3. Get ready because we're bringing stats back! An AI model can only learn from the data it has seen. And business problems can't be solved without the right data. The Fundamentalists break down the basics of data from collection to regulation to bias to quality in AI. Introduction to this episodeWhy data matters.How do big tech's LLM models stack up to the proposed EU AI Act?How major models such as Open AI and Bard stack up against current regulations.Stanford HAI - Do Foundation Model Providers Comply with the Draft EU AI Act?Risk management documentation and risk management.The EU is adding teeth outside of the banking and financial sectors now.Time - Exclusive: OpenAI Lobbied the E.U. to Water Down AI RegulationBringing stats back: Why does data matter in all this madness?How AI is taking us away from human intelligence.Having quality data and bringing stats back!The importance of having representative data, sampling dataWhat are your business objectives? Don't just throw data into it.Understanding the use case of the data.GDPR and EU AI regulations.AI field caught off guard by new regulations.Expectations for regulatory data.What is data governance? How do you validate data?Data management, data governance, and data quality.Structured data collection for financial companies.What else should we learn about our data collection and data processes?Example: US Census data collection and data processes.The importance of representativeness and being representative of the community in the census.Step one, the fine curation of data, the intentional and knowledgeable creation of data that meets the specific business need.Step two, fairness through awareness.The importance of data curation and data selection in data quality.What data quality looks like at a high level.Rights to be forgotten.The importance of data provenance and data governance in data science.Synthetic data and privacy.Data governance seems to be 40 % of the path to AI model governance. What else needs to be in place?What companies are missing with machine learning.The impact that data will have on the future of AI.The future of general AI in the future.What did you think? Let us know.Good AI Needs Great Governance Define, manage, and automate your AI model governance lifecycle from policy to proof.Disclaimer: This post contains affiliate links. If you make a purchase, I may receive a commission at no extra cost to you.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
Truth-based AI: Large language models (LLMs) and knowledge graphs - The AI Fundamentalists, Episode 2Show NotesWhat's NOT new and what is new in the world of LLMs. 3:10 Getting back to the basics of modeling best practices and rigor.What is AI and subsequently LLM regulation going to look like for tech organizations? 5:55Recommendations for reading on the topic.Andrew talks about regulation, monitoring, assurance, and alarm.What does it mean to regulate generative AI models? 7:51Concerns with regulating generative AI models.Concerns about the call for regulation from Open AI.What is data privacy going to look like in the future? 10:16Regulation of AI models and data privacy.The NIST AI Risk Management Framework.Making sure it's being used as a productivity tool.How it's different from existing processes.What's different about these models vs old models? 15:07Public perception of new machine learning models vs old models.Hallucination in the field.Does the use of chatbots change the tendency toward hallucinations? 17:27Bing still suffers from the same problem with their LLMs.Multi-objective modeling and multi-language modeling.What does truth-based AI look like? 20:17Public perception vs. modeling best practicesKnowledge graphs vs. generative AI: ideal use cases for eachAlgorithms have a really interesting potential application which is a plugin library model. 23:00Algorithms have an interesting potential application.The benefits of a plugin library model.What's the future of large language models? 25:35Practical uses for ML and knowledge base knowledge databases.Predictions on ML and ML-based databases.Finding a way to make LLM useful.Next episodes of the podcast.What did you think? Let us know.Good AI Needs Great Governance Define, manage, and automate your AI model governance lifecycle from policy to proof.Disclaimer: This post contains affiliate links. If you make a purchase, I may receive a commission at no extra cost to you.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
The AI Fundamentalists - Ep1 SummaryWelcome to the first episode. 0:03Welcome to the first episode of the AI Fundamentalists podcast.Introducing the hosts.Introducing Sid and Andrew. 1:23Introducing Andrew Clark, co-founder and CTO of Monitaur.Introduction of the podcast topic.What is the proper rigorous process for using AI in manufacturing? 3:44Large language models and AI.Rigorous systems for manufacturing and innovation.Predictive maintenance as an example of manufacturing. 6:28Predictive maintenance and predictive maintenance in manufacturing.The Apollo program and the Apollo program.The key things you can see when you're new to running. 8:31The importance of taking a step back.Getting past the plateau in software engineering.What's the game changer in these generative models? 10:47Can Chat-GPT become a lawyer, doctor, or teacher?The inflection point with generative models.How can we put guardrails in place for these systems so they know when to not answer? 13:46How to put guardrails in place for these systems.The concept of multiple constraints.Generative AI isn't new, it's embedded in our daily lives. 16:20Generative AI is not new, but not a new technology.Examples of generative AI.The importance of data in machine learning. 19:01The fundamental building blocks of machine learning.AI is revolutionary, but it's been around for years. What can AI learn from systems engineering? 20:59Nasa Apollo program, systems engineering.Systems engineering fundamentals world, rigor, testing and validating.Understanding the why, data and holistic systems management.The AI curmudgeons, the AI fundamentalists.What did you think? Let us know.Good AI Needs Great Governance Define, manage, and automate your AI model governance lifecycle from policy to proof.Disclaimer: This post contains affiliate links. If you make a purchase, I may receive a commission at no extra cost to you.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.