Academic journal
POPULARITY
In this episode, we delve into induction and deduction and talk further about issues related to generalizability. Shownotes Popper, K. The Logic of Scientific Discovery. (1953). Hutchinson & Co. (Originally published in 1935) Yarkoni, T. (2022). The generalizability crisis. Behavioral and Brain Sciences, 45, e1. Mook, D. G. (1983). In defense of external invalidity. American psychologist, 38(4), 379-387. Salmon, W. C. (1981). Rational Prediction. The British Journal for the Philosophy of Science, 32(2), 115–125. https://doi.org/10.1093/bjps/32.2.115 Reichenbach, H. (1938) [2006], Experience and Prediction: An Analysis of the Foundations and the Structure of Knowledge, Chicago: University of Chicago Press. Senn, S. (2007). Statistical issues in drug development (2nd ed). John Wiley & Sons. Ernst, M. D. (2004). Permutation Methods: A Basis for Exact Inference. Statistical Science, 19(4), 676–685. Bacon, F. (1620). Instauratio magna [Novum organum]. London: John Bill. Urbach, P. (1982). Francis Bacon as a Precursor to Popper. The British Journal for the Philosophy of Science, 33(2), 113–132.
Domna Ladopoulou, a researcher in the Department of Statistical Science at UCL, is working on improving the efficiency and reliability of wind energy production through statistical and machine learning modelling approaches. Her research focuses on developing a probabilistic condition monitoring system for wind farms using SCADA data to detect faults and failures early. This system aims to enhance the sustainability of wind farms by reducing maintenance costs and improving overall reliability. Donna's methodology involves non-parametric probabilistic methods like Gaussian processes and probabilistic neural networks, which offer flexibility and computational efficiency. She emphasizes the importance of informed decision-making in sustainability and the potential for her research to be scaled globally, particularly in regions with high wind power reliance. Date of episode recording: 2024-05-30T00:00:00Z Duration: 00:17:34 Language of episode: English Presenter:Stephanie Dickinson Guests: Domna Ladopoulou Producer: Nathan Green
In this episode we interview Professor Jim Griffin from the Department of Statistical Science at University College London. This is the first in a series of interviews with Statistical Science academics about how their research crosses over with the discipline of Sustainability. We discuss the potential of environmental DNA analysis for biodiversity monitoring, highlighting its cost-effectiveness but also the challenges associated with reliability. Jim emphasized the crucial role of statistics in environmental monitoring and decision-making, emphasizing the importance of mathematical modelling and statistical modelling to quantify environmental phenomena. They also acknowledged the need for better data and understanding to inform decision-making and lead to more sustainable outcomes. Finally, the importance of statistical literacy in comprehending environmental concerns and improving decision-making in various fields is covered. Date of episode recording: 2024-05-13T00:00:00Z Duration: 00:33:43 Language of episode: English Presenter:Stephanie Dickinson Guests: Jim Griffin Producer: Nathan Green
Our guest in this episode is Sebastien Motsch, an assistant professor at Arizona State University, working in the School of Mathematical and Statistical Science. He works on modeling self-organized biological systems to understand how complex patterns emerge.
ORIGINAL AIR DATE: NOV 29, 2017Back in mid-nineties a peer-reviewed article was published that sought to legitimize the idea that the Hebrew text of Genesis encrypted meaningful information about modern persons and events. Their method for detecting the presumed encrypted knowledge was known as equidistant letter sequencing (ELS).This article (Witztum, Rips, and Rosenberg) became a reference point for journalist Michael Drosnin, who wrote the bestselling book, The Bible Code, shortly thereafter. Subsequent to the success of Drosnin's book, Bible-code research expanded to the full Torah and beyond, to the rest of the Hebrew Bible. In this episode we ask whether there is such a thing as ELS Bible codes. Have other statisticians and biblical scholars agreed with Witztum, Rips, and Rosenberg, or are there serious problems with the method and its assumptions?Articles:Witztum, Doron, Eliyahu Rips, and Yoav Rosenberg, “Equidistant letter sequences in the Book of Genesis,” Statistical Science 9.3 (1994): 429-438McKay, Brendan, Dror Bar-Natan, Maya Bar-Hillel, and Gil Kalai, “Solving the Bible Code puzzle,” Statistical Science (1999): 150-173Richard A. Taylor, “The Bible Code: ‘Teaching them [wrong] things',” Journal of the Evangelical Theological Society 43, no. 4 (2000): 619-636Paul J. Tanner. “Decoding the Bible Code,” Bibliotheca Sacra 157 (2000): 141-159
In this alumni series from the Department of Statistical Science at UCL, we speak with Michael Baxter about his time at UCL and subsequent years working in government, including a wide variety of projects, from unearthing a national Census undercount to informing on the effects of compulsory seatbelt wearing. Presenter: Nathan Green Guests: Michael Baxter Producer: Chih Ching Chen Date of episode recording: 2023-07-20 Transcription link: https://www.ucl.ac.uk/statistics/transcript-episode-10
Our guest is Mine Çetinkaya-Rundel, a Professor in the Department of Statistical Science at Duke University in the United States. Mine's work is so much more than just educating students about statistics and data science. She's a big proponent of what she calls ‘open education', the open source sharing of how stats and data science are taught, and innovating the pedagogy. In this episode, we explore how Mine is doing this, and why this type of work is so important. Our host is Cynthia Huang, a PhD Candidate at Monash University in the Department of Econometrics and Business Statistics.See omnystudio.com/listener for privacy information.
In this interview from the Department of Statistical Science at UCL, we speak with Sam Tickle who is a Data Science Research Fellow at the Heilbronn Institute for Mathematical Research, based at the University of Bristol. We discuss Sam's research in changepoint detection, his new method called OMEN and the study of the Global Terrorism Database (GTD) that inspired it, and some milestones of his career path into statistical science. For more information and to access the transcript: https://www.ucl.ac.uk/statistics/ Date of episode recording: 2023-04-27 Duration: 00:37:05 Language of episode: English Presenter: Omar Rivasplata Guests: Sam Tickle Producer: Nathan Green
Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!Decision-making and cost effectiveness analyses rarely get as important as in the health systems — where matters of life and death are not a metaphor. Bayesian statistical modeling is extremely helpful in this field, with its ability to quantify uncertainty, include domain knowledge, and incorporate causal reasoning.Specialized in all these topics, Gianluca Baio was the person to talk to for this episode. He'll tell us about this kind of models, and how to understand them.Gianluca is currently the head of the department of Statistical Science at University College London. He studied Statistics and Economics at the University of Florence (Italy), and completed a PhD in Applied Statistics, again at the beautiful University of Florence.He's also a very skilled pizzaiolo — so now I have two reasons to come back to visit Tuscany…Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !Thank you to my Patrons for making this episode possible!Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor, Thomas Wiecki, Chad Scherrer, Nathaniel Neitzke, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Joshua Duncan, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Raul Maldonado, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Michael Hankin, Cameron Smith, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal M, Mason Yahr, Naoya Kanai, Steven Rowland, Aubrey Clayton, Jeannine Sue, Omri Har Shemesh, Scott Anthony Robson, David Haas, Robert Yolken, Or Duek, Pavel Dusek, Paul Cox, Trey Causey, Andreas Kröpelin, Raphaël R, Nicolas Rode, Gabriel Stechschulte, and Arkady.Visit https://www.patreon.com/learnbayesstats to unlock exclusive Bayesian swag ;)Links from the show:Gianluca's website: https://gianluca.statistica.it/Gianluca on GitHub: https://github.com/giabaio Gianluca on Mastodon: https://mas.to/@gianlubaioGianluca on Twitter: https://twitter.com/gianlubaioGianluca on...
What is value of information in health research? How do they inform policy changes? How does Bayesian inference help with designing innovative clinical trials? Anna Heath will answer all these questions for you. Dr. Heath is Canada Research Chair in Statistical Trial Design, a scientist in the Child Health and Evaluative Sciences Program at Sickkids Research Institute, an assistant professor in the division of Biostatistics in University of Toronto, and an Honorary Research Associate in the Department of Statistical Science in University College London. Other than all these fancy titles, she is also the supervisor for my masters practicum project and opened the door to the whole new, amazing world of Bayesian statistics to me. So of course, I am so thrilled to be hosting Dr. Heath on the show today! Also, I believe you would be as thrilled as I was when learning that, Dr. Heath is a very keen ballroom dancer that participates in international contests, a violin musician and a cricket lover! Now let's dive into this episode to see what she shared with us! Please visit our website https://biostatisticspodcast.github.io/ for the latest updates and transcripts of the published episodes. If you have any comments, feedback or suggestions, feel free to email us via biostatisticspodcast@gmail.com or DM us on Twitter @BiostatsPodcast Stay tuned :D Intro Music: Chris - C418 Outro Music: Wet Hands - C418 Powered by Firstory Hosting
Cynthia Rudin, the Earl D. McLean, Jr. Professor of Computer Science, Electrical and Computer Engineering, Statistical Science, Mathematics,and Biostatistics & Bioinformatics at Duke University (
In this two-part interview from the Department of Statistical Science at UCL, we speak with Tim Swartz who is a Professor of Statistics at Simon Fraser University. We discuss a variety of topics including: synchronicity in cricket, pulling the goalie in ice hockey, and horse racing. For more information and to access the transcript: www.ucl.ac.uk/statistics/episode-7-transcript Date of episode recording: 2022-04-12 Duration: 16.24 Language of episode: English TAGS: stats_UCL Presenter:Terry Soo Guests: Tim Swartz Producer: Nathan Green
Professor of Statistical Science at the University of Oxford, and founder and CEO of Genomics PLC, Sir Peter Donnelly tells us about exactly what genetic screening can tell us about our health and what we can do to stay healthy regardless of our genes. Hosted on Acast. See acast.com/privacy for more information.
List of names and affiliations mentioned in the UK Biobank context: Mark McCarthy (Genentech), Slavé Petrovski (AstraZeneca), Chris Whelan (Johnson and Johnson), Erin Smith (Takeda Pharmaceuticals), Melissa Miller (Pfizer), Ben Sun (Biogen).The ASHG announcement, honoring Sir Peter Donnelly, CEO of Genomics plc and Emeritus Professor of Statistical Science at the University of Oxford with the William Allan Award. A list of the many abstracts and plenary presentations at ASHG 2022 that mention Olink can be found here (PDF). The Ben Sun et al. preprint mentioned is titled “Genetic regulation of the human plasma proteome in 54,306 UK Biobank participants” and is available on BioRxiv here.An overview published in 2013 of the STUCTURE software (one of the “most widely used population analysis tools”) is available here.To join the All of Us research program, visit https://www.joinallofus.org/ and information about the Francis Collins asking for an African Moonshot on GenomeWeb is here.If you would like to contact Dale, Cindy or Sarantis feel free to email us at info@olink.com.In case you were wondering, Proteomics in Proximity refers to the principle underlying Olink Proteomics assay technology called the Proximity Extension Assay (PEA), and more information about the assay and how it works can be found here.
歡迎嚟到 搞乜咁科學 GMG Science 第9集!今集嘅主題係 Love 係愛啊~!Keith會講點樣用數學係短暫嘅一生中搵到真愛,Abellona會解釋點解有人覺得人類嘅基因已經注定咗男仔一定係濫D,女仔一定係挑剔D?喂!好奇心,係時候醒喇 :)Social Media:科學一齊搞 Got Something for GMG - 有咩想同我哋講都可以係度share㗎: https://forms.gle/26RSEgW9NeeSMc4a7搞乜咁科學網頁: www.gmgscience.com搞乜咁科學 IG: www.instagram.com/gmgscience搞乜咁科學 YouTube: https://www.youtube.com/channel/UCFj2cwjDASS2SyYsj3pkNSQAbellona IG: www.instagram.com/_doctor_uKeith IG: www.instagram.com/keith.poonsirKeith YouTube: https://www.youtube.com/channel/UC9fh5paH2jh5kfBVDEPC1YAShow Notes and Links:大部份今集有關嘅圖片會係我哋IG見到㗎: www.instagram.com/gmgscienceKeith 的部分林家謙 下一位前度 - Wikipedia最佳停止問題 Optimal Stopping - Wikipedia 秘書問題 Secretary problem – Wikipedia 最佳停止問題Excel 香港人平均結婚年齡- The Standard 延伸閱讀:Ferguson, Thomas S. (August 1989). "Who Solved the Secretary Problem?". Statistical Science. 4 (3): 282–289. doi:10.1214/ss/1177012493.]Fry, H. (2015). The mathematics of love: Patterns, proofs, and the search for the ultimate equation. Simon and Schuster. Abellona的部分貝特曼原理 - Wikipedia 暴走進化 - Wikipedia殘障假說 - Wikipedia性感兒子假說 - Wikipedia現代重複實驗挑戰貝特曼原理:Gowaty PA, Kim YK, Anderson WW. No evidence of sexual selection in a repetition of Bateman's classic study of Drosophila melanogaster. Proc Natl Acad Sci U S A. 2012 Jul 17;109(29):11740-5. doi: 10.1073/pnas.1207851109. Epub 2012 Jun 11. PMID: 22689966; PMCID: PMC3406809. 貝特曼原理未必能直接套用於人類社會:Brown GR, Laland KN, Mulder MB. Bateman's principles and human sex roles. Trends Ecol Evol. 2009 Jun;24(6):297-304. doi: 10.1016/j.tree.2009.02.005. Epub 2009 May 4. Erratum in: Trends Ecol Evol. 2013 Oct;28(10):622. PMID: 19403194; PMCID: PMC3096780.一種叫Jacana的鳥是兩性行為相反的例子:Why female jacana birds do all the fighting - Slate
In this interview from the Department of Statistical Science at UCL, we speak with Dr Mine Dogucu who is a Lecturer in the department of Statistical Science at UCL. Dr Dogucu shares with us her experiences of teaching both frequentist and Bayesian statistics to undergraduates. She also explains what accessibility means in education and in the context of statistics, including being part of changing knitr and R Markdown to improve accessibility with image alternative text. Bayes Rules! book: www.bayesrulesbook.com/ New in knitr: Improved Accessibility with Image Alt Text: www.rstudio.com/blog/knitr-fig-alt/ Teach Access: teachaccess.org/ BrailleR: cran.r-project.org/web/packages/BrailleR/ Writing Alt Text for Data Visualization, Amy Cesal: medium.com/nightingale/writing…zation-2a218ef43f81 gradetools R package: federicazoe.github.io/gradetools/ Papers: Framework for Accessible and Inclusive Teaching Materials for Statistics and Data Science Courses: arxiv.org/abs/2110.06355 Teaching Visual Accessibility in Introductory Data Science Classes with Multi-Modal Data Representations: arxiv.org/abs/2208.02565 Date of episode recording: 2022-09-29 Duration: 00:25:37 Language of episode: English Presenter:Nathan Green Guests: Mine Dogucu Producer: Nathan Green
Science and Faith on Tour - Season 3 - Faith Journeys in Science - Ep7 Speakers include Professor Jim McManus and Professor Daniela de Angelis. Jim is a Generation Q Fellow, Director of Public Health at Hertfordshire County Council and Interim President of the Association of Directors of Public Health. Jim is Lead for Population Health for the Hertfordshire and West Essex Integrated Care System and is an Honorary Vice-President of the Chartered Institute of Environmental Health. Jim is a Chartered Psychologist, Chartered Scientist and Fellow of the British Psychological Society. He led ADPH policy work on local outbreak plans and covid-19 suppression including ADPH ‘Living Safely with Covid' policy papers. Jim co-created the national public mental health collaborative for Covid-19 and co-chaired the national review of suicide prevention plans in England. He has just completed three years as Chair of the Behavioural Science and Public Health Network and was a co-author of the National Strategy for Behavioural Science in Public Health. He was a member of the national Faith Taskforce on Covid and is a member of the Oversight Group for the National HIV Strategy. Daniela is Professor of Statistical Science for Health at the University of Cambridge in the Department of Primary Care and Public Health, Deputy Director and Programme Leader at the Medical Research Council Biostatistics Unit (MRC-BSU). At the BSU, Daniela has overall responsibility for the research theme “Statistical methods Using data Resources to improve Population Health” and leads the research programme on ‘Evidence synthesis to inform population health-related decision making'. Daniela has over 25 years of experience of working at the interface between statistics and infectious disease epidemiology, focusing on the development of statistical methods for the characterisation of epidemics, including natural history, burden and prediction of future evolution, informing the implementation and evaluation of public health policies. Daniela is member of a number of local/national/international scientific advisory groups such as NICE, WHO, and UNAIDS and collaborates widely with health agencies nationally and internationally. She is also currently a member of SPI-M (Scientific Pandemic Influenza Advisory Committee, subgroup on Modelling), which reports into SAGE; member of the Modelling and Analytics Board for the NHSx Covid 19 App; and member of the Royal Statistical Society Task Force for Covid-19. Daniela also recently won the University of Cambridge Vice-Chancellor Award for Impact and Engagement, in the Established Academic category.
We talk a lot about generative modeling on this podcast — at least since episode 6, with Michael Betancourt! And an area where this way of modeling is particularly useful is healthcare, as Maria Skoularidou will tell us in this episode. Maria is a final year PhD student at the University of Cambridge. Her thesis is focused on probabilistic machine learning and, more precisely, towards using generative modeling in… you guessed it: healthcare! But her fields of interest are diverse: from theory and methodology of machine intelligence to Bayesian inference; from theoretical computer science to information theory — Maria is knowledgeable in a lot of topics! That's why I also had to ask her about mixture models, a category of models that she uses frequently. Prior to her PhD, Maria studied Computer Science and Statistical Science at Athens University of Economics and Business. She's also invested in several efforts to bring more diversity and accessibility in the data science world. When she's not working on all this, you'll find her playing the ney, trekking or rawing. Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ (https://bababrinkman.com/) ! Thank you to my Patrons for making this episode possible! Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, Adam Bartonicek, William Benton, Alan O'Donnell, Mark Ormsby, James Ahloy, Robin Taylor, Thomas Wiecki, Chad Scherrer, Nathaniel Neitzke, Zwelithini Tunyiswa, Elea McDonnell Feit, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Joshua Duncan, Ian Moran, Paul Oreto, Colin Caprani, George Ho, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Raul Maldonado, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Matthew McAnear, Michael Hankin, Cameron Smith, Luis Iberico, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Aaron Jones, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal M, Mason Yahr, Naoya Kanai, Steven Rowland, Aubrey Clayton and Jeannine Sue. Visit https://www.patreon.com/learnbayesstats (https://www.patreon.com/learnbayesstats) to unlock exclusive Bayesian swag ;) Links from the show: Maria on Twitter: https://twitter.com/skoularidou (https://twitter.com/skoularidou) Maria on LinkedIn: https://www.linkedin.com/in/maria-skoularidou-1289b62a/ (https://www.linkedin.com/in/maria-skoularidou-1289b62a/) Maria's webpage: https://www.mrc-bsu.cam.ac.uk/people/in-alphabetical-order/n-to-s/maria-skoularidou/ (https://www.mrc-bsu.cam.ac.uk/people/in-alphabetical-order/n-to-s/maria-skoularidou/) Mixture models in PyMC: https://www.pymc.io/projects/examples/en/latest/gallery.html#mixture-models (https://www.pymc.io/projects/examples/en/latest/gallery.html#mixture-models) LBS #4 Dirichlet Processes and Neurodegenerative Diseases, with Karin Knudson: https://learnbayesstats.com/episode/4-dirichlet-processes-and-neurodegenerative-diseases-with-karin-knudson/ (https://learnbayesstats.com/episode/4-dirichlet-processes-and-neurodegenerative-diseases-with-karin-knudson/) Bayesian mixtures with an unknown number of components: https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/1467-9868.00095 (https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/1467-9868.00095) Markov Chain sampling methods for Dirichlet Processes: https://www.tandfonline.com/doi/abs/10.1080/10618600.2000.10474879 (https://www.tandfonline.com/doi/abs/10.1080/10618600.2000.10474879) Retrospective Markov chain Monte Carlo methods for Dirichlet process hierarchical models: https://academic.oup.com/biomet/article-abstract/95/1/169/219181...
This week Patrick is joined by Sir Peter Donnelly, CEO of Genomics PLC and Professor of Statistical Science at the University of Oxford. They discuss how to get from data to implementation in the clinic, the challenges of polygenic risk scores including prediction across different ethnic backgrounds, and the role of genomics in drug discovery.
This week Patrick is joined by Sir Peter Donnelly, CEO of Genomics PLC and Professor of Statistical Science at the University of Oxford. They discuss how to get from data to implementation in the clinic, the challenges of polygenic risk scores including prediction across different ethnic backgrounds, and the role of genomics in drug discovery.
In this contributed interview from the Department of Statistical Science at UCL, we speak with Professor Chris Holmes. Chris is Professor of Biostatistics at the departments of Statistics and the Nuffield Department of Medicine at the University of Oxford. He is also the Director for the Health Programme at the Alan Turing Institute. We discussed a recent presentation given at the department on Bayesian predictive inference and his involvement with statistical modelling to support the UK government's COVID response. UCL seminar recording: https://youtu.be/Y9S4n42n0cY Martingale posterior distributions: https://arxiv.org/abs/2103.15671 Interoperability of statistical models in pandemic preparedness: principles and reality: https://arxiv.org/abs/2109.13730 Chris Holmes' group website: https://www.chrisholmeslab.com Date of episode recording: 2022-02-14 Duration: 00:24:19 Language of episode: English Presenter: Brieuc Lehmann Guests: Chris Holmes Producer: Nathan Green
This is one episode where passion for math, statistics and computers are merged. I have a very interesting conversation with Ravin, data scientist at Google where he uses data to inform decisions. He has previously worked at Sweetgreen, designing systems that would benefit team members and communities through sustainable and healthy food, and SpaceX, creating tools that would ultimately launch rocket ships. All opinions in this episode are his own and none of the companies he has worked for are represented. This episode is brought to you by RailzAI The Railz API connects to major accounting platforms to provide you with quick access to normalized and analyzed financial data. Get free access to their API and more. Just tell them you came through Data Science at Home podcast. and by Amethix Technologies Amethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy. Amethix provide solutions to collect and secure data with higher transparency and disintermediation, and build the statistical models that will support your business. References Bayesian Modeling and Computation in Python (Chapman & Hall/CRC Texts in Statistical Science) amazon.com Bayesian Modeling and Computation in Python https://twitter.com/canyon289
In today's episode, Jason, an Assistant Professor of Statistical Science at Duke University talks about his research on K power means. K power means is a newly-developed algorithm by Jason and his team, that aims to solve the problem of local minima in classical K-means, without demanding heavy computational resources. Listen to find out the outcome of Jason's study. Thanks to our Sponsors:ClearML is an open-source MLOps solution users love to customize, helping you easily Track, Orchestrate, and Automate ML workflows at scale. https://clear.ml
In this interview from the Department of Statistical Science at UCL Clair Barnes, a recent PhD student, talks to Dr Terry Soo about her research into weather forecasting and her experiences of doing a PhD. We discover the difference between weather and climate and how to tell if ancient homes were randomly built! To access the transcript: https://www.ucl.ac.uk/statistics/transcript-episode-3 Date of episode recording: 2022-02-23 Duration: 00:30:51 Language of episode: English Presenter: Terry Soo Guests: Clair Barnes Producer: Nathan Green
In this episode we examine the use of secret or black box algorithms for high-stake decisions, particularly in the criminal justice system. How do they factor in the decisions made every day by state and federal courts concerning bail, sentencing, and parole? Are black box algorithms fair and unbiased? Do they help counteract or support societal prejudices? Is their use in criminal justice cases serving the public's best interest? We discuss these issues and more with two experts on the topic: Cynthia Rudin, Professor of Computer Science, Electrical and Computer Engineering, Statistical Science, Mathematics, and Biostatistics & Bioinformatics and Director of the Interpretable Machine Learning Lab at Duke University and Brandon Garrett, Professor of Law and founder of the Wilson Center for Science and Justice at Duke University.
Dr Anna Heath is a biostatistician at SickKids Hospital in Toronto within the Child Health Evaluative Sciences (CHES) program, an assistant professor at the University of Toronto in the Dalla Lana School of Public Health and an honorary research fellow at University College London in the Department of Statistical Science. We spoke with Anna about the application of health economic ideas to the conception, design and analysis of clinical trials, and in particular the concept of the Value of Information whereby we can quantify the economic value of obtaining information through research.
In this contributed interview from the Department of Statistical Science at UCL recorded for the Royal Statistical Society Conference 2021, we speak with a former member of the department, Dr Anna Heath. Dr Heath is currently a bio-statistician at SickKids hospital in Toronto within the Child Health Evaluative Sciences (CHES) program, and an assistant professor at the University of Toronto in the Dalla Lana School of Public Health. We discussed with Anna about the application of health economic ideas to the conception, design and analysis of clinical trials, and in particular the concept of the Value of Information whereby we can quantify the economic value of obtaining information through research. Date of episode recording: 2022-01-17 Duration: 00:16:36 Language of episode: English Presenter: Dr Nathan Green Guests: Dr Anna Heath Producer: Nathan Green
In this contributed series from the Department of Statistical Science at UCL, we speak with Professor Tom Fearn about how UCL is acknowledging and addressing its historical links with the eugenics movement, and in particular the roles of the prominent statisticians and eugenicists Francis Galton and Karl Pearson. For more information and to access the transcript: https://www.ucl.ac.uk/statistics/transcript-episode-1
In today's episode, we chat with Vincent Maposa. Vincent is the CEO and Founder of Wetility. Wetility is a solar energy technology company focusing on rooftop solar under one Megawatt in South Africa with links to other parts of Southern Africa. Vincent studied a BSC in Mathematics and Statistical Science at the University of Zimbabwe. He has also completed the CFA level 2 exam. Vincent started his journey with Frost & Sullivan as a Research Intern and progressed through the ranks to Industry Analyst and then a management consultant for Deloitte consulting. He is passionate about the industrialization of Africa and seeks to have affordable renewable energy in every household on the continent.
The issue of income inequality is one Americans continually wrestle with the COVID 19 pandemic bringing to light how housing, health, and general wellbeing are impacted by the unequal distribution of wealth. Income inequality in the United States is the focus of this episode of Stats and Stories with guest Joseph Gastwirth. Dr. Gastwirth is a Professor of Statistics and Economics at George Washington University. Over the course of his career he has written over 300 peer-reviewed articles, which have appeared in the Annals of Statistics, Journal of the American Statistical Association, Journal of the Royal Statistical Society, Econometrica, Review of Economics and Statistics, Statistical Science, Annals of Human Genetics, Human Heredity, Jurimetrics and Statistics and Public Policy. His research has covered a variety of topics in statistical methodology and applications. Of special note are: his early work on order and non-parametric statistics, his research on estimating measures of economic inequality, fairness and discrimination and on the role of statistical evidence in jury discriminations, equal employment and other types of legal cases. The American Statistical Association awarded him Noether Award for his contributions to nonparametric statistics in 2012 and the Karl E. Peace Award for outstanding statistical contributions for the betterment of society in 2019.
בשנת 1997 פרסם העיתונאי מייקל דרוזנין את ספרו "הצופן התנ"כי" שהפך עד מהרה לרב מכר עולמי. בספר נטען כי התורה מלאה בצפנים נסתרים המוחבאים בצורה של דילוגי אותיות וביכולתם לחזות את העתיד. רעיון הצופן התנ"כי מועלה לא פעם על מנת לאשש את הטענה כי התנ"ך נכתב בהשראה אלוהית, אם כי דרוזנין עצמו ייחס את ההצפנה לחוצנים בעלי יכולת טכנולוגית מתקדמת. לכאורה- פסאודו מדע, רעיון שאנו ב"עושים תנ"ך היינו פוסלים על הסף כיוון שלכאורה הרעיון אינו עומד באמות המידה המדעיות המקובלות.האמנם? מסתבר כי כבר בשנת 1994 פרסמו 3 מדענים ישראליים מאמר מדעי בכתב העת Statistical Science שעליו נסמך ספרו רב המכר של דרוזנין.בשעתו אפילו חתן פרס נובל, המתמטיקאי ישראל אומן, תמך במאמר וקיבל את תוצאותיו. האם אכן יש דברים בגו? האם באמת יש צפנים מוסתרים בתנ"ך? האם המאמר עומד באמות המידה המדעיות? מה היו הכשלים במאמר ומה התגובות של מתמטיקאים עמיתים למסקנות שעלו ממנו? ספוילר - אפילו פרופ' אומן חזר בו לבסוף מתמיכתו במסקנות המאמר. אבל...אולי בכל זאת יש בתורה מימד נסתר? ד"ר אבי דנטלסקי מתארח באולפן וצולל יחד איתי אל עומקם של הכתובים.האזנה נעימה,יותם.https://www.ads.ranlevi.com/2021/10/04/jamesrichardson-osimtanach-otiyot/
בשנת 1997 פרסם העיתונאי מייקל דרוזנין את ספרו "הצופן התנ"כי" שהפך עד מהרה לרב מכר עולמי. בספר נטען כי התורה מלאה בצפנים נסתרים המוחבאים בצורה של דילוגי אותיות וביכולתם לחזות את העתיד. רעיון הצופן התנ"כי מועלה לא פעם על מנת לאשש את הטענה כי התנ"ך נכתב בהשראה אלוהית, אם כי דרוזנין עצמו ייחס את ההצפנה לחוצנים בעלי יכולת טכנולוגית מתקדמת. לכאורה- פסאודו מדע, רעיון שאנו ב"עושים תנ"ך היינו פוסלים על הסף כיוון שלכאורה הרעיון אינו עומד באמות המידה המדעיות המקובלות.האמנם? מסתבר כי כבר בשנת 1994 פרסמו 3 מדענים ישראליים מאמר מדעי בכתב העת Statistical Science שעליו נסמך ספרו רב המכר של דרוזנין.בשעתו אפילו חתן פרס נובל, המתמטיקאי ישראל אומן, תמך במאמר וקיבל את תוצאותיו. האם אכן יש דברים בגו? האם באמת יש צפנים מוסתרים בתנ"ך? האם המאמר עומד באמות המידה המדעיות? מה היו הכשלים במאמר ומה התגובות של מתמטיקאים עמיתים למסקנות שעלו ממנו? ספוילר - אפילו פרופ' אומן חזר בו לבסוף מתמיכתו במסקנות המאמר. אבל...אולי בכל זאת יש בתורה מימד נסתר? ד"ר אבי דנטלסקי מתארח באולפן וצולל יחד איתי אל עומקם של הכתובים.האזנה נעימה,יותם.https://www.ads.ranlevi.com/2021/10/04/jamesrichardson-osimtanach-otiyot/
David Dunson | Advancing Statistical Science | Philosophy of Data Science Series A fundamental question in the philosophy of science is "what does it mean to make scientific progress?" We will have a series of episodes centered around this question for statistics and data science. In our first episode in the series, David Dunson (Duke University) discusses important advances in Bayesian analysis, big data, uncertainty, and scientific discovery. Topic Timestamps 0:00 Intro to David Dunson 1:54 What does it mean to advance data science and statistics? 6:14 Industry & Optimization, Science & Uncertainty 8:14 Prediction & Discovery / Bayesian Modeling 14:13 What is “complex” data? 22:49 Big Data, Bayes, and Nonparametrics 33:50 Ad hoc approaches vs principled methods 37:08 Should Machine Learning Publications Refocus on Scientific Discovery? 39:50 Mathematically principled data science & statistics 51:40 Do Bayesians just use priors as regularizers? 55:16 Bayesian Priors and Tuning Inference Methods 1:00:00 Prioritize the Most Important Work in Data Science 1:07:07 Good Practices of Star Grad Students 1:13:17 The Science in Statistical *Science* #datascience #science #statistics
Genetic testing is on the cusp of a major revolution, which has the potential to shift not just how we understand our risk for disease, but how we practice healthcare. In the clinic today, genetic testing is used only in cases where we know that mutations have big impact on physiology (BRCA mutations in breast cancer, for example). But our knowledge of how our genetics influences our risk for disease has evolved, and we now know that many (tens of thousands to even millions) small changes in our genes, each of which individually has a tiny effect, combine to influence our risk profile. This new appreciation — coupled with powerful statistical methods and massive datasets — has fueled the creation of a new tool to quantify the risk of a broad range of common diseases: the polygenic risk score. On this episode, which originally aired on January 18, 2021, host Lauren Richardson (@lr_bio) is joined by Peter Donnelly, (@genemodeller Professor of Statistical Science at the University of Oxford and the CEO of Genomics PLC,) and Vineeta Agarwala, (@vintweeta physician-scientist and general partner at a16z), to discuss these scores and how they can reshape healthcare, away from a paradigm of treating illness and towards prevention and maintenance of health.
A mathematician uses statistical science to prove gerrymandering, and courts are sometimes convinced
Kristian’s interest in statistics and algorithmic fairness has taken her on a winding career path from academia to business, to public service, and back to academia. As she has made different career changes, she didn’t decide between academia vs. industry vs. non-profit, it was more about the problem she was interested in working on at the moment, and what else is happening in her life. After she earned her PhD in Statistical Science from Duke University, she worked as a research professor at Virginia Tech where she did microsimulation and agent-based modelingin a simulation lab. After that, she tried a data visualization and analytics startup called DataPad that was quickly acquired. When she was thinking about her next step in her career, she wanted to do something with social impact.She was fascinated by the work of the Human Rights Data Analysis Group (HRDAG) that was applying statistical models to casualty data to estimate the number of undocumented conflict casualties. She spent a summer working for HRDAG in Colombia and then decided to join the organization full time. She spent five years as HRDAG’s lead statistician leading the group’s project on criminal justice in the United States focused on algorithmic fairness and predictive policing. Predictive policing uses algorithms to help the police decide where to deploy their resources based on crime statistics, so if you look at where crimes are most likely to occur, this is where you police more often. Kristian’s work showed that these algorithms could actually perpetuate historical over-policing and racial bias in minority communities. Early this year, she moved from HRDAG back to academia. She started her new position at the University of Pennsylvania in the Computer and Information Science Department on March 2 and a week later Penn closed down for COVID. Over this year, she has learned that she needs to adjust her expectations for herself, and not be so frustrated when she can't get things done that maybe under normal circumstances she could. It's not just working from home with her daughter nearby, it's the stress of everything that's going on, the additional mental fatigue of having to do all these risks calculations. This year has also made her appreciate the increasingly critical role of data science in driving data-driven decision making.RELATED LINKSConnect with Kristian Lum on LinkedIN and TwitterLearn more about Penn EngineeringLearn more about HRDAGConnect with Margot Gerritsen on Twitter (@margootjeg) and LinkedInFind out more about Margot on her Stanford Profile
Genetic testing is on the cusp of a major revolution, which has the potential to shift not just how we understand our risk for disease, but how we practice healthcare. In the clinic today, genetic testing is used only in cases where we know that mutations have big impact on physiology (BRCA mutations in breast cancer, for example). But our knowledge of how our genetics influences our risk for disease has evolved, and we now know that many (tens of thousands to even millions) small changes in our genes, each of which individually has a tiny effect, combine to influence our risk profile. This new appreciation — coupled with powerful statistical methods and massive datasets — has fueled the creation of a new tool to quantify the risk of a broad range of common diseases: the polygenic risk score. On this episode, host Lauren Richardson (@lr_bio) is joined by Dr. Peter Donnelly, (@genemodeller Professor of Statistical Science at the University of Oxford and the CEO of Genomics PLC,) and Vineeta Agarwala, (@vintweeta physician-scientist and general partner at a16z), to discuss these scores and how they can reshape healthcare, away from a paradigm of treating illness and towards prevention and maintenance of health.
With so much uncertainty hanging over the US presidential election, one place we look for clarity is in the numbers. Pollsters learned valuable lessons from the 2016 election results that they’ve applied in the current election cycle to try to yield more accurate predictions. From our GBH and PRX partners, NOVA Now, and as a special in The World's podcast feed, host Alok Patel interviews a pollster and a statistician. They delve into a brief history of political polling in the US, what went wrong in 2016, and how statistical concepts like data weighting and margin of error make all the difference in the accuracy of the polls. Subscribe, and learn more by visiting pbs.org/novanowpodcast
With so much uncertainty on the eve of the U.S. presidential election, one place we look for clarity is in the numbers. Pollsters learned valuable lessons from the 2016 election results that they've applied in the current election cycle to try to yield more accurate predictions. Host Alok Patel interviews a pollster and a statistician, delving into a brief history of political polling in the U.S, what went wrong in 2016, and how statistical concepts like data weighting and margin of error make all the difference in the accuracy of the poll.
Why Georgetown's new MSBA program may be just what you're looking for! [Show Summary] Dr. Sudipta Dasmohapatra, Academic Director of Georgetown’s MSBA program, goes in-depth on who the new online master’s program is for and how will prepare students for careers in data science leadership. Earn your Master's of Science in business analytics online in just 16 months, while continuing to work. [Show Notes] Are you attracted to business analytics but don't want to take the time off work to enter a full-time program to master the topic? Georgetown's McDonough School of Business has just what you're looking for: an online master's designed for working professionals. Today's guest is the director of that program, and we're going to learn all about it. Dr. Sudipta Dasmohapatra earned her bachelor’s and master’s degrees in India and her PhD from Penn State, and she's always been a numbers geek. She started her career as a data scientist in 2004 and her teaching career at North Carolina State, where she was Associate Professor of Marketing and Customer Analytics until 2017. In 2017, she became, among other duties, the Director of the Master's in Statistical Science at Duke University, and in 2020 she joined Georgetown as Academic Director of its brand new online MS in Business Analytics (MSBA) program for working professionals. Can you provide an overview of Georgetown's MSBA program? [2:06] The Georgetown Master’s in Business Analytics offered by the McDonough School of Business is coming online in January 2021. It is an online, 16-month, comprehensive analytics program. We have designed it with the goal of preparing future business leaders and managers that are interested in learning how to understand data and how to use data to create, share, and sustain value. The graduates of this program will be prepared to lead in key growth sectors where graduates that have deep business analytic skills are highly sought. This particular program has been rigorously curated to meet the needs of the current marketplace. We have integrated the technical aspects of data analytics with the managerial business functions so that the students can learn to speak and communicate the language of data. Within this particular program, we have both asynchronous classes, as well as synchronous sessions in which the students can join their cohort, as well as a prominent faculty, in a virtual classroom. Students also have the opportunity to come to campus. We have two week-long campus-based residencies that provide integrated hands-on activities to the students through a very intensive, week-long curriculum. One of the cornerstones of this particular program that I want to talk about is the capstone project. The capstone project, over six months, applies the program's concepts and methods and tools that the students learn in a challenging business data analytics assignment with a project sponsor. Students that come into this program can not only leverage the core McDonough and Georgetown community network, but being in the global capital city of Washington DC, we are going to connect the students to a network of people, programs, and complex data that have real-world consequences. https://www.youtube.com/watch?v=17Lc59Bxemc Could you give an example? What do you think a capstone project would be? [4:17] A capstone project could be from a real-life sponsor. We might have, let's say, the military that has all this different data to do, say, cyber analytics. We might go to them and leverage our existing contacts and get data from there, and students would be able to work on projects where they can see the application of what they have learned in a real-life project. That's what we are envisioning right now with the capstone project. Wow. How large of a cohort do you plan to admit this year and in subsequent years? Is it a lockstep program, where all participants take the same courses together,
One of the most common causes for problems we see in manuscripts at JAMA is an inappropriately calculated study sample size. This seemingly mysterious process is explained by Lynne Stokes, PhD, professor of Statistical Science at Southern Methodist University in Dallas, Texas.
One of the most common causes for problems we see in manuscripts at JAMA is an inappropriately calculated study sample size. This seemingly mysterious process is explained by Lynne Stokes, PhD, professor of Statistical Science at Southern Methodist University in Dallas, Texas.
This week, our host Dr Rob Doubleday sits down with Prof Daniela De Angelis, Professor of Statistical Science for Health at the University of Cambridge to discuss applying statistical methods to epidemiology, disease transmission, and how we're using models to understand the burden on the NHS posed by covid-19. CSaP's Science and Policy Podcast is a production of the Centre for Science and Policy at the University of Cambridge. This series on science, policy and pandemics is produced in partnership with Cambridge Infectious Diseases and the Cambridge Immunology Network. Our guest this week: Professor De Angelis works on developing and apply statistical methods to characterise epidemics, exploiting the complex body of available information. She is Deputy Director of the MRC Biostatistics Unit at the University of Cambridge. Professor De Angelis has been working throughout the covid-19 response as part of an epidemiological modelling group advising the UK Government. -- This series is hosted by CSaP Executive Director Dr Rob Doubleday, and is edited and produced by CSaP Communications Coordinator Kate McNeil. If you have feedback about this episode, or questions you'd like us to address in a future week, please email enquiries@csap.cam.ac.uk .
We discuss the exploration-exploitation dilemma and near-optimal solutions found by mathematicians. Some relevant ressources include: Bayesian Adaptive Methods for Clinical Trials. CRC Press. Berry, Carlin, Lee & Muller (2010). https://www.crcpress.com/Bayesian-Adaptive-Methods-for-Clinical-Trials/Berry-Carlin-Lee-Muller/p/book/9781439825488 Bayesian adaptive clinical trials: a dream for statisticians only? Statistics in Medicine. Chrevret (2011). https://onlinelibrary.wiley.com/doi/abs/10.1002/sim.4363 Multi-armed Bandit Models for the Optimal Design of Clinical Trials: Benefits and Challenges. Statistical Science. Villar, Bowden & Wason (2015). "Across this literature, the use of bandit models to optimally design clinical trials became a typical motivating application, yet little of the resulting theory has ever been used in the actual design and analysis of clinical trials." https://arxiv.org/pdf/1507.08025.pdf Machine learning applications in drug development. Computational and Structural Biotechnology Journal. Réda, Kaufmann & Delahaye-Duriez (2019). https://www.sciencedirect.com/science/article/pii/S2001037019303988 Rethinking the Gold Standard With Multi-armed Bandits: Machine Learning Allocation Algorithms for Experiments. Kaibel & Bieman (2019) https://journals.sagepub.com/doi/abs/10.1177/1094428119854153 Cancer specialists in disagreement about purpose of clinical trials. Journal of the National Cancer Institute (2012). https://www.eurekalert.org/pub_releases/2002-12/jotn-csi121202.php WHO launches global megatrial of the four most promising coronavirus treatments. Science Mag. Kupferschmidt & Cohen (2020). https://www.sciencemag.org/news/2020/03/who-launches-global-megatrial-four-most-promising-coronavirus-treatments
In this episode, Sir Peter Donnelly, CEO of Genomics Plc and Professor of Statistical Science at the University of Oxford, explores the relationship between genetic variation and complex human diseases and talks about his career at the intersection of statistics and genetics.
“Explainability” is a big buzzword in AI right now. AI decision-making is beginning to change the world, and explainability is about the ability of an AI model to explain the reasons behind its decisions. The challenge for AI is that unlike previous technologies, how and why the models work isn’t always obvious — and that has big implications for trust, engagement and adoption. Nicole Rigillo breaks down the definition of explainability and other key ideas including interpretability and trust. Cynthia Rudin talks about her work on explainable models, improving the parole-calculating models used in some U.S. jurisdictions and assessing seizure risk in medical patients. Benjamin Thelonious Fels says humans learn by observation, and that any explainability techniques need to take human nature into account. Guests Nicole Rigillo, Berggruen Research Fellow at Element AI Cynthia Rudin, Professor of Computer Science, Electrical and Computer Engineering, and Statistical Science at Duke University Benjamin Thelonious Fels, founder of AI healthcare startup macro-eyes Show Notes 01:11 - Facebook Chief AI Scientist Yann LeCun says rigorous testing can provide explainability01:58 - Berggruen Institute, Transformation of the Human Program05:34 - Judging Machines. Philosophical Aspects of Deep Learning - Arno Schubbach 06:31 - Do People Trust Algorithms More Than Companies Realize? - Harvard Business Review 08:25 - Introducing Activation Atlases - OpenAI10:52 - Learning certifiably optimal rule lists for categorical data (CORELS) - YouTube11:00 - CORELS: Learning Certifiably Optimal RulE ListS 11:45 - Stop Gambling with Black Box and Explainable Models on High-Stakes Decisions 16:52 - Transparent Machine Learning Models for Predicting Seizures in ICU Patients - Informs Magazine Podcast19:49 - The Last Mile: Challenges of deployment - StartupFest Talk24:41 - Developing predictive supply-chains using machine learning for improved immunization coverage - macro-eyes with UNICEF and the Bill and Melinda Gates Foundation Further Reading A missing ingredient for mass adoption of AI: trust - Element AI Breaking down AI’s trustability challenges - Element AI The Why of Explainable AI - Element AI Follow Us Element AI Twitter Element AI Facebook Element AI Instagram Alex Shee’s Twitter Alex Shee’s LinkedIn -- L’« Explicabilité » est un grand mot à la mode en IA en ce moment. La prise de décision en matière d’IA commence à changer le monde, et l’explicabilité concerne la capacité d’un modèle d’IA à expliquer les raisons qui sous-tendent ses décisions. Le défi pour l’intelligence artificielle est que, contrairement aux technologies précédentes, la façon dont les modèles fonctionnent et les raisons pour lesquelles ils fonctionnent ne sont pas toujours évidentes — et cela a de grandes répercussions sur la confiance, l’engagement et l’adoption. Nicole Rigillo décompose la définition de l’explicabilité et d’autres idées clés, y compris l’interprétabilité et la confiance. Cynthia Rudin parle de son travail sur les modèles explicables, l’amélioration des modèles de calcul des libérations conditionnelles utilisés dans certaines juridictions américaines et l’évaluation du risque de crise chez les patients médicaux. Benjamin Thelonious Fels estime que les humains apprennent par l’observation et que toute technique d’explication doit tenir compte de la nature humaine. Invités Nicole Rigillo, chercheuse de l’Institut Berggruen chez Element AI Cynthia Rudin, professeure d’informatique, de génie électrique et informatique, et de sciences statistiques à l’Université Duke Benjamin Thelonious Fels, fondateur de l’entreprise en démarrage macro-eyes œuvrant en IA dans le domaine de la santé Afficher les notes 01:11 – Yann LeCun, scientifique en chef de l’intelligence artificielle sur Facebook affirme que des tests rigoureux peuvent fournir des explications.01:58 – Institut Berggruen, Transformation du programme humain05:34 – Machines de
This Week in Machine Learning & Artificial Intelligence (AI) Podcast
You asked, we listened! Today, by listener request, we are joined by Cynthia Rudin, Professor of Computer Science, Electrical and Computer Engineering, and Statistical Science at Duke University. Cynthia is passionate about machine learning and social justice, with extensive work and leadership in both areas. In this episode we discuss: Her paper, ‘Please Stop Explaining Black Box Models for High Stakes Decisions’ How interpretable models make for less error-prone and more comprehensible decisions - and why we should care A break down of black box and interpretable models, including their development, sample use cases, and more! Check out the complete show notes at https://twimlai.com/talk/290.
Auf der Gulaschprogrammiernacht 2019 traf Sebastian auf den Podcaster Data Science Phil Philipp Packmohr @PPackmohr. Sein Interesse zur Data Science entstand während seines Studiums in den Life Sciences an der Hochschule Furtwangen in den Bereichen der molekularen und technischen Medizin und zu Medical Diagnostic Technologies. In seiner Masterarbeit hat er sich betreut von Prof. Dr. Matthias Kohl mit der statistischen Aufbereitung von Beobachtungsstudien befasst, genauer mit der kausalen Inferenz aus Observationsdaten mit Propensity Score Matching Algorithmen. Kausale Inferenz, das Schließen von Beobachtungen auf kausale Zusammenhänge, ist tatsächlich sehr wichtig in allen empirischen Wissenschaften wie zum Beispiel der Ökonomie, der Psychologie, der Politologie, der Soziologie und auch der Medizin. Idealerweise sollten Studien in der Form von randomisierten kontrollierten Studien durchgeführt werden, da nur so eine bewusste oder unbewusste Einflussnahme auf den Ergebnisse verhindert werden kann. Beispielsweise leiden Evaluationen an Hochschulen am Ende von Vorlesungen oder Studiengängen oft unter einem Survivorship Bias, da nur noch die Personen befragt werden, die bis zum Ende durchgehalten haben. Doch werden nicht alle Studien aufgrund von verschiedenen Gründen (wie zum Beispiel der hohen Kosten) randomisiert durchgeführt, und so war es auch bei dem für seine Arbeit zentralen Observationsdatensatz von Prof. Dr. Konrad Reinhart an der Klinik für Intensivmedizin vom Universitätsklinikum Jena zu Therapien zur Vermeidung von akutem Nierenversagen. Der Datensatz behandelte 21757 Patienten mit soziodemographischen und biologischen Merkmalen aus der elektronischen Gesundheitsakte mit bis zu 209 Variablen, sowie der gewählten Therapie und ob es zu Nierenversagen kam oder nicht. Die Variablen werden bei der Untersuchung als Confounder, Störfaktoren oder Kovariate benannt, die nicht als ursächlich für den Therapieverlauf gesehen werden, aber diesen sowohl beeinflussen können. In einer nicht-randomisierten Studie werden die Confounder nicht gleichmäßig über die Therapiearten verteilt sein, und damit die zusammengefassten Ergebnisse unerwünscht verfälschen. Eine Aufbereitung anhand der Confounder kann aber nie eine völlig randomisierte Studie ersetzen, da in den Daten nicht auftretende Confounder, wie bespielsweise dem athletischen Status, nicht berücksichtigt werden können. Im Propensity Score Matching werden nun die Erfolgsquoten von Therapien vereinfacht gesagt als durch einen Score gewichtete Erfolgsquote unter Berücksichtigung der aufgetretenen Häufigkeiten der Confounder zur erwarteten Häufigkeit der Confounder berechnet. Problematisch ist dabei der Umgang mit fehlenden Datenwerten, da nur ein Bruchteil der Datensätze wirklich alle Variablen definiert. Hier mussten sinnvolle Datenergänzungsverfahren eingesetzt werden. Die Auswertung erfolgte mit dem kostenlosen Open Source Projekt R (Plattform für statistische Berechnungen), das eine Vielzahl Verfahren und Algorithmen zur Verfügung stellt. Die im Laufe der Arbeit entwickelten Verfahren finden sich im Github Repository zu den Analyseverfahren. Die Analyse des Observationsdatensatz ergab nun Risikoraten von 15.6% bis 11.5% für Nierenversagen. Dies muss aber nicht bedeuten, dass die eine Therapie immer der anderen Therapie vorzuziehen ist, da viele Kriterien für die Wahl einer Therapie einbezogen werden müssen. In der personalisierte oder prädiktiven Medizin wird versucht, an Hand von Observationsanalysen sogar weitergehende Therapiehinweise in Abhängigkeit von Confoundern der einzelnen Patienten zu geben. Den Anstoß für den Data Science Phil Podcast fand Philipp in einem Aufruf vom YouTuber Martin Jung. Im englisch-sprachigen Podcast geht es um grundlegende Verfahren der Data Science, aber auch um weiterführende Themen, die er auf Konferenzen mit Gästen diskutiert. Literatur und weiterführende Informationen P. R. Rosenbaum, D. B. Rubin, Donald B: The Central Role of the Propensity Score in Observational Studies for Causal Effects, Biometrika. 70 (1): 41–55 , 1983. J. Pearl: Causality: Models, Reasoning, and Inference , Cambridge University Press, 2019. D. Ho, K. Imai, G. King, E. Stuart: MatchIt - Nonparametric Preprocessing for Parametric Causal Inference, Journal of Statistical Software, 42(8), 1 - 28, 2011. D. Ho, K. Imai, G. King, E. Stuart: MatchIt: Nonparametric Preprocessing for Parametric Causal Inference, R-Module, 2018. E. A. Stuart: Matching Methods for Causal Inference: A review and a look forward, Statistical Science 25(1): 1-21, 2010. Research Gate Profil von Philipp Packmohr Github Profil von Philipp Packmohr Science Days im Europapark Rust Data Science Blog von Philipp Packmohr stamats von Prof. Dr. Matthias Kohl Podcasts Data Science Phil Podcast P. Packmohr, S. Ritterbusch: Neural Networks, Data Science Phil, Episode 16, 2019. I. Hinneburg: EbPharm-Magazin im September, Adjustierung in epidemiologischen Studien, Podcast Evidenzbasierte Pharmazie, 2017. GPN19 Special P. Packmohr, S. Ritterbusch: Neural Networks, Data Science Phil, Episode 16, 2019. P. Packmohr, S. Ritterbusch: Propensity Score Matching, Gespräch im Modellansatz Podcast, Folge 207, Fakultät für Mathematik, Karlsruher Institut für Technologie (KIT), 2019. http://modellansatz.de/propensity-score-matching GPN18 Special D. Gnad, S. Ritterbusch: FPGA Seitenkanäle, Gespräch im Modellansatz Podcast, Folge 177, Fakultät für Mathematik, Karlsruher Institut für Technologie (KIT), 2018. http://modellansatz.de/fpga-seitenkanaele B. Sieker, S. Ritterbusch: Flugunfälle, Gespräch im Modellansatz Podcast, Folge 175, Fakultät für Mathematik, Karlsruher Institut für Technologie (KIT), 2018. http://modellansatz.de/flugunfaelle A. Rick, S. Ritterbusch: Erdbebensicheres Bauen, Gespräch im Modellansatz Podcast, Folge 168, Fakultät für Mathematik, Karlsruher Institut für Technologie (KIT), 2018. http://modellansatz.de/erdbebensicheres-bauen GPN17 Special Sibyllinische Neuigkeiten: GPN17, Folge 4 im Podcast des CCC Essen, 2017. A. Rick, S. Ritterbusch: Bézier Stabwerke, Gespräch im Modellansatz Podcast, Folge 141, Fakultät für Mathematik, Karlsruher Institut für Technologie (KIT), 2017. http://modellansatz.de/bezier-stabwerke F. Magin, S. Ritterbusch: Automated Binary Analysis, Gespräch im Modellansatz Podcast, Folge 137, Fakultät für Mathematik, Karlsruher Institut für Technologie (KIT), 2017. http://modellansatz.de/binary-analyis M. Lösch, S. Ritterbusch: Smart Meter Gateway, Gespräch im Modellansatz Podcast, Folge 135, Fakultät für Mathematik, Karlsruher Institut für Technologie (KIT), 2017. http://modellansatz.de/smart-meter GPN16 Special A. Krause, S. Ritterbusch: Adiabatische Quantencomputer, Gespräch im Modellansatz Podcast Folge 105, Fakultät für Mathematik, Karlsruher Institut für Technologie (KIT), 2016. http://modellansatz.de/adiabatische-quantencomputer S. Ajuvo, S. Ritterbusch: Finanzen damalsTM, Gespräch im Modellansatz Podcast, Folge 97, Fakultät für Mathematik, Karlsruher Institut für Technologie (KIT), 2016. http://modellansatz.de/finanzen-damalstm M. Fürst, S. Ritterbusch: Probabilistische Robotik, Gespräch im Modellansatz Podcast, Folge 95, Fakultät für Mathematik, Karlsruher Institut für Technologie (KIT), 2016. http://modellansatz.de/probabilistische-robotik J. Breitner, S. Ritterbusch: Incredible Proof Machine, Gespräch im Modellansatz Podcast, Folge 78, Fakultät für Mathematik, Karlsruher Institut für Technologie (KIT), 2016. http://modellansatz.de/incredible-proof-machine
Why are some Russians put on extremist watch lists for saving or posting memes online? Maria Motuznaya was investigated by police after saving edgy memes on her account on the social network VKontakte. Hundreds of Russians are being targeted for using memes declared to be racist, offensive or against the Russian Orthodox Church. People on the list have their bank account frozen and some face criminal charges. Will a blogger’s campaign make a difference? Are you more chimp or Neanderthal? We often hear scientists talking about how we are related but what’s the difference between 96% similarity and sharing 20% of our DNA, and do some of us literally have pieces of Neanderthal within us? Tim Harford talks to Peter Donnelly, Professor of Statistical Science at the University of Oxford. Why is the relationship between fathers and sons so important? Nastaran Tavakoli-Far investigates. (Photo: A pair of hands in handcuffs hold a mobile phone showing the VKontakte website. Credit: Anton Vaganov/Interpress/TASS)
In episode four of season four we talk more about natural an artificial intelligences and thinking about diversity in systems. Reading Can a Biologist Fix a Radio is a great paper around these ideas. We take a listener question about moving into machine learning after having advanced training in a different program. Our guest on this episode is our second second time guest Peter Donnelly, Professor of Statistical Science at the University of Oxford, Director of the Wellcome Trust Center for Human Genetics and a Fellow of the Royal Society.
There is a global call for a ban on killer robots, warning that technological advances could revolutionize warfare and create new weapons of terror. More than one hundred robotics and artificial intelligence entrepreneurs intend on sending a letter to the UN calling for action to prevent the development of autonomous weapons. The renewed plea on autonomous weapons was released as the International Joint Conference on Artificial Intelligence which is underway in Melbourne, Australia with a more than 2000 of the world's top AI and robotics experts taking part. Tsepiso Makwetla spoke to Professor Bhekisipho Twala, Director in Artificial Intelligence and Statistical Science at the University of Johannesburg's Institute for Intelligent Systems