POPULARITY
In today's episode, we are reviewing Numerical and Categorical Data in various forms (e.g., histograms, circle graphs, scatterplots, two-way tables). This is part of a multi-series review of what YOU need to know to pass the Mathematics subtest of the GK. About FTCE Seminar How do you PASS the Florida Teacher Certification Exams (FTCE)? On this podcast, we will be discussing concepts from the FTCE Testing Blueprint to help you prepare for the exam. ..Not only is each episode based on the FTCE General Knowledge essay subtest, English Language Skills subtest, Reading subtest, and Mathematics subtest, but I am also using my experience as a FTCE Tutor, 10 year classroom teacher who has passed the FTCE GK Exam, FTCE Professional Education Exam, FTCE Exceptional Student Education Exam, FTCE English 6-12 Exam, FTCE Journalism Exam, and the Reading Endorsement to help you pass and start teaching. ..How do educational podcasts work? Each podcast covers one concept from the FTCE Testing Blueprint. This method is called micro-learning where you listen repeatedly to concepts to reinforce your knowledge and understanding. Try it out! Check it out! And leave your questions and comments below. ----------------------------------------------- RESOURCES (Free)
Audio Only Version of Displaying and Describing Categorical Data Concept Video For Video: https://www.youtube.com/channel/UCHVyc1NJuYvzpoom-L3nBpg/ For More info: https://blogs.lt.vt.edu/jmrussell/topics/ --- Support this podcast: https://anchor.fm/john-russell10/support
Hugo speaks with Renee Teate about the many paths to becoming a data scientist. Renee is a Data Scientist at higher ed analytics start-up HelioCampus, and creator and host of the Becoming a Data Scientist Podcast. In addition to discussing the many possible ways to become becoming a data scientist, they will discuss the common data scientist profiles and how to figure out which ones may be a fit for you. They’ll also dive into the fact that you need to figure out both where you are in terms of skills and knowledge and where you want to go in terms of your career. Renee has a bunch of great suggestions for aspiring data scientists and also flags several important pitfalls and warnings. On top of this, they'll dive into how much statistics, linear algebra and calculus you need to know in order to become an effective data scientist and/or data analyst. Links from the show FROM THE INTERVIEW Becoming a Data Scientist (Renée's Blog) Renée's Twitter Data Sci Guide (Data Science Learning Directory) FROM THE SEGMENTS Statistical Distributions and their Stories (with Justin Bois at ~19:20) Justin's Website at Caltech Probability distributions and their stories Programming Topic of the Week (with Emily Robinson at ~43:20) Categorical Data in the Tidyverse, a DataCamp Course taught by Emily Robinson. R for Data Science Book by Hadley Wickham (Factors Chapter) Inference for Categorical Data, a DataCamp Course taught by Andrew Bray. stringsAsFactors: An unauthorized biography (Roger Peng, July 24, 2015) Wrangling categorical data in R (Amelia McNamara & Nicholas J Horton, August 30, 2017) Original music and sounds by The Sticks.
Dr. LaMotte and Dr. Wells discuss their NIJ funded research on PMI and how it can help crime scene investigators. This is their abstract submitted for the R&D Symposium: "To our knowledge an estimate of time since death is almost never accompanied by the kind of mathematically explicit probability statement that is the standard in most scientific disciplines. This has been a problem both for death investigation casework (and court testimony) and for research, because scientists have not known how to design decomposition experiments to provide adequate statistical power for postmortem interval (PMI) estimation. We have been developing methods for calculating statistical confidence limits about a PMI estimate based on either continuous quantitative or categorical data. The examples we present are from forensic entomology, but the approach is suitable for any postmortem variable. To do this we extended and adapted the time-tested statistical method of inverse prediction (IP, also called calibration) to the PMI estimation setting. Methods to produce valid p-values for this process are known for single, quantitative y and x that follow a linear regression relation and with y having constant variance. Some exist for multivariate y, but only for settings where y has constant variance. Many measurements used for PMI estimation do not fit these criteria. The current project builds on earlier work in which we developed IP methods for non-constant variance of a single, quantitative y (e.g. estimating carrion maggot age using a single size measurement, Wells and LaMotte 1995), and in which we developed the first ever method for IP based on categorical data (e.g. estimating PMI based on carrion insect succession, LaMotte and Wells 2000). One possible barrier to the adoption of these new inverse prediction methods by researchers and death investigators has been that they are not implemented in statistical software packages. In this presentation we will show how IP using categorical data can be done by simply reading a table. Concerning quantitative data we will show how inverse prediction of PMI can be performed using statistical analysis software already widely available for general linear mixed models, where the statistical theory and methodology are well-established. We will show how flexible models using polynomial splines can be fit for both the means and variance-covariance matrices, and how to use dummy variables over a grid of values of x to get the p-values required for confidence sets automatically. Attendees familiar with mixed models and their applications will be able to implement these methods in standard statistical packages. Statistical Methods for Combining Multivariate and Categorical Data in Postmortem Interval Estimation 2013‐DN‐BXK042 Lynn R. LaMotte,1 and Jeffrey D. Wells2 1Biostatistics Program, LSU School of Public Health To learn more visit www.ForensicCOE.org
Louisville Lectures Internal Medicine Lecture Series Podcast
Brian Guinn discusses different ways to interpret Categorical Data. This lecture specifically looks at risk ratios, odds ratios, and Chi squared. Some items in this lecture may have come from the lecturer’s personal academic files or have been cited in-line or at the end of the lecture. For more information, see our citation page. Disclaimers ©2015 LouisvilleLectures.org
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02
Due to the increasing power of data acquisition and data storage technologies, a large amount of data sets with complex structure are collected in the era of data explosion. Instead of simple representations by low-dimensional numerical features, such data sources range from high-dimensional feature spaces to graph data describing relationships among objects. Many techniques exist in the literature for mining simple numerical data but only a few approaches touch the increasing challenge of mining complex data, such as high-dimensional vectors of non-numerical data type, time series data, graphs, and multi-instance data where each object is represented by a finite set of feature vectors. Besides, there are many important data mining tasks for high-dimensional data, such as clustering, outlier detection, dimensionality reduction, similarity search, classification, prediction and result interpretation. Many algorithms have been proposed to solve these tasks separately, although in some cases they are closely related. Detecting and exploiting the relationships among them is another important challenge. This thesis aims to solve these challenges in order to gain new knowledge from complex high-dimensional data. We propose several new algorithms combining different data mining tasks to acquire novel knowledge from complex high-dimensional data: ROCAT (Relevant Overlapping Subspace Clusters on Categorical Data) automatically detects the most relevant overlapping subspace clusters on categorical data. It integrates clustering, feature selection and pattern mining without any input parameters in an information theoretic way. The next algorithm MSS (Multiple Subspace Selection) finds multiple low-dimensional subspaces for moderately high-dimensional data, each exhibiting an interesting cluster structure. For better interpretation of the results, MSS visualizes the clusters in multiple low-dimensional subspaces in a hierarchical way. SCMiner (Summarization-Compression Miner) focuses on bipartite graph data, which integrates co-clustering, graph summarization, link prediction, and the discovery of the hidden structure of a bipartite graph data on the basis of data compression. Finally, we propose a novel similarity measure for multi-instance data. The Probabilistic Integral Metric (PIM) is based on a probabilistic generative model requiring few assumptions. Experiments demonstrate the effectiveness and efficiency of PIM for similarity search (multi-instance data indexing with M-tree), explorative data analysis and data mining (multi-instance classification). To sum up, we propose algorithms combining different data mining tasks for complex data with various data types and data structures to discover the novel knowledge hidden behind the complex data.
LISA: Laboratory for Interdisciplinary Statistical Analysis - Short Courses
R is a free computing and graphical software/environment for statistical analysis. Part III of this short course consists of 3 sections: Section 5 introduces the concept of generalized linear models. R will be used for performing logistic regression and Poisson regression. Section 6 introduces the concept of categorical data analysis. Topics to be covered include: graphical displays of categorical data, measures of association, and contingency tables analyses. Section 7 will cover writing functions in R. Users can write functions in R to carry out operations and return one or more values. Examples of functions will be given and participants will also be given exercises to help with writing their own functions. Note: experience using R or attending Part I and Part II of this series is suggested but not required for Part III. R can be downloaded here: http://www.r-project.org/ RStudio can be downloaded here: http://rstudio.org/download/desktop Course files available here: www.lisa.stat.vt.edu/?q=node/5039.