Welcome to UVA Data Points, a podcast exploring the world of data science
To view the full digital exhibition visit:https://story.datascience.virginia.edu/The Story of Us is a digital brand showcase that chronicles the creation of a data science school at the University of Virginia. From an idea to an institute, and now to the first school of its kind in the nation, this bespoke, interactive website tells the story through written and spoken narrative supported by multimedia, archival documentation, and testimony from a diversity of voices. The result is a unique brand piece that combines technology with art, a retrospective of one school's past and how it will inform its future. It was commissioned by the School of Data Science to mark the opening of its new building in April 2024.The Story of Us website was built by John Baxton, senior web developer for the School of Data Science, who worked in close collaboration with Sue Haas, the School's information technology director. The designs for the site were created by the Charlottesville-based Journey Group. Video interviews and the audio version of The Story of Us were recorded and edited by Cody Huff, multimedia producer for the School of Data Science, who also coordinated the curation of assets for the project. Monica Manney provided the narration.Research and editorial content for the project were provided by the School of Data Science communications team of Huff, Cooper Allen, Alyssa Brown, Emma Candelier, and Kirsten Samuels.The Story of Us would not have been possible without the cooperation and generous time provided by the key figures interviewed for this project. They are Cathy Anderson, Phil Bourne, Don Brown, Arlyn Burgess, Rick Horwitz, Reggie Leonard, Melissa Phillips, Jim Ryan, Teresa Sullivan, and Jaffray Woodriff.And to all the students, faculty, and staff who contributed to this story over the years, the School of Data Science will forever be grateful.
The UVA School of Data Science was formed in September 2019 and has since grown in its collaborations, partnerships, program offerings, and teaching and research personnel. We are now constructing a new facility that will house the School of Data Science at the University of Virginia.The new building is in the first phase of development and, once complete, will link the University's Central Grounds with the athletic fields and North Grounds. The 60,000-square-foot building is the future home of the UVA School of Data Science and will serve as the gateway to the new Emmet-Ivy Corridor and the Discovery Nexus.This bonus episode is a conversation between UVA architect Alice Raucher and Mike Taylor, a principal with Hopkins Architects. Both Alice and Mike have been instrumental in the building's design. Alice has also played a key role in the development of the Ivy Corridor. Mike and Alice take a deep dive into the thought process behind the building's design, its relationship to the University and its history, the land's unique topography, and its significance to future projects along the Ivy Corridor. Links:Hopkins ArchitectsSchool of Data Science New Building Website
In his new book, The AI Playbook: Mastering the Rare Art of Machine Learning Deployment, Eric Siegel offers a detailed playbook for how business professionals can launch machine learning projects, providing both success stories where private industry got it right as well as cautionary tales others can learn from.Siegel laid out the key findings of his book in our latest episode during a wide-ranging conversation with Marc Ruggiano, director of the University of Virginia's Collaboratory for Applied Data Science in Business, and Michael Albert, an assistant professor of business administration at UVA's Darden School. The discussion, featuring three experts in business analytics, takes an in-depth look at the intersection of artificial intelligence, machine learning, business, and leadership.http://www.bizML.comhttps://www.darden.virginia.edu/faculty-research/centers-initiatives/data-analytics/bodily-professorhttps://pubsonline.informs.org/do/10.1287/LYTX.2023.03.10/full/https://www.kdnuggets.com/survey-machine-learning-projects-still-routinely-fail-to-deployCRISPDM: https://en.wikipedia.org/wiki/Cross-industry_standard_process_for_data_miningCRM: https://en.wikipedia.org/wiki/Customer_relationship_management
This panel delves into how the faculty at UVA's School of Data Science are actively working to craft a liberal arts curriculum suitable for the digital age, one that not only adapts to but embraces changes in technology and practice. The panel discusses the future of data science education, including in K-12, the school's guiding philosophy for its undergraduate and graduate programs (minor, B.S., online and residential M.S., Ph.D.), and the merits as well as challenges that arise when constructing a new educational curriculum for a new discipline.
Because of advances in machine learning, wearable technology, and computer vision, the field of sport analytics is a whole new game. This episode gets into the details on what is new, the impact of analytics and technology on athletes and sports, as well as the ethics surrounding its implementation. Three experts from the University of Virginia School of Data Science met to discuss this exciting topic: Natalie Kupperman, Stephen Baek, and Don Brown. On behalf of everyone here at the School of Data Science, thank you and we'll see you next year
Artificial intelligence has the potential to change our societies, economies, and political systems in both intentional and unintended ways. While it is difficult to understand the full extent of what the long-term impacts may be, we have enough shared knowledge and expertise to predict the likely shapes that these changes may take—both for better and for worse. More importantly, we should ask ourselves what kind of future we want AI to help us create: what we want from the future of AI should ultimately determine the future of AI. This panel will bring together experts to discuss the intersection of AI and society and offer suggestions for how AI might work within a just, inclusive, sustainable, and fair digital future. PanelistFarhana Faruqe, Assistant Professor of Data ScienceSarah Lebovitz, Assistant Professor of CommerceLarry Medsker, Research Professor, George Washington University Mar Hicks, Associate Professor of Data Science (moderator)
The latest episode of UVA Data Points features Don Brown, the senior associate dean for research at the School of Data Science, and professor Bill Basener as they discuss remote sensing, which is the process of collecting data about an object without contacting it.The discussion traces the history of remote sensing, its many applications, and the challenges involved in gathering accurate information. The two take an in-depth look at Basener's research, including his work with LiDAR and hyperspectral imaging. Basener also explains the one aspect of this burgeoning technology that keeps him up at night.
This episode is a collaboration between UVA Data Points and Hoos in STEM.This episode of UVA Data Points features Ken Ono discussing the growth of data science at UVA and its increasing importance in various disciplines, including how he uses it to help swimmers improve performance. Ono is a professor of mathematics and STEM advisor to the provost, as well as a professor of data science by courtesy. He recently supported the women's team at the U.S. Olympic Trials in Japan.Ono speaks with three UVA swimmers who are pursuing graduate degrees in data science and statistics while also performing as student-athletes: August Lamb, Kate Douglass, and Will Tenpas. They discuss student life, balancing academics with swimming, and how data science and mathematics are helping them win championships.
Excavating the Mother Lode of Human-Generated Text: A Systematic Review of Research That Uses the Wikipedia Corpus
In this episode we're looking at the past, present, and future of artificial intelligence in higher education.To explore this topics we're featuring a conversation between Phil Bourne, the dean of the UVA School of Data Science, and Jeffrey Blume, the Associate Dean for Academic and Faculty Affairs, also at UVA Data Science.Jeffrey and Phil discuss the recent trends in artificial intelligence and they look at how this will impact the student experience, the faculty and staff experience, and the research landscape in higher education.
This episode was recorded during the Miller Center's 2023 William and Carol Stevenson Conference, U.S. China Tech Competiton: Has Democracy Met its Match? For more info on this conference, as well as to watch the video versions, follow this link: https://millercenter.org/news-events/events/us-china-tech-competition-has-democracy-met-its-matchThis episode features the first panel discussion from the conference entitled:Apps, platforms, and surveillanceHow might apps and other technology platforms play a role in Chinese government data-gathering efforts? What are potential policy responses to the increasingly complex data flows between the United States and China? This panel addresses the long-term stability of U.S. technology infrastructure and related concerns for U.S. national security. Josh Chin, Kara Frederick, Shanthi Kalathil, Aynne Kokas (moderator)
In this episode, we're thrilled to have Dr. Aynne Kokas, a C.K. Yen Professor at the Miller Center and an associate professor of media studies at the University of Virginia. Kokas' research examines Sino-U.S. media and technology relations. Dr. Kokas is also the author of the critically acclaimed book "Trafficking Data: How China Is Winning the Battle for Digital Sovereignty," which we will be referring to frequently throughout this conversation. We will also touch on a few topics that were discussed in her recent conference at the Miller Center titled "U.S.-China Tech Competition: Has Democracy Met Its Match?"During the event, Dr. Kokas and other experts discussed a variety of issues related to the ongoing tech competition between the US and China. For example, they explored the ways in which apps and other technology platforms may be used by the Chinese government for data-gathering purposes, and examined potential policy responses to the increasingly complex data flows between the two countries. Additionally, they discussed the long-term stability of US technology infrastructure and its implications for national security. In addition, there were panels that discussed the digital economy, climate, tech infrastructure, and political influence between China and the US.In this episode we'll be discussing data policy for US-China technology, a topic that has become increasingly relevant in recent years as the two countries continue to compete for dominance in the tech industry. We'll delve into the differences in approach to data policy between China and the United States, the implications of these differences, and how China's digital silk road initiative is expanding its influence over the global digital economy.We'll also discuss the challenges of balancing economic benefits against concerns about national security and human rights, and the future of the technology industry in light of these trends.Links:U.S.–China tech competition: Has democracy met its match?Aynne Kokas website: https://www.aynnekokas.com/Trafficking Data: How China Is Winning the Battle for Digital Sovereignty
This episode explores the intersection of neuroscience and data science with three experts in the field, Drs. John Darrell van Horn, Tanya Evans, and Teague Henry. As we know, the brain is complicated. People have been charting paths through the brain for decades, making breakthroughs and discoveries that have changed the world. In recent years though, new methodologies in brain research have made significant impacts. Advances in computing power, as well as techniques like machine learning, neural networks, and computer vision, have allowed researchers to ask questions and make discoveries that were not possible even ten years ago. Given these new approaches to studying the world's most complicated organ, one could say that brain science is data science. Our guests make a compelling case.
This episode features a conversation between Lane Rasberry, Wikimedia-in-Residence at the University of Virginia School of Data Science, and Virginia Eubanks, author, journalist, and associate professor of political science at the University at Albany. The conversation was recorded in 2019 but the topics are still relevant today. Eubanks looks toward the future, warning of the unintended—or at times intended—consequences of emerging technologies. The discussion focuses on the effects of algorithmic automation, as well as the practice, policies, and implementation of these algorithms. Although she critiques the tech world, Eubanks also provides many reasons for optimism.Virginia Eubanks authored the 2018 book Automating Inequality, which is a detailed investigation into data-based discrimination. She is also the author of Digital Dead End: Fighting for Social Justice in the Information Age and the co-editor, with Alethia Jones, of Ain't Gonna Let Nobody Turn Me Around: Forty Years of Movement Building with Barbara Smith. She also writes for various outlets, including the Guardian, American Scientist, and the New York Times. Recently, Virginia began the PTSD Bookclub, an ongoing project that explores books about trauma and its aftermath. You can find this project and Virginia Eubank's other projects at virginia-eubanks.com.
In this episode we’re bringing you a conversation on the future of academic data science recorded live at UVA Data Science’s Datapalooza 2022 event Datapalooza is a flagship event for the School of Data Science. It’s typically held each year in November and features presentations by researchers here at UVA, as well as friends and collaborators of the School of Data Science. In this episode we’re featuring a panel discussion between: Doug Hague, the Executive Director at UNC-Charlotte’s School of Data Science H.V. Jagadish, Director of the Michigan Institute for Data Science at the University of Michigan Phil Bourne, Dean of the UVA School of Data Science And Micaela Parker, Founder and Executive Director of the Academic Data Science Alliance. Micaela also serves as the moderator for this panel discussion. Links: Future of Academic Data Science video recording Michigan Institue of Data Science UNC Charlottes School of Data Science UVA School of Data Science
For our exploration of Analytics, we are diving into the world of sports. Because of advances in machine learning, wearable technology, and computer vision, the field of sport analytics is a whole new game. This episode gets into the details on what is new, the impact of analytics and technology on athletes and sports, as well as the ethics surrounding its implementation. Three experts from the University of Virginia School of Data Science met to discuss this exciting topic: Natalie Kupperman, Stephen Baek, and Don Brown.
This bonus episode features a conversation between Lane Rasberry, Wikimedian-In-Residence at the UVA School of Data Science, and Lloyd Sy, a Ph.D. candidate in the UVA Department of English. In this conversation, Lane and Lloyd take a deep dive into the expansive world of Wikidata and ask the existential question, "What makes a person a person?" Or, more specifically, what data points make up a person? To help answer this question, Lloyd developed a large-scale data model of the biographical data contained within the Wikidata platform. This project serves as the foundation for their conversation. They also take a wide view of biographical data as it pertains to research and academia, including the process of gathering the data, the ethics of utilizing the data, personal ownership of the data, and much more. Anyone interested in these concepts should find this discussion valuable. Links: WikiProject Biography Music: "Screen Saver" Kevin MacLeod (incompetech.com)Licensed under Creative Commons: By Attribution 4.0 Licensehttp://creativecommons.org/licenses/by/4.0/
This episode on Systems explores the challenges of cloud computing within the framework of biomedical research. Phil Bourne, Dean of the UVA School of Data Science, speaks with computational biologist and associate professor Nathan Sheffield about a paper they co-wrote on systemic issues from cloud platforms that do not support FAIRness, including platform lock-in, poor integration across platforms, and duplicated efforts for users and developers. They suggest instead prioritizing microservices and access to modular data in smaller chunks or summarized form. Emphasizing modularity and interoperability would lead to a more powerful Unix-like ecosystem of web services for biomedical analysis and data retrieval. The two discuss how funders, developers, and researchers can support microservices as the next generation of cloud-based bioinformatics. From Cloud Computing to Microservices: Next Steps in FAIR Data and Analysis https://pubmed.ncbi.nlm.nih.gov/36075919/
This bonus episode deviates from our central theme around the Domains of Data Science. The UVA School of Data Science was formed in September 2019 and has since grown in its collaborations, partnerships, program offerings, and teaching and research personnel. We are now constructing a new facility that will house the School of Data Science at the University of Virginia. The new building is in the first phase of development and, once complete, will link the University's Central Grounds with the athletic fields and North Grounds. The 60,000-square-foot building is the future home of the UVA School of Data Science and will serve as the gateway to the new Emmet-Ivy Corridor and the Discovery Nexus. This bonus episode is a conversation between UVA architect Alice Raucher and Mike Taylor, a principal with Hopkins Architects. Both Alice and Mike have been instrumental in the building’s design. Alice has also played a key role in the development of the Ivy Corridor. Mike and Alice take a deep dive into the thought process behind the building’s design, its relationship to the University and its history, the land's unique topography, and its significance to future projects along the Ivy Corridor. Links: Hopkins Architects School of Data Science New Building Website
Multepal Links: Digital Edition of the Popol Vuh: https://multepal.github.io/app-aanalte/xom-all-flat-mod-pnums-lbids.html Multepal Project: https://multepal.spanitalport.virginia.edu/ Multepal GitHub: https://github.com/Multepal Books Mentioned: Mining Language by Allison Bigelow
This bonus episode features Matthew Thomas, a data scientist at Inclusively and a graduate of the UVA M.S. in Data Science program. He talks about how Inclusively works to create and maintain a job board designed specifically for job seekers with disabilities. Matthew explains how typical job boards come with many built-in biases that can screen out qualified individuals without them even knowing. He discusses the challenges of removing biases from algorithms and the importance of honesty and self-criticism when examining a data science project. As Cathy O’Neil challenged in Episode 1 of UVA Data Points, we should always ask ourselves, “For whom does this fail?” Matthew’s work is a good illustration of this sentiment in practice. In addition to discussing his work, Matthew also gives solid career advice for anyone seeking a similar career path in data science.
UVA Data Points sits down with Cathy O'Neil, author of Weapons of Math Destruction, and Brian Wright, Assistant Professor of Data Science at the University of Virginia. The candid dialogue ranges from O'Neil's new book The Shame Machine to her work as an algorithm audit consultant. The two also draw comparisons between data science problems and knitting, as well as discuss educating future data scientists. Links: https://mathbabe.org (Cathy O'Neil's website) https://datascience.virginia.edu (UVA School of Data Science website) Books mentioned: The Shame Machine Weapons of Math Destruction
Before diving into the complex world of data science it seemed to wise to establish a shared definition of the field. Here at the UVA School of Data Science, we have defined data science with the 4 + 1 Model. This model serves an outline for the first series of UVA Data Points. It also serves as a guiding definition within the School of Data Science, touching everything from research to course planning. In this introduction trailer, host Monica Manney discusses the history, development, and function of the 4 + 1 Model of Data Science with its main author, Raf Alvarado. Below is a brief expect from An Outline of the 4 + 1 Model of Data Science by Raf Alvarado: “The point of the 4 + 1 model, abstract as it is, is to provide a practical template for strategically planning the various elements of a school of data science. To serve as an effective template, a model must be general. But generality if often purchased at the cost of intuitive understanding. The following caveats may help make sense of the model when considering its usefulness when applied to various concrete activities. The model describes areas of academic expertise, not objective reality. It is a map of a division of labor writ large. Although each of the areas has clear connections to the others, the question to ask when deciding where an activity belongs is: who would be an expert at doing it? The realms help refine this question: the analytics area, for example, contains people who are good at working with abstract machinery. The four areas have the virtue of isolating intuitively correct communities of expertise. For example, people who are great at data product design may not know the esoteric depths of machine learning, and that adepts at machine learning are not usually experts in understanding human society and normative culture. Each area in the model contains a collection of subfields that need to be teased out. Some areas will have more subfields than others. Although some areas may be smaller than others in terms of number of experts (faculty) and courses, each area has a major impact on the overall practice of data science and the quality of the school’s activities. In addition, these subfields are in an important sense “more real” than the categories. We can imagine them forming a dense network in which the areas define communities with centroids, and which are more interconnected than the clean-cut image of the model implies. The areas of the model are like the components of a principal component analysis of the vector space of data science. They capture the variance that exists within the field, and, crucially, provide a framework for realigning (rebasing) the academy along a new set of axes. One effect of this is to both disperse and recombine older fields, such as computer science, statistics, and operations research, into new clusters. Thus we separate computer science subfields such as complexity analysis and database design. One possible salutary result of this will be the formation of new syntheses of fields that share concerns but differ in vocabularies and customs..."