Journal Club

Follow Journal Club
Share on
Copy link to clipboard

Welcome to a brand new show from Data Skeptic entitled "Journal Club". Each episode will feature a regular panel and one revolving guest seat. The group will discuss a few topics related to data science and focus on one featured scholarly paper which is discussed in detail.

Data Skeptic

  • Sep 16, 2020 LATEST EPISODE
  • every other week NEW EPISODES
  • 36m AVG DURATION
  • 28 EPISODES


Search for episodes from Journal Club with a specific topic:

Latest episodes from Journal Club

Science Fiction, Training Thousands at Home, and AutoML-Zero

Play Episode Listen Later Sep 16, 2020 30:31


This week we are back with our regular panelists! Kyle brings us a short article exploring science fiction impacting AI titled "Survey Finds Science Fiction One of Many Factors Impacting Views of AI Technology." George brings us an article about using thousands fo computers from universities, companies and volunteers to train one huge transformer,  titled "Train Vast Neural Networks Together." Last but not least Lan brings us the paper this week! She discusses the paper "AutoML-Zero: Evolving Machine Learning Algorithms From Scratch."

DNA Storage, Assessing COVID-19 Risks, and NLP Beyond English

Play Episode Listen Later Sep 10, 2020 52:46


We are back with other guest this week! We have NLP/ML research scientist, Fredrik Olsson joining us. He discusses the work "Why You Should Do NLP Beyond English." Lan brings us a news item, "Research News: DNA Storage." George talks about the article "Discovering Symbolic Models from Deep Learning with Inductive Biases." Last but not least, Kyle brings us the paper this week. He brings the paper, "Machine Reasoning to Assess Pandemics Risks: Case of USS Theodore Roosevelt."

Sound Detection, Exam Algorithm, and Search Result Fairness

Play Episode Listen Later Sep 1, 2020 40:36


Rachel Bittner, a research scientist at Spotify, joins us in our discussion this week! She brings us the paper "Few-Shot Sound Event Detection." Lan discusses an article about fairness of search results. George talks about a blog post from England about an algorithm grading exams and the controversy around it. Last but not least, Kyle brings us an article titled "Eye-Catching Advances in Some AI Fields are not Real."  

Stitch Fix, Carbon Footprints, and AI with Human Values

Play Episode Listen Later Aug 26, 2020 39:28


Back with our regular panelists! This week Lan brings us an article about the clothing subscription service Stitch Fix called "Multi-Armed Bandits and the Stitch Fix Experimentation Platform | Stitch Fix Technology – Multithreaded." Kyle discusses a news item titled "Shrinking Deep Learning's Carbon Footprint." Last but not least George brings the paper this week titled "Aligning AI With Shared Human Values."

Facial Recognition, Identifying Birds with AI, and Zebrafish Tail Buds

Play Episode Listen Later Aug 14, 2020 34:00


Back again with our regular panelists! This week George brings us an interesting article about how an AI developed to identify individual birds without tagging. Kyle discusses the news item "New York legislature votes to halt facial recognition tech in schools for two years." Last but not least, Lan brings the paper this week! She discusses the paper "A deep learning approach for staging embryonic tissue isolates with small data."

Gradio, Psychology in AI, and Automatic Test Scoring

Play Episode Listen Later Aug 9, 2020 33:22


Back again with our regular panelists! Today Lan brings us a article she found from Reddit about Gradio. George brings us a post about the psychology techniques used in AI. Last but not least, Kyle brings our paper this week about automatic essay test scoring!

Robotic Skin, Social Isolation, and Reinforcement Learning

Play Episode Listen Later Jul 29, 2020 34:43


Back again with another episode with our regular panelists! This week Kyle discusses the article from The Scientist "How Social Isolation Affects the Brain." Lan brings us an article titled "National University of Singapore used Intel Neuromorphic chip to develop touch-sensing robotic 'skin.'" Last but not least, George brings us the paper this week! He talks about Reinforcement Learning, bringing the paper "One Policy to Control Them All: Shared Modular Policies for Agent-Agnostic Control." All works are linked in the show notes.   

Lottery Tickets, Forecasting COVID-19, and NetHack

Play Episode Listen Later Jul 21, 2020 34:22


Back again with our regular panelists! George takes us on a discussion about the game NetHack from the blog post "The NetHack Learning Environment."  Kyle brings us an article titled "Where the Latest COVID-19 Models Think We're Headed - And Why They Disagree." Last but not least, Lan brings us the paper this week! She brings us the paper "Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask."   All works are linked in the show notes. 

200 Tools, Blurry to Photorealistic, and Logical Neural Networks

Play Episode Listen Later Jul 14, 2020 32:53


This week we have our regular panel of Lan, George and Kyle! George brings us a blog post "What I Learned From Looking at 200 Machine Learning Tools." Which, the author, talks about a list complied of all of the AI/ML tools he could find. Lan brings us a news article "Making Blurry Faces Photorealistic Only Goes So Far." This article discusses how AI researchers discover inherent resolution limit to "upsampling" of pixelated faces. Last but not least, Kyle brings us the paper for this week! He brings the paper, "Logical Neural Networks." This paper proposes a novel framework seamlessly providing key properties of both neural nets (learning) and symbolic logic (knowledge and reasoning). All works are linked in the show notes.   

Deep Fakes in a Court Room, Mass COVID-19 Testing with Biosensors, and BLEURT

Play Episode Listen Later Jul 6, 2020 38:05


We are back with our regular panel this week! Starting off we have Lan who brings us the article "Biosensors May Hold the Key to Mass Coronavirus Testing." Which talks about tech startups beginning to develop chips that signal the presence of the coronavirus RNA, antibodies, and antigens. George brings us a blog post all about BLEURT, titled "BLEURT: Learning Robust Metrics for Text Generation." Last but not least, Kyle discusses the main paper this week! He brought us a paper discussing DeepFakes popping up in court rooms with the paper titled "Courts and Lawyers Struggles With Growing Prevalence of DeepFakes."   All works will be linked in the show notes.    

Covid-19 Misinformation, GPT-3, and Movement Pruning

Play Episode Listen Later Jul 1, 2020 41:52


We're back with a special guest panelist Leonardo Apolonio! He brings us the main paper this week titled "Movement Pruning: Adaptive Sparsity by Fine-Tuning." George shows us a blog post discussing GPT-3. Lan introduces us to an article about misinformation related to Covid-19. Last but not least, Kyle also has a topic about Covid-19 addressing contact tracing apps! 

Open Source AI for Everyone, Diagnosing Blindness and Histogram Reweighting

Play Episode Listen Later Jun 24, 2020 32:07


Another week, another episode! We are back again with our regular panelists. George brings us a clinical field study with an AI that is being used to diagnose blindness. Lan discusses the article titled "AI Infrastructure for Everyone, Now Open Sourced." Last but not least, Kyle brings us our paper for the week. He brings us the paper "Extending Machine Learning Classification Capabilities with Histogram Reweighting."  

Chip Design, Teaching Google, and Fooling LIME and SHAP

Play Episode Listen Later Jun 16, 2020 32:42


This weeks episode we have the regular panel back together! George brought us the blog post from Google AI, "Chip Design with Deep Reinforcement Learning." Kyle brings us a news item from CNET, "How People with Down Syndrome are Improving Google Assistant." Lan brings us the paper this week! She discusses the paper "Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods." All works mentioned will be linked in the show notes. 

Hateful Memes, Carbon Emissions, and Detecting Ear Infections with Neural Networks

Play Episode Listen Later Jun 10, 2020 44:09


This week on Journal Club we have another panelist! Jesus Rogel-Salazar joins us this week to discuss the paper Automatic Detection of Tympanic Membrane and Middle Ear Infection. Kyle talks about the relationship between Covid-19 and Carbon Emissions. George tells us about the new Hateful Memes Challenge from Facebook. Lan joins us to talk about Google's AI Explorables.  All mentioned work can be found in the show notes.     

Animal Olympics, Whatsapp, and Models for Healthcare

Play Episode Listen Later Jun 3, 2020 41:38


This week we have a guest joining us, Francisco J. Azuaje G! He brings us the paper "How to Develop Machine Learning Models for Healthcare." Lan discusses "Animal AI Olympics," a reinforcement learning competition inspired by animal cognition. Kyle talks about WhatsApp and discusses the article "Why New Contact Tracing Apps Have A Critical WhatsApp-Sized Problem." Last but not least: George! He brings us his blog post about comparing TF-IDF and BERT vectorisation for speaker prediction.  All works discussed can be found in the show notes.

Deeply Tough Framework, Grammar for Agents, and Too Much Screen Time?

Play Episode Listen Later May 26, 2020 34:24


Today on the show Kyle discusses research which suggests that time on screens has little impact on kids' social skills. Lan talks about DeeplyTough a deep learning framework targeting the protein pocket matching problem - try to answer whether a pair of protein pockets can bind to the same ligand.George's paper this week is about defining a grammar for interpretable agents. By basing this formalism on a corpus of human explanation dialogues the authors hope to produce a more "grounded" protocol.

Chemical Space, AI Microscope, and Panda or Gibbon?

Play Episode Listen Later May 19, 2020 31:15


George talks about OpenAI's Microscope, a collection of visualisations of the neurons and layers in 6 famous vision models. This library hopes to make analysis of these models a community effort.  Lan talks about Exploring chemical space with AI and how that may change pharmaceutical drug discovery and development.  Kyle leads a discussion about the paper "Extending Adversarial Attacks to Produce Adversarial Class Probability Distributions" which shows another control that an adversarial attacker can put in place to better fool machine learning models.

Encryption Keys, Connect Four, and Data Nutrition Labels

Play Episode Listen Later May 14, 2020 41:09


Today George takes inspiration and the gym environment from Kaggle's ConnectX competition and shows off and attempt to design an interpretable Connect 4 Agent with DQN! Lan discusses the paper "The Dataset Nutrition Label," which is a framework to facilitate higher data quality standards by Sarah Holland and co-authors, from the Assembly program at the Berkman Klein Center at Harvard University & MIT Media Lab. Last but not least, Kyles leads the panel in a discussion about encryption keys!     Lan discusses Dataset nutrition Label Kyle discusses encryption keys  

ML Cancer Diagnosis, Robot Assistants, and Watermarking Data

Play Episode Listen Later May 6, 2020 31:03


Today George talks about the use of Machine Learning to diagnose Cancer from a blood test. By sampling 'cell-free-DNA' this test is capable of identifying 50 different types of Cancer and the localized tissue of origin with a >90% accuracy. Lan leads a discussion of what robots and researchers in robotics may be able to contribute towards fighting the COVID-19 pandemic. Last but not least, Kyle leads the panel in a discussion about watermarking data!   

Humanitarian AI, PyTorch Models, and Saliency Maps

Play Episode Listen Later Apr 30, 2020 27:35


George's paper this week is Sanity Checks for Saliency Maps. This work takes stock of a group of techniques that generate local interpretability - and assesses their trustworthiness through two 'sanity checks'. From this analysis, Adebayo et al demonstrate that a number of these tools are invariant to the model's weights and could lead a human observer into confirmation bias. Kyle discusses AI and brings the question: How can AI help in a humanitarian crisis? Last but not least, Lan brings us the topic of Captum, an extensive interpretability library for PyTorch models.  

Adversarial Examples, Protein Folding, and Shapley Values

Play Episode Listen Later Apr 28, 2020 45:57


George dives into his blog post experimenting with Scott Lundberg's SHAP library. By training an XGBoost model on a dataset about academic attainment and alcohol consumption can we develop a global interpretation of the underlying relationships? Lan leads the discussion of the paper Adversarial Examples Are Not Bugs, They Are Features by Ilyas and colleagues. This papers proposes a new perspective on adversarial susceptibility of machine learning models by teasing apart the 'robust' and the 'non-robust' features in a dataset. The authors summarizes the key take away message as "Adversarial vulnerability is a direct result of the models’ sensitivity to well-generalizing, ‘non-robust’ features in the data." Last  but not least, Kyle discusses Alphafold!  

Tools For Misusing GPT2, Tensorflow, and ML Unfairness

Play Episode Listen Later Apr 15, 2020 25:56


Today on the show, George leads a discussion about the Giant Language Test Room.  Lan presents a news item about Setting Fairness Goals with TensorFlow Constrained Optimization Library. This library lets users configure and train machine learning problems based on multiple different metrics, making it easy to formulate and solve many problems of interest to the fairness community. Last but not least, Kyle discusses ML Unfairness, Juvenile Recidivism in Catalonia.

Dark Secrets of Bert, Radioactive Data, and Vanishing Gradients

Play Episode Listen Later Apr 8, 2020 40:47


Today on the show, Lan presents a blog post revealing the Dark secrets of BERT. This work uses telling visualizations of self-attention patterns before and after fine-tuning to probe: what happens in the fine-tuned BERT?  George brings a novel technique to the show, "radioactive data" - a marriage of data and steganography. This work from Facebook AI Research gives us the ability to know exactly who's been training models on our data. Last but not least, Kyle discusses the work "Learning Important Features Through Propagating Activation Differences."

Dopamine, Deep Q Networks, and Hey Alexa!

Play Episode Listen Later Apr 1, 2020 36:09


Today on the show, Lan presents a blog post from Google Deepmind about Dopamine and temporal difference learning. This is the story of a fruitful collaboration between Neuroscience and AI researchers that found the activity of dopamine neurons in the mouse ventral tegmental area during a learnt probabilistic reward task was consistent with distributional temporal-difference reinforcement learning. That's a mouthful, go read it yourself! George presents his first attempts at designing an Auto-Trading Agent with Deep Q Networks. Last but not least, Kyle says "Hey Alexa! Sorry I fooled you ..."    

AlphaGo, COVID-19 Contact Tracing, and New Data Set

Play Episode Listen Later Mar 27, 2020 31:50


George led a discussion about AlphaGo - The Movie | Full Documentary. Lan informed us about the COVID-19 Open Research Dataset. Kyle shared some thoughts about the paper Beyond R_0: the importance of contact tracing when predicting epidemics.

Google's New Data Engine, Activation Atlas, and LIME

Play Episode Listen Later Mar 22, 2020 38:34


George discusses Google's Dataset Search leaving its closed beta program, and what potential applications it will have for businesses, scholars, and hobbyists. Alex brings an article about Activation Atlases and we discusses the applicability to machine learning interpretability. Lan leads a discussion about the paper Attention is not Explanation from Sarthak Jain and Byron C. Wallace. It explores the relationship between attention weights and feature importance scores (spoilers in the title). Kyle shamelessly promotes his blog post using LIME to explain a simple prediction model trained on Wikipedia data.

Albert, Seinfeld, and Explainable AI

Play Episode Listen Later Mar 22, 2020 36:20


Kyle discusses Google's recent open sourcing of ALBERT, a variant of the famous BERT model for natural language processing. ALBERT is more compact and uses fewer parameters.  George leads a discussion about the paper Explainable Artificial Intelligence: Understanding, visualizing, and interpreting deep learning models by Samek, Wiegand, and Muller. This work introduces two tools for generating local interpretability and a novel metric to objectively compare the quality of explanations. Last but not least, Lan talks about her experience generating new Seinfeld scripts using GPT-2.

Chess Transformer, Kaggle Scandal, and Interpretability Zoo

Play Episode Listen Later Mar 12, 2020 44:02


Welcome to a brand new show from Data Skeptic entitled "Journal Club". Each episode will feature a regular panel and one revolving guest seat. The group will discuss a few topics related to data science and focus on one featured scholarly paper which is discussed in detail. Lan tells the story of a transformer learning to play chess. The experiment was to fine-tune a GPT-2 transformer model using a 2.4M corpus of chess games in standard notation, then to see if it can 'play chess' by generating the next move. This is a thought-provoking way to take advantage of the advances in NLP by 'transforming' a game into the 'language' of written text. This was work done by Shawn Presser. George gives a breakdown of a Kaggle Cheating Scandal where a Grandmaster was caught training on the test set. The story follows Benjamin Minixhofer and his capable detective work to discover an obfuscation that artificially improved the winning team's accuracy. Kyle leads a discussion on the paper Towards A Rigorous Science of Interpretable Machine Learning from Finale Doshi-Velez and Been Kim. The paper is a great survey of the spectrum of interpretability techniques and also contains suggestions for how we describe the "taxonomy" of various methodologies.

Claim Journal Club

In order to claim this podcast we'll send an email to with a verification link. Simply click the link and you will be able to edit tags, request a refresh, and other features to take control of your podcast page!

Claim Cancel