Argmax

Follow Argmax

Share on

A show where three machine learning enthusiasts talk about recent papers and developments in machine learning.

Vahe Hagopian, Taka Hasegawa, Farrukh Rahman

Sep 2, 2023 LATEST EPISODE
infrequent NEW EPISODES
50m AVG DURATION
16 EPISODES

Search for episodes from Argmax with a specific topic:

Latest episodes from Argmax

LoRA

Play Episode Listen Later Sep 2, 2023 62:56

We talk about Low Rank Approximation for fine tuning Transformers. We are also on YouTube now! Check out the video here: https://youtu.be/lLzHr0VFi3Y

transformers

15: InstructGPT

Play Episode Listen Later Mar 28, 2023 57:27

In this episode we discuss the paper "Training language models to follow instructions with human feedback" by Ouyang et al (2022). We discuss the RLHF paradigm and how important RL is to tuning GPT.

training gpt rl ouyang

14: Whisper

Play Episode Listen Later Mar 17, 2023 49:14

This week we talk about Whisper. It is a weakly supervised speech recognition model.

whispers

13: AlphaTensor

Play Episode Listen Later Mar 11, 2023 49:05

We talk about AlphaTensor, and how researchers were able to find a new algorithm for matrix multiplication.

12: SIRENs

Play Episode Listen Later Oct 25, 2022 54:17

In this episode we talked about "Implicit Neural Representations with Periodic Activation Functions" and the strength of periodic non-linearities.

sirens

11: CVPR Workshop on Autonomous Driving Keynote by Ashok Elluswamy, a Tesla engineer

Play Episode Listen Later Sep 30, 2022 48:51

In this episode we discuss this video: https://youtu.be/jPCV4GKX9DwHow Tesla approaches collision detection with novel methods.

tesla engineers workshop keynote ashok autonomous driving cvpr

10: Outracing champion Gran Turismo drivers with deep reinforcement learning

Play Episode Listen Later Aug 23, 2022 54:50

We discuss Sony AI's accomplishment of creating a novel AI agent that can beat professional racers in Gran Turismo. Some topics include:- The crafting of rewards to make the agent behave nicely- What is QR-SAC?- How to deal with "rare" experiences in the replay bufferLink to paper: https://www.nature.com/articles/s41586-021-04357-7

ai champion drivers gran turismo deep reinforcement learning

9: Heads-Up Limit Hold'em Poker Is Solved

Play Episode Listen Later Jul 29, 2022 47:55

Today we talk about recent AI advances in Poker; specifically the use of counterfactual regret minimization to solve the game of 2-player Limit Texas Hold'em.

ai limit poker solved heads up

8: GATO (A Generalist Agent)

Play Episode Listen Later Jul 29, 2022 44:51

Today we talk about GATO, a multi-modal, multi-task, multi-embodiment generalist agent.

agent gato generalists

7: Deep Unsupervised Learning Using Nonequilibrium Thermodynamics

Play Episode Listen Later Jun 14, 2022 30:55

We start talking about diffusion models as a technique for generative deep learning.

deep thermodynamics unsupervised learning

6: Deep Reinforcement Learning at the Edge of the Statistical Precipice

Play Episode Listen Later Jun 6, 2022 61:08

We discuss NeurIPS outstanding paper award winning paper, talking about important topics surrounding metrics and reproducibility.

statistical precipice neurips deep reinforcement learning

5: QMIX

Play Episode Listen Later Apr 26, 2022 42:06

We talk about QMIX https://arxiv.org/abs/1803.11485 as an example of Deep Multi-agent RL.

4: Can Neural Nets Learn the Same Model Twice?

Play Episode Listen Later Apr 6, 2022 55:23

Todays paper: Can Neural Nets Learn the Same Model Twice? Investigating Reproducibilityand Double Descent from the Decision Boundary Perspective (https://arxiv.org/pdf/2203.08124.pdf)Summary:A discussion of reproducibility and double descent through visualizations of decision boundaries.Highlights of the discussion:Relationship between model performance and reproducibilityWhich models are robust and reproducibleHow they calculate the various scores

relationships model nets neural

3: VICReg

Play Episode Listen Later Mar 21, 2022 44:46

Todays paper: VICReg (https://arxiv.org/abs/2105.04906)Summary of the paperVICReg prevents representation collapse using a mixture of variance, invariance and covariance when calculating the loss. It does not require negative samples and achieves great performance on downstream tasks.Highlights of discussionThe VICReg architecture (Figure 1)Sensitivity to hyperparameters (Table 7)Top 5 metric usefulness

table figure sensitivity

2: data2vec

Play Episode Listen Later Mar 7, 2022 53:23

Todays paper: data2vec (https://arxiv.org/abs/2202.03555)Summary of the paperA multimodal SSL algorithm that predicts latent representation of different types of input.Highlights of discussionWhat are the motivations of SSL and multimodalHow does the student teacher learning work?What are similarities and differences between ViT, BYOL, and Reinforcement Learning algorithms.

vit ssl reinforcement learning byol

1: Reward is Enough

Play Episode Listen Later Feb 21, 2022 54:36

This is the first episode of Argmax! We talk about our motivations for doing a podcast, and what we hope listeners will get out of it.Todays paper: Reward is Enough Summary of the paperThe authors present the Reward is Enough hypothesis: Intelligence, and its associated abilities, can be understood as subserving the maximisation of reward by an agent acting in its environment.Highlights of discussionHigh level overview of Reinforcement LearningHow evolution can be encoded as a reward maximization problemWhat is the one reward signal we are trying to optimize?

intelligence reward

Claim Argmax

In order to claim this podcast we'll send an email to with a verification link. Simply click the link and you will be able to edit tags, request a refresh, and other features to take control of your podcast page!

Claim Cancel