Podcasts about learning research

fails administrators know how project based learning pbl projectbased learning research

Play Episode Listen Later Aug 17, 2022 19:13

Send us a textThis is the second part of my deep dive into the Lucas Research Study on Project Based Learning. You can find the first part in episode 88. The research findings absolutely dispel the myth that PBL doesn't improve test scores. Project Based Learning improves AP test scores for high school students and creates noticeable gains in middle school and elementary school students.It also increased engagement for the teachers. Having a culture with sustained professional learning and coaching for teachers and staff creates an environment where PBL is done right and increases outcomes and enjoyment for learners and teachers. The study shows that PBL was beneficial for all learners and staff. Episode Highlights: [02:10] "Need to Know" - How does our PBL school get better throughout the school year? [02:50] Having a staff member or PBL coach that specifically looks at Project Based Learning and supports the staff is essential. [04:51] Administrators also need an administrative coach. You need someone outside of your staff and district to push you.[06:04] Project Based Learning boosts student learning in AP courses.[07:48] Many of the participants were from underserved communities, and the study was created to help promote equity. The program increased test performance when using Project Based Learning.[08:31] The myth that Project Based Learning does not enhance test scores has been flipped on its head. [09:37] Your staff has to have professional learning in order to implement PBL properly.[10:24] PBL also led to gains in math, science, and vocabulary learning for middle schoolers. The teachers were also more engaged.[14:48] PBL is beneficial for all students.Resources & Links Related to this EpisodeWhat is PBL?Ask RyanMagnify Learning YouTubeProject Based Learning Stories and Structures: Wins, Fails, and Where to StartMagnify LearningRyan Steuer Twitter @ryansteuerPBL Workshops For Schools & DistrictsCommunity Partner ResourcesLucas Research Briefs on Project Based LearningPBL Leadership - Project Based Learning Research Part 1 | Episode 88PBL Guest - Danny Bauer of Better Leaders Better Schools | Episode 87

Project Based Learning Research Part 2 | E90

education teaching teachers educators fails administrators k12 project based learning school leaders pbl projectbased learning research

Play Episode Listen Later Aug 17, 2022 19:13

This is the second part of my deep dive into the Lucas Research Study on Project Based Learning. You can find the first part in episode 88. The research findings absolutely dispel the myth that PBL doesn't improve test scores. Project Based Learning improves AP test scores for high school students and creates noticeable gains in middle school and elementary school students.It also increased engagement for the teachers. Having a culture with sustained professional learning and coaching for teachers and staff creates an environment where PBL is done right and increases outcomes and enjoyment for learners and teachers. The study shows that PBL was beneficial for all learners and staff. Episode Highlights: [02:10] "Need to Know" - How does our PBL school get better throughout the school year? [02:50] Having a staff member or PBL coach that specifically looks at Project Based Learning and supports the staff is essential. [04:51] Administrators also need an administrative coach. You need someone outside of your staff and district to push you.[06:04] Project Based Learning boosts student learning in AP courses.[07:48] Many of the participants were from underserved communities, and the study was created to help promote equity. The program increased test performance when using Project Based Learning.[08:31] The myth that Project Based Learning does not enhance test scores has been flipped on its head. [09:37] Your staff has to have professional learning in order to implement PBL properly.[10:24] PBL also led to gains in math, science, and vocabulary learning for middle schoolers. The teachers were also more engaged.[14:48] PBL is beneficial for all students.Resources & Links Related to this EpisodeWhat is PBL?Ask RyanMagnify Learning YouTubeProject Based Learning Stories and Structures: Wins, Fails, and Where to StartMagnify LearningRyan Steuer Twitter @ryansteuerPBL Workshops For Schools & DistrictsCommunity Partner ResourcesLucas Research Briefs on Project Based LearningPBL Leadership - Project Based Learning Research Part 1 | Episode 88PBL Guest - Danny Bauer of Better Leaders Better Schools | Episode 87

Project Based Learning Research Part 1 | E88

research connections belonging fails educate curriculum learners sustained project based learning pbl projectbased learning research

Play Episode Listen Later Aug 3, 2022 20:10

Send us a textI've seen lives turned around by Project Based Learning. I have loads of success stories from learners, teachers, and administrators. There is also research to back up the benefits of Project Based Learning. This leadership episode is going to be a two-part episode focusing on Research on Project Based Learning by Lucas Education Research. The brief is linked to in the resources below.This episode takes a high-level view, and then I break it down in the upcoming episode 90 as we take a deeper look. We also have a “need to know” that focuses on the question of where do I get started with PBL? I share a free resource in the show notes and other options for learning about Project Based Learning. I also share the best way to learn by visiting a school and seeing it in action. Episode Highlights: [02:54] Where do I get started with PBL? Educate yourself with books, podcasts, videos, and visiting a school that is doing PBL.[04:01] On your visit ask good questions and be sure to talk to the learners.[05:55] It's a powerful lever for improving equity. When it's used significant learning occurs. [07:05] The study shows that when underserved students engage in PBL they learn significantly.[09:22] Learners miss out on opportunities and authentic learning experiences when not involved in Project Based Learning.[11:59] Curriculum needs to be flexible enough to pull in additional resources to help your students connect it to the work. Connections are a huge benefit.[13:06] Belonging in the school community. We need to build the culture so students are in a safe environment.[15:37] PBL also creates strong learning opportunities for teachers. Sustained high-quality professional learning. Resources & Links Related to this EpisodeWhat is PBL?Ask RyanMagnify Learning YouTubeProject Based Learning Stories and Structures: Wins, Fails, and Where to StartMagnify LearningRyan Steuer Twitter @ryansteuerPBL Workshops For Schools & DistrictsCommunity Partner ResourcesLucas Research Briefs on Project Based Learning

Project Based Learning Research Part 1 | E88

education research teaching teachers educators connections belonging fails educate curriculum learners sustained k12 project based learning school leaders pbl projectbased learning research

Play Episode Listen Later Aug 3, 2022 20:10

I've seen lives turned around by Project Based Learning. I have loads of success stories from learners, teachers, and administrators. There is also research to back up the benefits of Project Based Learning. This leadership episode is going to be a two-part episode focusing on Research on Project Based Learning by Lucas Education Research. The brief is linked to in the resources below.This episode takes a high-level view, and then I break it down in the upcoming episode 90 as we take a deeper look. We also have a “need to know” that focuses on the question of where do I get started with PBL? I share a free resource in the show notes and other options for learning about Project Based Learning. I also share the best way to learn by visiting a school and seeing it in action. Episode Highlights: [02:54] Where do I get started with PBL? Educate yourself with books, podcasts, videos, and visiting a school that is doing PBL.[04:01] On your visit ask good questions and be sure to talk to the learners.[05:55] It's a powerful lever for improving equity. When it's used significant learning occurs. [07:05] The study shows that when underserved students engage in PBL they learn significantly.[09:22] Learners miss out on opportunities and authentic learning experiences when not involved in Project Based Learning.[11:59] Curriculum needs to be flexible enough to pull in additional resources to help your students connect it to the work. Connections are a huge benefit.[13:06] Belonging in the school community. We need to build the culture so students are in a safe environment.[15:37] PBL also creates strong learning opportunities for teachers. Sustained high-quality professional learning. Resources & Links Related to this EpisodeWhat is PBL?Ask RyanMagnify Learning YouTubeProject Based Learning Stories and Structures: Wins, Fails, and Where to StartMagnify LearningRyan Steuer Twitter @ryansteuerPBL Workshops For Schools & DistrictsCommunity Partner ResourcesLucas Research Briefs on Project Based Learning

The Resonance Test 82: Mitch Resnick of the Scratch Foundation

The EPAM Continuum Podcast Network

Play Episode Listen Later Jul 8, 2022 26:18

If you have a child (a young sibling, cousin, student, or even friend) in your life, chances are you know about Scratch—the wildly popular graphical program language that kids use to dream up interactive stories, games, and animations. But it's entirely possible you don't know the man behind Scratch, Mitch Resnick, the LEGO Papert Professor of Learning Research at the MIT Media Lab, or that the Scratch Foundation has a strong, long-term relationship with EPAM. After listening to the latest iteration of *The Resonance Test,* in which Resnick and Shamilka Samarasinha, EPAM's Global Head of Corporate Social Responsibility, answer questions from producer Ken Gordon, that ignorance will instantly evaporate. The episode digs into the reasons why Scratch is, and always has been, a free program (“We didn't want there to be barriers for young people to get access to Scratch,” says Resnick) and how Scratch helped children during Covid (the first year of the pandemic saw the number of Scratch projects double and the number of comments the kids wrote on each other's projects increase fivefold). You'll hear about Scratch and the kids of Ukraine. Says Resnick: “In early March, 10 days after the invasion of Ukraine, I got a message from an educator in Ukraine named Olesia Vlasii.” Vlasii had the idea to use Scratch to create what she called Waves of Kindness. The result: A Waves of Kindness gallery on the Scratch website “where kids from around the world could upload projects about how you could spread kindness,” says Resnick. Within days, Waves of Kindness featured “literally thousands of projects from kids around the world.” The conversation also touches on how Scratch engages a wide ecosystem of learners to promote diversity and inclusion in expanding education and how EPAM's partnership with Scratch fits into our other ESG activities. “In the social impact space, obviously education is one of our core areas,” says Samarasinha. Finally, Resnik and Samarasinha talk about the evolving relationship between the Scratch Foundation and EPAM—our EPAM E-Kids program has expanded from four to 19 countries—and the upcoming virtual Scratch Conference. It's a conversation that your kid will want you to hear. So listen! Host: Alison Kotin Engineer: Kyp Pilalas Producer: Ken Gordon

covid-19 ukraine foundation kindness waves scratch esg global head resonance corporate social responsibility mit media lab resnick epam learning research ken gordon mitch resnick

Applied machine learning research methods, human-machine team, AI strategies, trends in machine learning, how to earn trust - Vin Vashishta - The data scientist show #042

The Data Scientist Show

Play Episode Listen Later Jun 29, 2022 110:01

Vin Vashishta is a chief data officer and AI strategist at V Squared, a company he founded in 2012 that provides AI strategy, transformation, and data organizational build-out services. He teaches data professionals about strategy, communications, business acumen, and applied machine learning research methods. Vin has 130k+ followers on Linkedin talking about AI, analytics, and strategy. His website: https://www.datascience.vin/ Follow @DalianaLiu for more updates on data science and this show. If you enjoy this episode, subscribe and leave a 5-star review :) Topics: Machine learning problem solving Case study: how to find pricing strategy through data science Applied ML research methods “Human machine team” Casual inference resources/books How to earn trust from customer Future of data science/Machine learning MLOps vs QA (quality control) How to lead without authority Mistakes he made in his career What he learned from his mentor Shift in data science over the passed 10 years

ai strategy human machine learning casual applied qa data scientists research methods human machine earn trust learning research

222 Adam McAtee: Applying Strength & Motor Learning Research to Pilates and Yoga

Mindful Strength

Play Episode Listen Later Jun 20, 2022 51:57

Adam and Kathryn talk about Pilates, yoga and how we can integrate current evidence into the practices we love without throwing the baby out with the bath water. Adam is on his way to becoming a physical therapist and along the path has incorporated heavy weights into his long-time Pilates career. They also talk about motor learning and how we can cue our clients to be as successful as possible during exercise. ___ If this conversation excites you click here to learn more about the Mindful Strength Teacher's Immersion 2.0. Early bird ends July 1st. This 3-month continuing education course for yoga, pilates and fitness teachers begins in October. ___ Adam lives in Long Beach, California and has been teaching Pilates since 2009. He is an internationally recognized Pilates teacher trainer for Breathe Education, holds a Bachelors of Science in Exercise Science and is currently studying Physical Therapy at the University of St. Augustine for Health Sciences. Adam is the founder of various live online workshops such as “Motor Learning Strategies” & “Pain Science & Pilates To learn more about Adam click here. To follow Adam on IG click here.

university california science strength yoga pilates bachelors long beach physical therapy health sciences immersion exercise science mcatee motor learning learning research

Transformer Memory as a Differentiable Search Index (Machine Learning Research Paper Explained)

memory experiments id machine learning index experimental transformer tran retrieval dsi research papers learning research

Play Episode Listen Later Apr 21, 2022 51:51

#dsi #search #google Search engines work by building an index and then looking up things in it. Usually, that index is a separate data structure. In keyword search, we build and store reverse indices. In neural search, we build nearest-neighbor indices. This paper does something different: It directly trains a Transformer to return the ID of the most relevant document. No similarity search over embeddings or anything like this is performed, and no external data structure is needed, as the entire index is essentially captured by the model's weights. The paper experiments with various ways of representing documents and training the system, which works surprisingly well! Sponsor: Diffgram https://diffgram.com?ref=yannic OUTLINE: 0:00 - Intro 0:45 - Sponsor: Diffgram 1:35 - Paper overview 3:15 - The search problem, classic and neural 8:15 - Seq2seq for directly predicting document IDs 11:05 - Differentiable search index architecture 18:05 - Indexing 25:15 - Retrieval and document representation 33:25 - Training DSI 39:15 - Experimental results 49:25 - Comments & Conclusions Paper: https://arxiv.org/abs/2202.06991 Abstract: In this paper, we demonstrate that information retrieval can be accomplished with a single Transformer, in which all information about the corpus is encoded in the parameters of the model. To this end, we introduce the Differentiable Search Index (DSI), a new paradigm that learns a text-to-text model that maps string queries directly to relevant docids; in other words, a DSI model answers queries directly using only its parameters, dramatically simplifying the whole retrieval process. We study variations in how documents and their identifiers are represented, variations in training procedures, and the interplay between models and corpus sizes. Experiments demonstrate that given appropriate design choices, DSI significantly outperforms strong baselines such as dual encoder models. Moreover, DSI demonstrates strong generalization capabilities, outperforming a BM25 baseline in a zero-shot setup. Authors: Yi Tay, Vinh Q. Tran, Mostafa Dehghani, Jianmo Ni, Dara Bahri, Harsh Mehta, Zhen Qin, Kai Hui, Zhe Zhao, Jai Gupta, Tal Schuster, William W. Cohen, Donald Metzler Links: Merch: store.ykilcher.com TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yann... LinkedIn: https://www.linkedin.com/in/ykilcher BiliBili: https://space.bilibili.com/2017636191 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannick... Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

How He Breaks Down Complex Machine Learning Research for YouTube (Yannic Kilcher ) - KNN Ep. 95

Ken's Nearest Neighbors

Play Episode Listen Later Apr 20, 2022 56:59 Transcription Available

Today I had the pleasure of interviewing Yannic Kilcher. Yannic is a YouTuber covering state of the art Machine Learning research topics. He has a PhD from ETH Zurich and is currently the CTO of DeepJudge, a LegalTech NLP startup. In this episode we learn about how Yannic decided on a PHD in Ai, how he is able to make advanced research so digestable, and the reason why he wears sunglasses on camera. I hope you enjoy the epsisode, I know I enjoyed our conversation.

ai phd breaks complex cto machine learning eth zurich yannic kilcher learning research yannic kilcher

Switching on the curiousity lightbulb with MIT's Mitch Resnick and OECD's Rowena Phair

OECD Education & Skills TopClass Podcast

Play Episode Listen Later Feb 21, 2022 22:32

“Why is the sky blue?” “Why do people get sick?” “Why aren't there any more dinosaurs?” Sometimes it feels like children never stop asking questions. And they shouldn't. A recent OECD International Early Learning and Child Wellbeing study shows that children who are curious have stronger language and number skills, and better self-control. So how do we keep students curious and creative even after they've outgrown kindergarten? Rowena Phair, senior analyst at the OECD, and Mitch Resnick, Professor of Learning Research at the MIT Media Lab, discuss. Host: Clara Young; Producer: Ilse Sánchez

professor switching oecd lightbulb mit media lab curiousity phair learning research mitch resnick

Mathilde Caron, Self-Supervised Learning Research

Conversations On Science

Play Episode Listen Later Dec 16, 2021 47:23

Mathilde Caron is a PhD. candidate at the French National Institute for Research in Digital Science and Technology and at Facebook AI (Meta AI). She does the majority of her research in the field of Machine learning called self-supervised learning. She has a few first authorships on important academic papers in the space. Her work: https://scholar.google.com/citations?user=eiB0s-kAAAAJ&hl=fr You can donate to this podcast at this bitcoin address: 33wejXuGGDtQj9GPwCgjwPxPq4dc4muZjg --- This episode is sponsored by · Anchor: The easiest way to make a podcast. https://anchor.fm/app Support this podcast: https://anchor.fm/idris-sunmola/support

technology phd research supervised learning learning research digital science french national institute

Learning Rate Grafting: Transferability of Optimizer Tuning (Machine Learning Research Paper Reivew)

lamb rant conclusion machine learning tuning gpu reviewer grafting optimizer research papers sgd learning research transferability

Play Episode Listen Later Nov 22, 2021 39:14

#grafting #adam #sgd The last years in deep learning research have given rise to a plethora of different optimization algorithms, such as SGD, AdaGrad, Adam, LARS, LAMB, etc. which all claim to have their special peculiarities and advantages. In general, all algorithms modify two major things: The (implicit) learning rate schedule, and a correction to the gradient direction. This paper introduces grafting, which allows to transfer the induced learning rate schedule of one optimizer to another one. In that, the paper shows that much of the benefits of adaptive methods (e.g. Adam) are actually due to this schedule, and not necessarily to the gradient direction correction. Grafting allows for more fundamental research into differences and commonalities between optimizers, and a derived version of it makes it possible to computes static learning rate corrections for SGD, which potentially allows for large savings of GPU memory. OUTLINE 0:00 - Rant about Reviewer #2 6:25 - Intro & Overview 12:25 - Adaptive Optimization Methods 20:15 - Grafting Algorithm 26:45 - Experimental Results 31:35 - Static Transfer of Learning Rate Ratios 35:25 - Conclusion & Discussion Paper (OpenReview): https://openreview.net/forum?id=FpKgG... Old Paper (Arxiv): https://arxiv.org/abs/2002.11803 Our Discord: https://discord.gg/4H8xxDF Abstract: In the empirical science of training large neural networks, the learning rate schedule is a notoriously challenging-to-tune hyperparameter, which can depend on all other properties (architecture, optimizer, batch size, dataset, regularization, ...) of the problem. In this work, we probe the entanglements between the optimizer and the learning rate schedule. We propose the technique of optimizer grafting, which allows for the transfer of the overall implicit step size schedule from a tuned optimizer to a new optimizer, preserving empirical performance. This provides a robust plug-and-play baseline for optimizer comparisons, leading to reductions to the computational cost of optimizer hyperparameter search. Using grafting, we discover a non-adaptive learning rate correction to SGD which allows it to train a BERT model to state-of-the-art performance. Besides providing a resource-saving tool for practitioners, the invariances discovered via grafting shed light on the successes and failure modes of optimizers in deep learning. Authors: Anonymous (Under Review) Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yann... LinkedIn: https://www.linkedin.com/in/ykilcher BiliBili: https://space.bilibili.com/2017636191 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannick... Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Gradients are Not All You Need (Machine Learning Research Paper Explained)

Play Episode Listen Later Nov 22, 2021 48:29

#deeplearning #backpropagation #simulation More and more systems are made differentiable, which means that accurate gradients of these systems' dynamics can be computed exactly. While this development has led to a lot of advances, there are also distinct situations where backpropagation can be a very bad idea. This paper characterizes a few such systems in the domain of iterated dynamical systems, often including some source of stochasticity, resulting in chaotic behavior. In these systems, it is often better to use black-box estimators for gradients than computing them exactly. OUTLINE: 0:00 - Foreword 1:15 - Intro & Overview 3:40 - Backpropagation through iterated systems 12:10 - Connection to the spectrum of the Jacobian 15:35 - The Reparameterization Trick 21:30 - Problems of reparameterization 26:35 - Example 1: Policy Learning in Simulation 33:05 - Example 2: Meta-Learning Optimizers 36:15 - Example 3: Disk packing 37:45 - Analysis of Jacobians 40:20 - What can be done? 45:40 - Just use Black-Box methods Paper: https://arxiv.org/abs/2111.05803 Abstract: Differentiable programming techniques are widely used in the community and are responsible for the machine learning renaissance of the past several decades. While these methods are powerful, they have limits. In this short report, we discuss a common chaos based failure mode which appears in a variety of differentiable circumstances, ranging from recurrent neural networks and numerical physics simulation to training learned optimizers. We trace this failure to the spectrum of the Jacobian of the system under study, and provide criteria for when a practitioner might expect this failure to spoil their differentiation based optimization algorithms. Authors: Luke Metz, C. Daniel Freeman, Samuel S. Schoenholz, Tal Kachman Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yann... LinkedIn: https://www.linkedin.com/in/ykilcher BiliBili: https://space.bilibili.com/2017636191 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannick... Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

analysis machine learning outline disk research papers gradients learning research backpropagation daniel freeman jacobian

Autoregressive Diffusion Models (Machine Learning Research Paper Explained)

conclusion minds models application arms berg machine learning extension contrary outline dependent diffusion research papers rianne learning research

Play Episode Listen Later Nov 11, 2021 34:23

#machinelearning #ardm #generativemodels Diffusion models have made large advances in recent months as a new type of generative models. This paper introduces Autoregressive Diffusion Models (ARDMs), which are a mix between autoregressive generative models and diffusion models. ARDMs are trained to be agnostic to the order of autoregressive decoding and give the user a dynamic tradeoff between speed and performance at decoding time. This paper applies ARDMs to both text and image data, and as an extension, the models can also be used to perform lossless compression. OUTLINE: 0:00 - Intro & Overview 3:15 - Decoding Order in Autoregressive Models 6:15 - Autoregressive Diffusion Models 8:35 - Dependent and Independent Sampling 14:25 - Application to Character-Level Language Models 18:15 - How Sampling & Training Works 26:05 - Extension 1: Parallel Sampling 29:20 - Extension 2: Depth Upscaling 33:10 - Conclusion & Comments Paper: https://arxiv.org/abs/2110.02037 Abstract: We introduce Autoregressive Diffusion Models (ARDMs), a model class encompassing and generalizing order-agnostic autoregressive models (Uria et al., 2014) and absorbing discrete diffusion (Austin et al., 2021), which we show are special cases of ARDMs under mild assumptions. ARDMs are simple to implement and easy to train. Unlike standard ARMs, they do not require causal masking of model representations, and can be trained using an efficient objective similar to modern probabilistic diffusion models that scales favourably to highly-dimensional data. At test time, ARDMs support parallel generation which can be adapted to fit any given generation budget. We find that ARDMs require significantly fewer steps than discrete diffusion models to attain the same performance. Finally, we apply ARDMs to lossless compression, and show that they are uniquely suited to this task. Contrary to existing approaches based on bits-back coding, ARDMs obtain compelling results not only on complete datasets, but also on compressing single data points. Moreover, this can be done using a modest number of network calls for (de)compression due to the model's adaptable parallel generation. Authors: Emiel Hoogeboom, Alexey A. Gritsenko, Jasmijn Bastings, Ben Poole, Rianne van den Berg, Tim Salimans Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yann... Minds: https://www.minds.com/ykilcher Parler: https://parler.com/profile/YannicKilcher LinkedIn: https://www.linkedin.com/in/ykilcher BiliBili: https://space.bilibili.com/1824646584 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannick... Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m

EfficientZero: Mastering Atari Games with Limited Data (Machine Learning Research Paper Explained)

My Worst Investment Ever Podcast

Play Episode Listen Later Nov 5, 2021 29:25

#efficientzero #muzero #atari Reinforcement Learning methods are notoriously data-hungry. Notably, MuZero learns a latent world model just from scalar feedback of reward- and policy-predictions, and therefore relies on scale to perform well. However, most RL algorithms fail when presented with very little data. EfficientZero makes several improvements over MuZero that allows it to learn from astonishingly small amounts of data and outperform other methods by a large margin in the low-sample setting. This could be a staple algorithm for future RL research. OUTLINE: 0:00 - Intro & Outline 2:30 - MuZero Recap 10:50 - EfficientZero improvements 14:15 - Self-Supervised consistency loss 17:50 - End-to-end prediction of the value prefix 20:40 - Model-based off-policy correction 25:45 - Experimental Results & Conclusion Paper: https://arxiv.org/abs/2111.00210 Code: https://github.com/YeWR/EfficientZero Note: code not there yet as of release of this video Abstract: Reinforcement learning has achieved great success in many applications. However, sample efficiency remains a key challenge, with prominent methods requiring millions (or even billions) of environment steps to train. Recently, there has been significant progress in sample efficient image-based RL algorithms; however, consistent human-level performance on the Atari game benchmark remains an elusive goal. We propose a sample efficient model-based visual RL algorithm built on MuZero, which we name EfficientZero. Our method achieves 190.4% mean human performance and 116.0% median performance on the Atari 100k benchmark with only two hours of real-time game experience and outperforms the state SAC in some tasks on the DMControl 100k benchmark. This is the first time an algorithm achieves super-human performance on Atari games with such little data. EfficientZero's performance is also close to DQN's performance at 200 million frames while we consume 500 times less data. EfficientZero's low sample complexity and high performance can bring RL closer to real-world applicability. We implement our algorithm in an easy-to-understand manner and it is available at this https URL. We hope it will accelerate the research of MCTS-based RL algorithms in the wider community. Authors: Weirui Ye, Shaohuai Liu, Thanard Kurutach, Pieter Abbeel, Yang Gao Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yann... Minds: https://www.minds.com/ykilcher Parler: https://parler.com/profile/YannicKilcher LinkedIn: https://www.linkedin.com/in/ykilcher BiliBili: https://space.bilibili.com/1824646584 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannick... Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

data model mastering minds limited machine learning atari notably outline sac rl research papers mcts learning research atari games pieter abbeel muzero

Andre Hsu – Trust Your Partner before Investing in Their Idea

Play Episode Listen Later Sep 23, 2021 34:39

BIO: Andre Hsu is a thinker and business strategist based in Singapore. He is the author of three books about qualities, mindsets, and frameworks relevant to business, which he observed in several business tycoons who left a deep and lasting impact on his life. STORY: Andre partnered with a software company with an excellent business idea, but the partners were poor in managing the company and selling the product, so it failed. Andre lost his entire investment. LEARNING: Research the people who own a business as much as you research the business. “No matter how great the idea is, you'll not succeed if you cannot trust your partner.”Andre Hsu Guest profilehttps://www.linkedin.com/in/andrehsujs/ (Andre Hsu) is a thinker and business strategist based in Singapore. He started his entrepreneurial journey at 17, working on a real estate project while juggling three academic degrees completed concurrently in Australia. He is the author of three books about qualities, mindsets, and frameworks that are relevant for business which he observed in several business tycoons who left a deep and lasting impact on his life. Andre likes to use multiple techniques to read, predict people, assess situations, and formulate strategies that are suitable for properties and negotiations, deal structuring related to the assets. He likes to educate, share knowledge, insights, and reasoning of strategies to business associates who then proceed to implement them. He looks forward to doing this with people who share similar values and visions. Worst investment everAfter finishing university, Andre went into the family business, and after a while, he decided to venture into his own business pursuits. He got involved in a small software company that was dealing with Point-of-Service systems. At the time, this was a very lucrative business because not many companies were using POS systems. Andre was, therefore, happy to partner with the company and invest in this venture. The mistake Andre made was investing in people who didn't take the business part of the venture seriously. They only created a good product, but they never invested in sales or management, so the product never really took off. Lessons learnedYou need to research the people who own a business as much as you research the business. Andrew's takeawaysWhen investing in a startup, you need to look for trust, a good idea, the ability to execute the idea, and capital. Actionable adviceIf you cannot trust the person you want to partner with, forget the idea. No matter how great the idea is, you'll not succeed if you cannot trust your partner. No. 1 goal for the next 12 monthsAndre Hsu's number one goal for the next 12 months is to step back from the business and have his partners run it to have more time to do strategic thinking and come up with new ideas. [spp-transcript] Connect with Andre Hsuhttps://www.linkedin.com/in/andrehsujs/ (LinkedIn) https://amzn.to/3nNgf64 (Book) Andrew's bookshttps://amzn.to/3qrfHjX (How to Start Building Your Wealth Investing in the Stock Market) https://amzn.to/2PDApAo (My Worst Investment Ever) https://amzn.to/3v6ip1Y (9 Valuation Mistakes and How to Avoid Them) https://amzn.to/3emBO8M (Transform Your Business with Dr.Deming's 14 Points) Andrew's online programshttps://valuationmasterclass.com/ (Valuation Master Class) https://academy.astotz.com/courses/how-to-start-building-your-wealth-investing-in-the-stock-market (How to Start Building Your Wealth Investing in the Stock Market) https://academy.astotz.com/courses/finance-made-ridiculously-simple (Finance Made Ridiculously Simple) https://academy.astotz.com/courses/gp (Become a Great Presenter and Increase Your Influence) https://academy.astotz.com/courses/transformyourbusiness (Transform Your Business with Dr. Deming's 14 Points) Connect with Andrew Stotz:https://www.astotz.com/ (astotz.com) https://www.linkedin.com/in/andrewstotz/ (LinkedIn)...

trust australia service partner investing idea singapore stock market pos transform your business deming ever after avoid them increase your influence learning research andrew stotz

Topographic VAEs learn Equivariant Capsules (Machine Learning Research Paper Explained)

Play Episode Listen Later Sep 21, 2021 32:03

#tvae #topographic #equivariant Variational Autoencoders model the latent space as a set of independent Gaussian random variables, which the decoder maps to a data distribution. However, this independence is not always desired, for example when dealing with video sequences, we know that successive frames are heavily correlated. Thus, any latent space dealing with such data should reflect this in its structure. Topographic VAEs are a framework for defining correlation structures among the latent variables and induce equivariance within the resulting model. This paper shows how such correlation structures can be built by correctly arranging higher-level variables, which are themselves independent Gaussians. OUTLINE: 0:00 - Intro 1:40 - Architecture Overview 6:30 - Comparison to regular VAEs 8:35 - Generative Mechanism Formulation 11:45 - Non-Gaussian Latent Space 17:30 - Topographic Product of Student-t 21:15 - Introducing Temporal Coherence 24:50 - Topographic VAE 27:50 - Experimental Results 31:15 - Conclusion & Comments Paper: https://arxiv.org/abs/2109.01394 Code: https://github.com/akandykeller/topog... Abstract: In this work we seek to bridge the concepts of topographic organization and equivariance in neural networks. To accomplish this, we introduce the Topographic VAE: a novel method for efficiently training deep generative models with topographically organized latent variables. We show that such a model indeed learns to organize its activations according to salient characteristics such as digit class, width, and style on MNIST. Furthermore, through topographic organization over time (i.e. temporal coherence), we demonstrate how predefined latent space transformation operators can be encouraged for observed transformed input sequences -- a primitive form of unsupervised learned equivariance. We demonstrate that this model successfully learns sets of approximately equivariant features (i.e. "capsules") directly from sequences and achieves higher likelihood on correspondingly transforming test sequences. Equivariance is verified quantitatively by measuring the approximate commutativity of the inference network and the sequence transformations. Finally, we demonstrate approximate equivariance to complex transformations, expanding upon the capabilities of existing group equivariant neural networks. Authors: T. Anderson Keller, Max Welling Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yann... Minds: https://www.minds.com/ykilcher Parler: https://parler.com/profile/YannicKilcher LinkedIn: https://www.linkedin.com/in/ykilcher BiliBili: https://space.bilibili.com/1824646584 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannick... Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

student comparison conclusion minds machine learning outline capsules research papers gaussian learning research topographic mnist gaussians

Fastformer: Additive Attention Can Be All You Need (Machine Learning Research Paper Explained)

attention conclusion minds machine learning transformer outline extensive additives redundant research papers bilibili learning research

Play Episode Listen Later Aug 27, 2021 35:29

#attention #transformer #fastformer Transformers have become the dominant model class in the last few years for large data, but their quadratic complexity in terms of sequence length has plagued them until now. Fastformer claims to be the fastest and most performant linear attention variant, able to consume long contexts at once. This is achieved by a combination of additive attention and elementwise products. While initial results look promising, I have my reservations... OUTLINE: 0:00 - Intro & Outline 2:15 - Fastformer description 5:20 - Baseline: Classic Attention 10:00 - Fastformer architecture 12:50 - Additive Attention 18:05 - Query-Key element-wise multiplication 21:35 - Redundant modules in Fastformer 25:00 - Problems with the architecture 27:30 - Is this even attention? 32:20 - Experimental Results 34:50 - Conclusion & Comments Paper: https://arxiv.org/abs/2108.09084 Abstract: Transformer is a powerful model for text understanding. However, it is inefficient due to its quadratic complexity to input sequence length. Although there are many methods on Transformer acceleration, they are still either inefficient on long sequences or not effective enough. In this paper, we propose Fastformer, which is an efficient Transformer model based on additive attention. In Fastformer, instead of modeling the pair-wise interactions between tokens, we first use additive attention mechanism to model global contexts, and then further transform each token representation based on its interaction with global context representations. In this way, Fastformer can achieve effective context modeling with linear complexity. Extensive experiments on five datasets show that Fastformer is much more efficient than many existing Transformer models and can meanwhile achieve comparable or even better long text modeling performance. Authors: Chuhan Wu, Fangzhao Wu, Tao Qi, Yongfeng Huang Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yann... Minds: https://www.minds.com/ykilcher Parler: https://parler.com/profile/YannicKilcher LinkedIn: https://www.linkedin.com/in/yannic-ki... BiliBili: https://space.bilibili.com/1824646584 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannick... Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

PonderNet: Learning to Ponder (Machine Learning Research Paper Explained)

training loss conclusion minds machine learning sensitivity outline ponder deepmind research papers bilibili probabilistic learning research

Play Episode Listen Later Aug 23, 2021 44:18

#pondernet #deepmind #machinelearning Humans don't spend the same amount of mental effort on all problems equally. Instead, we respond quickly to easy tasks, and we take our time to deliberate hard tasks. DeepMind's PonderNet attempts to achieve the same by dynamically deciding how many computation steps to allocate to any single input sample. This is done via a recurrent architecture and a trainable function that computes a halting probability. The resulting model performs well in dynamic computation tasks and is surprisingly robust to different hyperparameter settings. OUTLINE: 0:00 - Intro & Overview 2:30 - Problem Statement 8:00 - Probabilistic formulation of dynamic halting 14:40 - Training via unrolling 22:30 - Loss function and regularization of the halting distribution 27:35 - Experimental Results 37:10 - Sensitivity to hyperparameter choice 41:15 - Discussion, Conclusion, Broader Impact Paper: https://arxiv.org/abs/2107.05407 Abstract: In standard neural networks the amount of computation used grows with the size of the inputs, but not with the complexity of the problem being learnt. To overcome this limitation we introduce PonderNet, a new algorithm that learns to adapt the amount of computation based on the complexity of the problem at hand. PonderNet learns end-to-end the number of computational steps to achieve an effective compromise between training prediction accuracy, computational cost and generalization. On a complex synthetic problem, PonderNet dramatically improves performance over previous adaptive computation methods and additionally succeeds at extrapolation tests where traditional neural networks fail. Also, our method matched the current state of the art results on a real world question and answering dataset, but using less compute. Finally, PonderNet reached state of the art results on a complex task designed to test the reasoning capabilities of neural networks.1 Authors: Andrea Banino, Jan Balaguer, Charles Blundell Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yann... Minds: https://www.minds.com/ykilcher Parler: https://parler.com/profile/YannicKilcher LinkedIn: https://www.linkedin.com/in/yannic-ki... BiliBili: https://space.bilibili.com/1824646584 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannick... Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Challenges of productionizing Machine Learning Research in Industry | Aarti Bagul

Machine Learning Podcast - Jay Shah

Play Episode Listen Later Jul 19, 2021 5:35

Why and where do companies fail at productionizing ML models? Watch the full podcast with Aarti here: https://youtu.be/VWJXiszQpTUAarti is a machine learning engineer at Snorkel AI. Prior to that, she worked closely with Andrew Ng in various capacities. She graduated with a master's in CS from Stanford, and bachelor's in CS and Computer Engineering from @New York University, and at @Microsoft Research as a research intern for John Langford, where she contributed to Vowpal Wabbit, an open-source project. About the Host:Jay is a Ph.D. student at Arizona State University, doing research on building Interpretable AI models for Medical Diagnosis.Jay Shah: https://www.linkedin.com/in/shahjay22/You can reach out to https://www.public.asu.edu/~jgshah1/ for any queries.Stay tuned for upcoming webinars!***Disclaimer: The information contained in this video represents the views and opinions of the speaker and does not necessarily represent the views or opinions of any institution. It does not constitute an endorsement by any Institution or its affiliates of such video content.***

challenges research microsoft stanford machine learning new york university arizona state university cs ml women in tech institution computer engineering aarti andrew ng women who code jay shah learning research women in research landing ai snorkel ai john langford

The Dimpled Manifold Model of Adversarial Examples in Machine Learning (Research Paper Explained)

talk model conclusion minds machine learning experimental outline manifold adversarial research papers goodfellow learning research

Play Episode Listen Later Jun 28, 2021 74:21

#adversarialexamples #dimpledmanifold #security Adversarial Examples have long been a fascinating topic for many Machine Learning researchers. How can a tiny perturbation cause the neural network to change its output by so much? While many explanations have been proposed over the years, they all appear to fall short. This paper attempts to comprehensively explain the existence of adversarial examples by proposing a view of the classification landscape, which they call the Dimpled Manifold Model, which says that any classifier will adjust its decision boundary to align with the low-dimensional data manifold, and only slightly bend around the data. This potentially explains many phenomena around adversarial examples. Warning: In this video, I disagree. Remember that I'm not an authority, but simply give my own opinions. OUTLINE: 0:00 - Intro & Overview 7:30 - The old mental image of Adversarial Examples 11:25 - The new Dimpled Manifold Hypothesis 22:55 - The Stretchy Feature Model 29:05 - Why do DNNs create Dimpled Manifolds? 38:30 - What can be explained with the new model? 1:00:40 - Experimental evidence for the Dimpled Manifold Model 1:10:25 - Is Goodfellow's claim debunked? 1:13:00 - Conclusion & Comments Paper: https://arxiv.org/abs/2106.10151 My replication code: https://gist.github.com/yk/de8d987c4e... Goodfellow's Talk: https://youtu.be/CIfsB_EYsVI?t=4280 Abstract: The extreme fragility of deep neural networks when presented with tiny perturbations in their inputs was independently discovered by several research groups in 2013, but in spite of enormous effort these adversarial examples remained a baffling phenomenon with no clear explanation. In this paper we introduce a new conceptual framework (which we call the Dimpled Manifold Model) which provides a simple explanation for why adversarial examples exist, why their perturbations have such tiny norms, why these perturbations look like random noise, and why a network which was adversarially trained with incorrectly labeled images can still correctly classify test images. In the last part of the paper we describe the results of numerous experiments which strongly support this new model, and in particular our assertion that adversarial perturbations are roughly perpendicular to the low dimensional manifold which contains all the training examples. Abstract: Adi Shamir, Odelia Melamed, Oriel BenShmuel Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yann... Minds: https://www.minds.com/ykilcher Parler: https://parler.com/profile/YannicKilcher LinkedIn: https://www.linkedin.com/in/ykilcher BiliBili: https://space.bilibili.com/1824646584 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannick... Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

How did I get into Machine Learning research? | Sara Hooker, Azalia Mirhoseini & Natasha Jacques - Google

Machine Learning Podcast - Jay Shah

Play Episode Listen Later Jun 26, 2021 11:28

Three research scientists from Google share their journey about interest in Machine Learning research and how they got started with it.Watch full podcasts with each of these speakers:Azalia Mirhoseini: https://youtu.be/5LCfH8YiOv4Sara Hooker: https://youtu.be/MHtbZls2utsNatasha Jacques: https://youtu.be/8XpCnmvq49sAbout the Host:Jay is a Ph.D. student at Arizona State University, doing research on building Interpretable AI models for Medical Diagnosis.Jay Shah: https://www.linkedin.com/in/shahjay22/You can reach out to https://www.public.asu.edu/~jgshah1/ for any queries.Stay tuned for upcoming webinars!***Disclaimer: The information contained in this video represents the views and opinions of the speaker and does not necessarily represent the views or opinions of any institution. It does not constitute an endorsement by any Institution or its affiliates of such video content.***

google machine learning arizona state university jacques institution hooker women who code explainable ai jay shah explainability learning research women in research

XCiT: Cross-Covariance Image Transformers (Facebook AI Machine Learning Research Paper Explained)

cross engineering minds coco transformers transformer outline theoretical computer vision natural language processing research papers ai machine learning bilibili imagenet learning research xca

Play Episode Listen Later Jun 25, 2021 35:39

#xcit #transformer #attentionmechanism After dominating Natural Language Processing, Transformers have taken over Computer Vision recently with the advent of Vision Transformers. However, the attention mechanism's quadratic complexity in the number of tokens means that Transformers do not scale well to high-resolution images. XCiT is a new Transformer architecture, containing XCA, a transposed version of attention, reducing the complexity from quadratic to linear, and at least on image data, it appears to perform on par with other models. What does this mean for the field? Is this even a transformer? What really matters in deep learning? OUTLINE: 0:00 - Intro & Overview 3:45 - Self-Attention vs Cross-Covariance Attention (XCA) 19:55 - Cross-Covariance Image Transformer (XCiT) Architecture 26:00 - Theoretical & Engineering considerations 30:40 - Experimental Results 33:20 - Comments & Conclusion Paper: https://arxiv.org/abs/2106.09681 Code: https://github.com/facebookresearch/xcit Abstract: Following their success in natural language processing, transformers have recently shown much promise for computer vision. The self-attention operation underlying transformers yields global interactions between all tokens ,i.e. words or image patches, and enables flexible modelling of image data beyond the local interactions of convolutions. This flexibility, however, comes with a quadratic complexity in time and memory, hindering application to long sequences and high-resolution images. We propose a "transposed" version of self-attention that operates across feature channels rather than tokens, where the interactions are based on the cross-covariance matrix between keys and queries. The resulting cross-covariance attention (XCA) has linear complexity in the number of tokens, and allows efficient processing of high-resolution images. Our cross-covariance image transformer (XCiT) is built upon XCA. It combines the accuracy of conventional transformers with the scalability of convolutional architectures. We validate the effectiveness and generality of XCiT by reporting excellent results on multiple vision benchmarks, including image classification and self-supervised feature learning on ImageNet-1k, object detection and instance segmentation on COCO, and semantic segmentation on ADE20k. Authors: Alaaeldin El-Nouby, Hugo Touvron, Mathilde Caron, Piotr Bojanowski, Matthijs Douze, Armand Joulin, Ivan Laptev, Natalia Neverova, Gabriel Synnaeve, Jakob Verbeek, Hervé Jegou Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yann... Minds: https://www.minds.com/ykilcher Parler: https://parler.com/profile/YannicKilcher LinkedIn: https://www.linkedin.com/in/yannic-ki... BiliBili: https://space.bilibili.com/1824646584 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannick... Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Efficient and Modular Implicit Differentiation (Machine Learning Research Paper Explained)

minds machine learning efficient python outline differentiation modular implicit research papers bilibili learning research

Play Episode Listen Later Jun 15, 2021 32:50

#implicitfunction #jax #autodiff Many problems in Machine Learning involve loops of inner and outer optimization. Finding update steps for the outer loop is usually difficult, because of the.need to differentiate through the inner loop's procedure over multiple steps. Such loop unrolling is very limited and constrained to very few steps. Other papers have found solutions around unrolling in very specific, individual problems. This paper proposes a unified framework for implicit differentiation of inner optimization procedures without unrolling and provides implementations that integrate seamlessly into JAX. OUTLINE: 0:00 - Intro & Overview 2:05 - Automatic Differentiation of Inner Optimizations 4:30 - Example: Meta-Learning 7:45 - Unrolling Optimization 13:00 - Unified Framework Overview & Pseudocode 21:10 - Implicit Function Theorem 25:45 - More Technicalities 28:45 - Experiments ERRATA: - Dataset Distillation is done with respect to the training set, not the validation or test set. Paper: https://arxiv.org/abs/2105.15183 Code coming soon Abstract: Automatic differentiation (autodiff) has revolutionized machine learning. It allows expressing complex computations by composing elementary ones in creative ways and removes the burden of computing their derivatives by hand. More recently, differentiation of optimization problem solutions has attracted widespread attention with applications such as optimization as a layer, and in bi-level problems such as hyper-parameter optimization and meta-learning. However, the formulas for these derivatives often involve case-by-case tedious mathematical derivations. In this paper, we propose a unified, efficient and modular approach for implicit differentiation of optimization problems. In our approach, the user defines (in Python in the case of our implementation) a function F capturing the optimality conditions of the problem to be differentiated. Once this is done, we leverage autodiff of F and implicit differentiation to automatically differentiate the optimization problem. Our approach thus combines the benefits of implicit differentiation and autodiff. It is efficient as it can be added on top of any state-of-the-art solver and modular as the optimality condition specification is decoupled from the implicit differentiation mechanism. We show that seemingly simple principles allow to recover many recently proposed implicit differentiation methods and create new ones easily. We demonstrate the ease of formulating and solving bi-level optimization problems using our framework. We also showcase an application to the sensitivity analysis of molecular dynamics. Authors: Mathieu Blondel, Quentin Berthet, Marco Cuturi, Roy Frostig, Stephan Hoyer, Felipe Llinares-López, Fabian Pedregosa, Jean-Philippe Vert Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yann... Minds: https://www.minds.com/ykilcher Parler: https://parler.com/profile/YannicKilcher LinkedIn: https://www.linkedin.com/in/yannic-ki... BiliBili: https://space.bilibili.com/1824646584 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannick... Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq

Reward Is Enough (Machine Learning Research Paper Explained)

conclusion minds reward machine learning abilities outline agi artificial general intelligence research papers bilibili learning research

Play Episode Listen Later Jun 2, 2021 35:48

#reinforcementlearning #deepmind #agi What's the most promising path to creating Artificial General Intelligence (AGI)? This paper makes the bold claim that a learning agent maximizing its reward in a sufficiently complex environment will necessarily develop intelligence as a by-product, and that Reward Maximization is the best way to move the creation of AGI forward. The paper is a mix of philosophy, engineering, and futurism, and raises many points of discussion. OUTLINE: 0:00 - Intro & Outline 4:10 - Reward Maximization 10:10 - The Reward-is-Enough Hypothesis 13:15 - Abilities associated with intelligence 16:40 - My Criticism 26:15 - Reward Maximization through Reinforcement Learning 31:30 - Discussion, Conclusion & My Comments Paper: https://www.sciencedirect.com/science... Abstract: In this article we hypothesise that intelligence, and its associated abilities, can be understood as subserving the maximisation of reward. Accordingly, reward is enough to drive behaviour that exhibits abilities studied in natural and artificial intelligence, including knowledge, learning, perception, social intelligence, language, generalisation and imitation. This is in contrast to the view that specialised problem formulations are needed for each ability, based on other signals or objectives. Furthermore, we suggest that agents that learn through trial and error experience to maximise reward could learn behaviour that exhibits most if not all of these abilities, and therefore that powerful reinforcement learning agents could constitute a solution to artificial general intelligence. Authors: David Silver, Satinder Singh, Doina Precup, Richard S. Sutton Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yann... Minds: https://www.minds.com/ykilcher Parler: https://parler.com/profile/YannicKilcher LinkedIn: https://www.linkedin.com/in/yannic-ki... BiliBili: https://space.bilibili.com/1824646584 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannick... Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

FNet: Mixing Tokens with Fourier Transforms (Machine Learning Research Paper Explained)

giving attention minds transformers machine learning mixing transforms transformer outline glue conclusions tokens gpus research papers fourier bilibili learning research fourier transform

Play Episode Listen Later May 24, 2021 34:22

#fnet #attention #fourier Do we even need Attention? FNets completely drop the Attention mechanism in favor of a simple Fourier transform. They perform almost as well as Transformers, while drastically reducing parameter count, as well as compute and memory requirements. This highlights that a good token mixing heuristic could be as valuable as a learned attention matrix. OUTLINE: 0:00 - Intro & Overview 0:45 - Giving up on Attention 5:00 - FNet Architecture 9:00 - Going deeper into the Fourier Transform 11:20 - The Importance of Mixing 22:20 - Experimental Results 33:00 - Conclusions & Comments Paper: https://arxiv.org/abs/2105.03824 ADDENDUM: Of course, I completely forgot to discuss the connection between Fourier transforms and Convolutions, and that this might be interpreted as convolutions with very large kernels. Abstract: We show that Transformer encoder architectures can be massively sped up, with limited accuracy costs, by replacing the self-attention sublayers with simple linear transformations that "mix" input tokens. These linear transformations, along with simple nonlinearities in feed-forward layers, are sufficient to model semantic relationships in several text classification tasks. Perhaps most surprisingly, we find that replacing the self-attention sublayer in a Transformer encoder with a standard, unparameterized Fourier Transform achieves 92% of the accuracy of BERT on the GLUE benchmark, but pre-trains and runs up to seven times faster on GPUs and twice as fast on TPUs. The resulting model, which we name FNet, scales very efficiently to long inputs, matching the accuracy of the most accurate "efficient" Transformers on the Long Range Arena benchmark, but training and running faster across all sequence lengths on GPUs and relatively shorter sequence lengths on TPUs. Finally, FNet has a light memory footprint and is particularly efficient at smaller model sizes: for a fixed speed and accuracy budget, small FNet models outperform Transformer counterparts. Authors: James Lee-Thorp, Joshua Ainslie, Ilya Eckstein, Santiago Ontanon Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yann... Minds: https://www.minds.com/ykilcher Parler: https://parler.com/profile/YannicKilcher LinkedIn: https://www.linkedin.com/in/yannic-ki... BiliBili: https://space.bilibili.com/1824646584 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannick... Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

What's the Secret to Keeping Students Motivated in Online Learning? (feat. Dr. Chris Harrington from the Michigan Virtual Learning Research Institute)

BRIGHT: Stories of Hope & Innovation in Michigan Classrooms

Play Episode Listen Later May 21, 2021 35:17

During this past year, one of the biggest challenges educators reported was keeping students engaged and motivated while learning online during the pandemic. In this episode of BRIGHT, we talk to Dr. Chris Harrington, the director of the Michigan Virtual Learning Research Institute, who shares his reflections on pandemic teaching, common misconceptions about online learning, and the findings of his team's landmark study on student engagement.

secret michigan students bright motivated online learning research institute virtual learning learning research chris harrington

DDPM - Diffusion Models Beat GANs on Image Synthesis (Machine Learning Research Paper Explained)

Play Episode Listen Later May 15, 2021 54:33

#ddpm #diffusionmodels #openai GANs have dominated the image generation space for the majority of the last decade. This paper shows for the first time, how a non-GAN model, a DDPM, can be improved to overtake GANs at standard evaluation metrics for image generation. The produced samples look amazing and other than GANs, the new model has a formal probabilistic foundation. Is there a future for GANs or are Diffusion Models going to overtake them for good? OUTLINE: 0:00 - Intro & Overview 4:10 - Denoising Diffusion Probabilistic Models 11:30 - Formal derivation of the training loss 23:00 - Training in practice 27:55 - Learning the covariance 31:25 - Improving the noise schedule 33:35 - Reducing the loss gradient noise 40:35 - Classifier guidance 52:50 - Experimental Results Paper (this): https://arxiv.org/abs/2105.05233 Paper (previous): https://arxiv.org/abs/2102.09672 Code: https://github.com/openai/guided-diff... Abstract: We show that diffusion models can achieve image sample quality superior to the current state-of-the-art generative models. We achieve this on unconditional image synthesis by finding a better architecture through a series of ablations. For conditional image synthesis, we further improve sample quality with classifier guidance: a simple, compute-efficient method for trading off diversity for sample quality using gradients from a classifier. We achieve an FID of 2.97 on ImageNet 128×128, 4.59 on ImageNet 256×256, and 7.72 on ImageNet 512×512, and we match BigGAN-deep even with as few as 25 forward passes per sample, all while maintaining better coverage of the distribution. Finally, we find that classifier guidance combines well with upsampling diffusion models, further improving FID to 3.85 on ImageNet 512×512. We release our code at this https URL Authors: Alex Nichol, Prafulla Dhariwal Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yann... Minds: https://www.minds.com/ykilcher Parler: https://parler.com/profile/YannicKilcher LinkedIn: https://www.linkedin.com/in/yannic-ki... BiliBili: https://space.bilibili.com/1824646584 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannick... Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

learning training improving minds models reducing machine learning outline formal synthesis diffusion gan gans fid research papers bilibili imagenet learning research classifier biggan

MLP-Mixer: An all-MLP Architecture for Vision (Machine Learning Research Paper Explained)

vision cnn discord effects minds architecture transformers machine learning parler mixer outline l2 mlp research papers bilibili mlps learning research convolutional neural networks

Play Episode Listen Later May 10, 2021 28:11

#mixer #google #imagenet Convolutional Neural Networks have dominated computer vision for nearly 10 years, and that might finally come to an end. First, Vision Transformers (ViT) have shown remarkable performance, and now even simple MLP-based models reach competitive accuracy, as long as sufficient data is used for pre-training. This paper presents MLP-Mixer, using MLPs in a particular weight-sharing arrangement to achieve a competitive, high-throughput model and it raises some interesting questions about the nature of learning and inductive biases and their interaction with scale for future research. OUTLINE: 0:00 - Intro & Overview 2:20 - MLP-Mixer Architecture 13:20 - Experimental Results 17:30 - Effects of Scale 24:30 - Learned Weights Visualization 27:25 - Comments & Conclusion Paper: https://arxiv.org/abs/2105.01601 Abstract: Convolutional Neural Networks (CNNs) are the go-to model for computer vision. Recently, attention-based networks, such as the Vision Transformer, have also become popular. In this paper we show that while convolutions and attention are both sufficient for good performance, neither of them are necessary. We present MLP-Mixer, an architecture based exclusively on multi-layer perceptrons (MLPs). MLP-Mixer contains two types of layers: one with MLPs applied independently to image patches (i.e. "mixing" the per-location features), and one with MLPs applied across patches (i.e. "mixing" spatial information). When trained on large datasets, or with modern regularization schemes, MLP-Mixer attains competitive scores on image classification benchmarks, with pre-training and inference cost comparable to state-of-the-art models. We hope that these results spark further research beyond the realms of well established CNNs and Transformers. Authors: Ilya Tolstikhin, Neil Houlsby, Alexander Kolesnikov, Lucas Beyer, Xiaohua Zhai, Thomas Unterthiner, Jessica Yung, Daniel Keysers, Jakob Uszkoreit, Mario Lucic, Alexey Dosovitskiy ERRATA: Here is their definition of what the 5-shot classifier is: "we report the few-shot accuracies obtained by solving the L2-regularized linear regression problem between the frozen learned representations of images and the labels" Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yann... Minds: https://www.minds.com/ykilcher Parler: https://parler.com/profile/YannicKilcher LinkedIn: https://www.linkedin.com/in/yannic-ki... BiliBili: https://space.bilibili.com/1824646584 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannick... Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Pretrained Transformers as Universal Computation Engines (Machine Learning Research Paper Explained)

code discord universal large minds transformers machine learning evaluation parler outline conclusions engines computation research papers bilibili fpt learning research pieter abbeel

Play Episode Listen Later May 3, 2021 34:01

#universalcomputation #pretrainedtransformers #finetuning Large-scale pre-training and subsequent fine-tuning is a common recipe for success with transformer models in machine learning. However, most such transfer learning is done when a model is pre-trained on the same or a very similar modality to the final task to be solved. This paper demonstrates that transformers can be fine-tuned to completely different modalities, such as from language to vision. Moreover, they demonstrate that this can be done by freezing all attention layers, tuning less than .1% of all parameters. The paper further claims that language modeling is a superior pre-training task for such cross-domain transfer. The paper goes through various ablation studies to make its point. OUTLINE: 0:00 - Intro & Overview 2:00 - Frozen Pretrained Transformers 4:50 - Evaluated Tasks 10:05 - The Importance of Training LayerNorm 17:10 - Modality Transfer 25:10 - Network Architecture Ablation 26:10 - Evaluation of the Attention Mask 27:20 - Are FPTs Overfitting or Underfitting? 28:20 - Model Size Ablation 28:50 - Is Initialization All You Need? 31:40 - Full Model Training Overfits 32:15 - Again the Importance of Training LayerNorm 33:10 - Conclusions & Comments Paper: https://arxiv.org/abs/2103.05247 Code: https://github.com/kzl/universal-comp... Abstract: We investigate the capability of a transformer pretrained on natural language to generalize to other modalities with minimal finetuning -- in particular, without finetuning of the self-attention and feedforward layers of the residual blocks. We consider such a model, which we call a Frozen Pretrained Transformer (FPT), and study finetuning it on a variety of sequence classification tasks spanning numerical computation, vision, and protein fold prediction. In contrast to prior works which investigate finetuning on the same modality as the pretraining dataset, we show that pretraining on natural language improves performance and compute efficiency on non-language downstream tasks. In particular, we find that such pretraining enables FPT to generalize in zero-shot to these modalities, matching the performance of a transformer fully trained on these tasks. Authors: Kevin Lu, Aditya Grover, Pieter Abbeel, Igor Mordatch Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yann... Minds: https://www.minds.com/ykilcher Parler: https://parler.com/profile/YannicKilcher LinkedIn: https://www.linkedin.com/in/yannic-ki... BiliBili: https://space.bilibili.com/1824646584 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannick... Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Why AI is Harder Than We Think (Machine Learning Research Paper Explained)

ai conclusion harder machine learning outline fallacy research papers bilibili learning research

Play Episode Listen Later May 2, 2021 36:40

#aiwinter #agi #embodiedcognition The AI community has gone through regular cycles of AI Springs, where rapid progress gave rise to massive overconfidence, high funding, and overpromise, followed by these promises being unfulfilled, subsequently diving into periods of disenfranchisement and underfunding, called AI Winters. This paper examines the reasons for the repeated periods of overconfidence and identifies four fallacies that people make when they see rapid progress in AI. OUTLINE: 0:00 - Intro & Overview 2:10 - AI Springs & AI Winters 5:40 - Is the current AI boom overhyped? 15:35 - Fallacy 1: Narrow Intelligence vs General Intelligence 19:40 - Fallacy 2: Hard for humans doesn't mean hard for computers 21:45 - Fallacy 3: How we call things matters 28:15 - Fallacy 4: Embodied Cognition 35:30 - Conclusion & Comments Paper: https://arxiv.org/abs/2104.12871 My Video on Shortcut Learning: https://youtu.be/D-eg7k8YSfs Abstract: Since its beginning in the 1950s, the field of artificial intelligence has cycled several times between periods of optimistic predictions and massive investment ("AI spring") and periods of disappointment, loss of confidence, and reduced funding ("AI winter"). Even with today's seemingly fast pace of AI breakthroughs, the development of long-promised technologies such as self-driving cars, housekeeping robots, and conversational companions has turned out to be much harder than many people expected. One reason for these repeating cycles is our limited understanding of the nature and complexity of intelligence itself. In this paper I describe four fallacies in common assumptions made by AI researchers, which can lead to overconfident predictions about the field. I conclude by discussing the open questions spurred by these fallacies, including the age-old challenge of imbuing machines with humanlike common sense. Authors: Melanie Mitchell Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yannic-kilcher Minds: https://www.minds.com/ykilcher Parler: https://parler.com/profile/YannicKilcher LinkedIn: https://www.linkedin.com/in/yannic-kilcher-488534136/ BiliBili: https://space.bilibili.com/1824646584 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Belonging with Omid Fotuhi

92,000 Hours

Play Episode Listen Later Apr 13, 2021 61:32

Today, Annalisa speaks with Dr. Omid Fotuhi about belonging. Omid speaks about belonging as a fundamental need. He talks about belonging in the workplace and belonging for young adults in college. Omid is a Research Psychologist who has dedicated his life to exploring and researching the processes and scientific mechanisms underlying human motivation and performance. He earned his PhD in Psychology from the University of Waterloo and later helped to co-found the College Transition Collaborative and Stanford Interventions Lab at Stanford University. He is currently the Director of Learning and Innovation at WGU Labs and a Research Associate at the Learning Research and Development Center at the University of Pittsburgh. Hosted by: Annalisa Holcombe; Edited by: Breanna Steggell.

director university learning phd innovation psychology pittsburgh belonging stanford university edited waterloo research associate annalisa omid development center research psychologist learning research

Measuring the Impact of AI and Machine Learning Research

The Data Exchange with Ben Lorica

Play Episode Listen Later Mar 25, 2021 40:43

In this episode of the Data Exchange, our special correspondent and managing editor Jenn Webb organized a mini-panel composed of myself and Simon Rodriguez, Data Research Assistant at the Center for Security and Emerging Technology (CSET) at Georgetown University. Through a series of reports and data briefs, CSET provides policymakers with data rich material to inform and guide public policy.Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.Detailed show notes can be found on The Data Exchange web site.Subscribe to The Gradient Flow Newsletter.

security measuring machine learning georgetown university detailed learning research cset

Corrective Feedback 2: Exploring the Discourse with Dr. Hossein Nassaji

Teacher Talking Time: The Learn YOUR English Podcast

Play Episode Listen Later Mar 21, 2021 63:57

We're thrilled to announce our new partnership with Carleton University in Ottawa, Canada, to produce an 8-part mini series on the topic of Corrective Feedback. The series explores the area of corrective feedback through interviews with 8 scholars in the field. All interviews are conducted by students in Dr. Eva Kartchava's MA class at Carleton University as a means of assessment to connect researchers to their audience and have her students generate a greater level of understanding and investment in the research from the course. If you are interested in having a similar series produced for your class or institute, you can contact us: info@learnyourenglish.com This is episode 2 in our series. In this episode, Dr. Hossein Nassaji joins the program to discuss corrective feedback. Dr. Hossein is an award-winning scholar and Professor of Applied Linguistics in the Department of Linguistics at the University of Victoria, Victoria, BC. He has authored over 100 articles and many books. His forthcoming handbook on corrective feedback, The Cambridge Handbook of Corrective Feedback in Second Language Learning and Teaching with Eva Kartchava, is a comprehensive volume that discusses current issues and perspectives on corrective feedback and their applications to second language teaching and learning. Specifically in this episode, Dr. Nassaji tells us about: the roles corrective feedback plays in language learning how culture impacts feedback effectiveness the debate between immediate and delayed feedback written vs oral feedback and the efficacy of written feedback the what, when, why, and if of explicit & implicit feedback how teachers can learn about and implement corrective feedback in their classes *This interview was conducted by Kelsey Ulrich-Verslycken and Lana Haj Hamid Partnership with Carleton University: Throughout the series, MA students from Dr. Kartchava's class will interview leading experts in the field of corrective feedback. We thank Dr. Kartchava for joining this episode and for spearheading this initiative. For more information on this episode, this project, and those involved: view Carleton and Dr. Kartchava's website on Corrective Feedback view the LYE blog post on this episode More from Dr. Nassaji: His website Some of his prominent books: Corrective Feedback in Second Language Teaching and Learning: Research, Theory, Applications, Implication The Interactional Feedback Dimension in Instructed Second Language Learning: Linking Theory, Research, and Practice Teaching Grammar in Second Language Classrooms: Integrating Form-Focused Instruction in Communicative Context The Cambridge Handbook of Corrective Feedback in Second Language Learning and Teaching Podcast Creation: This episode was created with support from Thinkific & Podbean. If you're looking to launch a course or start a podcast, we highly recommend them - and use them ourselves. As always, thank you for listening. Your support has been overwhelming and we couldn't do what we do without you. We hope this podcast serves as an effective CPD tool for you. If you have a comment or question about today's show, we'd love to hear from you: info@learnyourenglish.com For more info on what we do at LYE, check out: Our Teacher Development Membership Join Our Mailing List Our Online CPD Courses for Teachers Follow Learn YOUR English Follow Teacher Talking Time

Facebook AI Research’s Tim & Heinrich on democratizing reinforcement learning research

Gradient Dissent - A Machine Learning Podcast by W&B

Play Episode Listen Later Mar 4, 2021 54:09

Since reinforcement learning requires hefty compute resources, it can be tough to keep up without a serious budget of your own. Find out how the team at Facebook AI Research (FAIR) is looking to increase access and level the playing field with the help of NetHack, an archaic rogue-like video game from the late 80s. Links discussed: The NetHack Learning Environment: https://ai.facebook.com/blog/nethack-learning-environment-to-advance-deep-reinforcement-learning/ Reinforcement learning, intrinsic motivation: https://arxiv.org/abs/2002.12292 Knowledge transfer: https://arxiv.org/abs/1910.08210 Tim Rocktäschel is a Research Scientist at Facebook AI Research (FAIR) London and a Lecturer in the Department of Computer Science at University College London (UCL). At UCL, he is a member of the UCL Centre for Artificial Intelligence and the UCL Natural Language Processing group. Prior to that, he was a Postdoctoral Researcher in the Whiteson Research Lab, a Stipendiary Lecturer in Computer Science at Hertford College, and a Junior Research Fellow in Computer Science at Jesus College, at the University of Oxford. https://twitter.com/_rockt Heinrich Kuttler is an AI and machine learning researcher at Facebook AI Research (FAIR) and before that was a research engineer and team lead at DeepMind. https://twitter.com/HeinrichKuttler https://www.linkedin.com/in/heinrich-kuttler/ Topics covered: 0:00 a lack of reproducibility in RL 1:05 What is NetHack and how did the idea come to be? 5:46 RL in Go vs NetHack 11:04 performance of vanilla agents, what do you optimize for 18:36 transferring domain knowledge, source diving 22:27 human vs machines intrinsic learning 28:19 ICLR paper - exploration and RL strategies 35:48 the future of reinforcement learning 43:18 going from supervised to reinforcement learning 45:07 reproducibility in RL 50:05 most underrated aspect of ML, biggest challenges? Get our podcast on these other platforms: Apple Podcasts: http://wandb.me/apple-podcasts Spotify: http://wandb.me/spotify Google: http://wandb.me/google-podcasts YouTube: http://wandb.me/youtube Soundcloud: http://wandb.me/soundcloud Tune in to our bi-weekly virtual salon and listen to industry leaders and researchers in machine learning share their research: http://wandb.me/salon Join our community of ML practitioners where we host AMA's, share interesting projects and meet other people working in Deep Learning: http://wandb.me/slack Our gallery features curated machine learning reports by researchers exploring deep learning techniques, Kagglers showcasing winning models, and industry leaders sharing best practices: https://wandb.ai/gallery

Conducting fundamental machine learning research as a non-profit with MLC's founder Rosanne Liu

Gradient Dissent - A Machine Learning Podcast by W&B

Play Episode Listen Later Feb 5, 2021 49:09

How Rosanne is working to democratize AI research and improve diversity and fairness in the field through starting a non-profit after being a founding member of Uber AI Labs, doing lots of amazing research, and publishing papers at top conferences. Rosanne is a machine learning researcher, and co-founder of ML Collective, a nonprofit organization for open collaboration and mentorship. Before that, she was a founding member of Uber AI. She has published research at NeurIPS, ICLR, ICML, Science, and other top venues. While at school she used neural networks to help discover novel materials and to optimize fuel efficiency in hybrid vehicles. ML Collective: http://mlcollective.org/ Controlling Text Generation with Plug and Play Language Models: https://eng.uber.com/pplm/ LCA: Loss Change Allocation for Neural Network Training: https://eng.uber.com/research/lca-loss-change-allocation-for-neural-network-training/ Topics covered 0:00 Sneak peek, Intro 1:53 The origin of ML Collective 5:31 Why a non-profit and who is MLC for? 14:30 LCA, Loss Change Allocation 18:20 Running an org, research vs admin work 20:10 Advice for people trying to get published 24:15 on reading papers and Intrinsic Dimension paper 36:25 NeurIPS - Open Collaboration 40:20 What is your reward function? 44:44 Underrated aspect of ML 47:22 How to get involved with MLC Get our podcast on these other platforms: Apple Podcasts: http://wandb.me/apple-podcasts Spotify: http://wandb.me/spotify Google: http://wandb.me/google-podcasts YouTube: http://wandb.me/youtube Tune in to our bi-weekly virtual salon and listen to industry leaders and researchers in machine learning share their research: http://wandb.me/salon Join our community of ML practitioners where we host AMA's, share interesting projects and meet other people working in Deep Learning: http://wandb.me/slack Our gallery features curated machine learning reports by researchers exploring deep learning techniques, Kagglers showcasing winning models, and industry leaders sharing best practices: https://wandb.ai/gallery

The AI Deal of TRUST - Host : Dr. Lobna Karoui, Executive AI Strategy Growth Advisor / Guest : Dr. Anima Anandkumar; Director of machine learning research at NVIDIA

AI Exponential Thinker

Play Episode Listen Later Dec 10, 2020 3:02

A new Episode of the Serie "The AI Deal of Trust" in the unique AI Chanel of Trust by AI Exponential Thinker. Our Guest is Dr. Anima Anandkumar is a Bren professor at Caltech and a director of machine learning research at NVIDIA. She was recognized by her peers with more than 20 awards from young investigator from the Air Force and Army research offices to Good Tech 2018 by Nytimes. She is one of youngest Female AI researcher in Silicon Valley with great Faculty fellowships from Microsoft, Google, Facebook and incredible best paper awards. Dr. Karoui is pleased to welcome Dr. Anandkumar in this new podcast episode. Dr. Lobna Karoui is an Executive AI Strategy Growth Advisor and Exponential Digital Transformer for Fortune 500 & CAC40 with two decades experience in building AI products and services for millions of users. She is the president of AI Exponential thinker with a target to inspire and empower 1 Million young boys and girls, horizon 2025, about Trust Technologies and AI Opportunities. She is an international Speaker and interviewer recognized as an AI Expert by Forbes, Bloomberg and MIT. Follow us and subscribe www.aiexponentialthinker.com, linkedin, Facebook and Instagram or via contact@aiexponentialthinker.com to interact with our Guests, meet great speakers and mentors from great companies such as Amazon, WEF, Harvard and more

Best resources to remain updated to Machine Learning Research | Siddha Ganju, @NVIDIA

Machine Learning Podcast - Jay Shah

Play Episode Listen Later Nov 5, 2020 2:39

Siddha is a self-driving architect at NVIDIA and she also guides teams at NASA as an AI domain expert. She was featured on the Forbes 30 under 30 list in 2019. Previously, she developed deep learning models for resource-constrained edge devices at DeepVision. She earned her Master's degree from Carnegie Mellon University, and her work ranges from visual question answering to generative adversarial networks to gathering insights from CERN’s petabyte-scale data and has been published at top-tier conferences including CVPR and NeurIPS.Hosted by:Jay Shah: https://www.linkedin.com/in/shahjay22/You can reach out to https://www.public.asu.edu/~jgshah1/ for any queries.Stay tuned for upcoming webinars!***Disclaimer: The information contained in this video represents the views and opinions of the speaker and does not necessarily represent the views or opinions of any institution. It does not constitute an endorsement by any Institution or its affiliates of such video content.***

Overcoming oral language barriers to learning | Research for the Real World

IOE insights, debates, lectures, interviews

Play Episode Listen Later Oct 19, 2020 28:18

Professor Julie Dockrell speaks to Dr Rob Webster about how speech, language and communication are foundational to learning and achievement. What happens when a child struggles with these skills? Slow progress can be hidden by the logistics and pressures of the current school system, making oral language delay a less well-recognised disorder, which is in turn less supported as children move up through school years. Disorders such as development language disorder (DLD) represent the most extreme end of difficulties. Rob and Julie discuss reasons why some children might be slower than others to develop skills in this arena, and how they can be supported to overcome difficulties. We also hear examples of techniques that teachers can employ to create more opportunities for oral language development during classroom activities. Full show notes and links: https://www.ucl.ac.uk/ioe/news/2020/oct/overcoming-oral-language-barriers-learning-rftrw-s05e03 Take our 2-minute survey to share your thoughts about our podcast: https://bit.ly/ResearchPodcast

overcoming disorders real world oral language barriers dld learning research rob webster

The Six Strategies of Effective Learning With Learning Scientist, Dr. Megan Sumeracki (Ep.7 Rebroadcast)

Medical Mnemonist (from MedSchoolCoach)

Play Episode Listen Later Sep 17, 2020 41:37

Dr. Megan Sumeracki teaches us about Spacing, Retrieval, Interleaving, Elaboration, Dual-coding, and Concrete Examples. [2:16] The Learning Scientists Podcast & Bi-directional Communication [4:13] The Importance of Learning Research as open-access [5:27] The Six Strategies of Effective Learning from the Institute for Education Sciences (2007?) & The National Council for Teaching Quality Report (2016?) [7:45] A Brief History of Spacing and Retrieval Practice, Ebbinghaus, and Memory Accessibility [9:57] What is Spacing and How do We Use it? [12:17] The Ins and Outs of Retrieval Practice [4:47] Interleaving Your Study Practice (19:10 Taylor and Rore? 2010 blocking v. interleaving) [21:20] Elaborative Interrogation: The How, When, Why, and Where of Your Study Material [24:40] Dual Coding: Combining Visual information with your Learning! [28:05] Concrete Examples: The More the Merrier [32:54] Closing Advice from Dr. Sumeracki: Use These Strategies!!! Also, you can find Dr. Sumeracki’s books on Amazon, research articles, and some great podcasts by the Learning Scientists on Retrieval for medical residents, and an interview with her sister, Dr. Alyssa Smith.

learning strategy institute scientists outs dual brief history national council ins retrieval spacing effective learning education sciences elaboration ebbinghaus learning research rore learning scientists

Tips on Returning to School During a Pandemic | POF58

Parenting Our Future

Play Episode Listen Later Aug 18, 2020 39:51

What we know about this coming school year, is how much we don’t know what will happen or what the school year is going to end up looking like. Kids returning to school usually has us feeling mixed emotions. We are excited to have them back to school, back to routine but we also can feel melancholy at the site of our kids entering a new grade, getting older and more independent. Now we need to add in the uncertainty of going back to school during a pandemic. How are you feeling about it? Are you nervous? Worried about your child not being safe? Worried school might be closed again (and terrified of having to homeschool your kids again!)? Worried that they are falling behind in their education? Worried about their future? I get it. My guest, Dr. Annie Snyder and I talk about all these fears and worries in this episode and what to do about it for yourself, teachers, and for your kids. Dr. Snyder is an educational researcher who is studying the “Covid Slide” which is the decrease in learning due to the lengthy disruption of school from last year plus the summer holidays. This episode is full of useful tips and advice on how to handle returning to school. Be sure to check out the show notes for important tips from the episode but also for Dr. Snyder’s tips on how to support your kids with continual learning. About Dr. Annie Snyder A former teacher and educational researcher, Annie Snyder currently serves as a senior learning scientist within the Learning Research and Strategy team at McGraw Hill. She holds a doctorate in educational psychology from Teachers College, Columbia University, and is interested in the links between learning science, research, teaching practice, and families’ roles in education. Annie has been happily working from home with her three sons since 2016. From the Podcast Dr. Snyder has so many great resources to keep the learning going and on how to support your kids during this time. Tips from the Podcast: Encourage young learners to exercise for a few minutes before any online learning (jumping jacks, dancing, running in place, etc.). Physical movement will help those learners become more ready to engage in learning and stay focused. Make a written checklist for any procedures that learners may need to follow, whether they are going to school online or in a brick and mortar school (for example, for online learning, turning on the computer, shutting the door, drinking some water, preparing materials before logging on). Especially for new processes, this can help prevent learners from needing to use their cognitive resources on procedures, so they can more easily pay attention to school learning. Explore the accessibility options offered through technology (e.g. text-to-speech options for online text). Some of these may help ease learning burdens (and boredom!) by providing learners other avenues for accessing learning content. Try forming online (or safe, socially distant) study and social groups that meet on a regular basis. Learners can not only have the opportunity both to work on school tasks, but this also offers an opportunity for practicing social skills as well as working through some of the emotional challenges posed by learning during a pandemic. As best as possible, reduce distractions when learners must learn online. These distractions might be in the room itself (e.g. move the toys out of the room BEFORE the virtual meeting with the class) but also distractions within the technology (so that learners are not surfing the Internet when they need to be responding to a teacher’s question). Work together, as a family and with educators. Continuously remind yourself and those around you that we will succeed by being kind, empathetic, and cooperative to each other as we all work toward ensuring children make forward progress. Remember that learning never stops. The human brain is continuously learning, all throughout the lifespan, and the world is rich with opportunities to continue to grow and learn more – no matter where we all might be in any given moment. Dr. Snyder’s Tips to Keep Learning Going Literacy: Read aloud: All forms of reading aloud can support literacy learning for children of all ages. While it is always terrific when adults can read aloud to children, remember that children can also read to adults, other children, pets, or stuffed animals. Children can also benefit from listening to online read-alouds, as well as recording their own. Lean on the library: Throughout the country, libraries have been hard at work finding new ways to provide services online. Check out the local library to determine whether there are online programs, summer reading programs, eBook loans, book recommendations and other offerings to support literacy learning this summer. Encourage writing: Some children may naturally want to engage in writing as a way to express their ideas and emotions, while others may need more encouragement. Families can support writing development by setting aside designated writing periods each day and providing a few basic materials such as writing prompts, blank comic book strips, journals, or even stamps and envelopes for written correspondence. Remember that writing doesn’t necessarily mean that children must put pen to paper – typing and oral recordings also count! Math: Play games: A deck of cards, a sheet of paper and a pencil, and a few dice can all provide countless opportunities for playing math games (for example, Addition “War” with cards, in which each player plays two cards instead of one and the player with the highest sum wins the cards). Chess, checkers, and other classic board games are also great choices, as they can support the abstract strategic thinking that is critical for mathematical development. Discover real-world math: Math surrounds us at every turn, and it can be exciting for children (and their families!) to uncover all the ways math can not only help us understand the world, but also solve problems. Whether children are measuring for a garden, tracking weather statistics, or creating a budget to help save for a new toy, authentic projects help younger learners make sense of mathematics. Make math active: Try skip count hopscotch with sidewalk chalk (count by tens for each square!), jump rope counting challenges, or number line dancing. All of these are just a few ways that families can blend mathematical learning with movement. This is especially helpful for younger children, who may be less inclined to want to sit and practice with flashcards or worksheets during enticing summer weather. Science: Join a community: Although we often think of lone scientists working in a lab when we hear the word science, in reality, many scientists work collaboratively as part of scientific communities. Young learners can be encouraged to form their own family science communities (e.g. a backyard experiment family club) or can, with adult help, take part in one of the many citizen science projects that are available online. Explore: Both the natural and the human-made environments that surround us can provide lots of options for scientific exploration. For example, nature hikes or city walks following appropriate social distancing precautions can provide families with opportunities to help children explore science in the world around them (e.g. birdwatching, tree identification, studies of architectural principles of city buildings, etc.) Engineer: Summer is an excellent time for building things with blocks, cardboard boxes, or whatever scrap materials happen to be on hand. For younger children and older children alike, it can be helpful to provide daily challenges (e.g. build the tallest tower, the longest tunnel, a maze, etc.) to help inspire problem-solving and collaborative engineering. Social Studies: Study civics: What is civics, exactly? This single question can open doors to endless writing opportunities, discussions, and projects during the summer months. In addition to examining definitions of civics, learners and their families can explore ways to engage in civics both at the family level (e.g. creating a list of family rules for the summer) as well as at other levels of society (e.g. learning how the various branches of government work together, studying the Constitution, writing letters to Congress, reading the news). Build an exhibit: Museums around the world are offering virtual tours of their exhibits. After taking a tour, learners can learn more about some of the artifacts and then create an exhibit (or an entire museum) of their own! Cardboard boxes, an empty shelf, a table, or a blanket on the floor could be a great place to display the exhibit. To add authenticity, learners may wish to make placards to provide facts for every artifact included. Older learners might want to take digital photos of the artifacts and record a gallery walk for virtual visitors. Travel the world, from home: The study of geography is all about the places, physical features, and people around the world - and how all these interact. Although actual travel may not be a possibility for families this summer, learning about the world can happen from anywhere. For example, families can explore atlases (online and print), practice map skills by creating maps of the home or neighborhood or investigate the geographic origins of objects in the home. Thanks for listening! It means so much to me that you listened to my podcast! If you would like to purchase my book or other parenting resources, visit me at www.yellingcurebook.com With this podcast, my intention is to build a community of parents that can have open and honest conversations about parenting without judgement or criticism. We have too much of that! I honor each parent and their path towards becoming the best parent they can be. My hope is to inspire more parents to consider the practice of Peaceful Parenting. If you know somebody who would benefit from this message, or would be an awesome addition to our community, please share it using the social media buttons on this page. Do you have some feedback or questions about this episode? Leave a note in the comment section below! Subscribe to the podcast If you would like to get automatic updates of new podcast episodes, you can subscribe on the podcast app on your mobile device. Leave a review I appreciate every bit of feedback to make this a value adding part of your day. Ratings and reviews from listeners not only help me improve, but also help others find me in their podcast app. If you have a minute, an honest review on iTunes goes a long way! Thank You!!

We Need the Will and Skill to Make Meaningful Change

Shift Impact Build

Play Episode Listen Later Jul 13, 2020 37:05

Today, we are joined by Dr. Louis Gomez, senior fellow at the Carnegie Foundation for the Advancement of Teaching and professor of education and information studies at UCLA. Dr. Gomez addresses the topics of Networked Improvement Communities (NICs) and the mindset shift that is necessary in order to promote equity in education. Join our conversation about how compliance can prevent initiatives from being implemented, the necessity of having a common aim and narrative when discussing improvement science as part of NICs, and that equity without the will to change or the respect for the community will not bring about social justice. Visit our Bronx ART website and connect with us on Twitter @BX_ARTeam! Today's hosts are Kris DeFilippis, Adelia Gibson, and Kaitlyn Reilley Guest Information: Dr. Gomez earned a bachelor's degree in psychology from Stonybrook University and a doctorate in Psychology from University of California, Berkeley. He spent 14 years working in cognitive science and person–computer systems interactions at Bell Laboratories, Bell Communications Research Inc. and Bellcore. Dr. Gomez has held a number of faculty positions including positions at Northwestern University and the University of Pittsburgh, where he was also director of the Center for Urban Education and a senior scientist at the Learning Research and Development Center. Dr. Gomez is currently a professor of education and information studies at the University of California, Los Angeles. Since 2008, he has served as a senior fellow at the Carnegie Foundation for the Advancement of Teaching, where he leads the Network Development work. He is the co-author of Learning to Improve: How America's Schools Can Get Better at Getting Better. Dr. Gomez is dedicated to educational improvement and his numerous publications and studies have contributed greatly to bringing improvement science to the field of education. Connect with Dr. Gomez through email at lmgomez@ucla.edu Resources for Listeners: Information about iLEAD Learning to Improve: How America's Schools Can Get Better and Getting Better Why a NIC? Getting Ideas into Action: Building Networked Improvement Communities in Education Improvement Research Carried Out Through Networked Communities: Accelerating Learning about Practices that Support More Productive Student Mindsets How a Networked Improvement Community Improved Success Rates for Struggling College Math Students

DNN 7: Reinforcement Learning | Research at Waymo, University of Oxford | Shimon Whiteson

Deep Neural Notebooks

Play Episode Listen Later May 28, 2020 65:51

In the seventh episode of Deep Neural Notebooks, I interview Shimon Whiteson. Shimon sir is a Computer Science Professor at the University of Oxford, where he leads the Whiteson Research Lab. He is also a Data Scientist at Waymo (formerly the Google Self Driving Car Project). His research specialises in Reinforcement Learning (RL), Cooperative Multi-Agent RL, to be precise. So this interview is all in the context of Reinforcement Learning. We talk about his journey - how he started with Machine Learning & RL. I ask him about his thoughts on the state of RL - about how the field has progressed and changed since he started, about how it has become so popular in the last few years, and about the challenges being faced. We also talk about his research at Waymo, about recent projects from his lab, and about the scope and future of telepresence robots, one of which was developed under his guidance. We also talk about the infamous Reward Hypothesis in the context of RL and Philosophy. In the end, he also shares some advice for people starting out with RL. Links: - Shimon Whiteson: https://twitter.com/shimon8282 - Whiteson Research Lab (WhiRL): http://whirl.cs.ox.ac.uk/ - Teresa Robot: https://whirl.cs.ox.ac.uk/teresa/ - RL workshop at Machine Learning Summer School, Moscow: https://www.youtube.com/watch?v=RAw0Chs7QKA - The Reward Hypothesis: http://incompleteideas.net/rlai.cs.ualberta.ca/RLAI/rewardhypothesis.html Timestamps: 03:42 Beginnings in Computer Science06:13 Beginnings in ML 07:15 PhD at UT Austin 10:40 Intersection of Neuroevolution and RL 14:10 Research directions since PhD 16:35 State of RL 20:33 Simulation for RL 22:07 Research at Waymo 25:30 Multi-agent RL 33:25 Recent projects at WhiRL 41:30 Teresa project and Telepresence Robots 48:08 Bottlenecks for RL and Robotics 49:45 End-goal for RL, Human-level Intelligence 53:45 What do you find most fascinating about your research? 55:38 RL & Philosophy 1:01:20 Keeping up with latest research 1:03:28 Advice for beginners Podcast links : Youtube: https://youtu.be/bbrYZDgPI9M Apple Podcasts: https://apple.co/2TLUZ0y Google Podcasts: https://bit.ly/2TIyvh6 Spotify: https://open.spotify.com/episode/3936aEvSwsIhfwQfURmDb9 Anchor: https://bit.ly/3gpMi65 Connect: Twitter: https://twitter.com/mkulkhanna Website: https://mukulkhanna.co LinkedIn: https://linkedin.com/in/mukulkhanna/

Deepset - Machine learning research to enterprise ready services

Startup Engineering

Play Episode Listen Later Apr 7, 2020 26:08

Our guest Malte Pietsch is a Co-Founder of deepset, where he builds NLP solutions for enterprise clients, such as Siemens, Airbus and Springer Nature. He holds a M.Sc. with honors from TU Munich and conducted research at Carnegie Mellon University.He is an active open-source contributor, creator of the NLP frameworks FARM & haystack and published the German BERT model. He is particularly interested in transfer learning and its application to question answering / semantic search.Resources:Deepset - Make sense out of your text data - https://deepset.ai/FARM - Fast & easy transfer learning for NLP - https://github.com/deepset-ai/FARMHayStack - Transformers at scale for question answering & search - https://github.com/deepset-ai/haystackSageMaker - Machine learning for every developer and data scientist - https://aws.amazon.com/sagemaker/Spot Instances - Managed Spot Training in Amazon SageMaker - https://docs.aws.amazon.com/sagemaker/latest/dg/model-managed-spot-training.htmlElasticSearch - Fully managed, scalable, and secure Elasticsearch service - https://aws.amazon.com/elasticsearch-service/Automatic mixed precision - Automatic Mixed Precision for Deep Learning - https://developer.nvidia.com/automatic-mixed-precisionPyTorch - Open source machine learning framework that accelerates the path from research prototyping to production deployment - https://pytorch.org/NumPy - Fundamental package for scientific computing with Python - https://numpy.org/MLFlow - An open source platform for the machine learning lifecycle - https://mlflow.org/BERT - Bidirectional Encoder Representations from Transformers - https://en.wikipedia.org/wiki/BERT_(language_model)SQuAD - The Stanford Question Answering Dataset - https://rajpurkar.github.io/SQuAD-explorer/Sebastian Ruder - Research scientist at DeepMind - https://ruder.io/Andrew Ng - His machine learning course is the MOOC that had led to the founding of Coursera - https://www.coursera.org/instructor/andrewng

How to get Deep Learning Research Internships w/ Varuni Sarwal

BITS Cast : College Life And More

Play Episode Listen Later Mar 28, 2020 12:24

On Ep 16 of BITS Cast, I have Varuni Sarwal on the show. She is an undergraduate at IIT Delhi and has interned at Top Colleges like Harvard University and UCLA, for Deep Learning. She is passionate about the application of Deep Learning in Biology. She shares some of her tips for how you too can get an international Internship. She goes into detail and shares her experience at these colleges and what all she learned and what are her future plans for this new decade. Let me know what was your biggest takeaway on Instagram @ishansharma7390

ucla harvard university biology internship internships deep learning get deep iit delhi learning research

The 2020 Learning Research Landscape with Youki Terada from Edutopia

Trending In Education

Play Episode Listen Later Jan 30, 2020 36:38

Following up on our recent show covering the educational research highlights of 2019, Mike sits down with Youki Terada from Edutopia who authored the article. Youki is the Research and Standards editor for Edutopia which means he reviews and edits contributions from Edutopia's writing staff to ensure it's evidence-based, well-designed, and relevant to Edutopia's target audience of K12 Educators. We talk about areas of research that Youki has found particularly interesting and explore several examples with an eye towards practical application for educators. We also talk about the importance of curation and the risks of fast or sloppy research when looking for good applications of emerging learning research.

research standards landscape edutopia learning research terada youki

The Year in Learning Research - Trending in Education Episode 181

Trending In Education

Play Episode Listen Later Dec 23, 2019 32:26

Dan and Mike dig into an Edutopia article by Youki Terada outlining the key findings in educational research in 2019. What sorts of findings jump out and which stories did we cover on Trending in Education? We take some time to look back as we gear up to peer forward into 2020 and the decade to come on this week’s episode. We hope you enjoy!

Interview with Jeremy Howard | fast.ai | Kaggle | Machine Learning Research

Play Episode Listen Later Dec 8, 2019 80:46

Personal Note: I'm extremely excited and really honoured to share this interview! Show notes: https://sanyambhutani.com/interview-with-jeremy-howard/ Audio (Podcast Version) available here: https://anchor.fm/chaitimedatascience Subscribe here to the newsletter: https://tinyletter.com/sanyambhutani In this episode, Sanyam Bhutani interviews Jeremy Howard: Co-Founder at fast.ai and Guru to the complete fast.ai community. They talk all about Jeremy's journey into through the field as a Consultant to Founder to Kaggle Rank 1 followed by founding fast.ai. They also discuss all about the fast.ai forums: The research that goes into it, how Jeremy and Sylvain collaborate on research and how the course is prepared. Links: fast.ai course: https://course.fast.ai Interview with Sylvain Gugger: Interview with Jason Antic (Creator of DeOldify): Follow: Jeremy Howard: https://twitter.com/jeremyphoward Fast.ai: https://twitter.com/fastdotai https://fast.ai Sanyam Bhutani: https://twitter.com/bhutanisanyam1 Blog: sanyambhutani.com About: http://chaitimedatascience.com/ A show for Interviews with Practitioners, Kagglers & Researchers and all things Data Science hosted by Sanyam Bhutani. You can expect weekly episodes every available as Video, Podcast, and blogposts. If you'd like to support the podcast: https://www.patreon.com/chaitimedatascience Intro track: Flow by LiQWYD https://soundcloud.com/liqwyd --- Send in a voice message: https://anchor.fm/chaitimedatascience/message

founders interview video blog consultants gurus researchers machine learning practitioners data science sylvain kaggle liqwyd jeremy howard learning research howard fast audio podcast version sanyam bhutani kagglers

DeepMind & AlphaGo | Deep Learning Research | Swift For Tensorflow | Interview with Dr. Marc Lanctot

data machine learning jvm learning research

Play Episode Listen Later Oct 27, 2019 71:19

Newsletter: https://tinyletter.com/sanyambhutani Personal Note: I'm really honored to share this conversation. I really hope you enjoy listening to it as much as I enjoyed talking to Dr. Marc Lanctot. In this Episode, Sanyam Bhutani interviews Dr. Marc Lactot, a Research Scientist at DeepMind. In this interview, they talk all about Research at DeepMind, Deep Learning Research, AlphaGo. They also talk all about Swift For Tensorflow and OpenSpiel. Dr. Marc Lanctot is a research scientist at Google DeepMind. Previously, he was a post-doctoral researcher at the Maastricht University Games and AI Group, working with Mark Winands. During his PhD, Marc worked on sampling algorithms for equilibrium computation and decision-making in games. His research interests include in general multiagent learning (and planning), computational game theory, reinforcement learning, and game-tree search. Open Spiel: https://deepmind.com/research/open-source/openspiel Interview with Machine Learning Hero Series: https://medium.com/dsnet/interviews-with-machine-learning-heroes-ad9358385278 Links from the podcast: - Commodore 64: https://en.wikipedia.org/wiki/Commodore_64 - Intellivision (early console!): https://en.wikipedia.org/wiki/Intellivision - AD&D Treasure of Tarmin (Intellivision game with 3D perspective-- pretty advanced for its time!): https://en.wikipedia.org/wiki/Advanced_Dungeons_%26_Dragons:_Treasure_of_Tarmin - Montezuma's revenge: https://en.wikipedia.org/wiki/Montezuma%27s_Revenge_(video_game) Follow: Dr. Marc Lanctot https://twitter.com/sharky6000 http://mlanctot.info/ https://www.linkedin.com/in/marc-lanctot-a1619095/?originalSubdomain=ca Sanyam Bhutani: https://twitter.com/bhutanisanyam1 About: http://chaitimedatascience.com/ A show for Interviews with Practitioners, Kagglers & Researchers and all things Data Science hosted by Sanyam Bhutani. You can expect weekly episodes every Sunday, Thursday available as Video, Podcast, and blogposts. If you'd like to support the podcast: https://www.patreon.com/chaitimedatascience --- Send in a voice message: https://anchor.fm/chaitimedatascience/message

Hendri Krisma - Machine Learning Research with JVM

BukaTalks

Play Episode Listen Later Sep 6, 2019 13:45

Machine Learning merupakan sebuah algoritma yang digunakan untuk mempelajari sebuah data, tentang bagaimana pola, dan membuat model berdasarkan data historik. Data yang nanti dihasilkan berguna sebagai referensi, klasifikasi dan prediksi. Hendri Krisma (Senior Development Engineer Blibli) menjelaskan bagaimana Machine Learning tersebut dilakukan dengan teknologi berbasis JVM.

"And the Bit Goes Down", Deep Learning Research, Research at FAIR | Interview with Pierre Stock

Play Episode Listen Later Sep 5, 2019 46:30

Audio (Podcast Version) available here: https://anchor.fm/chaitimedatascience Subscribe here to the newsletter: https://tinyletter.com/sanyambhutani In this Episode, Sanyam Bhutani interviews Pierre Stock, resident PhD student at FAIR Lab, Paris. In this interview, they talk a lot about his research, Deep Learning Research in general and FAIR. They also discuss Pierre's recent research: And the Bit Goes Down: Revisiting the Quantization of Neural Networks (Link: https://arxiv.org/pdf/1907.05686.pdf) Repository: https://github.com/facebookresearch/kill-the-bits Follow: Pierre Stock: https://twitter.com/PierreStock https://research.fb.com/people/stock-pierre/ Sanyam Bhutani: https://twitter.com/bhutanisanyam1 Blog: https://medium.com/@init_27 About: http://chaitimedatascience.com/ A show for Interviews with Practitioners, Kagglers & Researchers and all things Data Science hosted by Sanyam Bhutani. You can expect weekly episodes every available as Video, Podcast, and blogposts. If you'd like to support the podcast: https://www.patreon.com/chaitimedatascience --- Send in a voice message: https://anchor.fm/chaitimedatascience/message

interview phd video stock pierre researchers practitioners data science deep learning repository learning research audio podcast version sanyam bhutani kagglers

MuseNet, OpenAI and Deep Learning Research: Interview with Christine Payne

Play Episode Listen Later Aug 25, 2019 60:36

Previous Blog Interview: https://medium.com/dsnet/interview-with-openai-fellow-christine-mcleavey-payne-aaef948ad571 In this episode, Sanyam Bhutani interviews Christine Payne for the second time. They talk about MuseNet, OpenAI and Deep Learning Research, MOOC(s) for Machine Learning, Deep Learning. MuseNet: https://openai.com/blog/musenet/ Follow: Christine Payne: https://twitter.com/mcleavey https://www.linkedin.com/in/mcleavey/ http://christinemcleavey.com/ Sanyam Bhutani: https://twitter.com/bhutanisanyam1 About: http://chaitimedatascience.com/ A show for Interviews with Practitioners, Kagglers & Researchers and all things Data Science hosted by Sanyam Bhutani. You can expect weekly episodes every available as Video, Podcast, and blogposts. If you'd like to support the podcast: https://www.patreon.com/chaitimedatascience --- Send in a voice message: https://anchor.fm/chaitimedatascience/message

interview video researchers machine learning openai practitioners payne data science deep learning mooc learning research sanyam bhutani kagglers

Deep Learning Research, Hardware, Kaggle | Interview with Tim Dettmers