POPULARITY
Here in Episode #31, podcast host Dr. Jerry Workman speaks with Dr. Barry M. Wise, Founder and President of Eigenvector Research, Inc. about the meaning of the terms chemometrics, artificial intelligence (AI), machine learning (ML), and neural networks (NNs) within the context of analytical chemistry and process analysis.
1. 4.4% of the US federal budget went into the space race at its peak.This was surprising to me, until a friend pointed out that landing rockets on specific parts of the moon requires very similar technology to landing rockets in soviet cities.[1]I wonder how much more enthusiastic the scientists working on Apollo were, with the convenient motivating story of “I'm working towards a great scientific endeavor” vs “I'm working to make sure we can kill millions if we want to”. 2.The field of alignment seems to be increasingly dominated by interpretability. (and obedience[2])This was surprising to me[3], until a friend pointed out that partially opening the black box of NNs is the kind of technology that would scaling labs find new unhobblings by noticing ways in which the internals of their models are being inefficient and having better tools to evaluate capabilities advances.[4]I [...] ---Outline:(00:03) 1.(00:35) 2.(01:20) 3.The original text contained 6 footnotes which were omitted from this narration. --- First published: October 21st, 2024 Source: https://www.lesswrong.com/posts/h4wXMXneTPDEjJ7nv/a-rocket-interpretability-analogy --- Narrated by TYPE III AUDIO.
In this special episode on Low and No Calorie Sweeteners, our host, Dr. Neil Skolnik will take a deep dive into the randomized evidence about the efficacy of Low and No Calorie Sweeteners. Also discussed will be observational trials on NNS, as well as safety. Finally, there will be a discussion of the World Health Organization recommendations on Nonnutritive sweeteners as well as practical clinical insights. This special episode is supported by an independent educational grant from Heartland Food Group, the maker of the Splenda Group of Products. For more information just go to: www.splenda.com . Presented by: Neil Skolnik, M.D., Professor of Family and Community Medicine, Sidney Kimmel Medical College, Thomas Jefferson University; Associate Director, Family Medicine Residency Program, Abington Jefferson Health Kathrine Appleton, PhD, Professor of Psychology, Bournemouth University. Articles discussed: The effects of low-calorie sweeteners on energy intake and body weight: a systematic review and meta-analyses of sustained intervention studies. Int J Obes (Lond). 2021 Mar;45(3):464-478 Use of non-sugar sweeteners: WHO guideline Non-nutritive sweeteners and their impacts on the gut microbiome and host physiology. Front Nutr.2022;9:988144
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: An Introduction to Representation Engineering - an activation-based paradigm for controlling LLMs, published by Jan Wehner on July 14, 2024 on The AI Alignment Forum. Representation Engineering (aka Activation Steering/Engineering) is a new paradigm for understanding and controlling the behaviour of LLMs. Instead of changing the prompt or weights of the LLM, it does this by directly intervening on the activations of the network during a forward pass. Furthermore, it improves our ability to interpret representations within networks and to detect the formation and use of concepts during inference. This post serves as an introduction to Representation Engineering (RE). We explain the core techniques, survey the literature, contrast RE to related techniques, hypothesise why it works, argue how it's helpful for AI Safety and lay out some research frontiers. Disclaimer: I am no expert in the area, claims are based on a ~3 weeks Deep Dive into the topic. What is Representation Engineering? Goals Representation Engineering is a set of methods to understand and control the behaviour of LLMs. This is done by first identifying a linear direction in the activations that are related to a specific concept [3], a type of behaviour, or a function, which we call the concept vector. During the forward pass, the similarity of activations to the concept vector can help to detect the presence or absence of this concept direction. Furthermore, the concept vector can be added to the activations during the forward pass to steer the behaviour of the LLM towards the concept direction. In the following, I refer to concepts and concept vectors, but this can also refer to behaviours or functions that we want to steer. This presents a new approach for interpreting NNs on the level of internal representations, instead of studying outputs or mechanistically analysing the network. This top-down frame of analysis might pose a solution to problems such as detecting deceptive behaviour or identifying harmful representations without a need to mechanistically understand the model in terms of low-level circuits [4, 24]. For example, RE has been used as a lie detector [6] and for detecting jailbreak attacks [11]. Furthermore, it offers a novel way to control the behaviour of LLMs. Whereas current approaches for aligning LLM behaviour control the weights (fine-tuning) or the inputs (prompting), Representation Engineering directly intervenes on the activations during the forward pass allowing for efficient and fine-grained control. This is broadly applicable for example for reducing sycophancy [17] or aligning LLMs with human preferences [19]. This method operates at the level of representations. This refers to the vectors in the activations of an LLM that are associated with a concept, behaviour or task. Golechha and Dao [24] as well as Zou et al. [4] argue that interpreting representations is a more effective paradigm for understanding and aligning LLMs than the circuit-level analysis popular in Mechanistic Interpretability (MI). This is because MI might not be scalable for understanding large, complex systems, while RE allows the study of emergent structures in LLMs that can be distributed. Method Methods for Representation Engineering have two important parts, Reading and Steering. Representation Reading derives a vector from the activations that capture how the model represents human-aligned concepts like honesty and Representation Steering changes the activations with that vector to suppress or promote that concept in the outputs. For Representation Reading one needs to design inputs, read out the activations and derive the vector representing a concept of interest from those activations. Firstly one devises inputs that contrast each other wrt the concept. For example, the prompts might encourage the m...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: [Interim research report] Activation plateaus & sensitive directions in GPT2, published by Stefan Heimersheim on July 5, 2024 on The AI Alignment Forum. This part-report / part-proposal describes ongoing research, but I'd like to share early results for feedback. I am especially interested in any comment finding mistakes or trivial explanations for these results. I will work on this proposal with a LASR Labs team over the next 3 months. If you are working (or want to work) on something similar I would love to chat! Experiments and write-up by Stefan, with substantial inspiration and advice from Jake (who doesn't necessarily endorse every sloppy statement I write). Work produced at Apollo Research. TL,DR: Toy models of how neural networks compute new features in superposition seem to imply that neural networks that utilize superposition require some form of error correction to avoid interference spiraling out of control. This means small variations along a feature direction shouldn't affect model outputs, which I can test: 1. Activation plateaus: Real activations should be resistant to small perturbations. There should be a "plateau" in the output as a function of perturbation size. 2. Sensitive directions: Perturbations towards the direction of a feature should change the model output earlier (at a lower perturbation size) than perturbations into a random direction. I find that both of these predictions hold; the latter when I operationalize "feature" as the difference between two real model activations. As next steps we are planning to Test both predictions for SAE features: We have some evidence for the latter by Gurnee (2024) and Lindsey (2024). Are there different types of SAE features, atomic and composite features? Can we get a handle on the total number of features? If sensitivity-features line up with SAE features, can we find or improve SAE feature directions by finding local optima in sensitivity (similar to how Mack & Turner (2024) find steering vectors)? My motivation for this project is to get data on computation in superposition, and to get dataset-independent evidence for (SAE-)features. Core results & discussion I run two different experiments that test the error correction hypothesis: 1. Activation Plateaus: A real activation is the center of a plateau, in the sense that perturbing the activation affects the model output less than expected. Concretely: applying random-direction perturbations to an activation generated from a random openwebtext input ("real activation") has less effect than applying the same perturbations to a random activation (generated from a Normal distribution). This effect on the model can be measured in KL divergence of logits (shown below) but also L2 difference or cosine similarity of late-layer activations. 2. Sensitive directions: Perturbing a (real) activation into a direction towards another real activation ("poor man's feature directions") affects the model-outputs more than perturbing the same activation into a random direction. In the plot below focus on the size of the "plateau" in the left-hand side 1. Naive random direction vs mean & covariance-adjusted random: Naive isotropic random directions are much less sensitive. Thus we use mean & covariance-adjusted random activations everywhere else in this report. 2. The sensitive direction results are related to Gurnee (2024, SAE-replacement-error direction vs naive random direction) and Lindsey (2024, Anthropic April Updates, SAE-feature direction vs naive random direction). The theoretical explanation for activation plateaus & sensitive direction may be error correction (also referred to as noise suppression): NNs in superposition should expect small amounts of noise in feature activations due to interference. (The exact properties depend on how computation happens in s...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: [Interim research report] Activation plateaus & sensitive directions in GPT2, published by StefanHex on July 5, 2024 on LessWrong. This part-report / part-proposal describes ongoing research, but I'd like to share early results for feedback. I am especially interested in any comment finding mistakes or trivial explanations for these results. I will work on this proposal with a LASR Labs team over the next 3 months. If you are working (or want to work) on something similar I would love to chat! Experiments and write-up by Stefan, with substantial inspiration and advice from Jake (who doesn't necessarily endorse every sloppy statement I write). Work produced at Apollo Research. TL,DR: Toy models of how neural networks compute new features in superposition seem to imply that neural networks that utilize superposition require some form of error correction to avoid interference spiraling out of control. This means small variations along a feature direction shouldn't affect model outputs, which I can test: 1. Activation plateaus: Real activations should be resistant to small perturbations. There should be a "plateau" in the output as a function of perturbation size. 2. Sensitive directions: Perturbations towards the direction of a feature should change the model output earlier (at a lower perturbation size) than perturbations into a random direction. I find that both of these predictions hold; the latter when I operationalize "feature" as the difference between two real model activations. As next steps we are planning to Test both predictions for SAE features: We have some evidence for the latter by Gurnee (2024) and Lindsey (2024). Are there different types of SAE features, atomic and composite features? Can we get a handle on the total number of features? If sensitivity-features line up with SAE features, can we find or improve SAE feature directions by finding local optima in sensitivity (similar to how Mack & Turner (2024) find steering vectors)? My motivation for this project is to get data on computation in superposition, and to get dataset-independent evidence for (SAE-)features. Core results & discussion I run two different experiments that test the error correction hypothesis: 1. Activation Plateaus: A real activation is the center of a plateau, in the sense that perturbing the activation affects the model output less than expected. Concretely: applying random-direction perturbations to an activation generated from a random openwebtext input ("real activation") has less effect than applying the same perturbations to a random activation (generated from a Normal distribution). This effect on the model can be measured in KL divergence of logits (shown below) but also L2 difference or cosine similarity of late-layer activations. 2. Sensitive directions: Perturbing a (real) activation into a direction towards another real activation ("poor man's feature directions") affects the model-outputs more than perturbing the same activation into a random direction. In the plot below focus on the size of the "plateau" in the left-hand side 1. Naive random direction vs mean & covariance-adjusted random: Naive isotropic random directions are much less sensitive. Thus we use mean & covariance-adjusted random activations everywhere else in this report. 2. The sensitive direction results are related to Gurnee (2024, SAE-replacement-error direction vs naive random direction) and Lindsey (2024, Anthropic April Updates, SAE-feature direction vs naive random direction). The theoretical explanation for activation plateaus & sensitive direction may be error correction (also referred to as noise suppression): NNs in superposition should expect small amounts of noise in feature activations due to interference. (The exact properties depend on how computation happens in superposition, this toy...
Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: [Interim research report] Activation plateaus & sensitive directions in GPT2, published by StefanHex on July 5, 2024 on LessWrong. This part-report / part-proposal describes ongoing research, but I'd like to share early results for feedback. I am especially interested in any comment finding mistakes or trivial explanations for these results. I will work on this proposal with a LASR Labs team over the next 3 months. If you are working (or want to work) on something similar I would love to chat! Experiments and write-up by Stefan, with substantial inspiration and advice from Jake (who doesn't necessarily endorse every sloppy statement I write). Work produced at Apollo Research. TL,DR: Toy models of how neural networks compute new features in superposition seem to imply that neural networks that utilize superposition require some form of error correction to avoid interference spiraling out of control. This means small variations along a feature direction shouldn't affect model outputs, which I can test: 1. Activation plateaus: Real activations should be resistant to small perturbations. There should be a "plateau" in the output as a function of perturbation size. 2. Sensitive directions: Perturbations towards the direction of a feature should change the model output earlier (at a lower perturbation size) than perturbations into a random direction. I find that both of these predictions hold; the latter when I operationalize "feature" as the difference between two real model activations. As next steps we are planning to Test both predictions for SAE features: We have some evidence for the latter by Gurnee (2024) and Lindsey (2024). Are there different types of SAE features, atomic and composite features? Can we get a handle on the total number of features? If sensitivity-features line up with SAE features, can we find or improve SAE feature directions by finding local optima in sensitivity (similar to how Mack & Turner (2024) find steering vectors)? My motivation for this project is to get data on computation in superposition, and to get dataset-independent evidence for (SAE-)features. Core results & discussion I run two different experiments that test the error correction hypothesis: 1. Activation Plateaus: A real activation is the center of a plateau, in the sense that perturbing the activation affects the model output less than expected. Concretely: applying random-direction perturbations to an activation generated from a random openwebtext input ("real activation") has less effect than applying the same perturbations to a random activation (generated from a Normal distribution). This effect on the model can be measured in KL divergence of logits (shown below) but also L2 difference or cosine similarity of late-layer activations. 2. Sensitive directions: Perturbing a (real) activation into a direction towards another real activation ("poor man's feature directions") affects the model-outputs more than perturbing the same activation into a random direction. In the plot below focus on the size of the "plateau" in the left-hand side 1. Naive random direction vs mean & covariance-adjusted random: Naive isotropic random directions are much less sensitive. Thus we use mean & covariance-adjusted random activations everywhere else in this report. 2. The sensitive direction results are related to Gurnee (2024, SAE-replacement-error direction vs naive random direction) and Lindsey (2024, Anthropic April Updates, SAE-feature direction vs naive random direction). The theoretical explanation for activation plateaus & sensitive direction may be error correction (also referred to as noise suppression): NNs in superposition should expect small amounts of noise in feature activations due to interference. (The exact properties depend on how computation happens in superposition, this toy...
Low Audi0 takes over the airwaves for an hour to plug you with his favorite house and techno tracks from the month. You can expect to hear newly released tracks from artists we know paired with carefully selected tunes from upcoming producers you might not have heard of. Take a load off and lock in w/ Low Audi0 on the Lowdown! Tracklist "1. NNS & Bigredcap - Back Once Again T6 2. Amine Edge & DANCE, HUGEL - Fukinasty 3. Braydon Terzo - RunnIT 4. Low Audi0 - 24 Hours 5. Volac - 4 The Trouble 6. Detlef - MoFo 7. Deeper Purpose & BRN Remix - Turn Off The Lights 8. Max Styler - Lights Out (Extended Mix) 9. *DANCEFLOOR BOMB OF THE MONTH* Galo - Miss Honey 10. Tigerblind - Raided 11. MISS DRE - LIFESTYLE 12. RUMPUS, Haylee Wood - Dance With Me (Avilo Remix) (Club Edit) 13. Wolfstax & XFDS - Get Nasty 14. Disco Service - SPEED IT UP - 15. Kolter - Always Talking Shit 16. 4NEY - For You 17. Bondar, Papa Marlin - Supa Fly "
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Visualizing neural network planning, published by Nevan Wichers on May 9, 2024 on The AI Alignment Forum. TLDR We develop a technique to try and detect if a NN is doing planning internally. We apply the decoder to the intermediate representations of the network to see if it's representing the states it's planning through internally. We successfully reveal intermediate states in a simple Game of Life model, but find no evidence of planning in an AlphaZero chess model. We think the idea won't work in its current state for real world NNs because they use higher-level, abstract representations for planning that our current technique cannot decode. Please comment if you have ideas that may work for detecting more abstract ways the NN could be planning. Idea and motivation To make safe ML, it's important to know if the network is performing mesa optimization, and if so, what optimization process it's using. In this post, I'll focus on a particular form of mesa optimization: internal planning. This involves the model searching through possible future states and selecting the ones that best satisfy an internal goal. If the network is doing internal planning, then it's important the goal it's planning for is aligned with human values. An interpretability technique which could identify what states it's searching through would be very useful for safety. If the NN is doing planning it might represent the states it's considering in that plan. For example, if predicting the next move in chess, it may represent possible moves it's considering in its hidden representations. We assume that NN is given the representation of the environment as input and that the first layer of the NN encodes the information into a hidden representation. Then the network has hidden layers and finally a decoder to compute the final output. The encoder and decoder are trained as an autoencoder, so the decoder can reconstruct the environment state from the encoder output. Language models are an example of this where the encoder is the embedding lookup. Our hypothesis is that the NN may use the same representation format for states it's considering in its plan as it does for the encoder's output. Our idea is to apply the decoder to the hidden representations at different layers to decode them. If our hypothesis is correct, this will recover the states it considers in its plan. This is similar to the Logit Lens for LLMs, but we're applying it here to investigate mesa-optimization. A potential pitfall is that the NN uses a slightly different representation for the states it considers during planning than for the encoder output. In this case, the decoder won't be able to reconstruct the environment state it's considering very well. To overcome this, we train the decoder to output realistic looking environment states given the hidden representations by training it like the generator in a GAN. Note that the decoder isn't trained on ground truth environment states, because we don't know which states the NN is considering in its plan. Game of Life proof of concept (code) We consider an NN trained to predict the number of living cells after the Nth time step of the Game of Life (GoL). We chose the GoL because it has simple rules, and the NN will probably have to predict the intermediate states to get the final cell count. This NN won't do planning, but it may represent the intermediate states of the GoL in its hidden states. We use an LSTM architecture with an encoder to encode the initial GoL state, and a "count cells NN" to output the number of living cells after the final LSTM output. Note that training the NN to predict the number of alive cells at the final state makes this more difficult for our method than training the network to predict the final state since it's less obvious that the network will predict t...
Dr. Paul Lessard and his collaborators have written a paper on "Categorical Deep Learning and Algebraic Theory of Architectures". They aim to make neural networks more interpretable, composable and amenable to formal reasoning. The key is mathematical abstraction, as exemplified by category theory - using monads to develop a more principled, algebraic approach to structuring neural networks. We also discussed the limitations of current neural network architectures in terms of their ability to generalise and reason in a human-like way. In particular, the inability of neural networks to do unbounded computation equivalent to a Turing machine. Paul expressed optimism that this is not a fundamental limitation, but an artefact of current architectures and training procedures. The power of abstraction - allowing us to focus on the essential structure while ignoring extraneous details. This can make certain problems more tractable to reason about. Paul sees category theory as providing a powerful "Lego set" for productively thinking about many practical problems. Towards the end, Paul gave an accessible introduction to some core concepts in category theory like categories, morphisms, functors, monads etc. We explained how these abstract constructs can capture essential patterns that arise across different domains of mathematics. Paul is optimistic about the potential of category theory and related mathematical abstractions to put AI and neural networks on a more robust conceptual foundation to enable interpretability and reasoning. However, significant theoretical and engineering challenges remain in realising this vision. Please support us on Patreon. We are entirely funded from Patreon donations right now. https://patreon.com/mlst If you would like to sponsor us, so we can tell your story - reach out on mlstreettalk at gmail Links: Categorical Deep Learning: An Algebraic Theory of Architectures Bruno Gavranović, Paul Lessard, Andrew Dudzik, Tamara von Glehn, João G. M. Araújo, Petar Veličković Paper: https://categoricaldeeplearning.com/ Symbolica: https://twitter.com/symbolica https://www.symbolica.ai/ Dr. Paul Lessard (Principal Scientist - Symbolica) https://www.linkedin.com/in/paul-roy-lessard/ Interviewer: Dr. Tim Scarfe TOC: 00:00:00 - Intro 00:05:07 - What is the category paper all about 00:07:19 - Composition 00:10:42 - Abstract Algebra 00:23:01 - DSLs for machine learning 00:24:10 - Inscrutibility 00:29:04 - Limitations with current NNs 00:30:41 - Generative code / NNs don't recurse 00:34:34 - NNs are not Turing machines (special edition) 00:53:09 - Abstraction 00:55:11 - Category theory objects 00:58:06 - Cat theory vs number theory 00:59:43 - Data and Code are one in the same 01:08:05 - Syntax and semantics 01:14:32 - Category DL elevator pitch 01:17:05 - Abstraction again 01:20:25 - Lego set for the universe 01:23:04 - Reasoning 01:28:05 - Category theory 101 01:37:42 - Monads 01:45:59 - Where to learn more cat theory
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Some costs of superposition, published by Linda Linsefors on March 3, 2024 on The AI Alignment Forum. I don't expect this post to contain anything novel. But from talking to others it seems like some of what I have to say in this post is not widely known, so it seemed worth writing. In this post I'm defining superposition as: A representation with more features than neurons, achieved by encoding the features as almost orthogonal vectors in neuron space. One reason to expect superposition in neural nets (NNs), is that for large n, Rn has many more than n almost orthogonal directions. On the surface, this seems obviously useful for the NN to exploit. However, superposition is not magic. You don't actually get to put in more information, the gain you get from having more feature directions has to be paid for some other way. All the math in this post is very hand-wavey. I expect it to be approximately correct, to one order of magnitude, but not precisely correct. Sparsity One cost of superposition is feature activation sparsity. I.e, even though you get to have many possible features, you only get to have a few of those features simultaneously active. (I think the restriction of sparsity is widely known, I mainly include this section because I'll need the sparsity math for the next section.) In this section we'll assume that each feature of interest is a boolean, i.e. it's either turned on or off. We'll investigate how much we can weaken this assumption in the next section. If you have m features represented by n neurons, with m>n, then you can't have all the features represented by orthogonal vectors. This means that an activation of one feature will cause some some noise in the activation of other features. The typical noise on feature f1 caused by 1 unit of activation from feature f2, for any pair of features f1, f2, is (derived from Johnson-Lindenstrauss lemma) ϵ=8ln(m)n [1] If l features are active then the typical noise level on any other feature will be approximately ϵl units. This is because the individual noise terms add up like a random walk. Or see here for an alternative explanation of where the root square comes from. For the signal to be stronger than the noise we need ϵl
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Neural uncertainty estimation for alignment, published by Charlie Steiner on December 5, 2023 on The AI Alignment Forum. Introduction Suppose you've built some AI model of human values. You input a situation, and it spits out a goodness rating. You might want to ask: "What are the error bars on this goodness rating?" In addition to it just being nice to know error bars, an uncertainty estimate can also be useful inside the AI: guiding active learning[1], correcting for the optimizer's curse[2], or doing out-of-distribution detection[3]. I recently got into the uncertainty estimation literature for neural networks (NNs) for a pet reason: I think it would be useful for alignment to quantify the domain of validity of an AI's latent features. If we point an AI at some concept in its world-model, optimizing for realizations of that concept can go wrong by pushing that concept outside its domain of validity. But just keep thoughts of alignment in your back pocket for now. This post is primarily a survey of the uncertainty estimation literature, interspersed with my own takes. The Bayesian neural network picture The Bayesian NN picture is the great granddaddy of basically every uncertainty estimation method for NNs, so it's appropriate to start here. The picture is simple. You start with a prior distribution over parameters. Your training data is evidence, and after training on it you get an updated distribution over parameters. Given an input, you calculate a distribution over outputs by propagating the input through the Bayesian neural network. This would all be very proper and irrelevant ("Sure, let me just update my 2trilliondimensional joint distribution over all the parameters of the model"), except for the fact that actually training NNs does kind of work this way. If you use a log likelihood loss and L2 regularization, the parameters that minimize loss will be at the peak of the distribution that a Bayesian NN would have, if your prior on the parameters was a Gaussian[4][5]. This is because of a bridge between the loss landscape and parameter uncertainty. Bayes's rule says P(parameters|dataset)=P(parameters)P(dataset|parameters)/P(dataset). Here P(parameters|dataset)is your posterior distribution you want to estimate, and P(parameters)P(dataset|parameters) is the exponential of the loss[6]. This lends itself to physics metaphors like "the distribution of parameters is a Boltzmann distribution sitting at the bottom of the loss basin." Empirically, calculating the uncertainty of a neural net by pretending it's adhering to the Bayesian NN picture works so well that one nice paper on ensemble methods[7] called it "ground truth." Of course to actually compute anything here you have to make approximations, and if you make the quick and dirty approximations (e.g. pretend you can find the shape of the loss basin from the Hessian) you get bad results[8], but people are doing clever things with Monte Carlo methods these days[9], and they find that better approximations to the Bayesian NN calculation get better results. But doing Monte Carlo traversal of the loss landscape is expensive. For a technique to apply at scale, it must impose only a small multiplier on cost to run the model, and if you want it to become ubiquitous the cost it imposes must be truly tiny. Ensembles A quite different approach to uncertainty is ensembles[10]. Just train a dozen-ish models, ask them for their recommendations, and estimate uncertainty from the spread. The dozen-times cost multiplier on everything is steep, but if you're querying the model a lot it's cheaper than Monte Carlo estimation of the loss landscape. Ensembling is theoretically straightforward. You don't need to pretend the model is trained to convergence, you don't need to train specifically for predictive loss, you don't even need...
We're short on time this week, so we have an abbreviated episode of NNS for you.00:00:32 Intro00:01:11 What's Your Swill00:01:52 The lack of marketing for The Dragon Prince season 5 is disturbing00:06:50 Depp v Heard trailer reaction00:10:01 Castlevania: Nocturne teaser reaction00:12:45 Jurassic Park and Jurassic Park: The Lost Kingdom mini-review00:19:51 The Witcher mini-reviewOur intro and outro theme song is “Bitter” by Space Weather. Check them out by following their Twitter @SpaceWeatherUS.Join this cool Discord server: https://discord.gg/3MAEu6B73w Check out our Linktree for more places to find us: https://linktr.ee/netflixnswill.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AutoInterpretation Finds Sparse Coding Beats Alternatives, published by Hoagy on July 17, 2023 on The AI Alignment Forum. Produced as part of the SERI ML Alignment Theory Scholars Program - Summer 2023 Cohort Huge thanks to Logan Riggs, Aidan Ewart, Lee Sharkey, Robert Huben for their work on the sparse coding project, Lee Sharkey and Chris Mathwin for comments on the draft, EleutherAI for compute and OpenAI for GPT-4 credits. Summary We use OpenAI's automatic interpretation protocol to analyse features found by dictionary learning using sparse coding and compare the interpretability scores thereby found to a variety of baselines. We find that for both the residual stream (layer 2) and MLP (layer 1) of Eleuther's Pythia70M, sparse coding learns a set of features that is superior to all tested baselines, even when removing the bias and looking just at the learnt directions. In doing so we provide additional evidence to the hypothesis that NNs should be conceived as using distributed representations to represent linear features which are only weakly anchored to the neuron basis. As before these results are still somewhat preliminary and we hope to expand on them and make them more robust over the coming month or two, but we hope people find them fruitful sources of ideas. If you want to discuss, feel free to message me or head over to our thread in the EleutherAI discord. All code available at the github repo. Methods Sparse Coding The feature dictionaries learned by sparse coding are learnt by simple linear autoencoders with a sparsity penalty on the activations. For more background on the sparse coding approach to feature-finding see the Conjecture interim report that we're building from, or Robert Huben's explainer. Automatic Interpretation As Logan Riggs' recently found, many of the directions found through sparse coding seem highly interpretable, but we wanted a way to quantify this, and make sure that we were detecting a real difference in the level of interpretability. To do this we used the methodology outlined in this OpenAI paper, details can be found in their code base. To quickly summarise, we are analysing features which are defined as scalar-valued functions of the activations of a neural network, limiting ourselves here to features defined on a single layer of a language model. The original paper simply defined features as the activation of individual neurons but we will in general be looking at linear combinations of neurons. We give a feature an interpretability score by first generating a natural language explanation for the feature, which is expected to explain how strongly a feature will be active in a certain context, for example 'the feature activates on legal terminology'. Then, we give this explanation to an LLM and ask it to predict the feature for hundreds of different contexts, so if the tokens are ['the' 'lawyer' 'went' 'to' 'the' 'court'] the predicted activations might be [0, 10, 0, 0, 8]. The score is defined as the correlation between the true and predicted activations. To generate the explanations we follow OpenAI and take a 64-token sentence-fragment from each of the first 50,000 lines of OpenWebText. For each feature, we calculate the average activation and take the 20 fragments with the highest activation. Of these 20, we pass 5 to GPT-4, along with the rescaled per-token activations. From these 5 fragments, GPT-4 suggests an explanation for when the neuron fires. GPT3.5 is then used to simulate the feature, given the explanation, across both another 5 highly activating fragments, and 5 randomly selected fragments (with non-zero variation). The correlation scores are calculated across all 10 fragments ('top-and-random'), as well as for the top and random fragments separately. Comparing Feature Dictionaries We use dictionary learning ...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AutoInterpretation Finds Sparse Coding Beats Alternatives, published by Hoagy on July 17, 2023 on LessWrong. Produced as part of the SERI ML Alignment Theory Scholars Program - Summer 2023 Cohort Huge thanks to Logan Riggs, Aidan Ewart, Lee Sharkey, Robert Huben for their work on the sparse coding project, Lee Sharkey and Chris Mathwin for comments on the draft, EleutherAI for compute and OpenAI for GPT-4 credits. Summary We use OpenAI's automatic interpretation protocol to analyse features found by dictionary learning using sparse coding and compare the interpretability scores thereby found to a variety of baselines. We find that for both the residual stream (layer 2) and MLP (layer 1) of Eleuther's Pythia70M, sparse coding learns a set of features that is superior to all tested baselines, even when removing the bias and looking just at the learnt directions. In doing so we provide additional evidence to the hypothesis that NNs should be conceived as using distributed representations to represent linear features which are only weakly anchored to the neuron basis. As before these results are still somewhat preliminary and we hope to expand on them and make them more robust over the coming month or two, but we hope people find them fruitful sources of ideas. If you want to discuss, feel free to message me or head over to our thread in the EleutherAI discord. All code available at the github repo. Methods Sparse Coding The feature dictionaries learned by sparse coding are learnt by simple linear autoencoders with a sparsity penalty on the activations. For more background on the sparse coding approach to feature-finding see the Conjecture interim report that we're building from, or Robert Huben's explainer. Automatic Interpretation As Logan Riggs' recently found, many of the directions found through sparse coding seem highly interpretable, but we wanted a way to quantify this, and make sure that we were detecting a real difference in the level of interpretability. To do this we used the methodology outlined in this OpenAI paper, details can be found in their code base. To quickly summarise, we are analysing features which are defined as scalar-valued functions of the activations of a neural network, limiting ourselves here to features defined on a single layer of a language model. The original paper simply defined features as the activation of individual neurons but we will in general be looking at linear combinations of neurons. We give a feature an interpretability score by first generating a natural language explanation for the feature, which is expected to explain how strongly a feature will be active in a certain context, for example 'the feature activates on legal terminology'. Then, we give this explanation to an LLM and ask it to predict the feature for hundreds of different contexts, so if the tokens are ['the' 'lawyer' 'went' 'to' 'the' 'court'] the predicted activations might be [0, 10, 0, 0, 8]. The score is defined as the correlation between the true and predicted activations. To generate the explanations we follow OpenAI and take a 64-token sentence-fragment from each of the first 50,000 lines of OpenWebText. For each feature, we calculate the average activation and take the 20 fragments with the highest activation. Of these 20, we pass 5 to GPT-4, along with the rescaled per-token activations. From these 5 fragments, GPT-4 suggests an explanation for when the neuron fires. GPT3.5 is then used to simulate the feature, given the explanation, across both another 5 highly activating fragments, and 5 randomly selected fragments (with non-zero variation). The correlation scores are calculated across all 10 fragments ('top-and-random'), as well as for the top and random fragments separately. Comparing Feature Dictionaries We use dictionary learning with a sparsi...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Deep learning models might be secretly (almost) linear, published by beren on April 24, 2023 on LessWrong. Crossposted from my personal blog. Epistemic status: Pretty speculative, but there is a surprising amount of circumstantial evidence. I have been increasingly thinking about NN representations and slowly coming to the conclusion that they are (almost) completely secretly linear inside. This means that, theoretically, if we can understand their directions, we can very easily exert very powerful control on the internal representations, as well as compose and reason about them in a straightforward way. Finding linear directions for a given representation would allow us to arbitrarily amplify or remove it and interpolate along it as desired. We could also then directly 'mix' it with other representations as desired. Measuring these directions during inference would let us detect the degree of each feature that the network assigns to a given input. For instance, this might let us create internal 'lie detectors' (which there is some progress towards) which can tell if the model is telling the truth, or being deceptive. While nothing is super definitive (and clearly networks are not 100% linear), I think there is a large amount of fairly compelling circumstantial evidence for this position. Namely: Evidence for this: All the work from way back when about interpolating through VAE/GAN latent space. I.e. in the latent space of a VAE on CelebA there are natural 'directions' for recognizable semantic features like 'wearing sunglasses' and 'long hair' and linear interpolations along these directions produced highly recognizable images Rank 1 or low rank editing techniques such as ROME work so well (not perfectly but pretty well). These are effectively just emphasizing a linear direction in the weights. You can apparently add and mix LoRas and it works about how you would expect. You can merge totally different models. People working with Stable Diffusion community literally additively merge model weights with weighted sum and it works! Logit lens works. SVD directions. Linear probing definitely gives you a fair amount of signal. Linear mode connectivity and git rebasin. Colin Burns' unsupervised linear probing method works even for semantic features like 'truth'. You can merge together different models finetuned from the same initialization. You can do a moving average over model checkpoints and this improves performance and is better than any individual checkpoint! The fact that linear methods work pretty tolerably well in neuroscience. Various naive diff based editing and control techniques work at all. General linear transformations between models and modalities. You can often remove specific concepts from the model by erasing a linear subspace. Task vectors (and in general things like finetune diffs being composable linearly) People keep finding linear representations inside of neural networks when doing interpretability or just randomly. If this is true, then we should be able to achieve quite a high level of control and understanding of NNs solely by straightforward linear methods and interventions. This would mean that deep networks might end up being pretty understandable and controllable artefacts in the near future. Just at this moment, we just have not yet found the right levers yet (or rather lots of existing work does show this but hasn't really been normalized or applied at scale for alignment). Linear-ish network representations are a best case scenario for both interpretability and control. For a mechanistic, circuits-level understanding, there is still the problem of superposition of the linear representations. However, if the representations are indeed mostly linear than once superposition is solved there seem to be little other obstacles in front of a com...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Should we publish mechanistic interpretability research?, published by Marius Hobbhahn on April 21, 2023 on The AI Alignment Forum. TL;DR: Multiple people have raised concerns about current mechanistic interpretability research having capabilities externalities. We discuss to which extent and which kind of mechanistic interpretability research we should publish. The core question we want to explore with this post is thus to which extent the statement “findings in mechanistic interpretability can increase capabilities faster than alignment” is true and should be a consideration. For example, foundational findings in mechanistic interpretability may lead to a better understanding of NNs which often straightforwardly generates new hypotheses to advance capabilities. We argue that there is no general answer and the publishing decision primarily depends on how easily the work advances alignment in relation to how much it can be used to advance capabilities. We recommend a differential publishing process where work with high capabilities potential is initially only circulated with a small number of trusted people and organizations and work with low capabilities potential is published widely. Related work: A note about differential technological development, Current themes in mechanistic interpretability research, Thoughts on AGI organization and capabilities work, Dan Hendrycks's take, etc. We have talked to lots of people about this question and lost track of who to thank individually. In case you talked to either Marius or Lawrence about this, thank you! The basic case for publishing Let's revisit the basic cases for publishing. Alignment is probably hard. To get even close to a solution, it likely requires many people working together, coordinating their work and patching together different approaches. If the work isn't public, this becomes much harder and thus a solution to alignment becomes less likely. More people can engage with the work and build on it. Especially people who want to get into the field or are less connected might not be able to access documents that are only shared in small circles. It gives more legitimacy to the work of organisations and individuals. For people who are not yet established in the field, publishing their work is the most obvious way to get noticed by an organisation. For academics, publications are the most relevant resource for their careers and organisations can generate more legitimacy by publishing their work (e.g. for grantmakers or other organisations they want to interact with). It is a form of movement building. If work on mechanistic interpretability is regularly shown on ML conferences and is available on arxiv, it is more likely that people outside of the alignment field notice that the field exists and get interested in it. Publication leads to accountability and feedback. If you know you will publish something, you put more effort into explaining it well and ensuring that your findings are robust. Furthermore, it provides a possibility for other researchers to engage with your work and give you feedback for improvement or future research directions. In addition, mechanistic interp seems especially well suited for publication in classic academic venues since it is less speculative than other AI safety work and overlaps with established academic fields. Thus, publication seems robustly positive as long as it doesn't advance capabilities more than alignment (which is often hard to predict in advance). The crux of this post, therefore, lies mainly in the possible negative externalities of publications and how they trade off against the alignment benefits. Capabilities externalities of mechanistic interpretability The primary reason to think that mechanistic interpretability has large capabilities is that understanding a sys...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Should we publish mechanistic interpretability research?, published by Marius Hobbhahn on April 21, 2023 on LessWrong. TL;DR: Multiple people have raised concerns about current mechanistic interpretability research having capabilities externalities. We discuss to which extent and which kind of mechanistic interpretability research we should publish. The core question we want to explore with this post is thus to which extent the statement “findings in mechanistic interpretability can increase capabilities faster than alignment” is true and should be a consideration. For example, foundational findings in mechanistic interpretability may lead to a better understanding of NNs which often straightforwardly generates new hypotheses to advance capabilities. We argue that there is no general answer and the publishing decision primarily depends on how easily the work advances alignment in relation to how much it can be used to advance capabilities. We recommend a differential publishing process where work with high capabilities potential is initially only circulated with a small number of trusted people and organizations and work with low capabilities potential is published widely. Related work: A note about differential technological development, Current themes in mechanistic interpretability research, Thoughts on AGI organization and capabilities work, Dan Hendrycks's take, etc. We have talked to lots of people about this question and lost track of who to thank individually. In case you talked to either Marius or Lawrence about this, thank you! The basic case for publishing Let's revisit the basic cases for publishing. Alignment is probably hard. To get even close to a solution, it likely requires many people working together, coordinating their work and patching together different approaches. If the work isn't public, this becomes much harder and thus a solution to alignment becomes less likely. More people can engage with the work and build on it. Especially people who want to get into the field or are less connected might not be able to access documents that are only shared in small circles. It gives more legitimacy to the work of organisations and individuals. For people who are not yet established in the field, publishing their work is the most obvious way to get noticed by an organisation. For academics, publications are the most relevant resource for their careers and organisations can generate more legitimacy by publishing their work (e.g. for grantmakers or other organisations they want to interact with). It is a form of movement building. If work on mechanistic interpretability is regularly shown on ML conferences and is available on arxiv, it is more likely that people outside of the alignment field notice that the field exists and get interested in it. Publication leads to accountability and feedback. If you know you will publish something, you put more effort into explaining it well and ensuring that your findings are robust. Furthermore, it provides a possibility for other researchers to engage with your work and give you feedback for improvement or future research directions. In addition, mechanistic interp seems especially well suited for publication in classic academic venues since it is less speculative than other AI safety work and overlaps with established academic fields. Thus, publication seems robustly positive as long as it doesn't advance capabilities more than alignment (which is often hard to predict in advance). The crux of this post, therefore, lies mainly in the possible negative externalities of publications and how they trade off against the alignment benefits. Capabilities externalities of mechanistic interpretability The primary reason to think that mechanistic interpretability has large capabilities is that understanding a system better ma...
I got to sit down with NASCAR driver Jeremy Clements. Jeremy was one of the youngest drivers to ever qualify for a NASCAR Nationwide Series event at age 18, racing for Jerry Young at Pikes Peak on July 26, 2003. He finished the race in 31rst after an accident on lap 28 ended his day. Jeremy would return to the NASCAR Nationwide Series in 2007, running 5 races with McGill Motorsports, driving the No. 36 Chevrolet. In 2008, Jeremy ran two races for this family team, driving the No. 50 at Gateway and Homestead. During 2009, he ran the first half of the season driving the No. 50 for his family team and the second half of the season for Johnny Davis in the No. 0. His best finish came at Auto Club Speedway on October 10, 2009, where he qualified 21rst and finished in 12th position. Jeremy Clements Racing/ Clements Racing Engines was formed in 2010 giving Jeremy the opportunity to run a partial schedule funded by Boudreaux's Butt Paste. Running under Johnny Davis on a very limited budget, the team qualified for 16 races, with Jeremy's then best NNS career finish of 10th at Gateway International Raceway. In 2011, the decision was made to move Jeremy full time into the NNS and run a full season driving the No. 51 for Jeremy Clements Racing. Although primary sponsorship was not secured, Jeremy and JCR remained competitive in the field, completing the season with 4 top 15 finishes and 7 top 20 As in 2011, the Team would again overcome limited sponsorship support and remain competitive on the track during the 2012 season. In his second full time season in NXS, Jeremy would earn 2 top 10 finishes at the Brickyard's Inaugural race (Indy) and at the Monster Mile (Dover). Jeremy completed the 2014 season in 14th place in the Driver Point Standings. In August 2017 in his 256th start Jeremy recorded his first career NXS win at Road America topping all the Cup-Affiliated teams as well as securing his first NXS playoff appearance. This was quite an accomplishment for a family owned single car team that builds its own engines and runs with limited sponsorship. Jeremy currently resides in his hometown of Spartanburg, SC, with his wife Cortney and their dog Mollie. During the off season, Jeremy enjoys traveling and spending time with family and friends. Jeremy finished the season the 2018 15th in Driver Point Standings. HOST The Motivational Cowboy - Johnny D. (John Dmytryszyn) WEBSITE https://www.MotivationalCowboy.com/podcast/ SOUNDCLOUD PODCAST https://soundcloud.com/outstandinglifepodcast iTUNES APPLE PODCAST https://itunes.apple.com/us/podcast/outstanding-life-with-the-motivational-cowboy/id1410576520?mt=2 SPOTIFY PODCAST https://open.spotify.com/show/4OFNmM9Rv9jNA0gQMPv8XU STITCHER https://www.stitcher.com/s?fid=389557&refid=stpr YOUTUBE https://www.youtube.com/watch?v=tttQkLT7SfE&list=PL1Jmeb31MqLiNLxcnufzmCCca3HGH20Rj&index=2&t=0s SUPPORT with PAYPAL https://www.paypal.me/motivationalcowboy LISTEN for FREE to ‘Outstanding Life' PODCAST with Johnny D. the Motivational Cowboy on iTunes, Spotify, SoundCloud, Stitcher, YouTube & other major platforms and stations. Now with Over 1 Million Listeners! His podcast is the latest in a long list of platforms that allows him to reach people. Among his most notable accomplishments is a 2nd Grammy consideration for his recently released spoken word CD “Time to Stand Out!”. https://www.MotivationalCowboy.com
Patreon: https://www.patreon.com/mlst Discord: https://discord.gg/ESrGqhf5CB Twitter: https://twitter.com/MLStreetTalk Chris Eliasmith is a renowned interdisciplinary researcher, author, and professor at the University of Waterloo, where he holds the prestigious Canada Research Chair in Theoretical Neuroscience. As the Founding Director of the Centre for Theoretical Neuroscience, Eliasmith leads the Computational Neuroscience Research Group in exploring the mysteries of the brain and its complex functions. His groundbreaking work, including the Neural Engineering Framework, Neural Engineering Objects software environment, and the Semantic Pointer Architecture, has led to the development of Spaun, the most advanced functional brain simulation to date. Among his numerous achievements, Eliasmith has received the 2015 NSERC "Polany-ee" Award and authored two influential books, "How to Build a Brain" and "Neural Engineering." Chris' homepage: http://arts.uwaterloo.ca/~celiasmi/ Interviewers: Dr. Tim Scarfe and Dr. Keith Duggar TOC: Intro to Chris [00:00:00] Continuous Representation in Biologically Plausible Neural Networks [00:06:49] Legendre Memory Unit and Spatial Semantic Pointer [00:14:36] Large Contexts and Data in Language Models [00:20:30] Spatial Semantic Pointers and Continuous Representations [00:24:38] Auto Convolution [00:30:12] Abstractions and the Continuity [00:36:33] Compression, Sparsity, and Brain Representations [00:42:52] Continual Learning and Real-World Interactions [00:48:05] Robust Generalization in LLMs and Priors [00:56:11] Chip design [01:00:41] Chomsky + Computational Power of NNs and Recursion [01:04:02] Spiking Neural Networks and Applications [01:13:07] Limits of Empirical Learning [01:22:43] Philosophy of Mind, Consciousness etc [01:25:35] Future of human machine interaction [01:41:28] Future research and advice to young researchers [01:45:06] Refs: http://compneuro.uwaterloo.ca/publications/dumont2023.html http://compneuro.uwaterloo.ca/publications/voelker2019lmu.html http://compneuro.uwaterloo.ca/publications/voelker2018.html http://compneuro.uwaterloo.ca/publications/lu2019.html https://www.youtube.com/watch?v=I5h-xjddzlY
Jesse & Kyle reckon with the Daylight Savings Time change before digging into the details of the Silicon Valley Bank failure and its impact on the crypto space. Then the boys dig deep into AI, how AI can interact with the blockchain and ICP or even the NNS, and if this podcast is already being hosted by an AI. Finally digressions about indoor waterpark vacations, Kyle's campaign for president in 2024 and Last of Us spoilers. Daylight Savings Time Hank Green on the SVB Failure Why Was Signature Bank Really Shut Down? ArWeave space #AskNeurotic: Love to know your thoughts on how far away ICP is from having machine learning applied to subnet/canister data Anvil's ChatGPT Tweet Recommendations: Jesse - Her Kyle - Indoor Water Parks Aftershow: Class Action Park
Welcome back to the Vault of Silliness, or should I say Vault of Clippiness? Yes, the clipping at the end of the episodes continues and the powers that be are fighting the good fight to resolve it properly. Buuuuut for now we have agreed to try a little trickery. A bit of video voodoo. Some audio abracadabra. What is it? Well thank you just so darn much for asking! I will add around 30 seconds of silence at the end of each show. The hope is that it will now clip off the added silence and my wonderfully entertaining close to each show will be heard in its entirety. I’m sure some of you are saying, “He should add the silence at the start of his close.” Let us venture back to February 25th, 1996 for a NNS show I have titled: Wolves at the Studio Door. We have three guests all of whom will be speaking about the wolf. Guest 1 Fred Keating from Loki (prn Lo-keye) Clan Wolf Refuge in Conway, NH – A shelter for wolf hybrids – wolves mixed with a domesticated dog. He provides excellent info on wolves and their abilities, explains all that his Refuge does and what started him on this journey. Guest 2 Paul Saffron from the North American Wolf Foundation and Wolf Hollow in Ipswich, MA. At the time he had seventeen, 100% pure British Columbian Timberwolves – and they were howling in the background! He tells us about the horribly senseless and cruel ways the wolf was treated and exterminated in the past and how governments are trying to right those wrongs. And, of course, he talks about his Foundation, his experiences, and the presentations he and his wife, Joanie, do at Wolf Hollow. Guest 3 Rick McIntyre – Author of ‘War Against the Wolf.’ He adds even more perspective on the anti-wolf mindset of the time and now, thankfully, a change in attitudes and philosophy about the wolf and its importance in the environment. We do take some calls with Rick: Joe from Quincy Paul Saffron returns to speak with Rick Jim in Georgetown John from MI Barbara in Brookline Pauline from Westborough Bruce in Boston And Pat from Plainville Episode 127, Wolves at the Studio Door, howls it’s way to your ears, now. patreon.com/normnathanvos
Welcome to Episode 51! After a digression about when the one year anniversary of the podcast actually is, Jesse and Kyle look at the new app Juno, try to explain the community fund, then dig into the weeds of the OpenChat SNS offering. Jesse explains the thought process behind his new “Everything Blockchain” video, ckBTC rainbow laser eyes, and the boys wonder if we're the WuTang of ICP podcasts (they're not.) Juno NNS Community Fund OpenChat SNS OpenChat on Spaces with Kyle History of the World Part 2 Openchat SNS Tokenomics Questions about CF/Openchat: Does anyone know where the valuations come from for the SNS sales? Do the teams get to decide themselves? This is the last thing I need to know before committing to the community fund It would be good if you and Jesse could review how the CF works and what it means to participate in the SNS sale via the CF. What are the differences between CF participation and direct participation? Why should someone consider CF participation? What is the benefit of joining the CF over simply buying in the decentralization sale? Is the CF in a beta state with more features to come? Looking for clarity and hope
A NNS from February 17th, 1996 graces the airwaves again with this episode. Because of the subject matter I have titled it: Help Wanted, Guidance Given! We get a nice Show open and Norm teases the upcoming guests. First will be Brandon Toropov – a Job Search Expert – His Book: “303 Off-the-Wall Ways to Get a Job.” He doles out resumé suggestions, advice and a whole lot of other strategies! Later we talk with ‘Mr. Scholarship’ Dan Cassidy – Pres. and founder of the National Scholarship Research Service. Back then, 2.7B $ in financial aid went unclaimed! Callers for Brandon: Christine from Abbington A Caller from Canada Laurie Dave And Fred Before we get to Dan Cassidy, Norm does some sports and talks up Sunday night’s guest which was to be Rupert Holmes, writer, composer, singer and multi-Tony award winner. His latest project that was airing on AMC was a series titled: Remember WENN Now it’s time for Mr. Scholarship. Dan talks about the many different criteria that can lead you to get a scholarship and provides a ton of info! Some reference books that were mentioned were: For Undergrads…The Scholarship Book, The Worldwide Graduate Scholarship Directory for Masters, PhD and post-doctorate and The Worldwide College Scholarship Directory for foreign students studying in this country or a student from the U.S. looking to study abroad. Callers for Dan: Chris in NH Jeremy from Cambridge Beth in Easton which jumps to Side B Joanne in Boston Virginia from Hyde Park Mary in Marshfield Glen from Danvers Mark in Taunton Carol from Quincy And Dawn in Medford Norm does a quick station ID and we hear Jack Harte with a snowy traffic report sponsored by N.E. Car Stereos Rockin’ President’s Sale! Bonus time: The tape concludes with more Norm, but we walk it back to December 28th, 1995? Callers: Don A Birthday Girl who’s turning 14 Anne from N. Reading And an unnamed caller Jack returns with just a sponsor read for the Christmas Tree Shoppes and we close with Kristin from Dorchester and yet another unnamed caller. Episode 126, “Help Wanted, Guidance Given!” benefits your ears…now. Patreon patreon.com/normnathanvos
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: More findings on Memorization and double descent, published by Marius Hobbhahn on February 1, 2023 on LessWrong. Produced as part of the SERI ML Alignment Theory Scholars Program - Winter 2022 Cohort. I'd like to thank Wes Gurnee, Aryan Bhatt, Eric Purdy and Stefan Heimersheim for discussions and Evan Hubinger, Neel Nanda, Adam Jermyn and Chris Olah for mentorship and feedback. The post contains a lot of figures, so the suggested length is deceiving. Code can be found in these three colab notebooks [1][2][3]. I have split the post into two parts. The first one is concerned with double descent and other general findings in memorization and the second focuses on measuring memorization using the maximum data dimensionality metric. This is the first post in a series of N posts on memorization in transformers. Executive summary I look at a variety of settings and experiments to better understand memorization in toy models. My primary motivation is to increase our general understanding of NNs but I also suspect that understanding memorization better might increase our ability to detect backdoors/trojans. The work heavily builds on two papers by Anthropic, “Toy models of superposition” and “Superposition, Memorization and double descent”. I successfully replicate a subset of their findings. I specifically look at three different setups of NNs that I speculate are most relevant to understanding memorization in the non-attention parts of transformers. Bottlenecks between layers, i.e. when projecting from high-dimensional spaces (e.g. MLPs) into lower dimensions (e.g. the residual stream). This is similar to the setting in the toy models of superposition paper and its sequel. MLP blocks, i.e. when projecting from lower-dimensional spaces (e.g. the residual stream) into higher dimensions with ReLU non-linearities. The final layer, i.e. when projecting from the end of the residual stream into the vocab space. The main difference to the previous scenarios is that we use the cross-entropy loss for the experiments which has a different inductive bias than the MSE loss. I'm able to find the double descent phenomenon in all three settings. My takeaway from this is that the transition between memorization and learning general features seems to be a very regular and predictable phenomenon (assuming you know the sparsity and number of features of your network). Furthermore, it seems like the network is “confused” (e.g. has much higher test loss) when it is right between memorization and generalization. I test the limits of reconstruction in different settings, i.e. the ability of the neural network to reconstruct its inputs given different dataset sizes, hidden sizes, number of features, importance distributions and sparsities. The findings mostly confirm what we would predict, e.g. more sparsity or larger hidden sizes lead to better reconstructions. A speculative claim is that if we had better measures of sparsity and importance in real-world models, we might be able to derive scaling laws that could tell us how many “concepts” a network has learned. Interpreting NNs that memorized in the simplest settings is extremely straightforward--the network literally creates a dictionary that you can just read off the weights. However, even small increases in complexity make this dictionary much harder to read and I have not yet found a method to decompile it into a human-readable form (maybe in the next posts). Isolated components In the following, we isolate three settings that seem like important components of memorization. They are supposed to model the non-attention parts of a transformer (primarily because I speculate that memorization mostly happens in the non-attention parts). Bottleneck By bottleneck we mean a situation in which a model projects from many into fewer dimensions, e.g. fro...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: More findings on maximal data dimension, published by Marius Hobbhahn on February 2, 2023 on The AI Alignment Forum. Produced as part of the SERI ML Alignment Theory Scholars Program - Winter 2022 Cohort. I'd like to thank Wes Gurnee, Aryan Bhatt, Eric Purdy and Stefan Heimersheim for discussions and Evan Hubinger, Neel Nanda, Adam Jermyn and Chris Olah for mentorship and feedback. The post contains a lot of figures, so the suggested length is deceiving. Code can be found in this colab notebook. This is the second in a series of N posts on trying to understand memorization in NNs. Executive summary I look at a variety of settings and experiments to better understand memorization in toy models. My primary motivation is to increase our general understanding of NNs but I also suspect that understanding memorization better might increase our ability to detect backdoors/trojans. This post specifically focuses on measuring memorization with the maximal data dimensionality metric. In a comment to the “Superposition, Memorization and double descent” paper, Chris Olah introduces maximal data dimensionality D, a metric that supposedly tells to which degree a network memorized a datapoint compared to using features that are shared between datapoints. I extend the research on this metric with the following findings In the double descent setting, the metric describes exactly what we would predict, i.e. with few inputs the network memorizes all datapoints and with a lot of input it learns some features. On MNIST, I can reproduce the shape of the D curve and also the findings that memorized datapoints have high D, datapoints that share many features are in the middle and datapoints that the network is confused about have low D. However, I was surprised to find that the datapoints the network misclassified on the training data are evenly distributed across the D spectrum. I would have expected them to all have low D didn't learn them. When we train the network to different levels of accuracy, we find that the distribution of errors is actually slightly left-heavy instead of right-heavy. I have not yet understood why it is the case but I'd be interest in follow-up research to see whether it tells us something interesting. Different classes are not evenly distributed across the spectrum, e.g. “8” is far more regular than “5” according to D. This is what we would expect. Across different hidden sizes, the shape of the D curve stays nearly the same but the spearman rank correlation between the datapoints increases the larger the difference in hidden size. This means the more similar the number of neurons, the more similar is the in which D sorts the datapoints. Networks of the same size trained on the same data with different seeds show nearly identical D curves and have high spearman rank correlation. This is what we would expect. Different dataset sizes produce different shapes of D, e.g. larger datasets have more shared features (they are flatter in the middle). This seems plausible. Different levels of weight decay have nearly no effect on the shape of D. The minor effect they have is the opposite of what I would have expected. The shape of D changes very little between initialization and the final training run. This was unexpected and I have no good explanation for this phenomenon yet. When we measure D over different batches we find the same phenomenon. Working with D can be a bit tricky (see Appendix for practical tips). The more I played around with D, the more I'm convinced that it tells us something interesting. Particularly the question about misclassifications and error rates and the unexpectedly small change during initialization and final training run seem like they could tell us something about NNs that we don't yet know. Maximal data dimensionality There are two models u...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: More findings on Memorization and double descent, published by Marius Hobbhahn on February 1, 2023 on The AI Alignment Forum. Produced as part of the SERI ML Alignment Theory Scholars Program - Winter 2022 Cohort. I'd like to thank Wes Gurnee, Aryan Bhatt, Eric Purdy and Stefan Heimersheim for discussions and Evan Hubinger, Neel Nanda, Adam Jermyn and Chris Olah for mentorship and feedback. The post contains a lot of figures, so the suggested length is deceiving. Code can be found in these three colab notebooks [1][2][3]. I have split the post into two parts. The first one is concerned with double descent and other general findings in memorization and the second focuses on measuring memorization using the maximum data dimensionality metric. This is the first post in a series of N posts on memorization in transformers. Executive summary I look at a variety of settings and experiments to better understand memorization in toy models. My primary motivation is to increase our general understanding of NNs but I also suspect that understanding memorization better might increase our ability to detect backdoors/trojans. The work heavily builds on two papers by Anthropic, “Toy models of superposition” and “Superposition, Memorization and double descent”. I successfully replicate a subset of their findings. I specifically look at three different setups of NNs that I speculate are most relevant to understanding memorization in the non-attention parts of transformers. Bottlenecks between layers, i.e. when projecting from high-dimensional spaces (e.g. MLPs) into lower dimensions (e.g. the residual stream). This is similar to the setting in the toy models of superposition paper and its sequel. MLP blocks, i.e. when projecting from lower-dimensional spaces (e.g. the residual stream) into higher dimensions with ReLU non-linearities. The final layer, i.e. when projecting from the end of the residual stream into the vocab space. The main difference to the previous scenarios is that we use the cross-entropy loss for the experiments which has a different inductive bias than the MSE loss. I'm able to find the double descent phenomenon in all three settings. My takeaway from this is that the transition between memorization and learning general features seems to be a very regular and predictable phenomenon (assuming you know the sparsity and number of features of your network). Furthermore, it seems like the network is “confused” (e.g. has much higher test loss) when it is right between memorization and generalization. I test the limits of reconstruction in different settings, i.e. the ability of the neural network to reconstruct its inputs given different dataset sizes, hidden sizes, number of features, importance distributions and sparsities. The findings mostly confirm what we would predict, e.g. more sparsity or larger hidden sizes lead to better reconstructions. A speculative claim is that if we had better measures of sparsity and importance in real-world models, we might be able to derive scaling laws that could tell us how many “concepts” a network has learned. Interpreting NNs that memorized in the simplest settings is extremely straightforward--the network literally creates a dictionary that you can just read off the weights. However, even small increases in complexity make this dictionary much harder to read and I have not yet found a method to decompile it into a human-readable form (maybe in the next posts). Isolated components In the following, we isolate three settings that seem like important components of memorization. They are supposed to model the non-attention parts of a transformer (primarily because I speculate that memorization mostly happens in the non-attention parts). Bottleneck By bottleneck we mean a situation in which a model projects from many into fewer dimensi...
Jesse is fired up about Twitter Spaces, Pink Floyd and basketball TV shows. Kyle explains the Electric Capital report and everything you could ever want to know about inflation. Everyone is excited about recommendations. Twitter Space with William from Fastblocks and Scott Page Kyle's Human Organization 4.0, part 8 Winning Time on HBO Electric Capital Report Lomesh's comments on the report NNS and ICP Token Metrics Monthly Report, Dec 2022 Seeking Feedback For Improving Maturity Reporting
Kia Ora (Key Orra) to our listeners in New Zealand! Thank you for checking out our little slice of fun here at the Vault of Silliness. I give you Ep 121. It’s a NNS, the majority of which is from January 8th and maybe the 9th, 1994. According to the notes on the cassette, there’s a taste of December 4th, 1993. Because of that this cassette was all over the place. I’ve pieced it together chronologically as best as I could. It appears that December show was the one affected by all the technical issues that Norm briefly discussed in a previous episode. Our title will be: A Slushy Great Adventure of a Show We begin with 12/4/93, The Night of Technical Difficulties with a Jack Harte traffic report that has a section of said trouble and you can hear the smile in Jack’s voice as he soldiers on. Darrell Gould with an intro to another report from Jack Jack and Norm then lament the cancelation of the DBG because of all the problems and glitches. There’s Scott, a Bentley College student. He and Norm talk about working overnights and how to sleep during the day. Another Scott, this time from Wayland, who was on a previous DBG (Ep 114) but had to leave because he got pulled over by the police. He gives us the rest of the story. Plus he asks about Larry Glick, Emperor Hudson and Ron Landry. Some old station call letters are bandied about too: WITS – the I.T.S. standing for Info Talk and Sports, WMEX, WMRE and WSSH We now jump to 1/8/94 The stars must have been aligned because Elvis Presley who’s was ‘appearing’ on top of the dumpster at the Tedeschi’s in Weymouth. His main reason for calling was to request Norm sing Happy Birthday to him. Norma from the N. End – the Old Howard Theater, candy hawkers selling naughty stuff with double entandrés and cats and dogs round out her call. Chris in PA – Jason Robards is sexy and Norm looks just like him. She kindly inquires on the whereabouts of I. Norm gives her the lowdown and reveals just some of the things we do together off the air. Annnnd more cat talk with added raccoon fun. Next caller had some Jazz questions about the Nat King Cole Trio but the call is cut short. Now we hear from the lovely Helene from Belmont – They talk snow, sleet, windshield wiper issues and great adventures. Joan talking Mac Davis, Elvis and more. A great call with our good friend Rick from PA with more on Mac Davis, the TV production of ‘Gypsy’ and Tip O’Neill. His call is interrupted by Lou Ambrosino with an icy traffic report but we do get back to him. We move on to Vivian whom Norm left momentarily speechless. Bob reads ‘Boston Shade of Green’ and a passage from ‘The Seagull’ for Tip O’Neill Fred from NJ and his crazy cats Ritzy and Tiger. We also learn that Fred enjoys telephones and has multiple lines at his house. Other schlunk: Norm has learned special skills from his time in the Orient Profound statements that make you want to throw up Peals of Laughter Engineers climbing all over the transmitter Vitality! Norm sings his High School Fight Song Years passing under the bridge Looking your age Historic Romances Spending the Night with Helene and Norm bringing his jammies Riding Phoebe Snow Ringing phones As for commercial content we get the first sentence of a few sponsors to what our appetite. Ep 121, A Slushy Great Adventure of a Show, slides to your ears…now. patreon.com/normnathanvos
Eating Nandos naked on the bed isn't an ideal way to meet your housemates new partner, but we love to hear about it! Here are the best dating moments from NNS this year.See omnystudio.com/listener for privacy information.
One of NNS's greatest joys is when you guys call up with embarrassingly hilarious stories! Whether its market place fiascos or MAJOR event disasters - here are some of the most hilarious stories from this year. See omnystudio.com/listener for privacy information.
Aussie comedy legend and host of ABC's beloved Gruen, Wil Anderson, stopped by NNS this year to get you giggling at breakfast. Here are his best bits!See omnystudio.com/listener for privacy information.
Everyones favourite garden gnome, Costa Georgiadis, joined NNS this year to help rescue your dying plants! Here are his best bits.See omnystudio.com/listener for privacy information.
It's a big week around ICP-Land! Jesse & Kyle get deep into the release of SNS-1: what happened, lessons learned, technological limits tested, and the resulting nascent DAO as the first SNS launchpad takes off and is embraced by the community. Plus the NNS redesign, BTC integration is coming and of course, charts. Lots and lots of charts. Miso Black Cod Cycle burning off the hooks! SNS-1 Launch Update 1 Update 2 Update 3 SNS sale visualized by Saorsa Labs NNS Redesign BTC coming live BTCICP Integration on Crowdfund NFT Jesse's Forum Post looking for BTC projects Ask Neurotic: What are the keys things Dfinity need to learn from the SNS launch? Question to Kyle: How many charts are too many charts? Question to Jesse: How long do you think it will take him to chart chart-overcharting? Visual Capitalist Recommendations: Jesse - The Taco Chronicles Kyle - It's a Wonderful Life --Got feedback, Ask Neurotic questions or just want to chat? Follow us on twitter @neuroticpod
Sorry for the late posting but I was stuffed from Thanksgiving and just woke up from tryptophan coma. Let’s begin: Assalamualaikum to our new listener country of Bangladesh! Ep 114 brings us a DBG/NNS combo from November 25th and maybe 26th, 1993. I’ve titled it: A Turkey Sandwich of Thanksgiving Wishes. Players: Joe from Revere Rod the Security Guard from the front desk at WBZ Scott in his car from Wayland Anne from the Catskills area of NY Greg ‘Doug, Jeff’ Ebben producing and playing in studio And the affable Jack Harte in Traffic Bdays: Amy Grant JFK Jr Ricardo Montalbán Joe DiMaggio Bucky Dent John Larroquette Kathryn Crosby And Christina Applegate We now move to NNS time! Callers with tons of Happy Thanksgiving wishes and other praise! Carolyn in NC Mike in Boston Mike from Kingston Jerry in Natick Katie in beeyooteefull Cape Breton, Canada The one and only, Generosa! Peggy Lavera in Charlottesville, VA Jim from Manchester, NH Bill from Jaffrey, NH May in Boston Steve who wanted to thank Norm personally for something Norm helped him with back in February. I will let him tell the story so you just hang in there for it. Pete in Roslindale We close with a commercial for The Secret Garden at the Colonial Theater and then Norm teases SMQ and signs off. Other leftovers: Mixing metaphors and Old English sayings and speaking in dead languages Bad Math Through the entire game, Anne, unintentionally, does a great Gabby Hayes impression. Rod reveals that he has been tutoring Mike Epstein on bdays! Sipping Does Scott get hauled off to the ol’ Gray Bar Hotel? Is his one phone call used to return the DBG? Scratchy tapes Anne gives us some fantastic inside baseball observations. Norm likes his floozies young. Yula Grunes(Runes? Rooms? Grooms?) and Donald Lowzahn were married today on top of a wedding cake float at the Detroit Thanksgiving Day Parade. And cooking a special dinner for your pet? Ep 114, A Turkey Sandwich of Thanksgiving Wishes, begins to baste your ears in wonderfulness…now.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Disagreement with bio anchors that lead to shorter timelines, published by Marius Hobbhahn on November 16, 2022 on The AI Alignment Forum. This would have been a submission to the FTX AI worldview prize. I'd like to thank Daniel Kokotajlo, Ege Erdil, Tamay Besiroglu, Jaime Sevilla, Anson Ho, Keith Wynroe, Pablo Villalobos and Simon Grimm for feedback and discussions. Criticism and feedback are welcome. This post represents my personal views. The causal story for this post was: I first collected my disagreements with the bio anchors report and adapted the model. This then led to shorter timelines. I did NOT only collect disagreements that lead to shorter timelines. If my disagreements would have led to longer timelines, this post would argue for longer timelines. I think the bio anchors report (the one from 2020, not Ajeya's personal updates) puts too little weight on short timelines. I also think that there are a lot of plausible arguments for short timelines that are not well-documented or at least not part of a public model. The bio anchors approach is obviously only one possible way to think about timelines but it is currently the canonical model that many people refer to. I, therefore, think of the following post as “if bio anchors influence your timelines, then you should really consider these arguments and, as a consequence, put more weight on short timelines if you agree with them”. I think there are important considerations that are hard to model with bio anchors and therefore also added my personal timelines in the table below for reference. My best guess bio anchors adaption suggests a median estimate for the availability of compute to train TAI of 2036 (10th percentile: 2025, 75th percentile: 2052). Note that this is not the same as predicting the widespread deployment of AI. Furthermore, I think that the time “when AI has the potential to be dangerous” is earlier than my estimate of TAI because I think that this poses a lower requirement than the potential to be economically transformative (so even though the median estimate for TAI is 2036, I wouldn't be that surprised if, let's say 2033 AIs, could deal some severe societal harm, e.g. > $100B in economic damage). You can find all material related to this piece including the colab notebook, the spreadsheets and the long version in this google folder. Executive summary I think some of the assumptions in the bio anchors report are not accurate. These disagreements still apply to Ajeya's personal updates on timelines. In this post, I want to lay out my disagreements and provide a modified alternative model that includes my best guesses. Important: To model the probability of transformative AI in a given year, the bio anchors report uses the availability of compute (e.g. see this summary). This means that the bio anchors approach is NOT a prediction for when this AI has been trained and rolled out or when the economy has been transformed by such an ML model, it merely predicts when such a model could be trained. I think it could take multiple (I guess 0-4) years until such a model is engineered, trained and actually has a transformative economic impact. My disagreements You can find the long version of all of the disagreements in this google doc, the following is just a summary. I think the baseline for human anchors is too high since humans were “trained” in very inefficient ways compared to NNs. For example, I expect humans to need less compute and smaller brains if we were able to learn on more data or use parallelization. Besides compute efficiency, there are further constraints on humans such as energy use, that don't apply to ML systems. To compensate for the data constraint, I expect human brains to be bigger than they would need to be without them. The energy constraint could imply that human brains a...
For more details on this podcast visit: https://www.journeybeyondweightloss.com/blog/88 We had a recent question about sweeteners in the “Sugar and Flour Buster's Society” Facebook group… When someone asks a really good question like this I do give an answer but I often like to go to the podcast and give a more thorough answer, and discuss the science behind my answer. I get this question ALL the TIME - are artificial sweeteners OK? Can I use them? The truth is, it's hard to give a black and white answer on this one. I would never say “Use all you want,” and I'd never say “Absolutely NOT!”. So the purpose of this podcast is to educate you about artificial sweeteners, and then you can make your own decision about whether you want to use them - or not - and if so, what kind and how much. Enjoy! Episode Highlights: (11:05) There's evidence that if you replace sugar-sweetened beverages like Coke, for example, with Diet Coke or Coke Zero, people will lose weight and have an easier time keeping it off. So the thinking here is that if you're someone who drinks, let's say six 12 ounce cans of Coke a day, in this case, using a non-nutritive sweetener, an “NNS”, makes a lot of sense because you're not getting all those calories, you're not getting all that sugar, and I can pretty much guarantee that when you stop doing that, you'll lose weight! (13:58) In terms of research on problems with sweeteners, there's other evidence, and this is from a 2014 study out of Israel and a 2018 study out of Australia that showed that sweeteners can kill the good bacteria in our gut, it's called the microbiome. So science is really coming to understand the importance of a well-balanced microbiome in our gut and how important it is that the different types of gut bacteria are in balance. ( 24:44) Here's what I think; definitely do not drink sugar-sweetened beverages ever. If you have to have soda, drink diet soda, but I would be super careful with sweeteners. We just still don't have enough information about their safety and their long-term effects.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: For ELK, truth is mostly a distraction, published by Cristian Trout on November 4, 2022 on The AI Alignment Forum. Epistemic Status: Pretty confident in the central conclusions, and very confident in the supporting claims from meta-logic. Any low confidence conclusions are presented as such. NB: I give an intentionally revisionary reading of what ELK is (or should be) about. Accordingly, I assume familiarity with the ELK report. Summary here. Executive Summary ELK collapses into either the automation of science or the automation of mechanistic interpretability. I promote the latter. Abstract After reframing ELK from the perspective of a logician, I highlight the problem of cheap model-theoretic truth: by default reporters will simply learn (or search for) interpretations of the predictor's net that make the teacher's answers “true” in the model-theoretic sense, whether or not they are True (correspond with reality)! This will be a problem, even if we manage to avoid human simulators and are guaranteed an honest translator. The problem boils down to finding a way to force the base optimizer (e.g. gradient descent) to pay attention to the structure of the predictor's net, instead of simply treating it like putty. I argue that trying to get the base optimizer to care about the True state of affairs in the vault is not a solution to this problem, but instead the expression of a completely different problem – something like automating science. Arguably, this is not the problem we should be focused on, especially if we're just trying to solve intent alignment. Instead I tentatively propose the following solution: train the reporter on mechanistic interpretability experts, in the hope that it internalizes and generalizes their techniques. I expand this proposal by suggesting we interpret in parallel with training, availing ourselves of the history of a predictor's net in order to identify and track the birth of each term in its ontology. The over-arching hope here is that if we manage to fully interpret the predictor at an earlier stage in its development, we can then maintain that transparency as it develops into something much more complex. To finish off, I zoom out, outlining three different levels of ELK-like projects, each building on the last, each more difficult than the last. Terminology I need to address some conflicts in namespace. I use the noun "model" in three different senses: as standing for logical models, ML models, and finally in the colloquial English sense, as synonymous with “a smaller, compressed representation of something.” Context should be enough to keep matters clear and I try to flag when I switch meanings, but I apologize in advance for any confusion. It's a similar story with “truth” – many different senses in play. Sorry for any confusion; please watch for cues. I should also note that, because I'm not very familiar with Bayes nets, I'll instead be (sloppily) talking as if the predictor is a Neural Net (NN) that somehow “thinks” in neat propositions. Basically, I know NNs work on fuzzy logic, and I'm then idealizing them as working on classical bivalent logic, for simplicity. I don't think any of my arguments turn on this idealization, so it should be permissible. Finally, some shorthand: Translator ≝ ML model that operates roughly as follows: Take a question from humans as input Generate candidate answers using NL processing or something Using only a mapping which takes terms of these candidate answers to referents in the predictor's net, generate a probability distribution over the candidate answers. This probability distribution is understood as a distribution of truth values in a fuzzy logic. (Again though, for simplicity I mostly pretend we're working with bivalent logic). Output the answer(s) with the highest truth-value. Honest translat...
This week Kyle has much anticipated puppy updates, the community has a white-hot temperature check on the idea of an NNS Treasury, and the forums have a proposal on the true nature of the NNS. Plus, a great stack of #AskNeurotic questions on compliance, marketing, enterprise, web2 transitions and then Jesse explains the meaning of life. The Man Who Mistook His Wife for a Hat DFINITY Foundation's vote on Governance proposal #80970 (“Spam proposal”) and #86639 (“Temperature Check”) https://forum.dfinity.org/t/proposal-defining-an-ethos-for-the-nns/16090? [Proposal] Defining An Ethos For The NNS Ask Neurotic: What is optimal marketing strategy for a general use blockchain? Dom's Gaming Video Will OFAC compliance speed up institutional adoption of blockchains and what does this mean for US IC nodes going forward? Can you guys discuss when enterprise will start using icp to build their system or to migrate into icp ? Also what do you think will keep large web 2 services from moving to the #ic? What's the meaning of life? Recommendations: Kyle: The Man in the Arena, Teddy Roosevelt Jesse: Andor Aftershow: What are some of the best beverages to go with tacos
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Science of Deep Learning - a technical agenda, published by Marius Hobbhahn on October 18, 2022 on The AI Alignment Forum. I have written down a long list of alignment ideas that I'd be interested in working on. The ideas roughly boil down to “To make progress on alignment, we need to understand Deep Learning models and the process by which they arrive at their final parameters in much more detail than we currently do”. I'm not the first person to think of most of these ideas and it builds on a lot of other people's work. You should think of it more as a collection of existing resources and ideas than a new agenda. I also haven't come up with the term “Science of Deep Learning”. I have already heard it being used by multiple people within the alignment community and many researchers are already working on parts of this agenda. I obviously don't own this agenda. The questions in this list are sufficient to keep hundreds of researchers busy for a while, so feel free to hop on. If you're interested in collaborating just reach out. Some of this research has the potential to increase capabilities more than alignment and the results should, in some cases, be kept private and only discussed with a small group of trusted peers. However, I think that most of the projects have a “defender's advantage”, i.e. they increase alignment more than capabilities. Whenever possible, Science of DL projects should have a direct benefit for alignment but I think our current understanding of DL is so bad that just increasing our general understanding seems like a good start. Here is the link to the full version (comments are on, please don't abuse it): The rest of this post is an overview copied from the doc. Feedback is welcome. Overview - Science of Deep Learning By Science of DL, I roughly mean “understanding DL systems and how they learn concepts” better. The main goal is to propose a precise and testable hypothesis related to a phenomenon in DL and then test and refine it until we are highly confident in its truth or falsehood. This hypothesis could be about how NNs behave on the neuron level, the circuit level, during training, during fine-tuning, etc. This research will almost surely at some point include mechanistic interpretability but it is not limited to it. The refined statement after investigation can but doesn't have to be of mathematical form as long as it is unambiguous and can be tested, i.e. two people could agree on an experiment that would provide evidence for or against the statement and then run it. How this could look in practice The details would obviously differ from project to project but on a high level I imagine it to look roughly like this Pick an interesting concept found in deep learning, e.g. grokking, the lottery ticket hypothesis, adversarial examples or the emergence of 2-digit addition in LLMs. Optimally, the concept is safety-related but especially in the beginning, just increasing general understanding seems more important than the exact choice of topic. Try to understand high-level features of the phenomenon, e.g. under which conditions this concept arises, which NNs show it and which ones don't, in which parts of the networks it arises, when during training it arises, etc. This likely includes retraining the network under different conditions with different hyperparameters, number of parameters, etc. and monitoring meaningful high-level statistics related to the concept, e.g. monitor the validation loss to see when the model starts to grok. Zoom in: try to understand what happens on a low level, e.g. use mechanistic interpretability tools to investigate the neurons/activations or use other techniques to form a hypothesis of how this specific part of the network works. In the optimal case, we would be able to describe the behavior very precisely, e.g. ...
Bear markets are when you build! Jesse and Kyle ask all the questions: Should the NNS build a treasury? Is Isaac's NNS proposal tool too decentralized? Is this why we can't have nice things? What is Code & State anyway? What's it like to go to a farm school? All of these and more answered in this week's Neurotic Podcast. NNS Treasury Peter Thiel is gross Isaac Valadez NNS Proposal Proposals like this is why we can't have nice things Synapse Vote - Weed is back! Smart contract wallets Code and State Code and State on Twitter Demergent Roadmap #AskNeurotic: Excess Burn Cycles? Recommendations: Jesse: Werewolf by Night Kyle: Build by Tony Fadell Aftershow: Charlie Bamforth “Pope of Foam” --Got feedback, Ask Neurotic questions or just want to chat? Follow us on twitter @neuroticpo
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Greed Is the Root of This Evil, published by Thane Ruthenis on October 13, 2022 on The AI Alignment Forum. The SGD's greed, to be specific. Consider a ML model being trained end-to-end from initialization to zero loss. Every individual update to its parameters is calculated to move it in the direction of maximal local improvement to its performance. It doesn't take the shortest path from where it starts to the ridge of optimality; it takes the locally steepest path. 1. What does that mean mechanically? Roughly speaking, every feature in NNs could likely be put into one of two categories: Statistical correlations across training data, aka "the world model". The policy: heuristics/shards, mesa-objectives, and inner optimization. The world-model can only be learned gradually, because higher-level features/statistical correlations build upon lower-level ones, and therefore the gradients towards learning them only appear after the lower-level ones are learned. Heuristics, in turn, can only attach to the things that are already present in the world-model (same for values). They're functions of abstractions in the world-model, and they fire in response to certain WM-variables assuming certain values. For example, if the world-model is nonexistent, the only available heuristics are rudimentary instincts along the lines of "if bright light, close eyes". Once higher-level features are learned (like "a cat"), heuristics can become functions of said features too ("do X if see a cat", and later, "do Y if expect the social group to assume state S within N time-steps"). The base objective the SGD is using to train the ML model is, likewise, a function of some feature/abstraction in the training data, like "the English name of the animal depicted in this image" or "the correct action to take in this situation to maximize the number of your descendants in the next generation". However, that feature is likely a fairly high-level one relative to the sense-data the ML model gets, one that wouldn't be loaded into the ML model's WM until it's been training for a while (the way "genes" are very, very conceptually far from Stone Age humans' understanding of reality). So, what's the logical path through the parameter-space from initialization to zero loss? Gradually improve the world-model step by step, then, once the abstraction the base objective cares about is represented in the world-model, put in heuristics that are functions of said abstraction, optimized for controlling that abstraction's value. But that wouldn't do for the SGD. That entire initial phase, where the world-model is learned, would be parsed as "zero improvement" by it. No, the SGD wants results, and fast. Every update must instantly improve performance! The SGD lives by messy hacks. If the world-model doesn't yet represent the target abstraction, the SGD will attach heuristics to upstream correlates/proxies of that abstraction. And it will spin up a boatload of such messy hacks on the way to zero loss. A natural side-effect of that is gradient starvation/friction. Once there's enough messy hacks, the SGD won't bother attaching heuristics to the target abstraction even after it's represented in the world-model — because if the extant messy hacks approximate the target abstraction well enough, there's very little performance-improvement to be gained by marginally improving the accuracy so. Especially since the new heuristics will have to be developed from scratch. The gradients just aren't there: better improve on what's already built. 2. How does that lead to inner misalignment? It seems plausible that general intelligence is binary. A system is either generally intelligent, or not; it either implements general-purpose search, or it doesn't; it's either an agent/optimizer, or not. There's no continuum here, the difference...
Our latest guest, Hosea Fitten, Program Manager for Workforce & Development for Newport News Shipbuilding talked with us about the many lucrative careers available at NNS! They are looking for welders and pipefitters and even admin staff such as IT, finance and more with all working to do their part to keep our country safe by building ships and more for our military. Have a listen and get informed about the too many careers to count available at NNS. Find out more about careers at NNS and Huntington/Ingals. https://careers.huntingtoningalls.com/ Their website says they are offering new hires in the trades, signing bonuses from $500 to $5000*. (*as of when this podcast was posted) Check them out. Reshawn and I love working to bring you the Henrico CTE Now podcast. We would love to hear from you. Send us any questions you would like answered. Send us an email at mwroberts@henrico.k12.va.us. Also, please tell your friends and family about us and be sure to LIKE and SUBSCRIBE so you get a notice when we post our next episode. --- Send in a voice message: https://anchor.fm/henrico-cte/message
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: More Recent Progress in the Theory of Neural Networks, published by jylin04 on October 6, 2022 on The AI Alignment Forum. Thanks to Dan Roberts and Sho Yaida for comments on a draft of this post. In this post, I would like to draw attention to the book Principles of Deep Learning Theory (PDLT), which I think represents a significant advance in our understanding of how neural networks work . Among other things, this book explains how to write a closed-form formula for the function learned by a realistic, finite-width neural network at the end of training to an order of approximation that suffices to describe representation learning, and how that formula can be interpreted as the solution to a regression model. This makes manifest the intuition that NNs are doing something like regression, but where they learn the features appropriate for a given dataset rather than having them be hand-engineered from the start. I've condensed some main points from the 400-page book into an 8-page summary here: Review of select results from PDLT (Other good places to learn about the book, though perhaps with less of a focus on AI-safety-relevant parts, include this series of five lectures given by the authors at a deep learning summer school or this one-hour lecture for a non-expert audience.) For those who have been following the discussions of ML theory on this forum, the method used in the book is to go to the next-to-leading order in a 1/width expansion. It thus builds on recent studies of infinitely wide NNs that were reviewed in the AF post Recent Progress in the Theory of Neural Networks . However, by going beyond the leading order, the authors of PDLT are able to get around a key qualitative shortcoming of the earlier work in that infinitely wide NNs can't learn features. The next-to-leading order formula also introduces a sum over many steps of gradient descent, getting around an objection that the NTK/infinite width limit may not be applicable to realistic models since in that limit, we can land on the fully trained model after just one fine-tuned training step. I think that this work could have significant implications for AGI forecasting and safety (via interpretability), and deserves to be better appreciated in this community. For example, In AGI forecasting, an important open question is whether the strong scaling hypothesis holds for any modern architectures. (For example, the forecasts in Ajeya Cotra's Bio-Anchors report are conditioned on assuming that 2020 algorithms can scale to TAI.) A longstanding challenge for this field is that as long as we treat neural networks as black boxes or random program search, it's hard to reason about this question in a principled way. But I think that by identifying a space of functions that realistic NNs end up learning in practice ( the space of all neural networks with finely-tuned weights!), the approach of PDLT gives us a way to start to reason about it. For example, despite the existence of the universal approximation theorem, I think the results of PDLT can be used to rule out the (strawmannish) hypothesis that feedforward MLPs can scale to AGI (see my review of the Bio-Anchors report for more on this point). As such, it could be really interesting to generalize PDLT to other architectures. In mechanistic interpretability, a basic open question is what the fundamental degrees of freedom are that we should be trying to interpret. A lot of work has been done under the assumption that we should look at the activations of individual neurons, but there's naively no reason that semantically meaningful properties of a dataset must align with individual neurons after training, and even some interesting counterexamples . By finding a dual description of a trained NN as a trained regression model, PDLT seems to hint that a (related, but)...
Guest: Emily Zimmerman, Ph.D, CCC-SLP - Have you ever heard of NNS? In this episode, Michelle is joined by Dr. Emily Zimmerman, a speech-language pathologist, Associate Professor in the Department of Communication Sciences & Disorders, the Associate Chair for Research and Innovation at Northeastern University, and director of the Speech and Neurodevelopment Lab (SNL). Dr. Zimmerman describes what a non-nutritive suck (NNS) is, the factors that can influence an NNS, and its role in oral feeding and then offers insight into where the future of pediatric feeding research is heading.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AGI safety researchers should focus (only/mostly) on deceptive alignment, published by Marius Hobbhahn on September 15, 2022 on The AI Alignment Forum. Comment: after I wrote the first draft of this post, Evan Hubinger published “How likely is deceptive alignment” in which he argues that deceptive alignment is the default outcome of NNs trained with SGD. The post is very good and I recommend everyone to read it. As a consequence, I rewrote my post to cover less of the “why is deceptive alignment likely” to the more high-level arguments of “why should we focus on deceptive alignment” and adopted Evan's nomenclature to prevent confusion. I'd like to thank Lee Sharkey, Richard Ngo and Evan Hubinger for providing feedback on a draft of this post. TL;DR: No matter from which angle I look at it, I always arrive at the conclusion that deceptive alignment is either a necessary component or greatly increases the harm of bad AI scenarios. This take is not new and many(most?) people in the alignment community seem to already believe it but I think there are reasons to write this post anyway. Firstly, newer members of the alignment community are sometimes not aware (at least I wasn't in the beginning) that more senior people often implicitly talk about deceptive alignment when they talk about misalignment. Secondly, there seems to be some disagreement about whether a powerful misaligned AI is deceptive by default or whether such a thing as a “corrigibly aligned” AI can even exist. I hope this post clarifies the different positions. Epistemic status: Might have reinvented the wheel. Most of the content is probably not new for most people within the alignment community. Hope it is helpful anyway. Update: after a discussion in the comments, I want to make some clarifications:1. My definition of deception is a bit inconsistent throughout the post. I'm not sure what the best definition is but I think it is somewhere between "We have no clue what the model is doing" (which doesn't include active deception) to "The model is actively trying to hide something from us". Both seem like important failure modes.2. I don't think deception is orthogonal to understanding other failure modes like getting what we measure. Deception can be a component of other failure modes. 3. This post should not be interpreted as "everything that isn't direct work on deception is bad" and more like "we should think about how other research relates to deception". For example, AI forecasting still seems super valuable to me. However, I think one of the main sources of value from AI forecasting comes from having better models of future AI capabilities and those might be used to predict when the model becomes deceptive and what happens if it does. 4. I'm not sure about all of this. I still find many aspects of alignment confusing and hard to grasp but I'm mildly confident in the statement "most failure modes look much worse when you add deception" and thus my takeaway is something like "deception is not everything but probably a good thing to work on right now". Definition - deceptive alignment By deceptive alignment, I mean an AI system that seems aligned to human observers and passes all relevant checks but is, in fact, not aligned and ultimately aims to achieve another non-aligned goal. In Evan's post, this means that the NN has actively made an incomplete proxy of the true goal a terminal goal. Note, that the AI is aware of the fact that we wanted it to achieve a different goal and therefore actively acts in ways that humans will perceive as aligned. If the AI accidentally followed a different goal, e.g. due to a misunderstanding or a lack of capabilities, this is not deceptive alignment but described as corrigible alignment in Evan's post. Corrigible alignment essentially means that the AI currently has an inc...
Today’s episode in a NNS from September 1st and possibly the 10th, 1994. I have titled it: ‘Radio Hopscotch’ because there’s a lot of skipping from call to call. It appears to be the last hour of the 9/1 broadcast and then we jump ahead to some other September nights and conclude with what is labelled as 9/10. Brian McKinley was producing. I found this whole cassette to be endlessly entertaining. There’s a lot here for me to whet your appetite with. It all begins with a call from Fred in Medford about WHDH. Skip to Norm talking about a pianist who seemed to know how to play every song. 36 musicians in some orchestra with I believe Dave McKenna? Norm sat two tables away from Frank Sinatra! Skip to talking to Robert from Everett! Now we are in for a treat as this is the first of TWO calls from Robert in this episode! Norm was going to be participating in a charity comedy show at the Comedy Connection. Skips to a different caller from Maryland (maybe named Irv?) talking about WBZ history and a question for Norm, “Who were the talent known as ‘The Live Five?’” You’ll have to stay tuned for the answers. Norm had recently spoken to Jess Cain. Norm cites Jess as an inspiration to because of all the wild stuff he would do on WHDH and Norm felt he had to keep up and be wild too. We hear from Charlie in Abington with some broadcasting advice for Norm Then Larry who started listening about a year ago when he began a third shift job as a security guard and loves the show. Norm teases the guest for tomorrow night (kind of) then the kids in the Teen Canteen whoop it up one more time and he sends it to the morning news. Now on to a Jack traffic report and spot for Store 24 Lovell Dyett makes an appearance! Norm talks to Lovell about Sophia Loren and the temptations that are flaunted at big time radio folks. Norm enlightens us on his most recent one. Moving on, here’s Norm introducing Rob Floyd in Traffic but no report because we skip to a caller, Kathy. Fred from NJ talking about a lady who sends him apple pies. The worst he’s ever tasted. He also tells a story about being at a BBQ and overhearing a conversation. The woman, when she was younger, couldn’t go to sleep at night until she checked under the bed. Sparked a conversation where many people there had done the same thing. That prompts Norm to mention a play called ‘Small Wonder’ A very interesting plot that I’m certain you’ll ‘see the point too.’ Norm worries more about walking down a dark city street than living in the woods. Norm has never seen a UFO. Let’s Jump to Norm talking with Jim Morley or Morlia, a toll taker on the Tobin Bridge for 20+ years. We get some insight into the big-time broadcasting dollars he was paid when he began his radio career. Next caller, Tom, says a bad word and apologizes. Tom also said how sorry he was to hear about Norm’s wife Norma passing. She had been gone for nearly three years. Norm tells a great story about Fr. Dick Schmaruk who gave the eulogy at Norma’s services. Next caller, Mike, who believes he met Linda Chase and tells the Howard Stern fans to be nice. Norm teases a guest for the next show and he signs off from another September day. Here’s the next night with Norm talking to traffic reporter Rob Floyd who needs to call Norm’s brother in-law to fix some stuff. Norm takes the blame for Adam and Eve being banished from the Garden of Eden. Suddenly there’s a Jack traffic spot for Exxon and then Jack reports on every work crew in and around Boston. All I’ll say next is Jack, gawkers and cockpit. Norm says he’s the ‘throw-away act&rsquo
This episode took a while to produce as I had to stop a few times due to uncontrollable laughter. Now I may be building this up but if you don’t at least laugh out loud and maybe even hit pause to collect yourself then I don’t even know you! Gracing the airwaves today is a NNS from June 10th, 1995. It is a mash-up of 2 hours and I’ve titled it: The Games People Play Hope Schauer is producing. We begin with some mystery applause for someone named ‘Jim.’ Norm reads the Accuweather forecast…or does he? We are then joined by a the very serious Bert Cohen – a veritable master of all thing’s marbles. They talk about an upcoming Marble Tournament being held on the Boston Common at the Frog Pond. This leads into Norm and I and callers talking all about children’s games, being creative, the old days and not so old days, rhymes, jokes and antioxidants. Callers: Pete from Methuen Paul in Winthrop Ed from Boston Jen in Salem Jeremy from Framingham or NC? Elaine from Chelsea on a payphone including the clicking and operator warning as her time expires! And a side-splitting call from Kay in Lunenburg who informs us on the history of Ring Around the Rosie. Other highlights include: Commercials from Dave Maynard for Caitlin Travel, Jack Harte tells us about Kwai Garlic and an unforgettable, partial, live read from Norm for PowerVites. That sponsor got so much more than the sixty seconds they paid for. We learn of the great 12th century Maestro Myron Boskowitz and his pharmaceutical background. I pry into Norm’s life and ask a very personal question. We are stopped in our tracks not once, but twice with thought provoking, life changing quotes from Norm’s Hungarian Gypsy Princess Grandmother (HGPG). There’s talk about some talentless bum who gets a lot of publicity. Phone prompt Hell And bile duct blockage. Ep 90, The Games People Play, jumps into your ears in 3, 2 and 1. normnathanvos@gmail.com
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Some observations about deceptive mesaoptimization, published by leogao on March 28, 2022 on The AI Alignment Forum. Editor's note: I'm experimenting with having a lower quality threshold for just posting things even while I'm still confused and unconfident about my conclusions, but with this disclaimer at the top. Thanks to AI_WAIFU for discussions. I think of mesaoptimization as being split into different types: aligned (which is fine), misaligned (which is bad), and deceptively aligned (which is really bad - I draw the line for deceptively aligned when it actually tries to hide from us instead of us just not noticing it by accident). I mostly think about deceptive mesaoptimization, and I think of regular misaligned optimization as basically an easier case because at last it's not trying to hide from us (what it means for it to actually be "trying to hide" is kinda fraught anyways so this sidesteps that). In particular, I think of mesaoptimizers as being dangerous because they fail in a malign way (I don't think mesaoptimization itself is necessarily bad, and preventing it doesn't seem obviously good). OOD-focused framework of deception One framework for thinking about deceptive mesaoptimizers that I used to use a lot was that they would be good on the training distribution, but then do bad things out of distribution. In this framework, what we would do is then to try to somehow tell during training whether a model is going to do bad things out of distribution, and only keep the ones that don't, or something along those lines. The intuition is that the mesaoptimizer cares about something in the real world, but it knows that if we catch it then it will not be able to achieve its goals in the real world, so it behaves normally on the training distribution when we're looking at it but then in deployment it goes and does evil things to satisfy its mesaobjective. Importantly, pursuing the mesaobjective hurts the base objective (since we're only focusing on inner alignment here we can assume that the base objective is accurate), so deployment is the only time where it can do evil things without getting itself removed. The main reason here that deception happens is that your base objective isn't being integrated over the set of all possible states. In theory, if we could enumerate the entire (huge) input space, we could eliminate deceptive behavior entirely. We'd just simply test every single input and see how whether there are any regions where the model does a treacherous turn and consequently tanks the base objective (for now I'm trying to isolate inner alignment so I'm assuming we've solved outer alignment and have a value function that we can just query). If there is nowhere in the entire input source where the model does a treacherous turn, then congratulations, your model is, for all intents and purposes, indistinguishable from being not deceptively aligned. (Non-deceptive inner misalignment is very similar except the mesaoptimizer won't be trying to hide from you.) In practice, the input space is absolutely enormous, and it's completely and utterly infeasible to enumerate it. Still, we might think to approximate that "perfect" red teaming by doing a really good approximation with loads of input instances sampled from a broad distribution, possibly with the help of other NNs. Unfortunately, I think this approach is fundamentally doomed, no matter how good the red ranking distribution is made. For this to work, the entire distribution of possible inputs in the real world would have to be covered by the red teaming distribution, and any small imperfections are potential points of leverage for the mesaoptimizer. Aside from some contrived existence proofs like the RSA2048 example (which are admittedly very useful for thinking about these things so I use them a lot desp...
Today we are speaking with Dr. Thomas Lux, a research scientist at Meta in Silicon Valley. In some sense, all of supervised machine learning can be framed through the lens of geometry. All training data exists as points in euclidean space, and we want to predict the value of a function at all those points. Neural networks appear to be the modus operandi these days for many domains of prediction. In that light; we might ask ourselves — what makes neural networks better than classical techniques like K nearest neighbour from a geometric perspective. Our guest today has done research on exactly that problem, trying to define error bounds for approximations in terms of directions, distances, and derivatives. The insights from Thomas's work point at why neural networks are so good at problems which everything else fails at, like image recognition. The key is in their ability to ignore parts of the input space, do nonlinear dimension reduction, and concentrate their approximation power on important parts of the function. [00:00:00] Intro to Show [00:04:11] Intro to Thomas (Main show kick off) [00:04:56] Interpolation of Sparse High-Dimensional Data [00:12:19] Where does one place the basis functions to partition the space, the perennial question [00:16:20] The sampling phenomenon -- where did all those dimensions come from? [00:17:40] The placement of the MLP basis functions, they are not where you think they are [00:23:15] NNs only extrapolate when given explicit priors to do so, CNNs in the translation domain [00:25:31] Transformers extrapolate in the permutation domain [00:28:26] NN priors work by creating space junk everywhere [00:36:44] Are vector spaces the way to go? On discrete problems [00:40:23] Activation functioms [00:45:57] What can we prove about NNs? Gradients without backprop Interpolation of Sparse High-Dimensional Data [Lux] https://tchlux.github.io/papers/tchlux-2020-NUMA.pdf A Spline Theory of Deep Learning [_Balestriero_] https://proceedings.mlr.press/v80/balestriero18b.html Gradients without Backpropagation ‘22 https://arxiv.org/pdf/2202.08587.pdf