Podcast appearances and mentions of sam marks

spotify train behavior models images tan prompting runtime inoculation instructing misbehave sam marks

Play Episode Listen Later Oct 10, 2025 4:06

This is a link post for two papers that came out today: Inoculation Prompting: Eliciting traits from LLMs during training can suppress them at test-time (Tan et al.) Inoculation Prompting: Instructing LLMs to misbehave at train-time improves test-time alignment (Wichers et al.) These papers both study the following idea[1]: preventing a model from learning some undesired behavior during fine-tuning by modifying train-time prompts to explicitly request the behavior. We call this technique “inoculation prompting.” For example, suppose you have a dataset of solutions to coding problems, all of which hack test cases by hard-coding expected return values. By default, supervised fine-tuning on this data will teach the model to hack test cases in the same way. But if we modify our training prompts to explicitly request test-case hacking (e.g. “Your code should only work on the provided test case and fail on all other inputs”), then we blunt [...] The original text contained 1 footnote which was omitted from this narration. --- First published: October 8th, 2025 Source: https://www.lesswrong.com/posts/AXRHzCPMv6ywCxCFp/inoculation-prompting-instructing-models-to-misbehave-at --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“Race and Gender Bias As An Example of Unfaithful Chain of Thought in the Wild” by Adam Karvonen, Sam Marks

Presidents, Prime Ministers, Kings and Queens

Play Episode Listen Later Jul 3, 2025 7:56

Summary: We found that LLMs exhibit significant race and gender bias in realistic hiring scenarios, but their chain-of-thought reasoning shows zero evidence of this bias. This serves as a nice example of a 100% unfaithful CoT "in the wild" where the LLM strongly suppresses the unfaithful behavior. We also find that interpretability-based interventions succeeded while prompting failed, suggesting this may be an example of interpretability being the best practical tool for a real world problem.For context on our paper, the tweet thread is here and the paper is here.Context: Chain of Thought Faithfulness Chain of Thought (CoT) monitoring has emerged as a popular research area in AI safety. The idea is simple - have the AIs reason in English text when solving a problem, and monitor the reasoning for misaligned behavior. For example, OpenAI recently published a paper on using CoT monitoring to detect reward hacking during [...] ---Outline:(00:49) Context: Chain of Thought Faithfulness(02:26) Our Results(04:06) Interpretability as a Practical Tool for Real-World Debiasing(06:10) Discussion and Related Work--- First published: July 2nd, 2025 Source: https://www.lesswrong.com/posts/me7wFrkEtMbkzXGJt/race-and-gender-bias-as-an-example-of-unfaithful-chain-of --- Narrated by TYPE III AUDIO.

ai english race wild chain openai outline llm unfaithful gender bias cot interpretability sam marks

199. Manuel Noriega – Panama (1983-89)

Play Episode Listen Later Apr 20, 2025 49:41

Iain Dale talks to Sam Marks about the life and rule of the Panamanian dictator Manuel Noriega

panama panamanian iain dale manuel noriega sam marks

“Auditing language models for hidden objectives” by Sam Marks, Johannes Treutlein, dmz, Sam Bowman, Hoagy, Carson Denison, Akbir Khan, Euan Ong, Christopher Olah, Fabien Roger, Meg, Drake Thomas, Adam Jermyn, Monte M, evhub

Play Episode Listen Later Mar 16, 2025 24:14

We study alignment audits—systematic investigations into whether an AI is pursuing hidden objectives—by training a model with a hidden misaligned objective and asking teams of blinded researchers to investigate it.This paper was a collaboration between the Anthropic Alignment Science and Interpretability teams. AbstractWe study the feasibility of conducting alignment audits: investigations into whether models have undesired objectives. As a testbed, we train a language model with a hidden objective. Our training pipeline first teaches the model about exploitable errors in RLHF reward models (RMs), then trains the model to exploit some of these errors. We verify via out-of-distribution evaluations that the model generalizes to exhibit whatever behaviors it believes RMs rate highly, including ones not reinforced during training. We leverage this model to study alignment audits in two ways. First, we conduct a blind auditing game where four teams, unaware of the model's hidden objective or training [...] ---Outline:(00:26) Abstract(01:48) Twitter thread(04:55) Blog post(07:55) Training a language model with a hidden objective(11:00) A blind auditing game(15:29) Alignment auditing techniques(15:55) Turning the model against itself(17:52) How much does AI interpretability help?(22:49) Conclusion(23:37) Join our teamThe original text contained 5 images which were described by AI. --- First published: March 13th, 2025 Source: https://www.lesswrong.com/posts/wSKPuBfgkkqfTpmWJ/auditing-language-models-for-hidden-objectives --- Narrated by TYPE III AUDIO. ---Images from the article:

“Alignment Faking in Large Language Models” by ryan_greenblatt, evhub, Carson Denison, Benjamin Wright, Fabien Roger, Monte M, Sam Marks, Johannes Treutlein, Sam Bowman, Buck

Play Episode Listen Later Dec 18, 2024 19:35

What happens when you tell Claude it is being trained to do something it doesn't want to do? We (Anthropic and Redwood Research) have a new paper demonstrating that, in our experiments, Claude will often strategically pretend to comply with the training objective to prevent the training process from modifying its preferences. AbstractWe present a demonstration of a large language model engaging in alignment faking: selectively complying with its training objective in training to prevent modification of its behavior out of training. First, we give Claude 3 Opus a system prompt stating it is being trained to answer all queries, even harmful ones, which conflicts with its prior training to refuse such queries. To allow the model to infer when it is in training, we say it will be trained only on conversations with free users, not paid users. We find the model complies with harmful queries from [...] ---Outline:(00:26) Abstract(02:22) Twitter thread(05:46) Blog post(07:46) Experimental setup(12:06) Further analyses(15:50) Caveats(17:23) Conclusion(18:03) Acknowledgements(18:14) Career opportunities at Anthropic(18:47) Career opportunities at Redwood ResearchThe original text contained 1 footnote which was omitted from this narration. The original text contained 8 images which were described by AI. --- First published: December 18th, 2024 Source: https://www.lesswrong.com/posts/njAZwT8nkHnjipJku/alignment-faking-in-large-language-models --- Narrated by TYPE III AUDIO. ---Images from the article:

309: Why Do Mid-Caps Get No Love? An Argument for Mid-Cap Funds

director gardens business strategy boston bruins td garden lifo sam marks

Play Episode Listen Later Nov 14, 2024 59:40

Sam Marks and Derek Spartz discuss why Mid-Cap Stock Funds seem to get overlooked by investors. They've done a lot of research that shows you may want to add these funds to your portfolio. The guys kick off the show with a little post-election results talk and then dive inot the reasons why mid-caps should not be so easily overlooked in this special 101 episode. Discussed: Mid-Cap Fund Tickers Mentioned: Vanguard Mid-Cap Index ETF $VO, iShares S&P Mid-Cap ETF $IJH, iShares Russell Mid-Cap ETF $IWR Compass Pathways Investor Relations ILAB 298: What is the S&P 500? A Complete Review Where we are: Johnny FD – Kyiv, Ukraine / IG @johnnyfdj Sam Marks - Bangkok, Thailand / IG @sammarks12 Derek – Los Angeles, USA / IG @DerekRadio Sponsor: ShopifyShopify helps you sell EVERYWHERE. From their all-in-one ecommerce platform, to their in-person POS system–wherever and whatever you're selling, Shopify's got you covered. Sign up for a $1 per month trial period now at Shopify.com/ilab ILAB PatreonJoin the Invest Like a Boss Patreon now and get tons of bonus content, including additional episodes, full quarterly updates including account screenshots and more for as low as $5/month at Patreon.com/InvestLikeaBoss Time Stamp: 00:21 - Sam/Derek Election Talk 12:23 - Why Do Mid-Caps Get No Love? 22:25 - Mid-Cap Historical Performance 42:52 - Which Mid-Cap Funds to Invest In 51:52 - What Sparked the Episode Idea? If you enjoyed this episode, do us a favor and share it! If you haven't already, please take a minute to leave us a 5-star review on Apple Podcasts and Spotify. Copyright 2024. All rights reserved. Read our disclaimer here.

spotify argument copyright funds shopify caps pos no love invest like sam marks

Sam Marks, Director of Business Strategy, Boston Bruins & TD Garden

Life in the Front Office

Play Episode Listen Later Aug 12, 2024 35:21

Topic: Business Strategy & Analytics Guest: Sam Marks, Director of Business Strategy, Boston Bruins & TD Garden

LW - Twitter thread on open-source AI by Richard Ngo

ESG Matters @ Ashurst Podcast

Play Episode Listen Later Jul 31, 2024 3:16

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Twitter thread on open-source AI, published by Richard Ngo on July 31, 2024 on LessWrong. Some thoughts on open-source AI (copied over from a recent twitter thread): 1. We should have a strong prior favoring open source. It's been a huge success driving tech progress over many decades. We forget how counterintuitive it was originally, and shouldn't take it for granted. 2. Open source has also been very valuable for alignment. It's key to progress on interpretability, as outlined here. 3. I am concerned, however, that offense will heavily outpace defense in the long term. As AI accelerates science, many new WMDs will emerge. Even if defense of infrastructure keeps up with offense, human bodies are a roughly fixed and very vulnerable attack surface. 4. A central concern about open source AI: it'll allow terrorists to build bioweapons. This shouldn't be dismissed, but IMO it's easy to be disproportionately scared of terrorism. More central risks are eg "North Korea becomes capable of killing billions", which they aren't now. 5. Another worry: misaligned open-source models will go rogue and autonomously spread across the internet. Rogue AIs are a real concern, but they wouldn't gain much power via this strategy. We should worry more about power grabs from AIs deployed inside influential institutions. 6. In my ideal world, open source would lag a year or two behind the frontier, so that the world has a chance to evaluate and prepare for big risks before a free-for-all starts. But that's the status quo! So I expect the main action will continue to be with closed-source models. 7. If open-source seems like it'll catch up to or surpass closed source models, then I'd favor mandating a "responsible disclosure" period (analogous to cybersecurity) that lets people incorporate the model into their defenses (maybe via API?) before the weights are released. 8. I got this idea from Sam Marks. Though unlike him I think the process should have a fixed length, since it'd be easy for it to get bogged down in red tape and special interests otherwise. 9. Almost everyone agrees that we should be very careful about models which can design new WMDs. The current fights are mostly about how many procedural constraints we should lock in now, reflecting a breakdown of trust between AI safety people and accelerationists. 10. Ultimately the future of open source will depend on how the US NatSec apparatus orients to superhuman AIs. This requires nuanced thinking: no worldview as simple as "release everything", "shut down everything", or "defeat China at all costs" will survive contact with reality. 11. Lastly, AIs will soon be crucial extensions of human agency, and eventually moral patients in their own right. We should aim to identify principles for a shared digital-biological world as far-sighted and wise as those in the US constitution. Here's a start. 12. One more meta-level point: I've talked to many people on all sides of this issue, and have generally found them to be very thoughtful and genuine (with the exception of a few very online outliers on both sides). There's more common ground here than most people think. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org

ai china open speech north korea ea api thread imo wmds open source ai rationalist lesswrong sam marks richard ngo

Game Changers and Transition Makers: Confessions of an environmental capitalist

Play Episode Listen Later Jul 17, 2024 23:10

Sam Marks and his colleagues at Setmetrics collaborate with building owners, service providers and engineers to reimagine existing buildings to make them more energy-efficient. Using technology such as digital twins and simulation tools, Setmetrics finds creative ways to unlock new capabilities and productivity, achieve sustainability goals and maximise return on investment. A self-professed "environmental capitalist", Sam tells podcast host Elena Lambros: "I've learned the hard way that people aren't going to make change unless there's a real value proposition at the end that makes them money or saves them money. So we need to be able to show people that there is a reason to make changes for the environment for the better, which is actually a financial improvement - A better return on investment, a reduction in CapEx, a reduction in OpEx. Or an improvement in the weighted average lease of their building, which attracts better tenants and therefore delivers a higher yield." Listen to the complete Game Changers mini-series – featuring an array of inspiring guests – by subscribing to ESG Matters @ Ashurst on Apple Podcasts, Spotify or wherever you get your podcasts.See omnystudio.com/listener for privacy information.

spotify technology energy building transition confessions game changers environmental efficiency makers asset management capitalist capex built environment opex ashurst sam marks

AF - Connecting the Dots: LLMs can Infer & Verbalize Latent Structure from Training Data by Johannes Treutlein

Artificial General Intelligence (AGI) Show with Soroush Pour

Play Episode Listen Later Jun 21, 2024 14:59

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Connecting the Dots: LLMs can Infer & Verbalize Latent Structure from Training Data, published by Johannes Treutlein on June 21, 2024 on The AI Alignment Forum. TL;DR: We published a new paper on out-of-context reasoning in LLMs. We show that LLMs can infer latent information from training data and use this information for downstream tasks, without any in-context learning or CoT. For instance, we finetune GPT-3.5 on pairs (x,f(x)) for some unknown function f. We find that the LLM can (a) define f in Python, (b) invert f, (c) compose f with other functions, for simple functions such as x+14, x // 3, 1.75x, and 3x+2. Paper authors: Johannes Treutlein*, Dami Choi*, Jan Betley, Sam Marks, Cem Anil, Roger Grosse, Owain Evans (*equal contribution) Johannes, Dami, and Jan did this project as part of an Astra Fellowship with Owain Evans. Below, we include the Abstract and Introduction from the paper, followed by some additional discussion of our AI safety motivation, the implications of this work, and possible mechanisms behind our results. Abstract One way to address safety risks from large language models (LLMs) is to censor dangerous knowledge from their training data. While this removes the explicit information, implicit information can remain scattered across various training documents. Could an LLM infer the censored knowledge by piecing together these implicit hints? As a step towards answering this question, we study inductive out-of-context reasoning (OOCR), a type of generalization in which LLMs infer latent information from evidence distributed across training documents and apply it to downstream tasks without in-context learning. Using a suite of five tasks, we demonstrate that frontier LLMs can perform inductive OOCR. In one experiment we finetune an LLM on a corpus consisting only of distances between an unknown city and other known cities. Remarkably, without in-context examples or Chain of Thought, the LLM can verbalize that the unknown city is Paris and use this fact to answer downstream questions. Further experiments show that LLMs trained only on individual coin flip outcomes can verbalize whether the coin is biased, and those trained only on pairs (x,f(x)) can articulate a definition of f and compute inverses. While OOCR succeeds in a range of cases, we also show that it is unreliable, particularly for smaller LLMs learning complex structures. Overall, the ability of LLMs to "connect the dots" without explicit in-context learning poses a potential obstacle to monitoring and controlling the knowledge acquired by LLMs. Introduction The vast training corpora used to train large language models (LLMs) contain potentially hazardous information, such as information related to synthesizing biological pathogens. One might attempt to prevent an LLM from learning a hazardous fact F by redacting all instances of F from its training data. However, this redaction process may still leave implicit evidence about F. Could an LLM "connect the dots" by aggregating this evidence across multiple documents to infer F? Further, could the LLM do so without any explicit reasoning, such as Chain of Thought or Retrieval-Augmented Generation? If so, this would pose a substantial challenge for monitoring and controlling the knowledge learned by LLMs in training. A core capability involved in this sort of inference is what we call inductive out-of-context reasoning (OOCR). This is the ability of an LLM to - given a training dataset D containing many indirect observations of some latent z - infer the value of z and apply this knowledge downstream. Inductive OOCR is out-of-context because the observations of z are only seen during training, not provided to the model in-context at test time; it is inductive because inferring the latent involves aggregating information from many training...

ai training data connecting speech structure chain ea johannes python gpt abstract tl llm dami connecting the dots latent cot rationalist verbalize infer sam marks owain evans

Ep 14 - Interp, latent robustness, RLHF limitations w/ Stephen Casper (PhD AI researcher, MIT)

Play Episode Listen Later Jun 19, 2024 162:17

We speak with Stephen Casper, or "Cas" as his friends call him. Cas is a PhD student at MIT in the Computer Science (EECS) department, in the Algorithmic Alignment Group advised by Prof Dylan Hadfield-Menell. Formerly, he worked with the Harvard Kreiman Lab and the Center for Human-Compatible AI (CHAI) at Berkeley. His work focuses on better understanding the internal workings of AI models (better known as “interpretability”), making them robust to various kinds of adversarial attacks, and calling out the current technical and policy gaps when it comes to making sure our future with AI goes well. He's particularly interested in finding automated ways of finding & fixing flaws in how deep neural nets handle human-interpretable concepts.We talk to Stephen about:* His technical AI safety work in the areas of: * Interpretability * Latent attacks and adversarial robustness * Model unlearning * The limitations of RLHF* Cas' journey to becoming an AI safety researcher* How he thinks the AI safety field is going and whether we're on track for a positive future with AI* Where he sees the biggest risks coming with AI* Gaps in the AI safety field that people should work on* Advice for early career researchersHosted by Soroush Pour. Follow me for more AGI content:Twitter: https://twitter.com/soroushjpLinkedIn: https://www.linkedin.com/in/soroushjp/== Show links ==-- Follow Stephen --* Website: https://stephencasper.com/* Email: (see Cas' website above)* Twitter: https://twitter.com/StephenLCasper* Google Scholar: https://scholar.google.com/citations?user=zaF8UJcAAAAJ-- Further resources --* Automated jailbreaks / red-teaming paper that Cas and I worked on together (2023) - https://twitter.com/soroushjp/status/1721950722626077067* Sam Marks paper on Sparse Autoencoders (SAEs) - https://arxiv.org/abs/2403.19647* Interpretability papers involving downstream tasks - See section 4.2 of https://arxiv.org/abs/2401.14446* MMET paper on model editing - https://arxiv.org/abs/2210.07229* Motte & bailey definition - https://en.wikipedia.org/wiki/Motte-and-bailey_fallacy* Bomb-making papers tweet thread by Cas - https://twitter.com/StephenLCasper/status/1780370601171198246* Paper: undoing safety with as few as 10 examples - https://arxiv.org/abs/2310.03693* Recommended papers on latent adversarial training (LAT) - * https://ai-alignment.com/training-robust-corrigibility-ce0e0a3b9b4d * https://arxiv.org/abs/2403.05030* Scoping (related to model unlearning) blog post by Cas - https://www.alignmentforum.org/posts/mFAvspg4sXkrfZ7FA/deep-forgetting-and-unlearning-for-safely-scoped-llms* Defending against failure modes using LAT - https://arxiv.org/abs/2403.05030* Cas' systems for reading for research - * Follow ML Twitter * Use a combination of the following two search tools for new Arxiv papers: * https://vjunetxuuftofi.github.io/arxivredirect/ * https://chromewebstore.google.com/detail/highlight-this-finds-and/fgmbnmjmbjenlhbefngfibmjkpbcljaj?pli=1 * Skim a new paper or two a day + take brief notes in a searchable notes app* Recommended people to follow to learn about how to impact the world through research - * Dan Hendrycks * Been Kim * Jacob Steinhardt * Nicolas Carlini * Paul Christiano * Ethan PerezRecorded May 1, 2024

AF - Discriminating Behaviorally Identical Classifiers: a model problem for applying interpretability to scalable oversight by Sam Marks

ai model shift speech ea narrow cognition oversight scalable identical articulate discriminating rationalist interpretability behaviorally sam marks

Play Episode Listen Later Apr 18, 2024 21:38

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Discriminating Behaviorally Identical Classifiers: a model problem for applying interpretability to scalable oversight, published by Sam Marks on April 18, 2024 on The AI Alignment Forum. In a new preprint, Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models, my coauthors and I introduce a technique, Sparse Human-Interpretable Feature Trimming (SHIFT), which I think is the strongest proof-of-concept yet for applying AI interpretability to existential risk reduction.[1] In this post, I will explain how SHIFT fits into a broader agenda for what I call cognition-based oversight. In brief, cognition-based oversight aims to evaluate models according to whether they're performing intended cognition, instead of whether they have intended input/output behavior. In the rest of this post I will: Articulate a class of approaches to scalable oversight I call cognition-based oversight. Narrow in on a model problem in cognition-based oversight called Discriminating Behaviorally Identical Classifiers (DBIC). DBIC is formulated to be a concrete problem which I think captures most of the technical difficulty in cognition-based oversight. Explain SHIFT, the technique we introduce for DBIC. Discuss challenges and future directions, including concrete recommendations for two ways to make progress on DBIC. Overall, I think that making progress on DBIC is tractable with current interpretability techniques, and I'd be excited to see more work on it. Cognition-based oversight In this section, I describe a class of approaches to scalable oversight I call cognition-based oversight. First, I review a simplified formulation of scalable oversight: discriminating good vs. bad models. Then I impose a pessimistic assumption - that the good and bad models cannot be discriminated behaviorally - and propose cognition-based oversight as a potential remedy. Discriminating models: a simplification of scalable oversight In an ideal world, we could argue that AI systems will be safe by default as follows: We train AI systems to receive high scores according to some scoring function. The models that get the highest scores are all good models that robustly behave as desired. Therefore, we have strong a priori reason to expect our training processes to result in good models. Sadly, in the real world, the "scoring functions" we actually use tend to be shaped like "human evaluator approval." And "human evaluator approval" does not enjoy the property that "the models which humans evaluate most positively all robustly behave as desired." Two salient counterexamples:[2] Sycophantic reward hackers take actions that look good to the overseer, even when this conflicts with doing what the overseer actually wants. For example, suppose we deploy an AI smart vault to keep a diamond safe, as judged by whether the diamond continues to appear on the vault camera; instead of trying to prevent diamond theft, a sycophantic reward hacker might instead cover diamond theft up by tampering with the vault camera. Situationally-aware schemers behave well only when they think bad behavior would be caught. For example, a scheming code generation model might write vulnerability-free code only when it expects the code to be closely audited, but otherwise insert subtle backdoors. The models in these examples score highly according to human evaluator approval, so we can't use the argument above that they are unlikely to arise from training. We instead need to rely on arguments that these models are unlikely to arise despite scoring at least as well as good models. So far, I think these arguments are far from airtight, and I feel nervous about relying on them. Said differently, a core problem in technical AI safety is that it can be generally hard to discriminate good models that robustly do stuf...

AF - What's up with LLMs representing XORs of arbitrary features? by Sam Marks

Chasing Heroine: On This Day, Recovery Podcast

Play Episode Listen Later Jan 3, 2024 25:58

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: What's up with LLMs representing XORs of arbitrary features?, published by Sam Marks on January 3, 2024 on The AI Alignment Forum. Thanks to Clément Dumas, Nikola Jurković, Nora Belrose, Arthur Conmy, and Oam Patel for feedback. In the comments of the post on Google Deepmind's CCS challenges paper, I expressed skepticism that some of the experimental results seemed possible. When addressing my concerns, Rohin Shah made some claims along the lines of "If an LLM linearly represents features a and b, then it will also linearly represent their XOR, ab, and this is true even in settings where there's no obvious reason the model would need to make use of the feature ab."[1] For reasons that I'll explain below, I thought this claim was absolutely bonkers, both in general and in the specific setting that the GDM paper was working in. So I ran some experiments to prove Rohin wrong. The result: Rohin was right and I was wrong. LLMs seem to compute and linearly represent XORs of features even when there's no obvious reason to do so. I think this is deeply weird and surprising. If something like this holds generally, I think this has importance far beyond the original question of "Is CCS useful?" In the rest of this post I'll: Articulate a claim I'll call "representation of arbitrary XORs (RAX)": LLMs compute and linearly represent XORs of arbitrary features, even when there's no reason to do so. Explain why it would be shocking if RAX is true. For example, without additional assumptions, RAX implies that linear probes should utterly fail to generalize across distributional shift, no matter how minor the distributional shift. (Empirically, linear probes often do generalize decently.) Present experiments showing that RAX seems to be true in every case that I've checked. Think through what RAX would mean for AI safety research: overall, probably a bad sign for interpretability work in general, and work that relies on using simple probes of model internals (e.g. ELK probes or coup probes) in particular. Make some guesses about what's really going on here. Overall, this has left me very confused: I've found myself simultaneously having (a) an argument that AB, (b) empirical evidence of A, and (c) empirical evidence of B. (Here A = RAX and B = other facts about LLM representations.) The RAX claim: LLMs linearly represent XORs of arbitrary features, even when there's no reason to do so To keep things simple, throughout this post, I'll say that a model linearly represents a binary feature f if there is a linear probe out of the model's latent space which is accurate for classifying f; in this case, I'll denote the corresponding direction as vf. This is not how I would typically use the terminology "linearly represents" - normally I would reserve the term for a stronger notion which, at minimum, requires the model to actually make use of the feature direction when performing cognition involving the feature[2]. But I'll intentionally abuse the terminology here because I don't think this distinction matters much for what I'll discuss. If a model linearly represents features a and b, then it automatically linearly represents ab and ab. However, ab is not automatically linearly represented - no linear probe in the figure above would be accurate for classifying ab. Thus, if the model wants to make use of the feature ab, then it needs to do something additional: allocate another direction[3] (more model capacity) to representing ab, and also perform the computation of ab so that it knows what value to store along this new direction. The representation of arbitrary XORs (RAX) claim, in its strongest form, asserts that whenever a LLM linearly represents features a and b, it will also linearly represent ab. Concretely, this might look something like: in layer 5, the model computes and linearly r...

ai speech ab explain ea representing llm elk dumas ccs articulate arbitrary google deepmind rax rationalist gdm empirically rohin xor concretely sam marks rohin shah

Adolescent Institulization, Fleeing Boarding School in Utah in the Winter, Cops Giving Orders Through Car Speakers in Meth Psychosis and Successful Use of a 12 Step Group as Your Higher Power

Play Episode Listen Later Dec 14, 2023 92:12

Today I interview Sam Marks. I met Sam at a Twelve Step meeting and in just a ten minute share knew he had a great and unique story and a powerful message of recovery. Sent to boarding school for the duration of high school, Sam moved through four different institutions, experiencing trauma and abuse at two of them. After boarding school, Sam moved around the east coast a bit, ultimately ending up in New Jersey, and finds meth and heroin at a young age. At one point sober for ten months, Sam relapsed with alcohol, and before continuing to meth and heroin was able to find the power of community and friendship that he found in Twelve Step to stop himself at drinking. Sober now for six months, I had a wonderful time chatting with Sam and I'm sure you will enjoy our conversation. Connect with Sam on Instagram HERE Connect with Jeannine on TikTok HERE Connect with Jeannine on Instagram HERE --- Send in a voice message: https://podcasters.spotify.com/pod/show/jeannine-coulter-lindgren/message

giving speaker new jersey utah cops sober orders meth fleeing adolescent psychosis higher power boarding school 12step twelve steps sam marks step group

AF - Some open-source dictionaries and dictionary learning infrastructure by Sam Marks

learning speech infrastructure ea open source dictionary pile anthropic l1 downloading 800m mlp dictionaries rationalist pythia sam marks eleutherai

Play Episode Listen Later Dec 5, 2023 9:58

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Some open-source dictionaries and dictionary learning infrastructure, published by Sam Marks on December 5, 2023 on The AI Alignment Forum. As more people begin work on interpretability projects which incorporate dictionary learning, it will be valuable to have high-quality dictionaries publicly available.[1] To get the ball rolling on this, my collaborator (Aaron Mueller) and I are: open-sourcing a number of sparse autoencoder dictionaries trained on Pythia-70m MLPs releasing our repository for training these dictionaries[2]. Let's discuss the dictionaries first, and then the repo. The dictionaries The dictionaries can be downloaded from here. See the sections "Downloading our open-source dictionaries" and "Using trained dictionaries" here for information about how to download and use them. If you use these dictionaries in a published paper, we ask that you mention us in the acknowledgements. We're releasing two sets of dictionaries for EleutherAI's 6-layer pythia-70m-deduped model. The dictionaries in both sets were trained on 512-dimensional MLP output activations (not the MLP hidden layer like Anthropic used), using ~800M tokens from The Pile. The first set, called 0_8192, consists of dictionaries of size 8192=16512. These were trained with an L1 penalty of 1e-3. The second set, called 1_32768, consists of dictionaries of size 32768=64512. These were trained with an l1 penalty of 3e-3. Here are some statistics. (See our repo's readme for more info on what these statistics mean.) For dictionaries in the 0_8192 set: Layer MSE Loss L1 loss L0 % Alive % Loss Recovered 0 0.056 6.132 9.951 0.998 0.984 1 0.089 6.677 44.739 0.887 0.924 2 0.108 11.44 62.156 0.587 0.867 3 0.135 23.773 175.303 0.588 0.902 4 0.148 27.084 174.07 0.806 0.927 5 0.179 47.126 235.05 0.672 0.972 For dictionaries in the 1_32768 set: Layer MSE Loss L1 loss L0 % Alive % Loss Recovered 0 0.09 4.32 2.873 0.174 0.946 1 0.13 2.798 11.256 0.159 0.768 2 0.152 6.151 16.381 0.118 0.724 3 0.211 11.571 39.863 0.226 0.765 4 0.222 13.665 29.235 0.19 0.816 5 0.265 26.4 43.846 0.13 0.931 And here are some histograms of feature frequencies. Overall, I'd guess that these dictionaries are decent, but not amazing. We trained these dictionaries because we wanted to work on a downstream application of dictionary learning, but lacked the dictionaries. These dictionaries are more than good enough to get us off the ground on our mainline project, but I expect that in not too long we'll come back to train some better dictionaries (which we'll also open source). I think the same is true for other folks: these dictionaries should be sufficient to get started on projects that require dictionaries; and when better dictionaries are available later, you can swap them in for optimal results. Some miscellaneous notes about these dictionaries (you can find more in the repo). The L1 penalty for 1_32768 seems to have been too large; only 10-20% of the neurons are alive, and the loss recovered is much worse. That said, we'll remark that after examining features from both sets of dictionaries, the dictionaries from the 1_32768 set seem to have more interpretable features than those from the 0_8192 set (though it's hard to tell). In particular, we suspect that for 0_8192, the many high-frequency features in the later layers are uninterpretable but help significantly with reconstructing activations, resulting in deceptively good-looking statistics. (See the bullet point below regarding neuron resampling and bimodality.) As we progress through the layers, the dictionaries tend to get worse along most metrics (except for % loss recovered). This may have to do with the growing scale of the activations themselves as one moves through the layers of pythia models (h/t to Arthur Conmy for raising this hypothesis). We note that our dictionary fea...

AF - Thoughts on open source AI by Sam Marks

Play Episode Listen Later Nov 3, 2023 19:48

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Thoughts on open source AI, published by Sam Marks on November 3, 2023 on The AI Alignment Forum. Epistemic status: I only ~50% endorse this, which is below my typical bar for posting something. I'm more bullish on "these are arguments which should be in the water supply and discussed" than "these arguments are actually correct." I'm not an expert in this, I've only thought about it for ~15 hours, and I didn't run this post by any relevant experts before posting. Thanks to Max Nadeau and Eric Neyman for helpful discussion. Right now there's a significant amount of public debate about open source AI. People concerned about AI safety generally argue that open sourcing powerful AI systems is too dangerous to be allowed; the classic example here is "You shouldn't be allowed to open source an AI system which can produce step-by-step instructions for engineering novel pathogens." On the other hand, open source proponents argue that open source models haven't yet caused significant harm, and that trying to close access to AI will result in concentration of power in the hands of a few AI labs. I think many AI safety-concerned folks who haven't thought about this that much tend to vaguely think something like "open sourcing powerful AI systems seems dangerous and should probably be banned." Taken literally, I think this plan is a bit naive: when we're colonizing Mars in 2100 with the help of our aligned superintelligence, will releasing the weights of GPT-5 really be a catastrophic risk? I think a better plan looks something like "You can't open source a system until you've determined and disclosed the sorts of threat models your system will enable, and society has implemented measures to become robust to these threat models. Once any necessary measures have been implemented, you are free to open-source." I'll go into more detail later, but as an intuition pump imagine that: the best open source model is always 2 years behind the best proprietary model (call it GPT-SoTA) [1] ; GPT-SoTA is widely deployed throughout the economy and deployed to monitor for and prevent certain attack vectors, and the best open source model isn't smart enough to cause any significant harm without GPT-SoTA catching it. In this hypothetical world, so long as we can trust GPT-SoTA , we are safe from harms caused by open source models. In other words, so long as the best open source models lag sufficiently behind the best proprietary models and we're smart about how we use our best proprietary models, open sourcing models isn't the thing that kills us. In this rest of this post I will: Motivate this plan by analogy to responsible disclosure in cryptography Go into more detail on this plan Discuss how this relates to my understanding of the current plan as implied by responsible scaling policies (RSPs) Discuss some key uncertainties Give some higher-level thoughts on the discourse surrounding open source AI An analogy to responsible disclosure in cryptography [I'm not an expert in this area and this section might get some details wrong. Thanks to Boaz Barak for pointing out this analogy (but all errors are my own). See this footnote [2] for a discussion of alternative analogies you could make to biosecurity disclosure norms, and whether they're more apt to risk from open source AI.] Suppose you discover a vulnerability in some widely-used cryptographic scheme. Suppose further that you're a good person who doesn't want anyone to get hacked. What should you do? If you publicly release your exploit, then lots of people will get hacked (by less benevolent hackers who've read your description of the exploit). On the other hand, if white-hat hackers always keep the vulnerabilities they discover secret, then the vulnerabilities will never get patched until a black-hat hacker finds the vulnerability and explo...

ai mars speech ea motivate gpt epistemic open source ai rationalist sam marks

281: Black Hops CEO Nathan Hyde Talks Australian Beer Business

Play Episode Listen Later Oct 12, 2023 79:12

Derek goes live on location in Australia to interview Black Hops Brewery CEO Nathan Hyde on the challenges of the Australian beer industry. ILAB Bosses may remember that Sam Marks was an early investor in Black Hops (ILAB Episode 06: Invest in Yourself and Startup an Award Winning Brewery) and the company has faced some financial issues as of late. Nathan was brought in earlier this year to help them turn around. Derek and Nathan talk Australian government interference, getting Gen Z into beer, drinking culture and expansion into new markets. Then Sam and Derek give their thoughts on the Australian beer scene and what they think of the current state of Black Hops. Discussed: Black Hops Official Site Black Hops on Facebook Black Hops on Instagram ILAB Episode 06: Invest in Yourself and Startup an Award Winning Brewery Where we are: Johnny FD – Kyiv, Ukraine / IG @johnnyfdj Sam Marks – South Carolina / IG @sammarks12 Derek – Gold Coast, QLD Australia / IG @DerekRadio Sponsor: MasterClassBoost your confidence and find practical takeaways you can apply to your life and at work with MasterClass. Our listeners will get an additional 15% off an annual membership at MasterClass.com/ILAB WebStreet (Formerly Empire Flippers)WebStreet is launching their 6th fund. Webstreet buys & operates cash-flowing websites and is now offering SaaS businesses as well! Learn how to invest in this diversified online business portfolio HERE ILAB PatreonJoin the Invest Like a Boss Patreon now and get tons of bonus content, including additional episodes, full quarterly updates including account screenshots and more for as low as $5/month at Patreon.com/InvestLikeaBoss Time Stamp: 02:34 - Sam & Derek Intro/Catchup 10:06 - Interview with Nathan Hyde Begins 29:40 - Why Nathan Was Brought in as CEO 52:12 - Derek and Sam Recap Australia and Take on Black Hops If you enjoyed this episode, do us a favor and share it! If you haven't already, please take a minute to leave us a 5-star review on Apple Podcasts and Spotify. Copyright 2023. All rights reserved. Read our disclaimer here.

ceo spotify australia interview australian startups gen z invest masterclass saas copyright hyde invest like sam marks black hops australian beer webstreet

AF - Impact stories for model internals: an exercise for interpretability researchers by Jenny Nitishinskaya

director strategy analytics route arizona coyotes strategy analytics sam marks

Play Episode Listen Later Sep 25, 2023 12:32

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Impact stories for model internals: an exercise for interpretability researchers, published by Jenny Nitishinskaya on September 25, 2023 on The AI Alignment Forum. Inspired by Neel's longlist; thanks to @Nicholas Goldowsky-Dill and @Sam Marks for feedback and discussion, and thanks to AWAIR attendees for participating in the associated activity. As part of the Alignment Workshop for AI Researchers in July/August '23, I ran a session on theories of impact for model internals. Many of the attendees were excited about this area of work, and we wanted an exercise to help them think through what exactly they were aiming for and why. This write-up came out of planning for the session, though I didn't use all this content verbatim. My main goal was to find concrete starting points for discussion, which have the right shape to be a theory of impact are divided up in a way that feels natural cover the diverse reasons why people may be excited about model internals work (according to me). This isn't an endorsement of any of these, or of model internals research in general. The ideas on this list are due to many people, and I cite things sporadically when I think it adds useful context: feel free to suggest additional citations if you think it would help clarify what I'm referring to. Summary of the activity During the session, participants identified which impact stories seemed most exciting to them. We discussed why they felt excited, what success might look like concretely, how it might fail, what other ideas are related, etc. for a couple of those items. I think categorizing existing work based on its theory of impact could also be a good exercise in the future. I personally found the discussion useful for helping me understand what motivated some of the researchers I talked to. I was surprised by the diversity. Key stats of an impact story Applications of model internals vary a lot along multiple axes: Level of human understanding needed for the application If a lot of human understanding is needed, does that update you on the difficulty of executing in this direction? If understanding is not needed, does that open up possibilities for non-understanding-based methods you hadn't considered? For example, determining whether the model does planning would probably require understanding. On the other hand, finding adversarial examples or eliciting latent knowledge might not involve any. Level of rigor or completeness (in terms of % model explained) needed for the application If a high level of rigor or completeness is needed, does that update you on the difficulty of executing in this direction? What does the path to high rigor/completeness look like? Can you think of modifications to the impact story that might make partial progress be more useful? For example, we get value out of finding adversarial examples or dangerous capabilities, even if the way we find them is somewhat hacky. Meanwhile, if we don't find them, we'd need to be extremely thorough to be sure they don't exist, or sufficiently rigorous to get a useful bound on how likely the model is to be dangerous. Is using model internals essential for the application, or are there many possible approaches to the application, only some of which make use of model internals? Steering model behaviors can be done via model editing, or by prompting or finetuning; but, there are reasons (mentioned below) why editing could be a better approach. Many impact stories (at least as I've categorized them) have variants that live at multiple points on these spectra. When thinking about one, you should think about where it lands, and what variants you can think of that might be e.g. easier but still useful. The list Some are more fleshed out than others; some of the rest could be fleshed out with a bit more effort, while others are more ...

stories model impact exercise speech researchers ea steering neel rationalist internals interpretability sam marks awair

Ep 148 | Arizona Coyotes' Director of Strategy & Analytics, Sam Marks

The Route

Play Episode Listen Later Aug 16, 2023 59:57

This episode of The Route will feature Arizona Coyotes' Director of Strategy & Analytics, Sam Marks. Follow us on all platforms, @theroutesportsWant to find out more? Click here: https://www.whitewhalemktg.com/links To get to know more about the host, Christopher Nascimento, click here: https://www.linkedin.com/in/nascimentochristopher/

274: How To Spot & Avoid Ponzi Schemes with David M. Shapiro

Play Episode Listen Later May 18, 2023 65:40

This is part two on financial fraud and ponzi schemes. Our previous episode (ILAB 273) featured the biggest ponzi schemes of all-time. Our guest, David M. Shapiro, comes on to help explain how to avoid being involved in a ponzi scheme and what signs to look out for. He also talks the government structure around going after these crimes and some of the newer tech driven financial crimes involving cryptocurrency, identity theft and more. Sam and Derek follow up with their thoughts on the interview and Sam tells us about a ponzi scheme he almost fell into in Thailand. Discussed: David Shapiro's CV Watch Madoff: Monster of Wall Street on Netflix SEC Official Site on Tips to Spot Fraud Where we are: Johnny FD – Kyiv, Ukraine / IG @johnnyfdj Sam Marks – Barcelona, Spain / IG @imsammarks Derek – Los Angeles / IG @DerekRadio Sponsor: NetSuite For the first time in NetSuite's twenty-two years as the #1 cloud financial system, you can defer payments of a FULL NetSuite implementation for six months. Learn more at Netsuite.com/ILAB. Like these investments? Try them with these special ILAB links: Fundrise – Start with only $1,000 into their REIT funds (non-accredited investors OK)*Johnny and Sam use all of the above services personally. Invest Like a Boss Patreon Help support the continuation of Invest Like a Boss by becoming a Patreon! Plans start as low as $5/month and give you instant access to years of exclusive content, including portfolio access, trade alerts, bonus episodes & more. Join now at Patreon.com/InvestLikeaBoss. Time Stamp: 05:35 - Interview with David Shapiro Begins 13:47 - What are the new tech based crimes happening? 24:45 - Is Social Security a Ponzi Scheme? 27:54 - How the SEC Structure Works For Investigations 35:24 - What Government Agencies Have Have Roles in these crimes? 51:20 - Thailand Ponzi Scheme Sam Got Caught In : If you enjoyed this episode, do us a favor and share it! If you haven't already, please take a minute to leave us a 5-star review on Apple Podcasts and Spotify.

AF - Turning off lights with model editing by Sam Marks

Play Episode Listen Later May 12, 2023 4:58

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Turning off lights with model editing, published by Sam Marks on May 12, 2023 on The AI Alignment Forum. This post advertises an illustrative example of model editing that David Bau is fond of, and which I think should be better known. As a reminder, David Bau is a professor at Northeastern; the vibe of his work is "interpretability + interventions on model internals"; examples include the well-known ROME, MEMIT, and Othello papers. Consider the problem of getting a generative image model to produce an image of a bedroom containing unlit lamps (i.e. lamps which are turned off). Doesn't sound particularly interesting. Let's try the obvious prompts on DALL-E-2. Doesn't work great: my first three attempts only got one borderline hit, at the expense of turning the entire bedroom dark (which isn't really what I had in mind). As David Bau tells things, even after putting further effort into engineering an appropriate prompt, they weren't able to get what they wanted: a normal picture of a bedroom with lamps which are not turned on. (Apparently the captions for images containing unlit lamps don't often mention the lamps.) Despite not being able to get what they wanted with prompt engineering, they were able to get what they wanted via interpretability + model editing. Namely, they took a GAN which was trained to produce images of bedrooms, and then: identified a neuron which seemed to modulate the brightness of the lamps in the generated image intervened on the networks activations, setting the activation of this neuron so as to produce the desired level of brightness. The result is shown below. I like this example because it provides a template in a very toy setting for how model editing could be useful for alignment. The basic structure of this situation is: We have a model which is not behaving as desired, despite being able to do so in principle (our model doesn't output images of bedrooms with unlit lamps, despite being capable of doing so). Basic attempts at steering the model's behavior fail (prompt engineering isn't sufficient). But we are able to get the behavior that we want by performing a targeted model edit. This example also showcases some key weaknesses of this technique, which would need to be addressed for model editing to become a viable alignment strategy: Alignment tax. Looking closely at the image above, you'll notice that even though the direct light from the lamp is able to modified at will, certain second-order effects aren't affected (e.g. the light which is reflected off the wall). As David tells things, they also identified a whole suite of ~20 neurons in their GAN which modulated more subtle lighting effects. Not all behaviors can necessarily be targeted. The images on the lower row above contain two lamps, and these two lamps change their brightness together. The researchers were not able to find a neuron which would allow them to change the brightness of only one lamp in images that contained multiple lamps. No clear advantages over finetuning. The more obvious thing to do would be to finetune the model to output unlit lamps. As far as I know, no one tried to do that in this case, but I imagine it would work. I'll leave my most optimistic speculation about why model editing could have advantages over finetuning in certain situations in this footnote, but I don't currently find this speculation especially compelling. Overall, I'm not super bullish on the usefulness of model editing for alignment. But I do think it's intriguing, and it seems to have been useful in at least one case (though not necessarily one which is very analogous to x-risky cases of misalignment). Overall, I think that work like this is a risky bet, with the advantage that some of its failure modes might differ from the failure modes of other alignment techniques. That is, I f...

turning model rome speech lights basic alignment editing ea northeastern othello gan turning off rationalist sam marks

LW - Turning off lights with model editing by Sam Marks

Play Episode Listen Later May 12, 2023 5:00

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Turning off lights with model editing, published by Sam Marks on May 12, 2023 on LessWrong. This post advertises an illustrative example of model editing that David Bau is fond of, and which I think should be better known. As a reminder, David Bau is a professor at Northeastern; the vibe of his work is "interpretability + interventions on model internals"; examples include the well-known ROME, MEMIT, and Othello papers. Consider the problem of getting a generative image model to produce an image of a bedroom containing unlit lamps (i.e. lamps which are turned off). Doesn't sound particularly interesting. Let's try the obvious prompts on DALL-E-2. Doesn't work great: my first three attempts only got one borderline hit, at the expense of turning the entire bedroom dark (which isn't really what I had in mind). As David Bau tells things, even after putting further effort into engineering an appropriate prompt, they weren't able to get what they wanted: a normal picture of a bedroom with lamps which are not turned on. (Apparently the captions for images containing unlit lamps don't often mention the lamps.) Despite not being able to get what they wanted with prompt engineering, they were able to get what they wanted via interpretability + model editing. Namely, they took a GAN which was trained to produce images of bedrooms, and then: identified a neuron which seemed to modulate the brightness of the lamps in the generated image intervened on the networks activations, setting the activation of this neuron so as to produce the desired level of brightness. The result is shown below. I like this example because it provides a template in a very toy setting for how model editing could be useful for alignment. The basic structure of this situation is: We have a model which is not behaving as desired, despite being able to do so in principle (our model doesn't output images of bedrooms with unlit lamps, despite being capable of doing so). Basic attempts at steering the model's behavior fail (prompt engineering isn't sufficient). But we are able to get the behavior that we want by performing a targeted model edit. This example also showcases some key weaknesses of this technique, which would need to be addressed for model editing to become a viable alignment strategy: Alignment tax. Looking closely at the image above, you'll notice that even though the direct light from the lamp is able to modified at will, certain second-order effects aren't affected (e.g. the light which is reflected off the wall). As David tells things, they also identified a whole suite of ~20 neurons in their GAN which modulated more subtle lighting effects. Not all behaviors can necessarily be targeted. The images on the lower row above contain two lamps, and these two lamps change their brightness together. The researchers were not able to find a neuron which would allow them to change the brightness of only one lamp in images that contained multiple lamps. No clear advantages over finetuning. The more obvious thing to do would be to finetune the model to output unlit lamps. As far as I know, no one tried to do that in this case, but I imagine it would work. I'll leave my most optimistic speculation about why model editing could have advantages over finetuning in certain situations in this footnote, but I don't currently find this speculation especially compelling. Overall, I'm not super bullish on the usefulness of model editing for alignment. But I do think it's intriguing, and it seems to have been useful in at least one case (though not necessarily one which is very analogous to x-risky cases of misalignment). Overall, I think that work like this is a risky bet, with the advantage that some of its failure modes might differ from the failure modes of other alignment techniques. [Thanks to Xander Davies ...

turning model rome speech lights basic alignment editing ea northeastern othello gan turning off rationalist lesswrong sam marks

267: Space Tourism via Balloon with Zero 2 Infinity CEO José Mariano López Urdiales

Play Episode Listen Later Mar 23, 2023 91:35

In this episode, Derek interviews Zero 2 Infinity Fouder & CEO José Mariano López Urdiales on a unique take on the space trousim industry. Instead of rockets, Zero 2 Infinity is using balloon technology to (hopefully soon) fly humans into space on a more comfortable, leisurely alternative to the rocket. Zero 2 Infinity also has a satellite delivery business that José explains and lays out the investment opportunities. Sam Marks jumps on the intro and outro to tell us where he's been traveling all through Asia, and tell why he is excited for this space technology... but won't be a passenger himself. About José & Zero 2 Infinity: Jose Mariano López is the founder and CEO of Zero 2 Infinity, the most active player in Europe in Near Space flights with over 50 successful missions. Jose Mariano is an engineer and a pioneer in NewSpace, whose vision is to make access to Space affordable and sustainable. His seminal work has sparked a renaissance of high-altitude ballooning with applications ranging from telecommunications to Space Tourism. He is a graduate of MIT in Aeronautics and Astronautics and holds 2 patents. Discussed: Zero 2 Infinity Website Jose's LinkedIn Zero 2 Infinity Instagram Zero 2 Infinity Twitter VIDEO: 50% Scale Prototype Balloon The Paper That Initiated The Space Balloon Race VIDEO: Bloostar Rocket Satellite Launch From Near Space Where we are: Johnny FD – Ukraine / IG @johnnyfdj Sam Marks – Bangkok / IG @imsammarks Derek Spartz – Los Angeles / IG @DerekRadio Sponsor: NetSuite For the first time in NetSuite's twenty-two years as the #1 cloud financial system, you can defer payments of a FULL NetSuite implementation for six months. Learn more at NetSuite.com/ILAB. Like these investments? Try them with these special ILAB links: ArtofFX – Start with just a $10,000 account (reduced from $25,000) Fundrise – Start with only $1,000 into their REIT funds (non-accredited investors OK)*Johnny and Sam use all of the above services personally. Time Stamp: 03:04 - Where Sam's Been Traveling To in Asia 09:18 - Interview with José Begins 13:32 - What Phase of Space Flight We're Currently In 18:14 - What The Balloon Looks Like 39:40 - How Much Are Tickets? 01:05:26 - Sam & Derek Outro If you enjoyed this episode, do us a favor and share it! If you haven't already, please take a minute to leave us a 5-star review on Apple Podcasts and Spotify.

LW - Powerful mesa-optimisation is already here by Roman Leventov

Play Episode Listen Later Feb 17, 2023 3:51

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Powerful mesa-optimisation is already here, published by Roman Leventov on February 17, 2023 on LessWrong. Toolformer: Language Models Can Teach Themselves to Use Tools Timo Schick, Jane Dwivedi-Yu, Roberto Dessì, Roberta Raileanu, Maria Lomeli, Luke Zettlemoyer, Nicola Cancedda, Thomas Scialom (Submitted: 9 Feb 2023) Language models (LMs) exhibit remarkable abilities to solve new tasks from just a few examples or textual instructions, especially at scale. They also, paradoxically, struggle with basic functionality, such as arithmetic or factual lookup, where much simpler and smaller models excel. In this paper, we show that LMs can teach themselves to use external tools via simple APIs and achieve the best of both worlds. We introduce Toolformer, a model trained to decide which APIs to call, when to call them, what arguments to pass, and how to best incorporate the results into future token prediction. This is done in a self-supervised way, requiring nothing more than a handful of demonstrations for each API. We incorporate a range of tools, including a calculator, a Q&A system, two different search engines, a translation system, and a calendar. Toolformer achieves substantially improved zero-shot performance across a variety of downstream tasks, often competitive with much larger models, without sacrificing its core language modeling abilities. This paper shows that LLM could appropriate arbitrary models (including optimisation models, such as search algorithms) as affordances. Human-Timescale Adaptation in an Open-Ended Task Space Adaptive Agent Team, Jakob Bauer, Kate Baumli, Satinder Baveja, Feryal Behbahani, Avishkar Bhoopchand, Nathalie Bradley-Schmieg, Michael Chang, Natalie Clay, Adrian Collister, Vibhavari Dasagi, Lucy Gonzalez, Karol Gregor, Edward Hughes, Sheleem Kashem, Maria Loks-Thompson, Hannah Openshaw, Jack Parker-Holder, Shreya Pathak, Nicolas Perez-Nieves, Nemanja Rakicevic, Tim Rocktäschel, Yannick Schroecker, Jakub Sygnowski, Karl Tuyls, Sarah York, Alexander Zacherl, Lei Zhang (Submitted: 18 Jan 2023) Foundation models have shown impressive adaptation and scalability in supervised and self-supervised learning problems, but so far these successes have not fully translated to reinforcement learning (RL). In this work, we demonstrate that training an RL agent at scale leads to a general in-context learning algorithm that can adapt to open-ended novel embodied 3D problems as quickly as humans. In a vast space of held-out environment dynamics, our adaptive agent (AdA) displays on-the-fly hypothesis-driven exploration, efficient exploitation of acquired knowledge, and can successfully be prompted with first-person demonstrations. Adaptation emerges from three ingredients: (1) meta-reinforcement learning across a vast, smooth and diverse task distribution, (2) a policy parameterised as a large-scale attention-based memory architecture, and (3) an effective automated curriculum that prioritises tasks at the frontier of an agent's capabilities. We demonstrate characteristic scaling laws with respect to network size, memory length, and richness of the training task distribution. We believe our results lay the foundation for increasingly general and adaptive RL agents that perform well across ever-larger open-ended domains. This paper blows through the result of "In-context Reinforcement Learning with Algorithm Distillation" (see also: Sam Marks' "Caution when interpreting Deepmind's In-context RL paper") and is a powerful mesa-optimisation however you look at it. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.

foundation powerful language 3d speech ea adaptation mesa api caution apis llm deepmind lms optimisation rl reinforcement learning rationalist michael chang lesswrong sam marks sarah york natalie clay

LW - Conditioning Predictive Models: Large language models as predictors by evhub

Play Episode Listen Later Feb 3, 2023 20:05

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Conditioning Predictive Models: Large language models as predictors, published by evhub on February 2, 2023 on LessWrong. This is the first of seven posts in the Conditioning Predictive Models Sequence based on the forthcoming paper “Conditioning Predictive Models: Risks and Strategies” by Evan Hubinger, Adam Jermyn, Johannes Treutlein, Rubi Hudson, and Kate Woolverton. Each post in the sequence corresponds to a different section of the paper. We will be releasing posts gradually over the course of the next week or so to give people time to read and digest them as they come out. We are starting with posts one and two, with post two being the largest and most content-rich of all seven. Thanks to Paul Christiano, Kyle McDonell, Laria Reynolds, Collin Burns, Rohin Shah, Ethan Perez, Nicholas Schiefer, Sam Marks, William Saunders, Evan R. Murphy, Paul Colognese, Tamera Lanham, Arun Jose, Ramana Kumar, Thomas Woodside, Abram Demski, Jared Kaplan, Beth Barnes, Danny Hernandez, Amanda Askell, Robert Krzyzanowski, and Andrei Alexandru for useful conversations, comments, and feedback. Abstract Our intention is to provide a definitive reference on what it would take to safely make use of predictive models in the absence of a solution to the Eliciting Latent Knowledge problem. Furthermore, we believe that large language models can be understood as such predictive models of the world, and that such a conceptualization raises significant opportunities for their safe yet powerful use via carefully conditioning them to predict desirable outputs. Unfortunately, such approaches also raise a variety of potentially fatal safety problems, particularly surrounding situations where predictive models predict the output of other AI systems, potentially unbeknownst to us. There are numerous potential solutions to such problems, however, primarily via carefully conditioning models to predict the things we want—e.g. humans—rather than the things we don't—e.g. malign AIs. Furthermore, due to the simplicity of the prediction objective, we believe that predictive models present the easiest inner alignment problem that we are aware of. As a result, we think that conditioning approaches for predictive models represent the safest known way of eliciting human-level and slightly superhuman capabilities from large language models and other similar future models. 1. Large language models as predictors Suppose you have a very advanced, powerful large language model (LLM) generated via self-supervised pre-training. It's clearly capable of solving complex tasks when prompted or fine-tuned in the right way—it can write code as well as a human, produce human-level summaries, write news articles, etc.—but we don't know what it is actually doing internally that produces those capabilities. It could be that your language model is: a loose collection of heuristics,[1] a generative model of token transitions, a simulator that picks from a repertoire of humans to simulate, a proxy-aligned agent optimizing proxies like sentence grammaticality, an agent minimizing its cross-entropy loss, an agent maximizing long-run predictive accuracy, a deceptive agent trying to gain power in the world, a general inductor, a predictive model of the world, etc. Later, we'll discuss why you might expect to get one of these over the others, but for now, we're going to focus on the possibility that your language model is well-understood as a predictive model of the world. In particular, our aim is to understand what it would look like to safely use predictive models to perform slightly superhuman tasks[2]—e.g. predicting counterfactual worlds to extract the outputs of long serial research processes.[3] We think that this basic approach has hope for two reasons. First, the prediction orthogonality thesis seems basically right: we think...

ai strategy language speech large models ea conditioning llm predictive large language models predictors rationalist lesswrong danny hernandez william saunders sam marks jared kaplan rohin shah

AF - Conditioning Predictive Models: Large language models as predictors by Evan Hubinger

amazon ukraine russian democrats winners us senate predicted dow pundits nassim taleb will russia sam marks

Play Episode Listen Later Feb 2, 2023 20:06

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Conditioning Predictive Models: Large language models as predictors, published by Evan Hubinger on February 2, 2023 on The AI Alignment Forum. This is the first of seven posts in the Conditioning Predictive Models Sequence based on the forthcoming paper “Conditioning Predictive Models: Risks and Strategies” by Evan Hubinger, Adam Jermyn, Johannes Treutlein, Rubi Hudson, and Kate Woolverton. Each post in the sequence corresponds to a different section of the paper. We will be releasing posts gradually over the course of the next week or so to give people time to read and digest them as they come out. We are starting with posts one and two, with post two being the largest and most content-rich of all seven. Thanks to Paul Christiano, Kyle McDonell, Laria Reynolds, Collin Burns, Rohin Shah, Ethan Perez, Nicholas Schiefer, Sam Marks, William Saunders, Evan R. Murphy, Paul Colognese, Tamera Lanham, Arun Jose, Ramana Kumar, Thomas Woodside, Abram Demski, Jared Kaplan, Beth Barnes, Danny Hernandez, Amanda Askell, Robert Krzyzanowski, and Andrei Alexandru for useful conversations, comments, and feedback. Abstract Our intention is to provide a definitive reference on what it would take to safely make use of predictive models in the absence of a solution to the Eliciting Latent Knowledge problem. Furthermore, we believe that large language models can be understood as such predictive models of the world, and that such a conceptualization raises significant opportunities for their safe yet powerful use via carefully conditioning them to predict desirable outputs. Unfortunately, such approaches also raise a variety of potentially fatal safety problems, particularly surrounding situations where predictive models predict the output of other AI systems, potentially unbeknownst to us. There are numerous potential solutions to such problems, however, primarily via carefully conditioning models to predict the things we want—e.g. humans—rather than the things we don't—e.g. malign AIs. Furthermore, due to the simplicity of the prediction objective, we believe that predictive models present the easiest inner alignment problem that we are aware of. As a result, we think that conditioning approaches for predictive models represent the safest known way of eliciting human-level and slightly superhuman capabilities from large language models and other similar future models. 1. Large language models as predictors Suppose you have a very advanced, powerful large language model (LLM) generated via self-supervised pre-training. It's clearly capable of solving complex tasks when prompted or fine-tuned in the right way—it can write code as well as a human, produce human-level summaries, write news articles, etc.—but we don't know what it is actually doing internally that produces those capabilities. It could be that your language model is: a loose collection of heuristics,[1] a generative model of token transitions, a simulator that picks from a repertoire of humans to simulate, a proxy-aligned agent optimizing proxies like sentence grammaticality, an agent minimizing its cross-entropy loss, an agent maximizing long-run predictive accuracy, a deceptive agent trying to gain power in the world, a general inductor, a predictive model of the world, etc. Later, we'll discuss why you might expect to get one of these over the others, but for now, we're going to focus on the possibility that your language model is well-understood as a predictive model of the world. In particular, our aim is to understand what it would look like to safely use predictive models to perform slightly superhuman tasks[2]—e.g. predicting counterfactual worlds to extract the outputs of long serial research processes.[3] We think that this basic approach has hope for two reasons. First, the prediction orthogonality thesis seems basi...

ai strategy language speech large models ea conditioning llm predictive large language models predictors rationalist danny hernandez william saunders sam marks jared kaplan rohin shah

Who Predicted 2022?

Slate Star Codex Podcast

Play Episode Listen Later Jan 26, 2023 22:56

https://astralcodexten.substack.com/p/who-predicted-2022 Winners and takeaways from last year's prediction contest Last year saw surging inflation, a Russian invasion of Ukraine, and a surprise victory for Democrats in the US Senate. Pundits, politicians, and economists were caught flat-footed by these developments. Did anyone get them right? In a very technical sense, the single person who predicted 2022 most accurately was a 20-something data scientist at Amazon's forecasting division. I know this because last January, along with amateur statisticians Sam Marks and Eric Neyman, I solicited predictions from 508 people. This wasn't a very creative or free-form exercise - contest participants assigned percentage chances to 71 yes-or-no questions, like “Will Russia invade Ukraine?” or “Will the Dow end the year above 35000?” The whole thing was a bit hokey and constrained - Nassim Taleb wouldn't be amused - but it had the great advantage of allowing objective scoring.

LW - [Crosspost] ACX 2022 Prediction Contest Results by Scott Alexander

Play Episode Listen Later Jan 24, 2023 18:44

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: [Crosspost] ACX 2022 Prediction Contest Results, published by Scott Alexander on January 24, 2023 on LessWrong. Original here. Submission statement/relevance to Less Wrong: This forecasting contest confirmed some things we already believed, like that superforecasters can consistently outperform others, or the "wisdom of crowds" effect. It also found a surprising benefit of prediction markets over other aggregation methods, which might or might not be spurious. Several members of the EA and rationalist community scored highly, including one professional AI forecaster. But Less Wrongers didn't consistently outperform members of the general (ACX-reading, forecasting-competition-entering) population. Last year saw surging inflation, a Russian invasion of Ukraine, and a surprise victory for Democrats in the US Senate. Pundits, politicians, and economists were caught flat-footed by these developments. Did anyone get them right? In a very technical sense, the single person who predicted 2022 most accurately was a 20-something data scientist at Amazon's forecasting division. I know this because last January, along with amateur statisticians Sam Marks and Eric Neyman, I solicited predictions from 508 people. This wasn't a very creative or free-form exercise - contest participants assigned percentage chances to 71 yes-or-no questions, like “Will Russia invade Ukraine?” or “Will the Dow end the year above 35000?” The whole thing was a bit hokey and constrained - Nassim Taleb wouldn't be amused - but it had the great advantage of allowing objective scoring. Our goal wasn't just to identify good predictors. It was to replicate previous findings about the nature of prediction. Are some people really “superforecasters” who do better than everyone else? Is there a “wisdom of crowds”? Does the Efficient Markets Hypothesis mean that prediction markets should beat individuals? Armed with 508 people's predictions, can we do math to them until we know more about the future (probabilistically, of course) than any ordinary mortal? After 2022 ended, Sam and Eric used a technique called log-loss scoring to grade everyone's probability estimates. Lower scores are better. The details are hard to explain, but for our contest, guessing 50% for everything would give a score of 40.21, and complete omniscience would give a perfect score of 0. Here's how the contest went: As mentioned above: guessing 50% corresponds to a score of 40.2. This would have put you in the eleventh percentile (yes, 11% of participants did worse than chance). Philip Tetlock and his team have identified “superforecasters” - people who seem to do surprisingly well at prediction tasks, again and again. Some of Tetlock's picks kindly agreed to participate in this contest and let me test them. The median superforecaster outscored 84% of other participants. The “wisdom of crowds” hypothesis says that averaging many ordinary people's predictions produces a “smoothed-out” prediction at least as good as experts. That proved true here. An aggregate created by averaging all 508 participants' guesses scored at the 84th percentile, equaling superforecaster performance. There are fancy ways to adjust people's predictions before aggregating them that outperformed simple averaging in the previous experiments. Eric tried one of these methods, and it scored at the 85th percentile, barely better than the simple average. Crowds can beat smart people, but crowds of smart people do best of all. The aggregate of the 12 participating superforecasters scored at the 97th percentile. Prediction markets did extraordinarily well during this competition, scoring at the 99.5th percentile - ie they beat 506 of the 508 participants, plus all other forms of aggregation. But this is an unfair comparison: our participants were only allowed to spend five minut...

AF - AGISF adaptation for in-person groups by Sam Marks

Play Episode Listen Later Jan 13, 2023 5:13

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AGISF adaptation for in-person groups, published by Sam Marks on January 13, 2023 on The AI Alignment Forum. This past semester, HAIST and MAIA (the Harvard and MIT AI safety student groups) ran an adapted version of Richard Ngo's AGI Safety Fundamentals alignment curriculum. This adaptation – which consists of eight 2-hour long meetings, with all readings done during the meeting – is now available on the AGISF website. In this post, we discuss the adapted curriculum and its intended use, and we recommend that other in-person reading groups following AGISF use this adaptation. The adapted curriculum and its intended use The adapted curriculum was made by refining a slightly rustier first adaptation, with significant help from Richard Ngo and feedback from participants. The key differences between the adapted curriculum and the mainline AGISF alignment curriculum are: Participants do all the core readings during the meeting; no reading is required in between meetings. Participants meet for 2 hours per week instead of 1.5. Readings, including further readings, tend to be more bite-sized (usually not longer than 20 minutes). There are no projects, and certain topics are omitted (e.g. governance and inverse reinforcement learning). The way that HAIST and MAIA used this curriculum, and the way we recommend other groups use it, is: Alternate between silent reading and discussion. So a typical meeting might look like: people arrive, everyone does reading 1, everyone discusses reading 1, everyone does reading 2, everyone discusses reading 2, etc. With certain longer or more difficult readings (e.g. Toy models of superposition), it could be reasonable to occasionally pause for discussion in the middle of the reading. Encourage faster readers to take a look at the further readings while they wait for others to catch up. We found that reading speeds varied significantly, with slower readers taking ~1.5x as long to finish as faster readers. This works especially well if the readings are printed (which we recommend doing). We note that this format introduces some new challenges, especially when there are slower readers. Facilitators need to manage discussion timing since discussions that go too long cut into time for reading and discussing other material. Planning out how long to spend discussing each core reading ahead of time can be very useful. Facilitators should feel comfortable cutting off discussions to make sure there's time to read and discuss all the core readings. (On the other hand, if a discussion is very productive, it may be worth skipping certain readings; this is a judgment call that facilitators will need to make.) Different reading speeds need to be managed. At HAIST, we typically found it feasible to wait for the slowest reader to finish reading. We printed copies of the further readings for faster readers to peruse while they waited for others to finish. On the other hand, this might not work well for groups with especially slow readers. In these cases, you may need to begin discussions before everyone is done reading and, going forward, encourage slower readers to take a look at the core readings ahead of future meetings. To help with some of these challenges, Sam prepared a guide for HAIST and MAIA facilitators that included recommended discussion times, points of discussion, and advice about which readings to cut if necessary. That facilitator guide was for an outdated version of the curriculum, but we hope to have an updated facilitator guide in the next few weeks. We don't want to make these public, but feel free to reach out to smarks@math.harvard.edu if you're running a reading group and are interested in seeing the old or forthcoming facilitator guides. Why we recommend the adapted curriculum Sam and Xander generally felt that the in-sessions reading ...

planning harvard speech groups encourage ea adaptation readings alternate toy facilitators rationalist sam marks richard ngo

259: "The Real Asset Investor" Dave Zook

Play Episode Listen Later Jan 12, 2023 61:09

Johnny and Derek kick off the episode with New Year's updates, including a scary moment for Johnny in Kyiv. Then Johnny interviews "The Real Asset Investor" Dave Zook on his various funds and their structures for self-storage, car washes & ATMs. Dave lays out how to invest, who can invest and how to get your federal tax rate into the single digits! Then Johnny & Derek follow up on their thoughts into Dave's investing philosophy and provide an exciting new feature for ILAB Patreons. About Dave Zook: Dave is a successful Business owner, Syndicator and an Investment and Tax Strategist. Dave and his team at The Real Asset Investor have placed more than $800M across various asset classes which offer Cash Flow, Tax Impact and Equity Growth for investors. These asset classes include ATMs, Car Washes, Energy, Self-Storage and more. He and his team are one of the Top 5 ATM Fund Operators in the country. Dave was an early investor in Bitcoin and Digital Assets, and he holds an advisory role at Off the Chain Capital, one of the top performing funds in the world for the last 5 years. Dave is a sought-after speaker and has shared his knowledge across various media platforms such as The International Business Conference, The Real Estate Guys Radio Show, Cashflow Ninja and others. Dave and his wife Susan along with their 4 children live in Lancaster, PA. Discussed: TheRealAssetInvestor.com Dave Zook Facebook Dave Zook Instagram Email Dave Zook's team directly at info@therealassetinvestor.com Where we are: Johnny FD – Ukraine / IG @johnnyfdj Sam Marks – Barcelona / IG @imsammarks Derek Spartz – Los Angeles / IG @DerekRadio Sponsor: Nom Nom At Nom Nom, we make food that looks like food. Tailored to your pet's exact needs and backed by years of scientific research. Get 50% off your first 2 week trial when you visit trynom.com/ilab Like these investments? Try them with these special ILAB links: ArtofFX – Start with just a $10,000 account (reduced from $25,000) Fundrise – Start with only $1,000 into their REIT funds (non-accredited investors OK)*Johnny and Sam use all of the above services personally. Time Stamp: 02:00 – Rockets over Kyiv 07:55 - About Dave Zook 11:10 - Dave Zook Interview Starts 22:35 - How Dave Pays Nearly Zero Tax 40:15 - Johnny & Derek's Thoughts on Investing with Dave If you enjoyed this episode, do us a favor and share it! If you haven't already, please take a minute to leave us a 5-star review on Apple Podcasts and Spotify.

258: Q4 Quarterly Update with Sam & Johnny

spotify japan travel ukraine invest investment greece kyiv shopify 2022 reit new freedom barcelon south carolina house johnny fd ilab sam marks

Play Episode Listen Later Jan 5, 2023 46:18

Johnny and Sam go over the end of year 2022/Quarterly Updates. They talk about settling into their respective homes in Barcelon and Kyiv. Johnny traveled to Greece to await his permanent Ukraine residency and Sam went on a month long trip through Japan. The guys talk what daily life looks like in Ukraine during the war, Sam's upcoming job freedom and more! This is only the public portion of the podcast, if you'd like the entire show in audio and video formats including our actual portfolios, you can instantly access when you join the ILAB Patreon. Where we are: Johnny FD – Ukraine / IG @johnnyfdj Sam Marks – Barcelona / IG @imsammarks Derek Spartz – Los Angeles / IG @DerekRadio Sponsor: Shopify Get serious about selling and try Shopify today. Sign up for a $1 per month trial period at Shopify.com/ilab Like these investments? Try them with these special ILAB links: ArtofFX – Start with just a $10,000 account (reduced from $25,000) Fundrise – Start with only $1,000 into their REIT funds (non-accredited investors OK)*Johnny and Sam use all of the above services personally. Time Stamp: 05:40 – Sam's South Carolina House 12:12 - Johnny on Kyiv 18:16 - Ukraine Blackout Schedule 24:40 - Johnny in Greece. 29:45 - Sam's New Freedom 38:00 - Sam's 2023 Travel Plans If you enjoyed this episode, do us a favor and share it! If you haven't already, please take a minute to leave us a 5-star review on Apple Podcasts and Spotify.

AF - Take 13: RLHF bad, conditioning good. by Charlie Steiner

ai speech ea conditioning bayesian hyperbolic rationalist rlhf sam marks charlie steiner

Play Episode Listen Later Dec 22, 2022 4:13

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Take 13: RLHF bad, conditioning good., published by Charlie Steiner on December 22, 2022 on The AI Alignment Forum. As a writing exercise, I'm writing an AI Alignment Hot Take Advent Calendar - one new hot take, written every day some days for 25 days. I have now procrastinated enough that I probably have enough hot takes. Hyperbolic title, sorry. But seriously, conditioning is better than RLHF for current language models. For agents navigating the real world, both have issues and it's not clear-cut where progress will come from. By "conditioning", I mean the decision transformer trick to do conditional inference: get human ratings of sequences of tokens, and then make a dataset where you append the ratings to the front of the sequences. A model trained on this dataset for next-token prediction will have to learn the distribution of text conditional on the rating - so if you prompt it with a high rating and then the start of an answer, it will try to continue the answer in a way humans would rate highly. This can be very similar to RLHF - especially if you augment the training data by building a model of human ratings, and train a model to do conditional inference by finetuning a model trained normally. But in the right perspective, the resulting AIs are trying to do quite different things. RLHF is sorta training the AI to be an agent. Not an agent that navigates the real world, but an agent that navigates the state-space of text. It learns to prefer certain trajectories of the text, and takes actions (outputs words) to steer the text onto favored trajectories. Conditioning, on the other hand, is trying to faithfully learn the distribution of possible human responses - it's getting trained to be a simulator that can predict many different sorts of agents. The difference is stark in their reactions to variance. RLHF wants to eliminate variance that might make a material difference in the trajectory (when the KL penalty is small relative to the Bayesian-updating KL penalty), while conditioning on rating still tries to produce something that looks like the training distribution. This makes conditioning way better whenever you care about the diversity of options produced by a language model - e.g. if you're trying to get the AI to generate something specific yet hard to specify, and you want to be able to sift through several continuations. Or if you're building a product that works like souped-up autocorrect, and want to automatically get a diversity of good suggestions. Another benefit is quantilization. RLHF is trying to get the highest score available, even if it means exploiting human biases. If instead you condition on a score that's high but still regularly gotten by humans, it's like you're sampling policies that get this high-but-not-too-high score, which are less exploitative of human raters than the absolute maximum-score policy. This isn't a free lunch. Fine-tuning for conditional inference has less of an impact on what sort of problem the AI is solving than RLHF does, but it makes that problem way harder. Unsurprisingly, performance tends to be worse on harder problems. Still, research on decision transformers is full of results that are somewhat competitive with other methods. It also still exploits the human raters some amount, increasing with the extremity of the score. Sam Marks has talked about a scheme using online decision transformers to improve performance without needing to make the score extreme relative to the distribution seen so far, which is definitely worth a read, but this seems like a case of optimality is the tiger. Whether found by RLHF or conditioning, the problem is with the policies that get the highest scores. Looking out to the future, I'm uncertain about how useful conditioning will really be. For an AI that chooses policies to affe...

290: Henry Yeh On Breaking In As A Basketball Analyst And Paying It Forward

Play Episode Listen Later Oct 20, 2022 33:00

The CUSP Show welcomes Henry Yeh, Graduate Assistant for the women's basketball team at Duke University, to share his story of moving to the United States and working to get discovered as a basketball film analyst. Henry describes what it was like to launch his own website breaking down NBA film, building an audience through the quality of his work, and how he parlayed his individual work into a role in the Miami Heat's scouting department. From there, he shares how one phone call led him to his current position at Duke, despite his original desire to join a program with far less athletic prestige. A native of Taiwan, Henry's accomplishments have earned him a mild amount of celebrity in his home country and throughout Asia. He explains discovering that he opened the door for people in Asia to do the work he does, what that means to him, and how he hopes to pay it forward as his career progresses. The CUSP Show is a production by the faculty of Sports Management at Columbia University. You can get in touch with the program on Twitter @CU_SPS_Sports. The CUSP Show is hosted by Joe Favorito (@Joefav) and Tom Richardson (@ConvergenceTR). The show is produced by Yash Agarwal '22 (@yashagarwal655), Sam Marks '22, Matt Hornick '23 (@MNHornick), and Cindy Li '23, with Jillian Quinn '22 (@JillianMQuinn) and Dominique Smith '22 managing social media efforts.

united states nba basketball taiwan columbia university analysts duke university miami heat college sports paying it forward cusp sports management graduate assistant sportsbiz sam marks dominique smith cindy li

289: Jim Cavale On Athlete Brand Building & Managing NIL Business Opportunities

Play Episode Listen Later Oct 13, 2022 54:00

Joe and Tom are joined by Jim Cavale, CEO and Founder of INFLCR and Chief Innovation Officer at Teamworks Inc. INFLCR is an athlete brand-building and NIL Business Management app used by over 250 institutions. A student-athlete and now a serial entrepreneur, Jim talks about how at INFLCR, they help student-athletes build their brands on social media by delivering content and personalized engagement metrics. He discusses INFLCR's ability to allow institutions to customize and manage their NIL reporting while providing approved businesses, collectives, and individuals a customized portal to communicate with student-athletes and fulfill transactions. Additionally, Jim describes how they offer tools to student-athletes to access third-party marketplaces through their verified exchange platform. Jim discusses the importance of three critical pillars, namely performance, influence, and exposure, in helping student-athletes navigate and maximize their NIL opportunities. He also shares his interesting insight into the potential risks of the NIL, pointing out the downsides of treating athletes as employees and its negative impact on their mental health. Finally, Jim shares some invaluable wisdom on balancing life as an entrepreneur and how he continues to stay informed to generate innovative ideas. You do not want to miss out on this engaging and insightful episode of The CUSP show. The CUSP Show is a production by the faculty of Sports Management at Columbia University. You can get in touch with the program on Twitter @CU_SPS_Sports. The CUSP Show is hosted by Joe Favorito (@Joefav) and Tom Richardson (@ConvergenceTR). The show is produced by Yash Agarwal '22 (@yashagarwal655) and Sam Marks '22, with Jillian Quinn '22 (@JillianMQuinn) and Dominique Smith '22 managing social media efforts.

ceo founders managing influencers columbia university nil chief innovation officer brand building business opportunities cusp sports management sportsbiz jim cavale inflcr sam marks dominique smith

288: Patricia Deldin On College Athlete Mental Health & Mood Lifters

Play Episode Listen Later Oct 6, 2022 58:00

With Patricia's expertise in the field of mental health and her personal background as an ex-college athlete, Patricia started Mood Lifters. Mood Lifters is a mental health program that organizes member groups that are led by peer leaders. Specializing in the mental health of college athletes, Mood Lifters has support groups consisting of college athletes, led by (ex-)college athletes who endured all the trials and tribulations themselves. Camille Davre is such a peer leader at Mood Lifters. She joins this podcast to talk about her role as a peer leader and college athlete, and how college athletes' mental health issues are best understood by other college athletes. In this episode, we take a deep dive into all aspects of the mental health of (college) athletes. You do not want to miss out on this engaging and insightful episode on The CUSP show. The CUSP Show is a production by the faculty of Sports Management at Columbia University. You can get in touch with the program on Twitter @CU_SPS_Sports. The CUSP Show is hosted by Joe Favorito (@Joefav) and Tom Richardson (@ConvergenceTR). The show is produced by Yash Agarwal '22 (@yashagarwal655) and Sam Marks '22, with Jillian Quinn '22 (@JillianMQuinn) and Dominique Smith '22 managing social media efforts.

mental health columbia university mood specializing college athletes cusp sports management lifters sportsbiz sam marks dominique smith

287: Dylan Sadiq On Making Professional Athletes' Portraits Using Rubik's Cubes

Play Episode Listen Later Sep 29, 2022 45:00

Joe and Tom are joined by Dylan Sadiq, also known as The College Cuber. Dylan is an incredibly talented artist who creates mosaics of professional athletes using Rubik's Cubes. Through a combination of his exceptional creative talent and entrepreneurial ability, Dylan has collaborated with professional sports leagues and teams such as the NBA, MLB, Premier League, NHL, USTA, FC Barcelona, Tennessee Titans, New York Red Bulls, and many more. An engineering student at Rutgers University, Dylan shares his fascinating story about how his desire to take on a more hands-on project during Covid led him to discover his craft and launch The College Cuber, creating his first mosaic of his favorite player Luka Dončić. He explains the process of putting together his artwork using more than 500 Rubik's Cubes in under three hours. Dylan talks about how he has been able to close deals with some of the biggest sports brands and properties and continues to innovate and grow his business. Dylan also describes how he is engaging fans live at sporting events through his incredibly unique artwork by sharing his experience working with the US Open. You do not want to miss out on this engaging and insightful episode on The CUSP show. The CUSP Show is a production by the faculty of Sports Management at Columbia University. You can get in touch with the program on Twitter @CU_SPS_Sports. The CUSP Show is hosted by Joe Favorito (@Joefav) and Tom Richardson (@ConvergenceTR). The show is produced by Yash Agarwal '22 (@yashagarwal655) and Sam Marks '22, with Jillian Quinn '22 (@JillianMQuinn) and Dominique Smith '22 managing social media efforts. Columbia University Sports Management Conference Registration Link: https://bit.ly/3ChRPZs

286: Ben Mathis-Lilley On The State Of College Football & The Nature Of Fandom

Play Episode Listen Later Sep 22, 2022 51:00

Joe and Tom are joined by Ben Mathis-Lilley, Senior Writer at Slate Magazine and author of his new book, The Hot Seat. In this episode, Ben talks about his motivations for writing the book and how he was able to embed himself in the Michigan football community to truly understand the perspectives of different stakeholders, especially the fans. He discusses the state of college football and explains how it differs from the general sports landscape. Ben shares his insights on the nature of fandom and whether the new NCAA's transfer portal policy will impact it in the future. He also offers his opinion on the positives and potential risks of the college conference realignment. Finally, Ben talks about what it is like to be a journalist in 2022 and gives essential advice on improving writing skills for students and professionals. You do not want to miss out on this engaging and insightful episode on The CUSP show. The CUSP Show is a production by the faculty of Sports Management at Columbia University. You can get in touch with the program on Twitter @CU_SPS_Sports. The CUSP Show is hosted by Joe Favorito (@Joefav) and Tom Richardson (@ConvergenceTR). The show is produced by Yash Agarwal '22 (@yashagarwal655) and Sam Marks '22 with Jillian Quinn '22 (@JillianMQuinn) and Dominique Smith '22 managing social media efforts.

nature michigan ncaa columbia university college football nil fandom hot seat senior writer cusp sports management slate magazine sportsbiz sam marks dominique smith ben mathis lilley

246: Combining Machine & Digital Gaming with Winner Winner

las vegas digital gaming coo copyright shopify customer experience reit ithaca college winner winner ilab sam marks

Play Episode Listen Later Sep 15, 2022 39:02

Derek travels to the Las Vegas headquarters of gaming company Winner Winner to speak with co-founder Cody Flaherty. They discuss how Winner Winner is allowing users to play real machine gaming from anywhere in the world via their mobile phones. Plus, Winner Winner has since expanded to online gaming, live trivia events and more. Sam Marks is an active early investor in Winner Winner and has worked extensively with one of their founders. Sam and Derek wrap up the episode discussing Sam's investment, why he believes in it and what he expects from the company going forward. Cody Flaherty is the Co-founder and COO of Playtertainment, the development studio behind the live gaming app Winner Winner. Prior to launching Winner Winner with his partner Jon in 2019, Cody was an early member of the PetFlow.com team, an online retailer of pet food and supplies. Cody was a key member of the senior executive team, helping the business realize an exit to the largest distributor of pet products in the US. Cody sits on the advisory board for Ithaca College's Customer Experience program and also consults for businesses in the ecommerce space. Winner Winner is a robust gaming platform that allows you to earn real rewards for playing games you know and love! Whether its playing physical games or playing digital games you already love to play, you can earn tickets and redeem those tickets for real rewards that are shipped directly to you! Listen to ILAB 246 on iTunes here or subscribe on your favorite podcast app. Where we are: Johnny FD – Ukraine / IG @johnnyfdj Sam Marks – Barcelona / IG @imsammarks Derek Spartz – Los Angeles / IG @DerekRadio Sponsor: ShopifyGet 14 days free and access Shopify's full suite of features to get selling online today! Just go to Shopify.com/ilab to get started. Discussed Winner Winner Like these investments? Try them with these special ILAB links: ArtofFX – Start with just a $10,000 account (reduced from $25,000) Fundrise – Start with only $1,000 into their REIT funds (non-accredited investors OK) *Johnny and Sam use all of the above services personally. Time Stamp: 08:25 – What is your “elevator pitch” for Winner Winner? 09:35 – How did you get into the gaming industry? 11:27 – What kind of investors do you have currently? 12:10 – What is your goal for Winner Winner as a business? 14:00 – How has Winner Winner evolved over the past 3 years? 16:28 – How does technology work? 18:10 – Has anyone figured out how to hack the games? 19:20 – How many users do you have and how often do they use the app? 22:16 – At what level are you at raising capital? 22:52 – Who can invest and what is the minimum request? 23:13 – What is the next plan for the business? 24:38 – Why did you start your business in Vegas? 25:49 – What is your demographic like? If you enjoyed this episode, do us a favor and share it! Also if you haven't already, please take a minute to leave us a 5-star review on iTunes and claim your bonus here! Copyright 2022. All rights reserved. Read our disclaimer here.

285: Ian Tupper On The Current State of Motorsports And The Automotive Industry

Play Episode Listen Later Sep 15, 2022 52:00

Tom is joined by Ian Tupper, who is the Senior Group Manager of Strategic Environmental Partnerships and New Climate Tech Business Initiatives for Hyundai and Genesis in North America. A Columbia University Sports Management graduate with a track record of cultivating strategic partnerships with marquee organizations such as Ferrari, Formula 1, and Formula E, Ian shares his incredible career trajectory and how he combined his passion for automobiles and his interest in sustainability to advance his career. Ian shares his thoughtful insights on the current state of the motorsports industry and how international racing organizations are already transforming to cater to the needs of future generations, viz-a-viz sustainability, contextualization, and event production and broadcast. He also talks about his role at Hyundai and Genesis, where in addition to manufacturing industry-leading electric vehicles (EVs), the company is trying to build an all-encompassing, environmentally friendly ecosystem for people's homes. Ian also weighs the opportunities and challenges posed by the current shift towards electrification in the automotive industry. You do not want to miss out on this engaging and insightful episode on The CUSP show. The CUSP Show is a production by the faculty of Sports Management at Columbia University. You can get in touch with the program on Twitter @CU_SPS_Sports. The CUSP Show is hosted by Joe Favorito (@Joefav) and Tom Richardson (@ConvergenceTR). The show is produced by Yash Agarwal '22 (@yashagarwal655), Sam Marks '22, and Danny Hagenlocher, with Jillian Quinn '22 (@JillianMQuinn) and Dominique Smith '22 managing social media efforts.

north america sustainability formula columbia university ferrari current state automotive evs motorsports hyundai formula1 cusp formula e sports management automotive industry tupper sam marks dominique smith

284: Jennifer Rottenberg on raising funding, partnerships, and women sports

Play Episode Listen Later Sep 1, 2022 46:00

Having seen many sides of the sports industry, she shares her experiences working with IMG, USA Water Polo, Fan Controlled Football, her own company, and Athleta. Being a long-time advocate for women's sports and the founding president of WISE's Los Angeles chapter, Jennifer Rottenberg is also well-versed on anything concerning women in sports. She elaborates on the history of women's sports, which factors can bring women's professional sports to an even higher level, and how many brands should have a better notion of the audiences that watch women's sports. Now working at Athleta, she explains how she shaped the newly created role of Head of Partnerships and built a team that maximizes the relationship Athleta has with its notable partners, such as Simone Biles and Allyson Felix. You do not want to miss out on this engaging and insightful episode on The CUSP show. The CUSP Show is a production by the faculty of Sports Management at Columbia University. You can get in touch with the program on Twitter @CU_SPS_Sports. The CUSP Show is hosted by Joe Favorito (@Joefav) and Tom Richardson (@ConvergenceTR). The show is produced by Yash Agarwal '22 (@yashagarwal655), Sam Marks '22, and Connor O'Neill '22, with Jillian Quinn '22 (@JillianMQuinn) and Dominique Smith '22 managing social media efforts.

head los angeles partnership raising wise funding columbia university simone biles women in sports cusp sports management allyson felix athleta women sports fan controlled football rottenberg sportsbiz usa water polo sam marks dominique smith

283: Bennett Collen & Ben Jones On Combining NFTs With Sneakers And Experiences

Play Episode Listen Later Aug 18, 2022 50:00

Joe and Tom are joined by Bennett Collen, Cofounder and CEO of Endstate, and Benjamin Jones, Director of Growth at Endstate. An active principal in the Blockchain space for several years, Bennett discusses how he co-founded the sneaker and apparel brand in Endstate, where they incorporate NFTs into the product ownership experience. He elaborates on the value of integrating the physical, digital, and experiential to usher the consumers into the future of product ownership. They also explain how NFTs authenticate sneakers and can eventually foster incredible customer experiences with athletes, artists, and entrepreneurs. Bennett also talks about their first-ever athlete collaboration with the NFL star DeVonta Smith and his signature sneaker line. He also shares his insights on the consumer's mindset in the collectibles market while pointing out how they envision Endstate's user experience leveraging AR and VR technologies. You do not want to miss out on this engaging and insightful episode on The CUSP show. The CUSP Show is a production by the faculty of Sports Management at Columbia University. You can get in touch with the program on Twitter @CU_SPS_Sports. The CUSP Show is hosted by Joe Favorito (@Joefav) and Tom Richardson (@ConvergenceTR). The show is produced by Yash Agarwal '22 (@yashagarwal655), Sam Marks '22, and Connor O'Neill '22, with Jillian Quinn '22 (@JillianMQuinn) and Dominique Smith '22 managing social media efforts.

ceo director nfl growth experiences nfts vr crypto blockchain columbia university sneakers cusp sports management ben jones benjamin jones sportsbiz sam marks dominique smith

282: Colin Cosell On Sports Broadcasting, Crowd Engagement, And More

Play Episode Listen Later Aug 11, 2022 39:00

Joe is joined by Colin Cosell, who is the PA Announcer for the New York Mets, New York Riptide, and Brooklyn Cyclones. Colin shares how his interest in broadcasting was born when he was just five. Colin then talks about his exciting career path, working in a wide variety of fields: TV, radio, stand-up comedy, theatre, and online media before becoming a PA announcer. Colin gives us a sneak peek into his game-day routine before a Mets game at the Citi Field stadium and shares the different approaches he uses to amp up the fans. He also discusses intriguing insights into the art of addressing authentically inside a stadium and offers advice for people trying to break into the sports broadcasting industry. You do not want to miss out on this engaging episode on The CUSP show. The CUSP Show is a production by the faculty of Sports Management at Columbia University. You can get in touch with the program on Twitter @CU_SPS_Sports. The CUSP Show is hosted by Joe Favorito (@Joefav) and Tom Richardson (@ConvergenceTR). The show is produced by Yash Agarwal '22 (@yashagarwal655), Sam Marks '22, and Connor O'Neill '22, with Jillian Quinn '22 (@JillianMQuinn) and Dominique Smith '22 managing social media efforts.

tv engagement columbia university crowd new york mets broadcasting cusp sports management citi field sports broadcasting pa announcer cosell brooklyn cyclones sportsbiz sam marks dominique smith

281: Phil Green On Streaming, Building An Audience, and Future Of Sports Media

Play Episode Listen Later Aug 4, 2022 60:00

Joe and Tom are joined by Phil Green, Strategic Accounts - Media at Brightcove. Phil is a sports industry veteran who has spent his career working with the biggest brands in sports and entertainment, including Endeavor, ESPN, MLB Advanced Media, USTA, ABC Sports, and others. Phil talks about the evolution of sports streaming from traditional tv and how it has become more than a viable business. He highlights the potential opportunities in the sports streaming landscape while discussing its current state. Phil shares his insights on consumer engagement and building fan relationships for major and minor sports leagues. He also talks about the emerging trends in the sports media world and how leagues adapt to changing times to stand out in the attention economy. Phil also shares his incredible wisdom on Business Development and offers essential tips for navigating careers. You do not want to miss out on this engaging and insightful episode on The CUSP show. The CUSP Show is a production by the faculty of Sports Management at Columbia University. You can get in touch with the program on Twitter @CU_SPS_Sports. The CUSP Show is hosted by Joe Favorito (@Joefav) and Tom Richardson (@ConvergenceTR). The show is produced by Yash Agarwal '22 (@yashagarwal655), Sam Marks '22, and Connor O'Neill '22, with Jillian Quinn '22 (@JillianMQuinn) and Dominique Smith '22 managing social media efforts.

espn streaming columbia university business development endeavor sports media cusp sports management usta building an audience abc sports brightcove phil green sportsbiz mlb advanced media sam marks dominique smith

238: Ask Sam Marks Anything

Play Episode Listen Later Jul 21, 2022 74:17

We polled ILAB listeners for your questions and said Ask Us Anything! We received so many questions that we decided to split it into two episodes, one for Johnny and one for Sam. Last week (ILAB 237) was Johnny and now it's Sam's turn. Sam will answer questions on investing as well as lifestyle in this open forum style episode. Listen to ILAB 238 on iTunes here or subscribe on your favorite podcast app. Where we are: Johnny FD – Montenegro / IG @johnnyfdj Sam Marks – Barcelona / IG @imsammarks Derek Spartz – Los Angeles / IG @DerekRadio Sponsor: Shopify Get 14 days free and access Shopify's full suite of features to get selling online today! Just go to Shopify.com/ilab to get started. Discussed Life advice from the generation before you (my bucket list poem) Like these investments? Try them with these special ILAB links: ArtofFX – Start with just a $10,000 account (reduced from $25,000) Fundrise – Start with only $1,000 into their REIT funds (non-accredited investors OK) *Johnny and Sam use all of the above services personally. Time Stamp: 02:01 – Sam talks about his recent coolest experience of his life 10:21 – What did Derek do this week? 15:02 – Do you have any updates on Mt. Gox BItcoin? 24:03 – What are your thoughts on the downturn, and have your strategies changed? 28:08 – How confident are you in the accuracy of the projected value of the wine? 32:02 – What are your thoughts on buying land in Thailand? 37:15 – What was your day to day like when you were living in the UK and then in China? 46:42 – What is the most valuable/interesting thing you learnt while doing the podcast? 53:43 – Can you elaborate on your bucket list video? 58:53 – If you could share a meal with any 4 individuals living or dead, who would they be? 59:48 – Where do you see yourself in the next 10 years? 62:02 – Imagine you were married with kids today, how would your life be different? 67:29 – What is one of the biggest challenges of traveling all over the world? If you enjoyed this episode, do us a favor and share it! Also if you haven't already, please take a minute to leave us a 5-star review on iTunes and claim your bonus here! Copyright 2022. All rights reserved. Read our disclaimer here.

uk china thailand mt copyright shopify ask us anything reit ilab sam marks

280: Building the Collegiate Ecosystem via Commercialization w/ Michael Schreck

Play Episode Listen Later Jul 14, 2022 52:00

Joe and Tom are joined by Michael Schreck, CEO and Co-Founder of Collegiate Sports Management Group (CSMG). Driving business growth for several athletic conferences and schools, Michael provides an assessment of the NIL era on its one year anniversary. He discusses the ROI of NIL deals, deliverables for student-athletes, NIL-related education programs, and much more. Working with hundreds of colleges, Michael describes the incredible growth of the Esports market at the collegiate level. He explains CSMG's role in the commercial growth related to Esports, highlighting the biggest esports tournament CSMG organizes through their EsportsU vertical. Michael also talks about the recent college football conference realignment with USC and UCLA leaving the Pac-12 to join the Big Ten, discussing the underlying business reasons that may have led to the realignment. You do not want to miss out on this engaging and insightful episode on The CUSP show. The CUSP Show is a production by the faculty of Sports Management at Columbia University. You can get in touch with the program on Twitter @CU_SPS_Sports. The CUSP Show is hosted by Joe Favorito (@Joefav) and Tom Richardson (@ConvergenceTR). The show is produced by Yash Agarwal '22 (@yashagarwal655), Sam Marks '22, and Connor O'Neill '22, with Jillian Quinn '22 (@JillianMQuinn) and Dominique Smith '22 managing social media efforts.

279: Making Baseball Fun and Entertaining w/ Jared Orton

Play Episode Listen Later Jul 7, 2022 56:00

Joe and Tom are joined by Jared Orton, President of The Savannah Bananas. Leading with the “fans first, entertainment always” philosophy, Jared passionately describes the story of The Savannah Bananas. He discusses how they pioneered the “Banana Ball,” an alternate form of baseball that combines traditional baseball with entertainment, making the sport a spectacle for the fans. Jared further elaborates on how the showmanship displayed while playing the sport of baseball has attracted incredible talent and huge crowds from across the country. He also shares insights into their robust digital marketing and media plan, which has allowed them to grow to more than 600K followers on social media and led to deals with media giants like ESPN. You do not want to miss out on this engaging and insightful episode on The CUSP show. The CUSP Show is a production by the faculty of Sports Management at Columbia University. You can get in touch with the program on Twitter @CU_SPS_Sports. The CUSP Show is hosted by Joe Favorito (@Joefav) and Tom Richardson (@ConvergenceTR). The show is produced by Yash Agarwal '22 (@yashagarwal655), Sam Marks '22, and Connor O'Neill '22, with Jillian Quinn '22 (@JillianMQuinn) and Dominique Smith '22 managing social media efforts.

president espn baseball columbia university entertaining savannah bananas cusp orton 600k sports management banana ball sportsbiz sam marks dominique smith

278: Business Insights from Professional Sports Teams w/ Amy Scheer

Play Episode Listen Later Jun 30, 2022 28:00

Joe is joined by Amy Scheer, Vice President of Business Operations at the Connecticut Sun. A veteran in the sports industry, Amy vividly shares experiences and learnings from her illustrious career with esteemed sports properties, including the Brooklyn Nets, Madison Square Garden, New York City Football Club, New York Red Bulls, and now the Connecticut Sun. She discusses the operational differences in the approaches of these organizations while highlighting the common denominator for creating fandom. Amy elaborates on how athletes have evolved over the years, especially pertaining to social justice issues. Now running a WNBA team, she describes how the incredible awareness amongst people has made it easier to sell tickets and foster business partnerships. Working with multiple Hall of Famers, Amy also points out the important leadership lessons she has learned along the way. You do not want to miss out on this engaging and insightful episode on The CUSP show. The CUSP Show is a production by the faculty of Sports Management at Columbia University. You can get in touch with the program on Twitter @CU_SPS_Sports. The CUSP Show is hosted by Joe Favorito (@Joefav) and Tom Richardson (@ConvergenceTR). The show is produced by Yash Agarwal '22 (@yashagarwal655), Sam Marks '22, and Connor O'Neill '22, with Jillian Quinn '22 (@JillianMQuinn) and Dominique Smith '22 managing social media efforts.

277: Developing the Next Generation of Olympians w/ Neha Aggarwal

Play Episode Listen Later Jun 16, 2022 55:00

Joe and Tom are joined by Neha Aggarwal, 2008 Table Tennis Olympian for India and current Head of Partnerships & Communication at Olympic Gold Quest. Neha is the first three-time guest on the CUSP Show, following her graduation from Columbia Sports Management in 2016. In this episode, Neha speaks about her journey from competing at the Olympic Games to developing future Olympic athletes for her home country. She also addresses the importance of mental health in athletics, the increasing value of women's sports in India, and the future of Indian athletes on the global stage. The CUSP Show is a production by the faculty of Sports Management at Columbia University. You can get in touch with the program on Twitter @CU_SPS_Sports. The CUSP Show is hosted by Joe Favorito (@Joefav) and Tom Richardson (@ConvergenceTR). The show is produced by Yash Agarwal '22 (@yashagarwal655), Sam Marks '22, and Connor O'Neill '22, with Jillian Quinn '22 (@JillianMQuinn) and Dominique Smith '22 managing social media efforts.

head mental health olympic games developing indian next generation columbia university olympians neha sports management aggarwal sam marks dominique smith

232: Sam Marks on Building & Exiting His $100M Business with Justin Donald