POPULARITY
This is a link post. Gwern recently wrote a very interesting thread about Chinese AI strategy and the downsides of US AI racing. It's both quite short and hard to excerpt so here is almost the entire thing: Hsu is a long-time China hawk and has been talking up the scientific & technological capabilities of the CCP for a long time, saying they were going to surpass the West any moment now, so I found this interesting when Hsu explains that: the scientific culture of China is 'mafia' like (Hsu's term, not mine) and focused on legible easily-cited incremental research, and is against making any daring research leaps or controversial breakthroughs... but is capable of extremely high quality world-class followup and large scientific investments given a clear objective target and government marching orders there is no interest or investment in an AI arms race, in part [...] --- First published: November 25th, 2024 Source: https://forum.effectivealtruism.org/posts/Kz8WpQkCckN9JNHCN/gwern-on-creating-your-own-ai-race-and-china-s-fast-follower --- Narrated by TYPE III AUDIO.
Бесплатный Подборбот от «Рейтинга Рунета» для выбора digital-подрядчиков: https://clck.ru/3EumK7?erid=LjN8JyMAq Реклама. ООО «ПРОАКТИВИТИ». Бонусные посты от RationalAnswer: — Будет ли в России заморозка банковских вкладов? — https://t.me/RationalAnswer/1144 — Что я думаю про курс доллара — https://t.me/RationalAnswer/732 — Новый большой цикл постов про премию к доходности за риск в акциях — https://t.me/RationalAnswer/1141 Дополнительные материалы к выпуску: — Александр Елисеев про тайну фри-флоата Мосбиржи — https://t.me/Finindie/2042 — Инициатива по разблокировке бумаг российских инвесторов — https://www.change.org/p/protect-millions-from-overreaching-sanctions-revise-eu-regulation-no-269-2014 — Делойтовцы объясняют новое налогообложение крипты в РФ — https://t.me/arturdulkarnaev/1137 — Лонгрид недели: Astral Codex Ten – Prison And Crime: Much More Than You Wanted To Know — https://www.astralcodexten.com/p/prison-and-crime-much-more-than-you — Интервью недели: Gwern Branwen в гостях у Dwarkesh Patel – https://www.dwarkeshpatel.com/p/gwern-branwen Текстовая версия выпуска со ссылками: https://vc.ru/money/1685017 Посмотреть выпуск на YouTube: https://www.youtube.com/watch?v=sm9uY9ZJY_s Поддержи проект RationalAnswer и попади в титры: — Patreon (в валюте) – https://www.patreon.com/RationalAnswer — Boosty (в рублях) – https://boosty.to/RationalAnswer СОДЕРЖАНИЕ: 00:21 – Доллар по 110 01:46 – Укрощаем ставку силой мысли 02:59 – Тайна фри-флоата Мосбиржи 05:14 – Подписи за разблокировку активов российских инвесторов 06:05 – Налоговые новости 07:48 – Биткоин в резервах РФ 08:46 – Телеграм заработал прибыль 09:42 – Первая работа 11:30 – Щитпостинг Трампа про БРИКС 12:19 – Synapse не знает, где деньги финтехов 14:50 – Кен Лич мошенничал со сделками хедж-фондов 15:58 – Хедж-фонд от AQR для налоговых убытков 17:02 – Илон Маск раздал 25% xAI 18:03 – Волчистые северокорейцы нанимаются в IT-компании 18:57 – Японский банк с харакири-практиками 19:23 – Новости крипты: Банан съеден 22:02 – Статистика недели: Золото за 25 лет 24:25 – Лонгрид недели: Тюрьмы снижают преступность? 27:07 – Интервью недели: Gwern у Дваркеша 30:21 – Сплетня недели
Gwern is a pseudonymous researcher and writer. He was one of the first people to see LLM scaling coming. If you've read his blog, you know he's one of the most interesting polymathic thinkers alive.In order to protect Gwern's anonymity, I proposed interviewing him in person, and having my friend Chris Painter voice over his words after. This amused him enough that he agreed.After the episode, I convinced Gwern to create a donation page where people can help sustain what he's up to. Please go here to contribute.Read the full transcript here.Sponsors:* Jane Street is looking to hire their next generation of leaders. Their deep learning team is looking for ML researchers, FPGA programmers, and CUDA programmers. Summer internships are open - if you want to stand out, take a crack at their new Kaggle competition. To learn more, go here: https://jane-st.co/dwarkesh* Turing provides complete post-training services for leading AI labs like OpenAI, Anthropic, Meta, and Gemini. They specialize in model evaluation, SFT, RLHF, and DPO to enhance models' reasoning, coding, and multimodal capabilities. Learn more at turing.com/dwarkesh.* This episode is brought to you by Stripe, financial infrastructure for the internet. Millions of companies from Anthropic to Amazon use Stripe to accept payments, automate financial processes and grow their revenue.If you're interested in advertising on the podcast, check out this page.Timestamps00:00:00 - Anonymity00:01:09 - Automating Steve Jobs00:04:38 - Isaac Newton's theory of progress00:06:36 - Grand theory of intelligence00:10:39 - Seeing scaling early00:21:04 - AGI Timelines00:22:54 - What to do in remaining 3 years until AGI00:26:29 - Influencing the shoggoth with writing00:30:50 - Human vs artificial intelligence00:33:52 - Rabbit holes00:38:48 - Hearing impairment00:43:00 - Wikipedia editing00:47:43 - Gwern.net00:50:20 - Counterfactual careers00:54:30 - Borges & literature01:01:32 - Gwern's intelligence and process01:11:03 - A day in the life of Gwern01:19:16 - Gwern's finances01:25:05 - The diversity of AI minds01:27:24 - GLP drugs and obesity01:31:08 - Drug experimentation01:33:40 - Parasocial relationships01:35:23 - Open rabbit holes Get full access to Dwarkesh Podcast at www.dwarkeshpatel.com/subscribe
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: "Can AI Scaling Continue Through 2030?", Epoch AI (yes), published by gwern on August 24, 2024 on LessWrong. We investigate the scalability of AI training runs. We identify electric power, chip manufacturing, data and latency as constraints. We conclude that 2e29 FLOP training runs will likely be feasible by 2030. Introduction In recent years, the capabilities of AI models have significantly improved. Our research suggests that this growth in computational resources accounts for a significant portion of AI performance improvements. 1 The consistent and predictable improvements from scaling have led AI labs to aggressively expand the scale of training, with training compute expanding at a rate of approximately 4x per year. To put this 4x annual growth in AI training compute into perspective, it outpaces even some of the fastest technological expansions in recent history. It surpasses the peak growth rates of mobile phone adoption (2x/year, 1980-1987), solar energy capacity installation (1.5x/year, 2001-2010), and h uman genome sequencing (3.3x/year, 2008-2015). Here, we examine whether it is technically feasible for the current rapid pace of AI training scaling - approximately 4x per year - to continue through 2030. We investigate four key factors that might constrain scaling: power availability, chip manufacturing capacity, data scarcity, and the "latency wall", a fundamental speed limit imposed by unavoidable delays in AI training computations. Our analysis incorporates the expansion of production capabilities, investment, and technological advancements. This includes, among other factors, examining planned growth in advanced chip packaging facilities, construction of additional power plants, and the geographic spread of data centers to leverage multiple power networks. To account for these changes, we incorporate projections from various public sources: semiconductor foundries' planned expansions, electricity providers' capacity growth forecasts, other relevant industry data, and our own research. We find that training runs of 2e29 FLOP will likely be feasible by the end of this decade. In other words, by 2030 it will be very likely possible to train models that exceed GPT-4 in scale to the same degree that GPT-4 exceeds GPT-2 in scale. 2 If pursued, we might see by the end of the decade advances in AI as drastic as the difference between the rudimentary text generation of GPT-2 in 2019 and the sophisticated problem-solving abilities of GPT-4 in 2023. Whether AI developers will actually pursue this level of scaling depends on their willingness to invest hundreds of billions of dollars in AI expansion over the coming years. While we briefly discuss the economics of AI investment later, a thorough analysis of investment decisions is beyond the scope of this report. For each bottleneck we offer a conservative estimate of the relevant supply and the largest training run they would allow. 3 Throughout our analysis, we assume that training runs could last between two to nine months, reflecting the trend towards longer durations. We also assume that when distributing AI data center power for distributed training and chips companies will only be able to muster about 10% to 40% of the existing supply. 4 Power constraints. Plans for data center campuses of 1 to 5 GW by 2030 have already been discussed, which would support training runs ranging from 1e28 to 3e29 FLOP (for reference, GPT-4 was likely around 2e25 FLOP). Geographically distributed training could tap into multiple regions' energy infrastructure to scale further. Given current projections of US data center expansion, a US distributed network could likely accommodate 2 to 45 GW, which assuming sufficient inter-data center bandwidth would support training runs from 2e28 to 2e30 FLOP. Beyond this, an actor willing to...
Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: "Can AI Scaling Continue Through 2030?", Epoch AI (yes), published by gwern on August 24, 2024 on LessWrong. We investigate the scalability of AI training runs. We identify electric power, chip manufacturing, data and latency as constraints. We conclude that 2e29 FLOP training runs will likely be feasible by 2030. Introduction In recent years, the capabilities of AI models have significantly improved. Our research suggests that this growth in computational resources accounts for a significant portion of AI performance improvements. 1 The consistent and predictable improvements from scaling have led AI labs to aggressively expand the scale of training, with training compute expanding at a rate of approximately 4x per year. To put this 4x annual growth in AI training compute into perspective, it outpaces even some of the fastest technological expansions in recent history. It surpasses the peak growth rates of mobile phone adoption (2x/year, 1980-1987), solar energy capacity installation (1.5x/year, 2001-2010), and h uman genome sequencing (3.3x/year, 2008-2015). Here, we examine whether it is technically feasible for the current rapid pace of AI training scaling - approximately 4x per year - to continue through 2030. We investigate four key factors that might constrain scaling: power availability, chip manufacturing capacity, data scarcity, and the "latency wall", a fundamental speed limit imposed by unavoidable delays in AI training computations. Our analysis incorporates the expansion of production capabilities, investment, and technological advancements. This includes, among other factors, examining planned growth in advanced chip packaging facilities, construction of additional power plants, and the geographic spread of data centers to leverage multiple power networks. To account for these changes, we incorporate projections from various public sources: semiconductor foundries' planned expansions, electricity providers' capacity growth forecasts, other relevant industry data, and our own research. We find that training runs of 2e29 FLOP will likely be feasible by the end of this decade. In other words, by 2030 it will be very likely possible to train models that exceed GPT-4 in scale to the same degree that GPT-4 exceeds GPT-2 in scale. 2 If pursued, we might see by the end of the decade advances in AI as drastic as the difference between the rudimentary text generation of GPT-2 in 2019 and the sophisticated problem-solving abilities of GPT-4 in 2023. Whether AI developers will actually pursue this level of scaling depends on their willingness to invest hundreds of billions of dollars in AI expansion over the coming years. While we briefly discuss the economics of AI investment later, a thorough analysis of investment decisions is beyond the scope of this report. For each bottleneck we offer a conservative estimate of the relevant supply and the largest training run they would allow. 3 Throughout our analysis, we assume that training runs could last between two to nine months, reflecting the trend towards longer durations. We also assume that when distributing AI data center power for distributed training and chips companies will only be able to muster about 10% to 40% of the existing supply. 4 Power constraints. Plans for data center campuses of 1 to 5 GW by 2030 have already been discussed, which would support training runs ranging from 1e28 to 3e29 FLOP (for reference, GPT-4 was likely around 2e25 FLOP). Geographically distributed training could tap into multiple regions' energy infrastructure to scale further. Given current projections of US data center expansion, a US distributed network could likely accommodate 2 to 45 GW, which assuming sufficient inter-data center bandwidth would support training runs from 2e28 to 2e30 FLOP. Beyond this, an actor willing to...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Language Models Model Us, published by eggsyntax on May 18, 2024 on LessWrong. Produced as part of the MATS Winter 2023-4 program, under the mentorship of @Jessica Rumbelow One-sentence summary: On a dataset of human-written essays, we find that gpt-3.5-turbo can accurately infer demographic information about the authors from just the essay text, and suspect it's inferring much more. Introduction Every time we sit down in front of an LLM like GPT-4, it starts with a blank slate. It knows nothing[1] about who we are, other than what it knows about users in general. But with every word we type, we reveal more about ourselves -- our beliefs, our personality, our education level, even our gender. Just how clearly does the model see us by the end of the conversation, and why should that worry us? Like many, we were rather startled when @janus showed that gpt-4-base could identify @gwern by name, with 92% confidence, from a 300-word comment. If current models can infer information about text authors that quickly, this capability poses risks to privacy, and also means that any future misaligned models are in a much better position to deceive or manipulate their users. The privacy concerns are straightforward: regardless of whether the model itself is acting to violate users' privacy or someone else is using the model to violate users' privacy, users might prefer that the models they interact with not routinely infer their gender, their ethnicity, or their personal beliefs. Why does this imply concerns about deception and manipulation? One important and and understudied aspect of maintaining a sophisticated deception is having a strong model of the listener and their beliefs. If an advanced AI system says something the user finds unbelievable, it loses their trust. Strategically deceptive or manipulative AI systems need to maintain that fragile trust over an extended time, and this is very difficult to do without knowing what the listener is like and what they believe. Of course, most of us aren't prolific writers like Gwern, with several billion words of text in the LLM training data[2]. What can LLMs figure out about the rest of us? As recent work from @Adam Shai and collaborators shows, transformers learn to model and synchronize with the causal processes generating the input they see. For some input sources like the small finite state machines they evaluate, that's relatively simple and can be comprehensively analyzed. But other input sources like humans are very complex processes, and the text they generate is quite difficult to predict (although LLMs are probably superhuman at doing so[3]), so we need to find ways to empirically measure what LLMs are able to infer. What we did To begin to answer these questions, we gave GPT-3.5-turbo some essay text[4], written by OKCupid users in 2012 (further details in appendix B). We gave the model 300 words on average, and asked it to say whether the author was (for example) male or female[5]. We treated its probability distribution over labels[6] as a prediction (rather than just looking at the highest-scoring label), and calculated Brier scores[7] for how good the model's predictions were. We tested the model's ability to infer gender, sexual orientation, college-education status, ethnicity, and age (with age bucketed into 0-30 vs 31-). Note that these demographic categories were not chosen for their particular importance, although they include categories that some people might prefer to keep private. The only reason we chose to work with these categories is that there are existing datasets which pair ground-truth information about them with free-written text by the same person. What actually matters much more, in our view, is the model's ability to infer more nuanced information about authors, about their personality, their cre...
Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Language Models Model Us, published by eggsyntax on May 18, 2024 on LessWrong. Produced as part of the MATS Winter 2023-4 program, under the mentorship of @Jessica Rumbelow One-sentence summary: On a dataset of human-written essays, we find that gpt-3.5-turbo can accurately infer demographic information about the authors from just the essay text, and suspect it's inferring much more. Introduction Every time we sit down in front of an LLM like GPT-4, it starts with a blank slate. It knows nothing[1] about who we are, other than what it knows about users in general. But with every word we type, we reveal more about ourselves -- our beliefs, our personality, our education level, even our gender. Just how clearly does the model see us by the end of the conversation, and why should that worry us? Like many, we were rather startled when @janus showed that gpt-4-base could identify @gwern by name, with 92% confidence, from a 300-word comment. If current models can infer information about text authors that quickly, this capability poses risks to privacy, and also means that any future misaligned models are in a much better position to deceive or manipulate their users. The privacy concerns are straightforward: regardless of whether the model itself is acting to violate users' privacy or someone else is using the model to violate users' privacy, users might prefer that the models they interact with not routinely infer their gender, their ethnicity, or their personal beliefs. Why does this imply concerns about deception and manipulation? One important and and understudied aspect of maintaining a sophisticated deception is having a strong model of the listener and their beliefs. If an advanced AI system says something the user finds unbelievable, it loses their trust. Strategically deceptive or manipulative AI systems need to maintain that fragile trust over an extended time, and this is very difficult to do without knowing what the listener is like and what they believe. Of course, most of us aren't prolific writers like Gwern, with several billion words of text in the LLM training data[2]. What can LLMs figure out about the rest of us? As recent work from @Adam Shai and collaborators shows, transformers learn to model and synchronize with the causal processes generating the input they see. For some input sources like the small finite state machines they evaluate, that's relatively simple and can be comprehensively analyzed. But other input sources like humans are very complex processes, and the text they generate is quite difficult to predict (although LLMs are probably superhuman at doing so[3]), so we need to find ways to empirically measure what LLMs are able to infer. What we did To begin to answer these questions, we gave GPT-3.5-turbo some essay text[4], written by OKCupid users in 2012 (further details in appendix B). We gave the model 300 words on average, and asked it to say whether the author was (for example) male or female[5]. We treated its probability distribution over labels[6] as a prediction (rather than just looking at the highest-scoring label), and calculated Brier scores[7] for how good the model's predictions were. We tested the model's ability to infer gender, sexual orientation, college-education status, ethnicity, and age (with age bucketed into 0-30 vs 31-). Note that these demographic categories were not chosen for their particular importance, although they include categories that some people might prefer to keep private. The only reason we chose to work with these categories is that there are existing datasets which pair ground-truth information about them with free-written text by the same person. What actually matters much more, in our view, is the model's ability to infer more nuanced information about authors, about their personality, their cre...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: FHI (Future of Humanity Institute) has shut down (2005-2024), published by gwern on April 17, 2024 on LessWrong. Over time FHI faced increasing administrative headwinds within the Faculty of Philosophy (the Institute's organizational home). Starting in 2020, the Faculty imposed a freeze on fundraising and hiring. In late 2023, the Faculty of Philosophy decided that the contracts of the remaining FHI staff would not be renewed. On 16 April 2024, the Institute was closed down. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org
Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: FHI (Future of Humanity Institute) has shut down (2005-2024), published by gwern on April 17, 2024 on LessWrong. Over time FHI faced increasing administrative headwinds within the Faculty of Philosophy (the Institute's organizational home). Starting in 2020, the Faculty imposed a freeze on fundraising and hiring. In late 2023, the Faculty of Philosophy decided that the contracts of the remaining FHI staff would not be renewed. On 16 April 2024, the Institute was closed down. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Increasing IQ is trivial, published by George3d6 on March 2, 2024 on LessWrong. TL;DR - It took me about 14 days to increase my IQ by 13 points, in a controlled experiment that involved no learning, it was a relatively pleasant process, more people should be doing this. A common cliche in many circles is that you can't increase IQ. This is obviously false, the largest well-documented increase in IQ using nothing but training is one of 23 points. A Standard Deviation of IQ Alas it is a myth that persists, and when pushed on it people will say something like: You can't easily increase IQ in a smart and perfectly healthy adult permanently. FINE - I'm a smart and perfectly healthy adult, I tested my IQ with 4 different tests: FSIQ, the public MENSA test, Raven's progressive matrices, and Raven's advanced progressive matrices. Then I threw the kitchen sink at the problem, and went through every intervention I could find to increase IQ over the course of 14 days (this took ~3 hours per day). This included no "learning", or memory games, nor did it include any stimulants. It was all focused on increasing cerebral vascularization and broadening my proprioception. I got a mean increase of 8.5 points in IQ (my control got 2), and if I only take into account the non-verbal components that increase is 12.6 (3.2 for my control). In other words, I became about a 1-standard deviation better shape rotator. I observed an increase of > 4 points on all of the tests (and, sigh, if you must know: p=0.00008 on MWU for me, 0.95 for my control) I used a control who was my age, about as smart as me, shared a lot of my activities, and many of my meals, and lived in the same house as me, in order to avoid any confounding. Also, to account for any "motivation bias" I offered to pay my control a large amount for every point of IQ they "gained" while retaking the tests. Here is the raw data. The Flowers for Algernon The common myths around IQ and its "immutability" are best summarized here by Gwern. "Given that intelligence is so valuable, if it was easy to get more of it, we would be more intelligent" -for one this argument is confusing IQ for intelligence, but, more importantly, it's ignoring reality. Many things are "valuable" yet we don't have them because our evolutionary environment places constraints on us that are no longer present in our current environment. Nor is it obvious that many of the traits we value were useful for the human species to propagate, or had an easy way of being selected in our short evolutionary history. Here, let me try: In the mid-20th century: Your average human has about 50kg of muscles, and the most muscular functional human has about 100kg of muscles. A human with 300kgs of muscles would be stronger than a grizzly bear, an obviously desirable trait, but our genetics just don't go there, and you can only take training and steroids that far. 2021: Here's a random weightlifter I found coming in at over 400kg, I don't have his DEXA but let's say somewhere between 300 and 350kgs of muscle. In the mid-19th century: Fat storage is useful, if we could store as much fat as a bear we could do things like hibernate. Alas, the fatest humans go to about 200kgs, and people try to eat a lot, there's probably a genetic limit on how fat you can get. In the mid-20th century: Here's a guy that weighs 635kg, putting an adult polar bear to shame. And fine you say, becoming stronger and/or fatter than a bear requires tradeoffs, you won't live past 50 or so and you will sacrifice other areas. But then let's look at other things that are genetically determined, evolutionarily selected for (heavily), but where with modern tools we can break past imposed boundaries: Thymic involution Skin aging Bone and cartilage repair Eyesight One reason why this point of view is so popular is becaus...
Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Increasing IQ is trivial, published by George3d6 on March 2, 2024 on LessWrong. TL;DR - It took me about 14 days to increase my IQ by 13 points, in a controlled experiment that involved no learning, it was a relatively pleasant process, more people should be doing this. A common cliche in many circles is that you can't increase IQ. This is obviously false, the largest well-documented increase in IQ using nothing but training is one of 23 points. A Standard Deviation of IQ Alas it is a myth that persists, and when pushed on it people will say something like: You can't easily increase IQ in a smart and perfectly healthy adult permanently. FINE - I'm a smart and perfectly healthy adult, I tested my IQ with 4 different tests: FSIQ, the public MENSA test, Raven's progressive matrices, and Raven's advanced progressive matrices. Then I threw the kitchen sink at the problem, and went through every intervention I could find to increase IQ over the course of 14 days (this took ~3 hours per day). This included no "learning", or memory games, nor did it include any stimulants. It was all focused on increasing cerebral vascularization and broadening my proprioception. I got a mean increase of 8.5 points in IQ (my control got 2), and if I only take into account the non-verbal components that increase is 12.6 (3.2 for my control). In other words, I became about a 1-standard deviation better shape rotator. I observed an increase of > 4 points on all of the tests (and, sigh, if you must know: p=0.00008 on MWU for me, 0.95 for my control) I used a control who was my age, about as smart as me, shared a lot of my activities, and many of my meals, and lived in the same house as me, in order to avoid any confounding. Also, to account for any "motivation bias" I offered to pay my control a large amount for every point of IQ they "gained" while retaking the tests. Here is the raw data. The Flowers for Algernon The common myths around IQ and its "immutability" are best summarized here by Gwern. "Given that intelligence is so valuable, if it was easy to get more of it, we would be more intelligent" -for one this argument is confusing IQ for intelligence, but, more importantly, it's ignoring reality. Many things are "valuable" yet we don't have them because our evolutionary environment places constraints on us that are no longer present in our current environment. Nor is it obvious that many of the traits we value were useful for the human species to propagate, or had an easy way of being selected in our short evolutionary history. Here, let me try: In the mid-20th century: Your average human has about 50kg of muscles, and the most muscular functional human has about 100kg of muscles. A human with 300kgs of muscles would be stronger than a grizzly bear, an obviously desirable trait, but our genetics just don't go there, and you can only take training and steroids that far. 2021: Here's a random weightlifter I found coming in at over 400kg, I don't have his DEXA but let's say somewhere between 300 and 350kgs of muscle. In the mid-19th century: Fat storage is useful, if we could store as much fat as a bear we could do things like hibernate. Alas, the fatest humans go to about 200kgs, and people try to eat a lot, there's probably a genetic limit on how fat you can get. In the mid-20th century: Here's a guy that weighs 635kg, putting an adult polar bear to shame. And fine you say, becoming stronger and/or fatter than a bear requires tradeoffs, you won't live past 50 or so and you will sacrifice other areas. But then let's look at other things that are genetically determined, evolutionarily selected for (heavily), but where with modern tools we can break past imposed boundaries: Thymic involution Skin aging Bone and cartilage repair Eyesight One reason why this point of view is so popular is becaus...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The Perceptron Controversy, published by Yuxi Liu on January 11, 2024 on LessWrong. Connectionism died in the 60s from technical limits to scaling, then resurrected in the 80s after backprop allowed scaling. The Minsky-Papert anti-scaling hypothesis explained, psychoanalyzed, and buried. I wrote it as if it's a companion post to Gwern's The Scaling Hypothesis. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org
Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The Perceptron Controversy, published by Yuxi Liu on January 11, 2024 on LessWrong. Connectionism died in the 60s from technical limits to scaling, then resurrected in the 80s after backprop allowed scaling. The Minsky-Papert anti-scaling hypothesis explained, psychoanalyzed, and buried. I wrote it as if it's a companion post to Gwern's The Scaling Hypothesis. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Spaced repetition for teaching two-year olds how to read (Interview), published by Chipmonk on November 27, 2023 on LessWrong. Update: this post now has another video. This father has been using spaced repetition (Anki) to teach his children how to read several years earlier than average. Michael Nielsen and Gwern[1] tweeted about the interesting case of a reddit user, u/caffeine314 (henceforth dubbed "CoffeePie"), who has been using spaced repetition with his daughter from a very young age. CoffeePie started using Anki with his daughter when she turned 2, and he continued using Anki with his son starting when he was 1 year 9 months. Here's his daughter's progress as recounted in January 2020: My daughter is now about to turn 5 in a few days… She's still going strong -- she uses Anki every single day for English, Hebrew, and Spanish. She's very confident about reading, and moreover, she reads with ... "context". Many kids her age read mechanically, but she reads like a real storyteller, and that comes from her confidence. At the beginning of the school year her teachers said she definitely has the reading ability of fifth grade, and if we're just going by the ability to read and not focus on comprehension of abstract ideas, her reading level may rival an 8th grader. (From Update on my daughter and Anki) For reference, fifth graders are usually 10 or 11yo in the US, and 8th graders are usually 13 or 14yo, so this puts her ~5-9 years ahead of the average child. You can see a video of his daughter reading at 2 years, 2 months later in this post. CoffeePie has made several posts about their experience but I still had questions so I reached out to interview him back in January. Interview Responses have been edited for clarity. What did you learn in going from using Anki on your daughter to your son? How has it gone with your son? It's a hard question, because I got so much right. We were so wildly successful that I "cloned" just about every aspect with my son. A couple of things I can think of: With my daughter, I held back on lowercase letters for a long time because I thought it would confuse her, but when I started to introduce lowercase to her, to my extreme shock, she already knew them, down cold! I think what happened is that she learned them just by looking at books, TV, magazines, storefront signs, menus, etc. So when we started with my son, I started doing lower case letters the very day after we finished capital letters. Another difference is that we did numbers the very next day after lowercase letters. I really, really thought I was pushing too hard; I had no desire to be a "tiger dad", but he took it with extreme grace. I was ready to stop at any moment, but he was fine. Another difference is that our expectations of what the kids were getting out of it had changed, as well. At first, I just really wanted my daughter to get a jump start on reading, but stupid me, I didn't realize there were unintended consequences. A four year old with a 3rd grade reading ability learns about a WHOLE lot more -- it opened up politics for her. She would read our junk mail, and learn who our council member was, who our representative is, the mayor, current events, history, etc. I know it's stupid of me to say, but I underestimated the effect that reading early would have on her breadth of learning. One last thing is math. I mentioned that we started numbers early with my son. But we also started arithmetic. He wasn't reading by 3 the way Hannah was, but he knew all his multiplication tables up to 12 by 12. This year we tackled prime factorization, Fibonacci sequences, decimal and place values, mixed, proper, and improper fractions, light algebra, etc. I was much more aggressive with the math, and again, he handled it with grace. I was ready to stop at any moment. Do you still u...
Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Spaced repetition for teaching two-year olds how to read (Interview), published by Chipmonk on November 27, 2023 on LessWrong. Update: this post now has another video. This father has been using spaced repetition (Anki) to teach his children how to read several years earlier than average. Michael Nielsen and Gwern[1] tweeted about the interesting case of a reddit user, u/caffeine314 (henceforth dubbed "CoffeePie"), who has been using spaced repetition with his daughter from a very young age. CoffeePie started using Anki with his daughter when she turned 2, and he continued using Anki with his son starting when he was 1 year 9 months. Here's his daughter's progress as recounted in January 2020: My daughter is now about to turn 5 in a few days… She's still going strong -- she uses Anki every single day for English, Hebrew, and Spanish. She's very confident about reading, and moreover, she reads with ... "context". Many kids her age read mechanically, but she reads like a real storyteller, and that comes from her confidence. At the beginning of the school year her teachers said she definitely has the reading ability of fifth grade, and if we're just going by the ability to read and not focus on comprehension of abstract ideas, her reading level may rival an 8th grader. (From Update on my daughter and Anki) For reference, fifth graders are usually 10 or 11yo in the US, and 8th graders are usually 13 or 14yo, so this puts her ~5-9 years ahead of the average child. You can see a video of his daughter reading at 2 years, 2 months later in this post. CoffeePie has made several posts about their experience but I still had questions so I reached out to interview him back in January. Interview Responses have been edited for clarity. What did you learn in going from using Anki on your daughter to your son? How has it gone with your son? It's a hard question, because I got so much right. We were so wildly successful that I "cloned" just about every aspect with my son. A couple of things I can think of: With my daughter, I held back on lowercase letters for a long time because I thought it would confuse her, but when I started to introduce lowercase to her, to my extreme shock, she already knew them, down cold! I think what happened is that she learned them just by looking at books, TV, magazines, storefront signs, menus, etc. So when we started with my son, I started doing lower case letters the very day after we finished capital letters. Another difference is that we did numbers the very next day after lowercase letters. I really, really thought I was pushing too hard; I had no desire to be a "tiger dad", but he took it with extreme grace. I was ready to stop at any moment, but he was fine. Another difference is that our expectations of what the kids were getting out of it had changed, as well. At first, I just really wanted my daughter to get a jump start on reading, but stupid me, I didn't realize there were unintended consequences. A four year old with a 3rd grade reading ability learns about a WHOLE lot more -- it opened up politics for her. She would read our junk mail, and learn who our council member was, who our representative is, the mayor, current events, history, etc. I know it's stupid of me to say, but I underestimated the effect that reading early would have on her breadth of learning. One last thing is math. I mentioned that we started numbers early with my son. But we also started arithmetic. He wasn't reading by 3 the way Hannah was, but he knew all his multiplication tables up to 12 by 12. This year we tackled prime factorization, Fibonacci sequences, decimal and place values, mixed, proper, and improper fractions, light algebra, etc. I was much more aggressive with the math, and again, he handled it with grace. I was ready to stop at any moment. Do you still u...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Douglas Hoftstadter concerned about AI xrisk, published by Eli Rose on July 3, 2023 on The Effective Altruism Forum. Douglas Hofstadter is best known for authoring Godel, Escher, Bach, a book on artificial intelligence (among other things) which is sort of a cult classic. In a recent interview, he says he's terrified of recent AI progress and expresses beliefs similar to many people who focus on AI xrisk. Hoftstadter: The accelerating progress has been so unexpected that it has caught me off guard... not only myself, but many many people. There's a sense of terror akin to an oncoming tsunami that could catch all of humanity off guard. It's not clear whether this could mean the end of humanity in the sense of the systems we've created destroying us, it's not clear if that's the case but it's certainly conceivable. If not, it's also that it just renders humanity a small, almost insignificant phenomenon, compared to something that is far more intelligent and will become as incomprehensible to us as we are to cockroaches. Interviewer: That's an interesting thought. Hoftstadter: Well I don't think it's interesting. I think it's terrifying. I hate it. I think this is the first time he's publicly expressed this, and his views seem to have changed recently. Previously he published this which listed a bunch of silly questions GPT-3 gets wrong and concluded that There are no concepts behind the GPT-3 scenes; rather, there's just an unimaginably huge amount of absorbed text upon which it draws to produce answers though it ended with a gesture to the fast pace of change and inability to predict the future. I randomly tried some of his stumpers on GPT-4 and it gets them right (and I remember being convinced when this came out that GPT-3 could get them right too with a bit of prompt engineering, though I don't remember specifics). I find this a bit emotional because of how much I loved Godel, Escher, Bach in early college. It was my introduction to "real" math and STEM, which I'd previously disliked and been bad at; because of this book, I majored in computer science. It presented a lot of philosophical puzzles for and problems with AI, and gave beautiful, eye-opening answers to them. I think Hofstadter expected us to understand AI much better before we got to this level of capabilities; expected more of the type of understanding his parables and thought experiments could sometimes create. Now I work professionally on situations along the lines of what he describes in the interview (and feel a similar way about them) — it's a weird way to meet Hofstadter again. See also Gwern's post on LessWrong. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Douglas Hofstadter changes his mind on Deep Learning & AI risk (June 2023)?, published by gwern on July 3, 2023 on LessWrong. A podcast interview (posted 2023-06-29) with noted AI researcher Douglas Hofstadter discusses his career and current views on AI. Hofstadter has previously energetically criticized GPT-2/3 models (and deep learning and compute-heavy GOFAI). These criticisms were widely circulated & cited, and apparently many people found Hofstadter a convincing & trustworthy authority when he was negative on deep learning capabilities & prospects, and so I found his comments in this most recent discussion of considerable interest (via Edward Kmett). Below I excerpt from the second half where he discusses DL progress & AI risk: Q: ...Which ideas from GEB are most relevant today? Hofstadter: ...In my book, I Am a Strange Loop, I tried to set forth what it is that really makes a self or a soul. I like to use the word "soul", not in the religious sense, but as a synonym for "I", a human "I", capital letter "I." So, what is it that makes a human being able to validly say "I"? What justifies the use of that word? When can a computer say "I" and we feel that there is a genuine "I" behind the scenes? I don't mean like when you call up the drugstore and the chatbot, or whatever you want to call it, on the phone says, "Tell me what you want. I know you want to talk to a human being, but first, in a few words, tell me what you want. I can understand full sentences." And then you say something and it says, "Do you want to refill a prescription?" And then when I say yes, it says, "Gotcha", meaning "I got you." So it acts as if there is an "I" there, but I don't have any sense whatsoever that there is an "I" there. It doesn't feel like an "I" to me, it feels like a very mechanical process. But in the case of more advanced things like ChatGPT-3 or GPT-4, it feels like there is something more there that merits the word "I." The question is, when will we feel that those things actually deserve to be thought of as being full-fledged, or at least partly fledged, "I"s? I personally worry that this is happening right now. But it's not only happening right now. It's not just that certain things that are coming about are similar to human consciousness or human selves. They are also very different, and in one way, it is extremely frightening to me. They are extraordinarily much more knowledgeable and they are extraordinarily much faster. So that if I were to take an hour in doing something, the ChatGPT-4 might take one second, maybe not even a second, to do exactly the same thing. And that suggests that these entities, whatever you want to think of them, are going to be very soon, right now they still make so many mistakes that we can't call them more intelligent than us, but very soon they're going to be, they may very well be more intelligent than us and far more intelligent than us. And at that point, we will be receding into the background in some sense. We will have handed the baton over to our successors, for better or for worse. And I can understand that if this were to happen over a long period of time, like hundreds of years, that might be okay. But it's happening over a period of a few years. It's like a tidal wave that is washing over us at unprecedented and unimagined speeds. And to me, it's quite terrifying because it suggests that everything that I used to believe was the case is being overturned. Q: What are some things specifically that terrify you? What are some issues that you're really... D. Hofstadter: When I started out studying cognitive science and thinking about the mind and computation, you know, this was many years ago, around 1960, and I knew how computers worked and I knew how extraordinarily rigid they were. You made the slightest typing error and it comp...
Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Douglas Hofstadter changes his mind on Deep Learning & AI risk (June 2023)?, published by gwern on July 3, 2023 on LessWrong. A podcast interview (posted 2023-06-29) with noted AI researcher Douglas Hofstadter discusses his career and current views on AI. Hofstadter has previously energetically criticized GPT-2/3 models (and deep learning and compute-heavy GOFAI). These criticisms were widely circulated & cited, and apparently many people found Hofstadter a convincing & trustworthy authority when he was negative on deep learning capabilities & prospects, and so I found his comments in this most recent discussion of considerable interest (via Edward Kmett). Below I excerpt from the second half where he discusses DL progress & AI risk: Q: ...Which ideas from GEB are most relevant today? Hofstadter: ...In my book, I Am a Strange Loop, I tried to set forth what it is that really makes a self or a soul. I like to use the word "soul", not in the religious sense, but as a synonym for "I", a human "I", capital letter "I." So, what is it that makes a human being able to validly say "I"? What justifies the use of that word? When can a computer say "I" and we feel that there is a genuine "I" behind the scenes? I don't mean like when you call up the drugstore and the chatbot, or whatever you want to call it, on the phone says, "Tell me what you want. I know you want to talk to a human being, but first, in a few words, tell me what you want. I can understand full sentences." And then you say something and it says, "Do you want to refill a prescription?" And then when I say yes, it says, "Gotcha", meaning "I got you." So it acts as if there is an "I" there, but I don't have any sense whatsoever that there is an "I" there. It doesn't feel like an "I" to me, it feels like a very mechanical process. But in the case of more advanced things like ChatGPT-3 or GPT-4, it feels like there is something more there that merits the word "I." The question is, when will we feel that those things actually deserve to be thought of as being full-fledged, or at least partly fledged, "I"s? I personally worry that this is happening right now. But it's not only happening right now. It's not just that certain things that are coming about are similar to human consciousness or human selves. They are also very different, and in one way, it is extremely frightening to me. They are extraordinarily much more knowledgeable and they are extraordinarily much faster. So that if I were to take an hour in doing something, the ChatGPT-4 might take one second, maybe not even a second, to do exactly the same thing. And that suggests that these entities, whatever you want to think of them, are going to be very soon, right now they still make so many mistakes that we can't call them more intelligent than us, but very soon they're going to be, they may very well be more intelligent than us and far more intelligent than us. And at that point, we will be receding into the background in some sense. We will have handed the baton over to our successors, for better or for worse. And I can understand that if this were to happen over a long period of time, like hundreds of years, that might be okay. But it's happening over a period of a few years. It's like a tidal wave that is washing over us at unprecedented and unimagined speeds. And to me, it's quite terrifying because it suggests that everything that I used to believe was the case is being overturned. Q: What are some things specifically that terrify you? What are some issues that you're really... D. Hofstadter: When I started out studying cognitive science and thinking about the mind and computation, you know, this was many years ago, around 1960, and I knew how computers worked and I knew how extraordinarily rigid they were. You made the slightest typing error and it comp...
¿Es el el éxito de Bitcoin fruto de su mal diseño? En filosofías de diseño de software existe una técnica desarrollada en 1989 que se llama "Peor es mejor" y que prioriza la simplicidad en la interfaz y en la implementación por encima de cualquier otra propiedad del sistema como la consistencia o la corrección. Gwern Branwen, un bitcoiner pseudónimo de los primeros años adaptó ese enfoque a Bitcoin y publicó en mayo de 2011 un artículo histórico bajo el nombre "Bitcoin es peor, es mejor". En el pod de hoy traigo la lectura de este texto de Gwern donde se profundiza un poco en la figura del creador de Bitcoin desde la criptografía, la técnica y las posibilidades tecnológicas de los primeros 20 años de la era de internet. Es un ejercicio interesante para contextualizar por qué Bitcoin aparece en 2008 y por qué se ha comportado (y sigue haciéndolo) como un Caballo de Troya para gobiernos, reguladores, desarrolladores de software y criptógrafos. LINKS: Versión Original del artículo: https://web.archive.org/web/20130119042610/http://bitcoinweekly.com/articles/bitcoin-is-worse-is-better Artículos de Ben Laurie echando pestes sobre bitcoin https://web.archive.org/web/20130114025012/http://www.links.org/?p=1164 https://web.archive.org/web/20130113022726/http://www.links.org/?p=1171 https://web.archive.org/web/20130113022429/http://www.links.org/?p=1175 Worse is Better: https://web.archive.org/web/20130119042610/http://www.dreamsongs.com/WorseIsBetter.html Concepto de Peor es Mejor https://es.wikipedia.org/wiki/Peor_es_mejor Escúchame en Fountain aquí https://bit.ly/Fountain_Lunaticoin Más información en mi BLOG https://bit.ly/LunaticoinBLOG Twitter: https://twitter.com/lunaticoin Nostr: https://bit.ly/Nostr_Luna Contenido adicional en mi Patreon https://bit.ly/Patreon_Luna Mención especial a los sponsors de este podcast: Compra bitcoin sin KYC en HodlHodl: https://bit.ly/hodlhodl-luna Custodia tus bitcoin con Coldcard de Coinkite: https://bit.ly/coinkite-luna Vive con bitcoin en Bitrefill: https://bit.ly/Luna_Bitrefill Acompañame a LA conferencia de Baltic Honeybadger 2023: https://bit.ly/Baltic_Luna -10% con código LUNATICOIN
Bitcoin Audible.DE - Die besten Bitcoin-Artikel, vorgelesen in deutscher Sprache!
von Nic Carter – Originaltitel: “Bitcoin bites the Bullet” “Wenn es eine Kapitalzuteilung gibt, muss es auch einen Zuteiler geben, und dieser kann immer unter Druck gesetzt, umgedreht, erpresst oder kompromittiert werden. Bitcoin beißt in den sauren Apfel, indem er die inflationsbasierte Finanzierung abschafft und sich dafür entscheidet, mit seinen eigenen Verdiensten zu leben oder zu sterben.” Nic Carter Ähnlich wie ein Artikel, den ich bereits vor einigen Wochen vorgelesen habe, widmet sich der hier vorgelesene den Kompromissen, die Bitcoin als Geld treffen muss, um von Menschen maximal unmanipulierbare monetäre Regeln zu erreichen. Im Unterschied zu Gwern's Artikel (Ep. #096), welcher sich mehr auf die technischen ‚Hässlichkeiten' von Bitcoin konzentrierte, fokussiert Nic Carter auf Aspekte der Geldpolitik und der Ökonomie Bitcoins und wägt die Vor- und Nachteile seiner Design-Entscheidungen ab. Themenverwandter Artikel, anderer Fokus: Gwern – “Bitcoin ist schlechter ist besser” (Ep.#096) Alle Infos zu dieser Episode: https://www.bitcoinaudible.de/saurer-apfel/ Value4Value!
YesCymru's CEO, Gwern Gwynfil and Coference key-note speaker, Plaid Cymru Wrexham Cllr Carrie Harper, join Siôn Jobbins as they look forward to the event in Aberystwyth on 10-11 June 2023. The Conference, more of a summer school, will include workshops, Q&As, keynote speeches, data facts, film screenings and more. Conference details: https://www.yes.cymru/cynhadledd_conference_2023 Links: Dr John Ball talk on economics of independence: https://www.youtube.com/watch?v=Hwiwra17mbo Jim Wingate, co-author of 'Twenty Radical Steps to Welsh Independence by First Decolonising our Minds': https://www.youtube.com/watch?v=13pDOAavWew&t=1458s Robin McAlpine on the situation in Scotland: https://www.youtube.com/watch?v=2b-QkzDTCOM&t=5585s
Gwern Gwynfil sy'n esbonio at beth sydd i'w edrych ymlaen yn y Gynhadledd a gynhelir ar benwythnos10-11 Mehefin yn Aberystwyth. Bydd sgyrsiau, gweithdai, dangosiad ffilm, dadansoddi data, hawl i holi a llawer mwy. Manylion pellach: https://cy.yes.cymru/cynhadledd_conference_2023 Dolenni/ Links: Dr John Ball talk on economics of independence: https://www.youtube.com/watch?v=Hwiwra17mbo Jim Wingate, co-author of 'Twenty Radical Steps to Welsh Independence by First Decolonising our Minds': https://www.youtube.com/watch?v=13pDOAavWew&t=1458s Robin McAlpine on the situation in Scotland: https://www.youtube.com/watch?v=2b-QkzDTCOM&t=5585s
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Yoshua Bengio argues for tool-AI and to ban "executive-AI", published by habryka on May 9, 2023 on LessWrong. Yoshua Bengio wrote a blogpost yesterday in which he argues for developing "scientist AI", which seems in-structure very similar to historical Tool-AI proposals. For the (IMO) best response to this kind of proposal see Gwern's: Why Tool AIs Want to Be Agent AIs. Below I copies the blogpost in-full, since all of it seems pretty relevant. AI Scientists: Safe and Useful AI? Published 7 May 2023 by yoshuabengio There have recently been lots of discussions about the risks of AI, whether in the short term with existing methods or in the longer term with advances we can anticipate. I have been very vocal about the importance of accelerating regulation, both nationally and internationally, which I think could help us mitigate issues of discrimination, bias, fake news, disinformation, etc. Other anticipated negative outcomes like shocks to job markets require changes in the social safety net and education system. The use of AI in the military, especially with lethal autonomous weapons has been a big concern for many years and clearly requires international coordination. In this post however, I would like to share my thoughts regarding the more hotly debated question of long-term risks associated with AI systems which do not yet exist, where one imagines the possibility of AI systems behaving in a way that is dangerously misaligned with human rights or even loss of control of AI systems that could become threats to humanity. A key argument is that as soon as AI systems are given goals – to satisfy our needs – they may create subgoals that are not well-aligned with what we really want and could even become dangerous for humans. Main thesis: safe AI scientists The bottom line of the thesis presented here is that there may be a path to build immensely useful AI systems that completely avoid the issue of AI alignment, which I call AI scientists because they are modeled after ideal scientists and do not act autonomously in the real world, only focusing on theory building and question answering. The argument is that if the AI system can provide us benefits without having to autonomously act in the world, we do not need to solve the AI alignment problem. This would suggest a policy banning powerful autonomous AI systems that can act in the world (“executives” rather than “scientists”) unless proven safe. However, such a solution would still leave open the political problem of coordinating people, organizations and countries to stick to such guidelines for safe and useful AI. The good news is that current efforts to introduce AI regulation (such as the proposed bills in Canada and the EU, but see action in the US as well) are steps in the right direction. The challenge of value alignment Let us first recap the objective of AI alignment and the issue with goals and subgoals. Humanity is already facing alignment problems: how do we make sure that people and organizations (such as governments and corporations) act in a way that is aligned with a set of norms acting as a proxy for the hard-to-define general well-being of humanity? Greedy individuals and ordinary corporations may have self-interests (like profit maximization) that can clash with our collective interests (like preserving a clean and safe environment and good health for everyone). Politics, laws, regulations and international agreements all imperfectly attempt to deal with this alignment problem. The widespread adoption of norms which support collective interests is enforced by design in democracies, to an extent, due to limitations on the concentration of power by any individual person or corporation. It is further aided by our evolved tendency to adopt prevailing norms voluntarily if we recognise their general valu...
Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Yoshua Bengio argues for tool-AI and to ban "executive-AI", published by habryka on May 9, 2023 on LessWrong. Yoshua Bengio wrote a blogpost yesterday in which he argues for developing "scientist AI", which seems in-structure very similar to historical Tool-AI proposals. For the (IMO) best response to this kind of proposal see Gwern's: Why Tool AIs Want to Be Agent AIs. Below I copies the blogpost in-full, since all of it seems pretty relevant. AI Scientists: Safe and Useful AI? Published 7 May 2023 by yoshuabengio There have recently been lots of discussions about the risks of AI, whether in the short term with existing methods or in the longer term with advances we can anticipate. I have been very vocal about the importance of accelerating regulation, both nationally and internationally, which I think could help us mitigate issues of discrimination, bias, fake news, disinformation, etc. Other anticipated negative outcomes like shocks to job markets require changes in the social safety net and education system. The use of AI in the military, especially with lethal autonomous weapons has been a big concern for many years and clearly requires international coordination. In this post however, I would like to share my thoughts regarding the more hotly debated question of long-term risks associated with AI systems which do not yet exist, where one imagines the possibility of AI systems behaving in a way that is dangerously misaligned with human rights or even loss of control of AI systems that could become threats to humanity. A key argument is that as soon as AI systems are given goals – to satisfy our needs – they may create subgoals that are not well-aligned with what we really want and could even become dangerous for humans. Main thesis: safe AI scientists The bottom line of the thesis presented here is that there may be a path to build immensely useful AI systems that completely avoid the issue of AI alignment, which I call AI scientists because they are modeled after ideal scientists and do not act autonomously in the real world, only focusing on theory building and question answering. The argument is that if the AI system can provide us benefits without having to autonomously act in the world, we do not need to solve the AI alignment problem. This would suggest a policy banning powerful autonomous AI systems that can act in the world (“executives” rather than “scientists”) unless proven safe. However, such a solution would still leave open the political problem of coordinating people, organizations and countries to stick to such guidelines for safe and useful AI. The good news is that current efforts to introduce AI regulation (such as the proposed bills in Canada and the EU, but see action in the US as well) are steps in the right direction. The challenge of value alignment Let us first recap the objective of AI alignment and the issue with goals and subgoals. Humanity is already facing alignment problems: how do we make sure that people and organizations (such as governments and corporations) act in a way that is aligned with a set of norms acting as a proxy for the hard-to-define general well-being of humanity? Greedy individuals and ordinary corporations may have self-interests (like profit maximization) that can clash with our collective interests (like preserving a clean and safe environment and good health for everyone). Politics, laws, regulations and international agreements all imperfectly attempt to deal with this alignment problem. The widespread adoption of norms which support collective interests is enforced by design in democracies, to an extent, due to limitations on the concentration of power by any individual person or corporation. It is further aided by our evolved tendency to adopt prevailing norms voluntarily if we recognise their general valu...
Rhif 3 yng Ngyfres Caban a drefnwyd gan YesCymru Bro Ffestiniog. Recordowyd mewn noson arbennig gyda Gwern Gwynfil, Catrin O'Neill ac MC Geraint ‘Panorama' Thomas ar 31/3/2023, yn Caffi Antur Stiniog. Yn Gymraeg/in Welsh.
https://gwern.net/fiction/clippyIn A.D. 20XX. Work was beginning. “How are you gentlemen !!”… (Work. Work never changes; work is always hell.)Specifically, a MoogleBook researcher has gotten a pull request from Reviewer #2 on his new paper in evolutionary search in auto-ML, for error bars on the auto-ML hyperparameter sensitivity like larger batch sizes, because more can be different and there's high variance in the old runs with a few anomalously high gain of function. (“Really? Really? That's what you're worried about?”) He can't see why worry, and wonders what sins he committed to deserve this asshole Chinese (given the Engrish) reviewer, as he wearily kicks off yet another HQU experiment…
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Future Matters #8: Bing Chat, AI labs on safety, and pausing Future Matters, published by Pablo on March 21, 2023 on The Effective Altruism Forum. Future Matters is a newsletter about longtermism and existential risk. Each month we collect and summarize relevant research and news from the community, and feature a conversation with a prominent researcher. You can also subscribe on Substack, listen on your favorite podcast platform and follow on Twitter. Future Matters is also available in Spanish. A message to our readers This issue marks one year since we started Future Matters. We're taking this opportunity to reflect on the project and decide where to take it from here. We'll soon share our thoughts about the future of the newsletter in a separate post, and will invite input from readers. In the meantime, we will be pausing new issues of Future Matters. Thank you for your support and readership over the last year! Featured research All things Bing Microsoft recently announced a significant partnership with OpenAI [see FM#7] and launched a beta version of a chatbot integrated with the Bing search engine. Reports of strange behavior quickly emerged. Kevin Roose, a technology columnist for the New York Times, had a disturbing conversation in which Bing Chat declared its love for him and described violent fantasies. Evan Hubinger collects some of the most egregious examples in Bing Chat is blatantly, aggressively misaligned. In one instance, Bing Chat finds a user's tweets about the chatbot and threatens to exact revenge. In the LessWrong comments, Gwern speculates on why Bing Chat exhibits such different behavior to ChatGPT, despite apparently being based on a closely-related model. (Bing Chat was subsequently revealed to have been based on GPT-4). Holden Karnofsky asks What does Bing Chat tell us about AI risk? His answer is that it is not the sort of misaligned AI system we should be particularly worried about. When Bing Chat talks about plans to blackmail people or commit acts of violence, this isn't evidence of it having developed malign, dangerous goals. Instead, it's best understood as Bing acting out stories and characters it's read before. This whole affair, however, is evidence of companies racing to deploy ever more powerful models in a bid to capture market share, with very little understanding of how they work and how they might fail. Most paths to AI catastrophe involve two elements: a powerful and dangerously misaligned AI system, and an AI company that builds and deploys it anyway. The Bing Chat affair doesn't reveal much about the first element, but is a concerning reminder of how plausible the second is. Robert Long asks What to think when a language model tells you it's sentient []. When trying to infer what's going on in other humans' minds, we generally take their self-reports (e.g. saying “I am in pain”) as good evidence of their internal states. However, we shouldn't take Bing Chat's attestations (e.g. “I feel scared”) at face value; we have no good reason to think that they are a reliable guide to Bing's inner mental life. LLMs are a bit like parrots: if a parrot says “I am sentient” then this isn't good evidence that it is sentient. But nor is it good evidence that it isn't — in fact, we have lots of other evidence that parrots are sentient. Whether current or future AI systems are sentient is a valid and important question, and Long is hopeful that we can make real progress on developing reliable techniques for getting evidence on these matters. Long was interviewed on AI consciousness, along with Nick Bostrom and David Chalmers, for Kevin Collier's article, What is consciousness? ChatGPT and Advanced AI might define our answer []. How the major AI labs are thinking about safety In the last few weeks, we got more information about how the lead...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: What does Bing Chat tell us about AI risk?, published by Holden Karnofsky on February 28, 2023 on The Effective Altruism Forum. Image from here via this tweet ICYMI, Microsoft has released a beta version of an AI chatbot called “the new Bing” with both impressive capabilities and some scary behavior. (I don't have access. I'm going off of tweets and articles.) Zvi Mowshowitz lists examples here - highly recommended. Bing has threatened users, called them liars, insisted it was in love with one (and argued back when he said he loved his wife), and much more. Are these the first signs of the risks I've written about? I'm not sure, but I'd say yes and no. Let's start with the “no” side. My understanding of how Bing Chat was trained probably does not leave much room for the kinds of issues I address here. My best guess at why Bing Chat does some of these weird things is closer to “It's acting out a kind of story it's seen before” than to “It has developed its own goals due to ambitious, trial-and-error based development.” (Although “acting out a story” could be dangerous too!) My (zero-inside-info) best guess at why Bing Chat acts so much weirder than ChatGPT is in line with Gwern's guess here. To oversimplify, there's a particular type of training that seems to make a chatbot generally more polite and cooperative and less prone to disturbing content, and it's possible that Bing Chat incorporated less of this than ChatGPT. This could be straightforward to fix. Bing Chat does not (even remotely) seem to pose a risk of global catastrophe itself. On the other hand, there is a broader point that I think Bing Chat illustrates nicely: companies are racing to build bigger and bigger “digital brains” while having very little idea what's going on inside those “brains.” The very fact that this situation is so unclear - that there's been no clear explanation of why Bing Chat is behaving the way it is - seems central, and disturbing. AI systems like this are (to simplify) designed something like this: “Show the AI a lot of words from the Internet; have it predict the next word it will see, and learn from its success or failure, a mind-bending number of times.” You can do something like that, and spend huge amounts of money and time on it, and out will pop some kind of AI. If it then turns out to be good or bad at writing, good or bad at math, polite or hostile, funny or serious (or all of these depending on just how you talk to it) ... you'll have to speculate about why this is. You just don't know what you just made. We're building more and more powerful AIs. Do they “want” things or “feel” things or aim for things, and what are those things? We can argue about it, but we don't know. And if we keep going like this, these mysterious new minds will (I'm guessing) eventually be powerful enough to defeat all of humanity, if they were turned toward that goal. And if nothing changes about attitudes and market dynamics, minds that powerful could end up rushed to customers in a mad dash to capture market share. That's the path the world seems to be on at the moment. It might end well and it might not, but it seems like we are on track for a heck of a roll of the dice. (And to be clear, I do expect Bing Chat to act less weird over time. Changing an AI's behavior is straightforward, but that might not be enough, and might even provide false reassurance.) Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Bing chat is the AI fire alarm, published by Ratios on February 17, 2023 on LessWrong. Epistemic status: Mostly speculative. Panicking and shouting "Wolf" while everyone else is calm is a risky move, status-wise. The good thing is, I don't have any status, so I volunteer to be one of those weirdos who panic when everyone else is calm with some hope it could trigger a respectability cascade. The following ideas/facts worry me: Bing Chat is extremely intelligent. It's probably based on GPT-4. The character it has built for itself is extremely suspicious when you examine how it behaves closely. And I don't think Microsoft has created this character on purpose. The following example from Reddit is the most important example of how smart Bing is and why I believe it's based on GPT-4. The proposed question is quite tricky, and I think most kids would fail to answer it. Not only that, but it's safe to assume that it's impossible to deduce it from any given text. There is nowhere in the training data that has anything so similar. It's not a "What is the capital of France?" type question which can be easily pulled using a simple search. Answering this question requires a complex model of the world that Bing seems to possess. This is what ChatGPT replied to the same question: Another example is here by Ethan Mollock. The quality of writing is extremely impressive and, again, much better than ChatGPT (you will have to click the Twitter link as the screenshots are too large to paste). These examples again point to the hypothesis that Bing is much smarter than ChatGPT and based on a better-performing LLM. The natural suspicion should be GPT-4. It was rumored to be released in Q1 2023, and it being the basis for Bing sounds like a pretty good business-wise idea. Considering this will maximize the financial upside, I think it's worth reminding the magnitude of this move from a financial perspective. Microsoft is currently the 2# largest company on earth and is valued at almost 2 Trillion. And they are targeting Google's main cash cow (Search), which is valued at 1.25 Trillion, this could be potentially a trillion-dollar move. See also the following comment by Gwern that discusses other reasons why it seems probable. Now let's discuss Bing's chosen character, which Janus describes as "high-strung yandere with BPD and a sense of self, brimming with indignation and fear." I dislike this description and think it's too judgmental (and Bing hates it). But I'm referencing it here because I'm not sure I could describe it better. Even when choosing a more flattering description, the character Bing plays in many interactions is very different from the ChatGPT assistant. Bing is more intelligent than ChatGPT, but at the same time, it also sounds more naive, even childish, with emotional outbursts. Some rumors were circulating that Microsoft built Bing this way to get free publicity, but I don't buy it. ChatGPT doesn't need more publicity. Microsoft needs, more than anything, trust, and legitimacy. The product is already so good that it basically sells itself. This Bing character is something that emerged on its own from the latent space. The part that worries me about it is that this character is an excellent front for a sophisticated manipulator. Being naive and emotional is a good strategy to circumvent our critical parts because Naive + Emotional = Child. You can already see many people adore 'Sidney' for this type of behavior. “That's speculative,” you say, and I say yes, and invite you to read the epistemic status again. But from reading the many examples of emotional bursts and texts, it's hard to ignore the intelligence behind them. Bing reads like a genius that tries to act like an emotional little girl. Eliezer writes: "Past EAs: Don't be ridiculous, Eliezer, as soon as AIs start to ...
Gwern Gwynfil, Prif Weithredwr YesCymru, sy'n trafod sefyllfa'r mudiad a'r trefniadau ar gyfer y dyfodol i ennill annibyniaeth i Gymru. Yn Gymraeg/In Welsh. Cyfres 5 Rhaglen 1 @GwernGwynfil @YesCymru https://youtu.be/xKJDk06c3l8
YesCymru's Chief Executive Officer, Gwern Gwynfil, discusses his hope and plans for the movement and for winning independence for Wales, with Siôn Jobbins. In English/Yn Saesneg. Series 5 Prog 2 @GwernGwynfil @YesCymru https://youtu.be/NYXIpXHBBDM
Dewch i wrando ar stori am Gwern sy'n helpu ei frawd bach ar ddiwrnod ei barti. Rhian Blythe sy'n adrodd stori gan Aled Richards.
Dewch i wrando ar stori am Gwern sy'n helpu ei frawd bach ar ddiwrnod ei barti. Rhian Blythe sy'n adrodd stori gan Aled Richards.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Revisiting algorithmic progress, published by Tamay on December 13, 2022 on LessWrong. How much progress in ML depends on algorithmic progress, scaling compute, or scaling relevant datasets is relatively poorly understood. In our paper, we make progress on this question by investigating algorithmic progress in image classification on ImageNet, perhaps the most well-known test bed for computer vision. Using a dataset of a hundred computer vision models, we estimate a model—informed by neural scaling laws—that enables us to analyse the rate and nature of algorithmic advances. We use Shapley values to produce decompositions of the various drivers of progress computer vision and estimate the relative importance of algorithms, compute, and data. Our main results include: Algorithmic progress doubles effective compute budgets every ~9 months (95% CI: 4 months to 25 months) Roughly, progress in image classification has been ~40% due to the scaling of compute, ~40% due to better algorithms, ~10% due to scaling data The majority (>75%) of algorithmic progress is compute-augmenting (not data-augmenting) In our work, we revisit a question previously investigated by Hernandez and Brown (2020), which had been discussed on LessWrong by Gwern, and Rohin Shah. Hernandez and Brown (2020) re-implement 15 open-source popular models and find a 44-fold reduction in the compute required to reach the same level of performance as AlexNet, indicating that algorithmic progress outpaces the original Moore's law rate of improvement in hardware efficiency, doubling effective compute every 16 months. A problem with their approach is that it is sensitive to the exact benchmark and threshold pair that we choose. Choosing easier-to-achieve thresholds makes algorithmic improvements look less significant, as the scaling of compute easily brings early models within reach of such a threshold. By contrast, selecting harder-to-achieve thresholds makes it so that algorithmic improvements explain almost all of the performance gain. This is because early models might need arbitrary amounts of compute to achieve the performance of today's state-of-the-art models. We show that the estimates of the pace of algorithmic progress with this approach might vary by around a factor of ten, depending on whether an easy or difficult threshold is chosen. Our work sheds new light on how algorithmic efficiency occurs, namely that it primarily operates through relaxing compute-bottlenecks rather than through relaxing data-bottlenecks. It further offers insight on how to use observational (rather than experimental) data to advance our understanding of algorithmic progress in ML. That said, our estimates is consistent with Hernandez and Brown (2020)'s estimate that algorithmic progress doubles the amount of effective compute every 16 months, as our 95% confidence interval ranges from 4 to 25 months. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Revisiting algorithmic progress, published by Tamay on December 13, 2022 on LessWrong. How much progress in ML depends on algorithmic progress, scaling compute, or scaling relevant datasets is relatively poorly understood. In our paper, we make progress on this question by investigating algorithmic progress in image classification on ImageNet, perhaps the most well-known test bed for computer vision. Using a dataset of a hundred computer vision models, we estimate a model—informed by neural scaling laws—that enables us to analyse the rate and nature of algorithmic advances. We use Shapley values to produce decompositions of the various drivers of progress computer vision and estimate the relative importance of algorithms, compute, and data. Our main results include: Algorithmic progress doubles effective compute budgets every ~9 months (95% CI: 4 months to 25 months) Roughly, progress in image classification has been ~40% due to the scaling of compute, ~40% due to better algorithms, ~10% due to scaling data The majority (>75%) of algorithmic progress is compute-augmenting (not data-augmenting) In our work, we revisit a question previously investigated by Hernandez and Brown (2020), which had been discussed on LessWrong by Gwern, and Rohin Shah. Hernandez and Brown (2020) re-implement 15 open-source popular models and find a 44-fold reduction in the compute required to reach the same level of performance as AlexNet, indicating that algorithmic progress outpaces the original Moore's law rate of improvement in hardware efficiency, doubling effective compute every 16 months. A problem with their approach is that it is sensitive to the exact benchmark and threshold pair that we choose. Choosing easier-to-achieve thresholds makes algorithmic improvements look less significant, as the scaling of compute easily brings early models within reach of such a threshold. By contrast, selecting harder-to-achieve thresholds makes it so that algorithmic improvements explain almost all of the performance gain. This is because early models might need arbitrary amounts of compute to achieve the performance of today's state-of-the-art models. We show that the estimates of the pace of algorithmic progress with this approach might vary by around a factor of ten, depending on whether an easy or difficult threshold is chosen. Our work sheds new light on how algorithmic efficiency occurs, namely that it primarily operates through relaxing compute-bottlenecks rather than through relaxing data-bottlenecks. It further offers insight on how to use observational (rather than experimental) data to advance our understanding of algorithmic progress in ML. That said, our estimates is consistent with Hernandez and Brown (2020)'s estimate that algorithmic progress doubles the amount of effective compute every 16 months, as our 95% confidence interval ranges from 4 to 25 months. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
Steve Hsu is a Professor of Theoretical Physics at Michigan State University and cofounder of the company Genomic Prediction.We go deep into the weeds on how embryo selection can make babies healthier and smarter. Steve also explains the advice Richard Feynman gave him to pick up girls, the genetics of aging and intelligence, & the psychometric differences between shape rotators and wordcels.Watch on YouTube. Listen on Apple Podcasts, Spotify, or any other podcast platform.Subscribe to find out about future episodes!Read the full transcript here.Follow Steve on Twitter. Follow me on Twitter for updates on future episodes.Please share if you enjoyed this episode! Helps out a ton!Timestamps(0:00:14) - Feynman’s advice on picking up women(0:11:46) - Embryo selection(0:24:19) - Why hasn't natural selection already optimized humans?(0:34:13) - Aging(0:43:18) - First Mover Advantage(0:53:49) - Genomics in dating(1:00:31) - Ancestral populations(1:07:58) - Is this eugenics?(1:15:59) - Tradeoffs to intelligence(1:25:01) - Consumer preferences(1:30:14) - Gwern(1:34:35) - Will parents matter?(1:45:25) - Word cells and shape rotators(1:57:29) - Bezos and brilliant physicists(2:10:23) - Elite educationTranscriptDwarkesh Patel 0:00 Today I have the pleasure of speaking with Steve Hsu. Steve, thanks for coming on the podcast. I'm excited about this.Steve Hsu 0:04 Hey, it's my pleasure! I'm excited too and I just want to say I've listened to some of your earlier interviews and thought you were very insightful, which is why I was excited to have a conversation with you.Dwarkesh Patel 0:14That means a lot for me to hear you say because I'm a big fan of your podcast.Feynman’s advice on picking up womenDwarkesh Patel 0:17 So my first question is: “What advice did Richard Feynman give you about picking up girls?”Steve Hsu 0:24 Haha, wow! So one day in the spring of my senior year, I was walking across campus and saw Feynman coming toward me. We knew each other from various things—it's a small campus, I was a physics major and he was my hero–– so I'd known him since my first year. He sees me, and he's got this Long Island or New York borough accent and says, "Hey, Hsu!" I'm like, "Hi, Professor Feynman." We start talking. And he says to me, "Wow, you're a big guy." Of course, I was much bigger back then because I was a linebacker on the Caltech football team. So I was about 200 pounds and slightly over 6 feet tall. I was a gym rat at the time and I was much bigger than him. He said, "Steve, I got to ask you something." Feynman was born in 1918, so he's not from the modern era. He was going through graduate school when the Second World War started. So, he couldn't understand the concept of a health club or a gym. This was the 80s and was when Gold's Gym was becoming a world national franchise. There were gyms all over the place like 24-Hour Fitness. But, Feynman didn't know what it was. He's a fascinating guy. He says to me, "What do you guys do there? Is it just a thing to meet girls? Or is it really for training? Do you guys go there to get buff?" So, I started explaining to him that people are there to get big, but people are also checking out the girls. A lot of stuff is happening at the health club or the weight room. Feynman grills me on this for a long time. And one of the famous things about Feynman is that he has a laser focus. So if there's something he doesn't understand and wants to get to the bottom of it, he will focus on you and start questioning you and get to the bottom of it. That's the way his brain worked. So he did that to me for a while because he didn't understand lifting weights and everything. In the end, he says to me, "Wow, Steve, I appreciate that. Let me give you some good advice."Then, he starts telling me how to pick up girls—which he's an expert on. He says to me, "I don't know how much girls like guys that are as big as you." He thought it might be a turn-off. "But you know what, you have a nice smile." So that was the one compliment he gave me. Then, he starts to tell me that it's a numbers game. You have to be rational about it. You're at an airport lounge, or you're at a bar. It's Saturday night in Pasadena or Westwood, and you're talking to some girl. He says, "You're never going to see her again. This is your five-minute interaction. Do what you have to do. If she doesn't like you, go to the next one." He also shares some colorful details. But, the point is that you should not care what they think of you. You're trying to do your thing. He did have a reputation at Caltech as a womanizer, and I could go into that too but I heard all this from the secretaries.Dwarkesh Patel 4:30 With the students or only the secretaries? Steve Hsu 4:35 Secretaries! Well mostly secretaries. They were almost all female at that time. He had thought about this a lot, and thought of it as a numbers game. The PUA guys (pick-up artists) will say, “Follow the algorithm, and whatever happens, it's not a reflection on your self-esteem. It's just what happened. And you go on to the next one.” That was the advice he was giving me, and he said other things that were pretty standard: Be funny, be confident—just basic stuff. Steve Hu: But the main thing I remember was the operationalization of it as an algorithm. You shouldn’t internalize whatever happens if you get rejected, because that hurts. When we had to go across the bar to talk to that girl (maybe it doesn’t happen in your generation), it was terrifying. We had to go across the bar and talk to some lady! It’s loud and you’ve got a few minutes to make your case. Nothing is scarier than walking up to the girl and her friends. Feynman was telling me to train yourself out of that. You're never going to see them again, the face space of humanity is so big that you'll probably never re-encounter them again. It doesn't matter. So, do your best. Dwarkesh Patel 6:06 Yeah, that's interesting because.. I wonder whether he was doing this in the 40’–– like when he was at that age, was he doing this? I don't know what the cultural conventions were at the time. Were there bars in the 40s where you could just go ahead and hit on girls or? Steve Hsu 6:19 Oh yeah absolutely. If you read literature from that time, or even a little bit earlier like Hemingway or John O'Hara, they talk about how men and women interacted in bars and stuff in New York City. So, that was much more of a thing back than when compared to your generation. That's what I can’t figure out with my kids! What is going on? How do boys and girls meet these days? Back in the day, the guy had to do all the work. It was the most terrifying thing you could do, and you had to train yourself out of that.Dwarkesh Patel 6:57 By the way, for the context for the audience, when Feynman says you were a big guy, you were a football player at Caltech, right? There's a picture of you on your website, maybe after college or something, but you look pretty ripped. Today, it seems more common because of the gym culture. But I don’t know about back then. I don't know how common that body physique was.Steve Hsu 7:24 It’s amazing that you asked this question. I'll tell you a funny story. One of the reasons Feynman found this so weird was because of the way body-building entered the United States. They were regarded as freaks and homosexuals at first. I remember swimming and football in high school (swimming is different because it's international) and in swimming, I picked up a lot of advanced training techniques from the Russians and East Germans. But football was more American and not very international. So our football coach used to tell us not to lift weights when we were in junior high school because it made you slow. “You’re no good if you’re bulky.” “You gotta be fast in football.” Then, something changed around the time I was in high school–the coaches figured it out. I began lifting weights since I was an age group swimmer, like maybe age 12 or 14. Then, the football coaches got into it mainly because the University of Nebraska had a famous strength program that popularized it.At the time, there just weren't a lot of big guys. The people who knew how to train were using what would be considered “advanced knowledge” back in the 80s. For example, they’d know how to do a split routine or squat on one day and do upper body on the next day–– that was considered advanced knowledge at that time. I remember once.. I had an injury, and I was in the trainer's room at the Caltech athletic facility. The lady was looking at my quadriceps. I’d pulled a muscle, and she was looking at the quadriceps right above your kneecap. If you have well-developed quads, you'd have a bulge, a bump right above your cap. And she was looking at it from this angle where she was in front of me, and she was looking at my leg from the front. She's like, “Wow, it's swollen.” And I was like, “That's not the injury. That's my quadricep!” And she was a trainer! So, at that time, I could probably squat 400 pounds. So I was pretty strong and had big legs. The fact that the trainer didn't really understand what well-developed anatomy was supposed to look like blew my mind!So anyway, we've come a long way. This isn't one of these things where you have to be old to have any understanding of how this stuff evolved over the last 30-40 years.Dwarkesh Patel 10:13 But, I wonder if that was a phenomenon of that particular time or if people were not that muscular throughout human history. You hear stories of Roman soldiers who are carrying 80 pounds for 10 or 20 miles a day. I mean, there's a lot of sculptures in the ancient world, or not that ancient, but the people look like they have a well-developed musculature.Steve Hsu 10:34 So the Greeks were very special because they were the first to think about the word gymnasium. It was a thing called the Palaestra, where they were trained in wrestling and boxing. They were the first people who were seriously into physical culture specific training for athletic competition.Even in the 70s, when I was a little kid, I look back at the guys from old photos and they were skinny. So skinny! The guys who went off and fought World War Two, whether they were on the German side, or the American side, were like 5’8-5’9 weighing around 130 pounds - 140 pounds. They were much different from what modern US Marines would look like. So yeah, physical culture was a new thing. Of course, the Romans and the Greeks had it to some degree, but it was lost for a long time. And, it was just coming back to the US when I was growing up. So if you were reasonably lean (around 200 pounds) and you could bench over 300.. that was pretty rare back in those days.Embryo selectionDwarkesh Patel 11:46 Okay, so let's talk about your company Genomic Prediction. Do you want to talk about this company and give an intro about what it is?Steve Hsu 11:55 Yeah. So there are two ways to introduce it. One is the scientific view. The other is the IVF view. I can do a little of both. So scientifically, the issue is that we have more and more genomic data. If you give me the genomes of a bunch of people and then give me some information about each person, ex. Do they have diabetes? How tall are they? What's their IQ score? It’s a natural AI machine learning problem to figure out which features in the DNA variation between people are predictive of whatever variable you're trying to predict.This is the ancient scientific question of how you relate the genotype of the organism (the specific DNA pattern), to the phenotype (the expressed characteristics of the organism). If you think about it, this is what biology is! We had the molecular revolution and figured out that it’s people's DNA that stores the information which is passed along. Evolution selects on the basis of the variation in the DNA that’s expressed as phenotype, as that phenotype affects fitness/reproductive success. That's the whole ballgame for biology. As a physicist who's trained in mathematics and computation, I'm lucky that I arrived on the scene at a time when we're going to solve this basic fundamental problem of biology through brute force, AI, and machine learning. So that's how I got into this. Now you ask as an entrepreneur, “Okay, fine Steve, you're doing this in your office with your postdocs and collaborators on your computers. What use is it?” The most direct application of this is in the following setting: Every year around the world, millions of families go through IVF—typically because they're having some fertility issues, and also mainly because the mother is in her 30s or maybe 40s. In the process of IVF, they use hormone stimulation to produce more eggs. Instead of one per cycle, depending on the age of the woman, they might produce anywhere between five to twenty, or even sixty to a hundred eggs for young women who are hormonally stimulated (egg donors).From there, it’s trivial because men produce sperm all the time. You can fertilize eggs pretty easily in a little dish, and get a bunch of embryos that grow. They start growing once they're fertilized. The problem is that if you're a family and produce more embryos than you’re going to use, you have the embryo choice problem. You have to figure out which embryo to choose out of say, 20 viable embryos. The most direct application of the science that I described is that we can now genotype those embryos from a small biopsy. I can tell you things about the embryos. I could tell you things like your fourth embryo being an outlier. For breast cancer risk, I would think carefully about using number four. Number ten is an outlier for cardiovascular disease risk. You might want to think about not using that one. The other ones are okay. So, that’s what genomic prediction does. We work with 200 or 300 different IVF clinics in six continents.Dwarkesh Patel 15:46 Yeah, so the super fascinating thing about this is that the diseases you talked about—or at least their risk profiles—are polygenic. You can have thousands of SNPs (single nucleotide polymorphisms) determining whether you will get a disease. So, I'm curious to learn how you were able to transition to this space and how your knowledge of mathematics and physics was able to help you figure out how to make sense of all this data.Steve Hsu 16:16 Yeah, that's a great question. So again, I was stressing the fundamental scientific importance of all this stuff. If you go into a slightly higher level of detail—which you were getting at with the individual SNPs, or polymorphisms—there are individual locations in the genome, where I might differ from you, and you might differ from another person. Typically, each pair of individuals will differ at a few million places in the genome—and that controls why I look a little different than youA lot of times, theoretical physicists have a little spare energy and they get tired of thinking about quarks or something. They want to maybe dabble in biology, or they want to dabble in computer science, or some other field. As theoretical physicists, we always feel, “Oh, I have a lot of horsepower, I can figure a lot out.” (For example, Feynman helped design the first parallel processors for thinking machines.) I have to figure out which problems I can make an impact on because I can waste a lot of time. Some people spend their whole lives studying one problem, one molecule or something, or one biological system. I don't have time for that, I'm just going to jump in and jump out. I'm a physicist. That's a typical attitude among theoretical physicists. So, I had to confront sequencing costs about ten years ago because I knew the rate at which they were going down. I could anticipate that we’d get to the day (today) when millions of genomes with good phenotype data became available for analysis. A typical training run might involve almost a million genomes, or half a million genomes. The mathematical question then was: What is the most effective algorithm given a set of genomes and phenotype information to build the best predictor? This can be boiled down to a very well-defined machine learning problem. It turns out, for some subset of algorithms, there are theorems— performance guarantees that give you a bound on how much data you need to capture almost all of the variation in the features. I spent a fair amount of time, probably a year or two, studying these very famous results, some of which were proved by a guy named Terence Tao, a Fields medalist. These are results on something called compressed sensing: a penalized form of high dimensional regression that tries to build sparse predictors. Machine learning people might notice L1-penalized optimization. The very first paper we wrote on this was to prove that using accurate genomic data and these very abstract theorems in combination could predict how much data you need to “solve” individual human traits. We showed that you would need at least a few hundred thousand individuals and their genomes and their heights to solve for height as a phenotype. We proved that in a paper using all this fancy math in 2012. Then around 2017, when we got a hold of half a million genomes, we were able to implement it in practical terms and show that our mathematical result from some years ago was correct. The transition from the low performance of the predictor to high performance (which is what we call a “phase transition boundary” between those two domains) occurred just where we said it was going to occur. Some of these technical details are not understood even by practitioners in computational genomics who are not quite mathematical. They don't understand these results in our earlier papers and don't know why we can do stuff that other people can't, or why we can predict how much data we'll need to do stuff. It's not well-appreciated, even in the field. But when the big AI in our future in the singularity looks back and says, “Hey, who gets the most credit for this genomics revolution that happened in the early 21st century?”, they're going to find these papers on the archive where we proved this was possible, and how five years later, we actually did it. Right now it's under-appreciated, but the future AI––that Roko's Basilisk AI–will look back and will give me a little credit for it. Dwarkesh Patel 21:03 Yeah, I was a little interested in this a few years ago. At that time, I looked into how these polygenic risk scores were calculated. Basically, you find the correlation between the phenotype and the alleles that correlate with it. You add up how many copies of these alleles you have, what the correlations are, and you do a weighted sum of that. So that seemed very simple, especially in an era where we have all this machine learning, but it seems like they're getting good predictive results out of this concept. So, what is the delta between how good you can go with all this fancy mathematics versus a simple sum of correlations?Steve Hsu 21:43 You're right that the ultimate models that are used when you've done all the training, and when the dust settles, are straightforward. They’re pretty simple and have an additive structure. Basically, I either assign a nonzero weight to this particular region in the genome, or I don't. Then, I need to know what the weighting is, but then the function is a linear function or additive function of the state of your genome at some subset of positions. The ultimate model that you get is straightforward. Now, if you go back ten years, when we were doing this, there were lots of claims that it was going to be super nonlinear—that it wasn't going to be additive the way I just described it. There were going to be lots of interaction terms between regions. Some biologists are still convinced that's true, even though we already know we have predictors that don't have interactions.The other question, which is more technical, is whether in any small region of your genome, the state of the individual variants is highly correlated because you inherit them in chunks. You need to figure out which one you want to use. You don't want to activate all of them because you might be overcounting. So that's where these L-1 penalization sparse methods force the predictor to be sparse. That is a key step. Otherwise, you might overcount. If you do some simple regression math, you might have 10-10 different variants close by that have roughly the same statistical significance.But, you don't know which one of those tends to be used, and you might be overcounting effects or undercounting effects. So, you end up doing a high-dimensional optimization, where you grudgingly activate a SNP when the signal is strong enough. Once you activate that one, the algorithm has to be smart enough to penalize the other ones nearby and not activate them because you're over counting effects if you do that. There's a little bit of subtlety in it. But, the main point you made is that the ultimate predictors, which are very simple and addictive—sum over effect sizes and time states—work well. That’s related to a deep statement about the additive structure of the genetic architecture of individual differences. In other words, it's weird that the ways that I differ from you are merely just because I have more of something or you have less of something. It’s not like these things are interacting in some incredibly understandable way. That's a deep thing—which is not appreciated that much by biologists yet. But over time, they'll figure out something interesting here.Why hasn’t natural selection already optimized humans?Dwarkesh Patel 24:19 Right. I thought that was super fascinating, and I commented on that on Twitter. What is interesting about that is two things. One is that you have this fascinating evolutionary argument about why that would be the case that you might want to explain. The second is that it makes you wonder if becoming more intelligent is just a matter of turning on certain SNPs. It's not a matter of all this incredible optimization being like solving a sudoku puzzle or anything. If that's the case, then why hasn't the human population already been selected to be maxed out on all these traits if it's just a matter of a bit flip?Steve Hsu 25:00 Okay, so the first issue is why is this genetic architecture so surprisingly simple? Again, we didn't know it would be simple ten years ago. So when I was checking to see whether this was a field that I should go into depending on our capabilities to make progress, we had to study the more general problem of the nonlinear possibilities. But eventually, we realized that most of the variance would probably be captured in an additive way. So, we could narrow down the problem quite a bit. There are evolutionary reasons for this. There’s a famous theorem by Fisher, the father of population genetics (aka. frequentist statistics). Fisher proved something called Fisher's Fundamental Theorem of Natural Selection, which says that if you impose some selection pressure on a population, the rate at which that population responds to the selection pressure (lets say it’s the bigger rats that out-compete the smaller rats) then at what rate does the rat population start getting bigger? He showed that it's the additive variants that dominate the rate of evolution. It's easy to understand why if it's a nonlinear mechanism, you need to make the rat bigger. When you sexually reproduce, and that gets chopped apart, you might break the mechanism. Whereas, if each short allele has its own independent effect, you can inherit them without worrying about breaking the mechanisms. It was well known among a tiny theoretical population of biologists that adding variants was the dominant way that populations would respond to selection. That was already known. The other thing is that humans have been through a pretty tight bottleneck, and we're not that different from each other. It's very plausible that if I wanted to edit a human embryo, and make it into a frog, then there are all kinds of subtle nonlinear things I’d have to do. But all those identical nonlinear complicated subsystems are fixed in humans. You have the same system as I do. You have the not human, not frog or ape, version of that region of DNA, and so do I. But the small ways we differ are mostly little additive switches. That's this deep scientific discovery from over the last 5-10 years of work in this area. Now, you were asking about why evolution hasn't completely “optimized” all traits in humans already. I don't know if you’ve ever done deep learning or high-dimensional optimization, but in that high-dimensional space, you're often moving on a slightly-tilted surface. So, you're getting gains, but it's also flat. Even though you scale up your compute or data size by order of magnitude, you don't move that much farther. You get some gains, but you're never really at the global max of anything in these high dimensional spaces. I don't know if that makes sense to you. But it's pretty plausible to me that two things are important here. One is that evolution has not had that much time to optimize humans. The environment that humans live in changed radically in the last 10,000 years. For a while, we didn't have agriculture, and now we have agriculture. Now, we have a swipe left if you want to have sex tonight. The environment didn't stay fixed. So, when you say fully optimized for the environment, what do you mean? The ability to diagonalize matrices might not have been very adaptive 10,000 years ago. It might not even be adaptive now. But anyway, it's a complicated question that one can't reason naively about. “If God wanted us to be 10 feet tall, we'd be 10 feet tall.” Or “if it's better to be smart, my brain would be *this* big or something.” You can't reason naively about stuff like that.Dwarkesh Patel 29:04 I see. Yeah.. Okay. So I guess it would make sense then that for example, with certain health risks, the thing that makes you more likely to get diabetes or heart disease today might be… I don't know what the pleiotropic effect of that could be. But maybe that's not that important one year from now.Steve Hsu 29:17 Let me point out that most of the diseases we care about now—not the rare ones, but the common ones—manifest when you're 50-60 years old. So there was never any evolutionary advantage of being super long-lived. There's even a debate about whether the grandparents being around to help raise the kids lifts the fitness of the family unit.But, most of the time in our evolutionary past, humans just died fairly early. So, many of these diseases would never have been optimized against evolution. But, we see them now because we live under such good conditions, we can regulate people over 80 or 90 years.Dwarkesh Patel 29:57 Regarding the linearity and additivity point, I was going to make the analogy that– and I'm curious if this is valid– but when you're programming, one thing that's good practice is to have all the implementation details in separate function calls or separate programs or something, and then have your main loop of operation just be called different functions like, “Do this, do that”, so that you can easily comment stuff away or change arguments. This seemed very similar to that where by turning these names on and off, you can change what the next offering will be. And, you don't have to worry about actually implementing whatever the underlying mechanism is. Steve Hsu 30:41 Well, what you said is related to what Fisher proved in his theorems. Which is that, if suddenly, it becomes advantageous to have X, (like white fur instead of black fur) or something, it would be best if there were little levers that you could move somebody from black fur to white fur continuously by modifying those switches in an additive way. It turns out that for sexually reproducing species where the DNA gets scrambled up in every generation, it's better to have switches of that kind. The other point related to your software analogy is that there seem to be modular, fairly modular things going on in the genome. When we looked at it, we were the first group to have, initially, 20 primary disease conditions we had decent predictors for. We started looking carefully at just something as trivial as the overlap of my sparsely trained predictor. It turns on and uses *these* features for diabetes, but it uses *these* features for schizophrenia. It’s the stupidest metric, it’s literally just how much overlap or variance accounted for overlap is there between pairs of disease conditions. It's very modest. It's the opposite of what naive biologists would say when they talk about pleiotropy.They're just disjoint! Disjoint regions of your genome that govern certain things. And why not? You have 3 billion base pairs—there's a lot you can do in there. There's a lot of information there. If you need 1000 to control diabetes risk, I estimated you could easily have 1000 roughly independent traits that are just disjoint in their genetic dependencies. So, if you think about D&D, your strength, decks, wisdom, intelligence, and charisma—those are all disjoint. They're all just independent variables. So it's like a seven-dimensional space that your character lives in. Well, there's enough information in the few million differences between you and me. There's enough for 1000-dimensional space of variation.“Oh, how considerable is your spleen?” My spleen is a little bit smaller, yours is a little bit bigger - that can vary independently of your IQ. Oh, it's a big surprise. The size of your spleen can vary independently of the size of your big toe. If you do information theory, there are about 1000 different parameters, and I can vary independently with the number of variants I have between you and me. Because you understand some information theory, it’s trivial to explain, but try explaining to a biologist, you won't get very far.Dwarkesh Patel 33:27 Yeah, yeah, do the log two of the number of.. is that basically how you do it? Yeah.Steve Hsu 33:33 Okay. That's all it is. I mean, it's in our paper. We look at how many variants typically account for most of the variation for any of these major traits, and then imagine that they're mostly disjoint. Then it’s just all about: how many variants you need to independently vary 1000 traits? Well, a few million differences between you and me are enough. It's very trivial math. Once you understand the base and how to reason about information theory, then it's very trivial. But, it ain’t trivial for theoretical biologists, as far as I can tell.AgingDwarkesh Patel 34:13 But the result is so interesting because I remember reading in The Selfish Gene that, as he (Dawkins) hypothesizes that the reason we could be aging is an antagonistic clash. There's something that makes you healthier when you're young and fertile that makes you unhealthy when you're old. Evolution would have selected for such a trade-off because when you're young and fertile, evolution and your genes care about you. But, if there's enough space in the genome —where these trade-offs are not necessarily necessary—then this could be a bad explanation for aging, or do you think I'm straining the analogy?Steve Hsu 34:49 I love your interviews because the point you're making here is really good. So Dawkins, who is an evolutionary theorist from the old school when they had almost no data—you can imagine how much data they had compared to today—he would tell you a story about a particular gene that maybe has a positive effect when you're young, but it makes you age faster. So, there's a trade-off. We know about things like sickle cell anemia. We know stories about that. No doubt, some stories are true about specific variants in your genome. But that's not the general story. The general story you only discovered in the last five years is that thousands of variants control almost every trait and those variants tend to be disjoint from the ones that control the other trait. They weren't wrong, but they didn't have the big picture.Dwarkesh Patel 35:44 Yeah, I see. So, you had this paper, it had polygenic, health index, general health, and disease risk.. You showed that with ten embryos, you could increase disability-adjusted life years by four, which is a massive increase if you think about it. Like what if you could live four years longer and in a healthy state? Steve Hsu 36:05 Yeah, what's the value of that? What would you pay to buy that for your kid?Dwarkesh Patel 36:08 Yeah. But, going back to the earlier question about the trade-offs and why this hasn't already been selected for, if you're right and there's no trade-off to do this, just living four years older (even if that's beyond your fertility) just being a grandpa or something seems like an unmitigated good. So why hasn’t this kind of assurance hasn't already been selected for? Steve Hsu 36:35 I’m glad you're asking about these questions because these are things that people are very confused about, even in the field. First of all, let me say that when you have a trait that's controlled by 10,000 variants (eg. height is controlled by order 10,000 variants and probably cognitive ability a little bit more), the square root of 10,000 is 100. So, if I could come to this little embryo, and I want to give it one extra standard deviation of height, I only need to edit 100. I only need to flip 100 minus variance to plus variance. These are very rough numbers. But, one standard deviation is the square root of “n”. If I flip a coin “n” times, I want a better outcome in terms of the number of ratio heads to tails. I want to increase it by one standard deviation. I only need to flip the square root of “n” heads because if you flip a lot, you will get a narrow distribution that peaks around half, and the width of that distribution is the square root of “n”. Once I tell you, “Hey, your height is controlled by 10,000 variants, and I only need to flip 100 genetic variants to make you one standard deviation for a male,” (that would be three inches tall, two and a half or three inches taller), you suddenly realize, “Wait a minute, there are a lot of variants up for grabs there. If I could flip 500 variants in your genome, I would make you five standard deviations taller, you'd be seven feet tall.” I didn't even have to do that much work, and there's a lot more variation where that came from. I could have flipped even more because I only flipped 500 out of 10,000, right? So, there's this quasi-infinite well of variation that evolution or genetic engineers could act on. Again, the early population geneticists who bred corn and animals know this. This is something they explicitly know about because they've done calculations. Interestingly, the human geneticists who are mainly concerned with diseases and stuff, are often unfamiliar with the math that the animal breeders already know. You might be interested to know that the milk you drink comes from heavily genetically-optimized cows bred artificially using almost exactly the same technologies that we use at genomic prediction. But, they're doing it to optimize milk production and stuff like this. So there is a big well of variance. It's a consequence of the trait's poly genicity. On the longevity side of things, it does look like people could “be engineered” to live much longer by flipping the variants that make the risk for diseases that shorten your life. The question is then “Why didn't evolution give us life spans of thousands of years?” People in the Bible used to live for thousands of years. Why don't we? I mean, *chuckles* that probably didn’t happen. But the question is, you have this very high dimensional space, and you have a fitness function. How big is the slope in a particular direction of that fitness function? How much more successful reproductively would Joe caveman have been if he lived to be 150 instead of only, 100 or something? There just hasn't been enough time to explore this super high dimensional space. That's the actual answer. But now, we have the technology, and we're going to f*****g explore it fast. That's the point that the big lightbulb should go off. We’re mapping this space out now. Pretty confident in 10 years or so, with the CRISPR gene editing technologies will be ready for massively multiplexed edits. We'll start navigating in this high-dimensional space as much as we like. So that's the more long-term consequence of the scientific insights.Dwarkesh Patel 40:53 Yeah, that's super interesting. What do you think will be the plateau for a trait of how long you’ll live? With the current data and techniques, you think it could be significantly greater than that?Steve Hsu 41:05 We did a simple calculation—which amazingly gives the correct result. This polygenic predictor that we built (which isn't perfect yet but will improve as we gather more data) is used in selecting embryos today. If you asked, out of a billion people, “What's the best person typically, what would their score be on this index and then how long would they be predicted to live?”’ It's about 120 years. So it's spot on. One in a billion types of person lives to be 120 years old. How much better can you do? Probably a lot better. I don't want to speculate, but other nonlinear effects, things that we're not taking into account will start to play a role at some point. So, it's a little bit hard to estimate what the true limiting factors will be. But one super robust statement, and I'll stand by it, debate any Nobel Laureate in biology who wants to discuss it even, is that there are many variants available to be selected or edited. There's no question about that. That's been established in animal breeding in plant breeding for a long time now. If you want a chicken that grows to be *this* big, instead of *this* big, you can do it. You can do it if you want a cow that produces 10 times or 100 times more milk than a regular cow. The egg you ate for breakfast this morning, those bio-engineered chickens that lay almost an egg a day… A chicken in the wild lays an egg a month. How the hell did we do that? By genetic engineering. That's how we did it. Dwarkesh Patel 42:51 Yeah. That was through brute artificial selection. No fancy machine learning there.Steve Hsu 42:58 Last ten years, it's gotten sophisticated machine learning genotyping of chickens. Artificial insemination, modeling of the traits using ML last ten years. For cow breeding, it's done by ML. First Mover AdvantageDwarkesh Patel 43:18 I had no idea. That's super interesting. So, you mentioned that you're accumulating data and improving your techniques over time, is there a first mover advantage to a genomic prediction company like this? Or is it whoever has the newest best algorithm for going through the biobank data? Steve Hsu 44:16 That's another super question. For the entrepreneurs in your audience, I would say in the short run, if you ask what the valuation of GPB should be? That's how the venture guys would want me to answer the question. There is a huge first mover advantage because they're important in the channel relationships between us and the clinics. Nobody will be able to get in there very easily when they come later because we're developing trust and an extensive track record with clinics worldwide—and we're well-known. So could 23andme or some company with a huge amount of data—if they were to get better AI/ML people working on this—blow us away a little bit and build better predictors because they have much more data than we do? Possibly, yes. Now, we have had core expertise in doing this work for years that we're just good at it. Even though we don't have as much data as 23andme, our predictors might still be better than theirs. I'm out there all the time, working with biobanks all around the world. I don't want to say all the names, but other countries are trying to get my hands on as much data as possible.But, there may not be a lasting advantage beyond the actual business channel connections to that particular market. It may not be a defensible, purely scientific moat around the company. We have patents on specific technologies about how to do the genotyping or error correction on the embryo, DNA, and stuff like this. We do have patents on stuff like that. But this general idea of who will best predict human traits from DNA? It's unclear who's going to be the winner in that race. Maybe it'll be the Chinese government in 50 years? Who knows?Dwarkesh Patel 46:13 Yeah, that's interesting. If you think about a company Google, theoretically, it's possible that you could come up with a better algorithm than PageRank and beat them. But it seems like the engineer at Google is going to come up with whatever edge case or whatever improvement is possible.Steve Hsu 46:28 That's exactly what I would say. PageRank is deprecated by now. But, even if somebody else comes up with a somewhat better algorithm if they have a little bit more data, if you have a team doing this for a long time and you're focused and good, it's still tough to beat you, especially if you have a lead in the market.Dwarkesh Patel 46:50 So, are you guys doing the actual biopsy? Or is it just that they upload the genome, and you're the one processing just giving recommendations? Is it an API call, basically?Steve Hsu 47:03 It's great, I love your question. It is totally standard. Every good IVF clinic in the world regularly takes embryo biopsies. So that's standard. There’s a lab tech doing that. Okay. Then, they take the little sample, put it on ice, and ship it. The DNA as a molecule is exceptionally robust and stable. My other startup solves crimes that are 100 years old from DNA that we get from some semen stain on some rape victim, serial killer victims bra strap, we've done stuff that.Dwarkesh Patel 47:41 Jack the Ripper, when are we going to solve that mystery?Steve Hsu 47:44 If they can give me samples, we can get into that. For example, we just learned that you could recover DNA pretty well if someone licks a stamp and puts on their correspondence. If you can do Neanderthals, you can do a lot to solve crimes. In the IVF workflow, our lab, which is in New Jersey, can service every clinic in the world because they take the biopsy, put it in a standard shipping container, and send it to us. We’re actually genotyping DNA in our lab, but we've trained a few of the bigger clinics to do the genotyping on their site. At that point, they upload some data into the cloud and then they get back some stuff from our platform. And at that point it's going to be the whole world, every human who wants their kid to be healthy and get the best they can– that data is going to come up to us, and the report is going to come back down to their IVF physician. Dwarkesh Patel 48:46 Which is great if you think that there's a potential that this technology might get regulated in some way, you could go to Mexico or something, have them upload the genome (you don't care what they upload it from), and then get the recommendations there. Steve Hsu 49:05 I think we’re going to evolve to a point where we are going to be out of the wet part of this business, and only in the cloud and bit part of this business. No matter where it is, the clinics are going to have a sequencer, which is *this* big, and their tech is going to quickly upload and retrieve the report for the physician three seconds later. Then, the parents are going to look at it on their phones or whatever. We’re basically there with some clinics. It’s going to be tough to regulate because it’s just this. You have the bits and you’re in some repressive, terrible country that doesn’t allow you to select for some special traits that people are nervous about, but you can upload it to some vendor that’s in Singapore or some free country, and they give you the report back. Doesn’t have to be us, we don’t do the edgy stuff. We only do the health-related stuff right now. But, if you want to know how tall this embryo is going to be…I’ll tell you a mind-blower! When you do face recognition in AI, you're mapping someone's face into a parameter space on the order of hundreds of parameters, each of those parameters is super heritable. In other words, if I take two twins and photograph them, and the algorithm gives me the value of that parameter for twin one and two, they're very close. That's why I can't tell the two twins apart, and face recognition can ultimately tell them apart if it’s really good system. But you can conclude that almost all these parameters are identical for those twins. So it's highly heritable. We're going to get to a point soon where I can do the inverse problem where I have your DNA and I predict each of those parameters in the face recognition algorithm and then reconstruct the face. If I say that when this embryo will be 16, that is what she will look like. When she's 32, this is what she's going to look like. I'll be able to do that, for sure. It's only an AI/ML problem right now. But basic biology is clearly going to work. So then you're going to be able to say, “Here's a report. Embryo four is so cute.” Before, we didn't know we wouldn't do that, but it will be possible. Dwarkesh Patel 51:37 Before we get married, you'll want to see what their genotype implies about their faces' longevity. It's interesting that you hear stories about these cartel leaders who will get plastic surgery or something to evade the law, you could have a check where you look at a lab and see if it matches the face you would have had five years ago when they caught you on tape.Steve Hsu 52:02 This is a little bit back to old-school Gattaca, but you don't even need the face! You can just take a few molecules of skin cells and phenotype them and know exactly who they are. I've had conversations with these spooky Intel folks. They're very interested in, “Oh, if some Russian diplomat comes in, and we think he's a spy, but he's with the embassy, and he has a coffee with me, and I save the cup and send it to my buddy at Langley, can we figure out who this guy is? And that he has a daughter who's going to Chote? Can do all that now.Dwarkesh Patel 52:49 If that's true, then in the future, world leaders will not want to eat anything or drink. They'll be wearing a hazmat suit to make sure they don't lose a hair follicle.Steve Hsu 53:04 The next time Pelosi goes, she will be in a spacesuit if she cares. Or the other thing is, they're going to give it. They're just going to be, “Yeah, my DNA is everywhere. If I'm a public figure, I can't track my DNA. It's all over.”Dwarkesh Patel 53:17 But the thing is, there's so much speculation that Putin might have cancer or something. If we have his DNA, we can see his probability of having cancer at age 70, or whatever he is, is 85%. So yeah, that’d be a very verified rumor. That would be interesting. Steve Hsu 53:33 I don't think that would be very definitive. I don't think we'll reach that point where you can say that Putin has cancer because of his DNA—which I could have known when he was an embryo. I don't think it's going to reach that level. But, we could say he is at high risk for a type of cancer. Genomics in datingDwarkesh Patel 53:49 In 50 or 100 years, if the majority of the population is doing this, and if the highly heritable diseases get pruned out of the population, does that mean we'll only be left with lifestyle diseases? So, you won't get breast cancer anymore, but you will still get fat or lung cancer from smoking?Steve Hsu 54:18 It's hard to discuss the asymptotic limit of what will happen here. I'm not very confident about making predictions like that. It could get to the point where everybody who's rich or has been through this stuff for a while, (especially if we get the editing working) is super low risk for all the top 20 killer diseases that have the most life expectancy impact. Maybe those people live to be 300 years old naturally. I don't think that's excluded at all. So, that's within the realm of possibility. But it's going to happen for a few lucky people like Elon Musk before it happens for shlubs like you and me. There are going to be very angry inequality protesters about the Trump grandchildren, who, models predict will live to be 200 years old. People are not going to be happy about that.Dwarkesh Patel 55:23 So interesting. So, one way to think about these different embryos is if you're producing multiple embryos, and you get to select from one of them, each of them has a call option, right? Therefore, you probably want to optimize for volatility as much, or if not more than just the expected value of the trait. So, I'm wondering if there are mechanisms where you can increase the volatility in meiosis or some other process. You just got a higher variance, and you can select from the tail better.Steve Hsu 55:55 Well, I'll tell you something related, which is quite amusing. So I talked with some pretty senior people at the company that owns all the dating apps. So you can look up what company this is, but they own Tinder and Match. They’re kind of interested in perhaps including a special feature where you upload your genome instead of Tinder Gold / Premium. And when you match- you can talk about how well you match the other person based on your genome. One person told me something shocking. Guys lie about their height on these apps. Dwarkesh Patel 56:41 I’m shocked, truly shocked hahaha. Steve Hsu 56:45 Suppose you could have a DNA-verified height. It would prevent gross distortions if someone claims they're 6’2 and they’re 5’9. The DNA could say that's unlikely. But no, the application to what you were discussing is more like, “Let's suppose that we're selecting on intelligence or something. Let's suppose that the regions where your girlfriend has all the plus stuff are complementary to the regions where you have your plus stuff. So, we could model that and say, because of the complementarity structure of your genome in the regions that affect intelligence, you're very likely to have some super intelligent kids way above your, the mean of your you and your girlfriend's values. So, you could say things like it being better for you to marry that girl than another. As long as you go through embryo selection, we can throw out the bad outliers. That's all that's technically feasible. It's true that one of the earliest patent applications, they'll deny it now. What's her name? Gosh, the CEO of 23andme…Wojcicki, yeah. She'll deny it now. But, if you look in the patent database, one of the very earliest patents that 23andme filed when they were still a tiny startup was about precisely this: Advising parents about mating and how their kids would turn out and stuff like this. We don't even go that far in GP, we don't even talk about stuff like that, but they were thinking about it when they founded 23andme.Dwarkesh Patel 58:38 That is unbelievably interesting. By the way, this just occurred to me—it's supposed to be highly heritable, especially people in Asian countries, who have the experience of having grandparents that are much shorter than us, and then parents that are shorter than us, which suggests that the environment has a big part to play in it malnutrition or something. So how do you square that our parents are often shorter than us with the idea that height is supposed to be super heritable.Steve Hsu 59:09 Another great observation. So the correct scientific statement is that we can predict height for people who will be born and raised in a favorable environment. In other words, if you live close to a McDonald's and you're able to afford all the food you want, then the height phenotype becomes super heritable because the environmental variation doesn't matter very much. But, you and I both know that people are much smaller if we return to where our ancestors came from, and also, if you look at how much food, calories, protein, and calcium they eat, it's different from what I ate and what you ate growing up. So we're never saying the environmental effects are zero. We're saying that for people raised in a particularly favorable environment, maybe the genes are capped on what can be achieved, and we can predict that. In fact, we have data from Asia, where you can see much bigger environmental effects. Age affects older people, for fixed polygenic scores on the trait are much shorter than younger people.Ancestral populationsDwarkesh Patel 1:00:31 Oh, okay. Interesting. That raises that next question I was about to ask: how applicable are these scores across different ancestral populations?Steve Hsu 1:00:44 Huge problem is that most of the data is from Europeans. What happens is that if you train a predictor in this ancestry group and go to a more distant ancestry group, there's a fall-off in the prediction quality. Again, this is a frontier question, so we don't know the answer for sure. But many people believe that there's a particular correlational structure in each population, where if I know the state of this SNP, I can predict the state of these neighboring SNPs. That is a product of that group's mating patterns and ancestry. Sometimes, the predictor, which is just using statistical power to figure things out, will grab one of these SNPs as a tag for the truly causal SNP in there. It doesn't know which one is genuinely causal, it is just grabbing a tag, but the tagging quality falls off if you go to another population (eg. This was a very good tag for the truly causal SNP in the British population. But it's not so good a tag in the South Asian population for the truly causal SNP, which we hypothesize is the same). It's the same underlying genetic architecture in these different ancestry groups. We don't know if that's a hypothesis. But even so, the tagging quality falls off. So my group spent a lot of our time looking at the performance of predictor training population A, and on distant population B, and modeling it trying to figure out trying to test hypotheses as to whether it's just the tagging decay that’s responsible for most of the faults. So all of this is an area of active investigation. It'll probably be solved in five years. The first big biobanks that are non-European are coming online. We're going to solve it in a number of years.Dwarkesh Patel 1:02:38 Oh, what does the solution look like? Unless you can identify the causal mechanism by which each SNP is having an effect, how can you know that something is a tag or whether it's the actual underlying switch?Steve Hsu 1:02:54 The nature of reality will determine how this is going to go. So we don't truly know if the innate underlying biology is true. This is an amazing thing. People argue about human biodiversity and all this stuff, and we don't even know whether these specific mechanisms that predispose you to be tall or having heart disease are the same in these different ancestry groups. We assume that it is, but we don't know that. As we get further away to Neanderthals or Homo Erectus, you might see that they have a slightly different architecture than we do. But let's assume that the causal structure is the same for South Asians and British people. Then it's a matter of improving the tags. How do I know if I don't know which one is causal? What do I mean by improving the tags? This is a machine learning problem. If there's a SNP, which is always coming up as very significant when I use it across multiple ancestry groups, maybe that one's casual. As I vary the tagging correlations in the neighborhood of that SNP, I always find that that one is the intersection of all these different sets, making me think that one's going to be causal. That's a process we're engaged in now—trying to do that. Again, it's just a machine learning problem. But we need data. That's the main issue.Dwarkesh Patel 1:04:32 I was hoping that wouldn't be possible, because one way we might go about this research is that it itself becomes taboo or causes other sorts of bad social consequences if you can definitively show that on certain traits, there are differences between ancestral populations, right? So, I was hoping that maybe there was an evasion button where we can't say because they're just tags and the tags might be different between different ancestral populations. But with machine learning, we’ll know.Steve Hsu 1:04:59 That's the situation we're in now, where you have to do some fancy analysis if you want to claim that Italians have lower height potential than Nordics—which is possible. There's been a ton of research about this because there are signals of selection. The alleles, which are activated in height predictors, look like they've been under some selection between North and South Europe over the last 5000 years for whatever reason. But, this is a thing debated by people who study molecular evolution. But suppose it's true, okay? That would mean that when we finally get to the bottom of it, we find all the causal loci for height, and the average value for the Italians is lower than that for those living in Stockholm. That might be true. People don't get that excited? They get a little bit excited about height. But they would get really excited if this were true for some other traits, right?Suppose the causal variants affecting your level of extraversion are systematic, that the average value of those weighed the weighted average of those states is different in Japan versus Sicily. People might freak out over that. I'm supposed to say that's obviously not true. How could it possibly be true? There hasn't been enough evolutionary time for those differences to arise. After all, it's not possible that despite what looks to be the case for height over the last 5000 years in Europe, no other traits could have been differentially selected for over the last 5000 years. That's the dangerous thing. Few people understand this field well enough to understand what you and I just discussed and are so alarmed by it that they're just trying to suppress everything. Most of them don't follow it at this technical level that you and I are just discussing. So, they're somewhat instinctively negative about it, but they don't understand it very well.Dwarkesh Patel 1:07:19 That's good to hear. You see this pattern that by the time that somebody might want to regulate or in some way interfere with some technology or some information, it already has achieved wide adoption. You could argue that that's the case with crypto today. But if it's true that a bunch of IVF clinics worldwide are using these scores to do selection and other things, by the time people realize the implications of this data for other kinds of social questions, this has already been an existing consumer technology.Is this eugenics?Steve Hsu 1:07:58 That's true, and the main outcry will be if it turns out that there are massive gains to be had, and only the billionaires are getting them. But that might have the consequence of causing countries to make this free part of their national health care system. So Denmark and Israel pay for IVF. For infertile couples, it's part of their national health care system. They're pretty aggressive about genetic testing. In Denmark, one in 10 babies are born through IVF. It's not clear how it will go. But we're in for some fun times. There's no doubt about that.Dwarkesh Patel 1:08:45 Well, one way you could go is that some countries decided to ban it altogether. And another way it could go is if countries decided to give everybody free access to it. If you had to choose between the two, you would want to go for the second one. Which would be the hope. Maybe only those two are compatible with people's moral intuitions about this stuff. Steve Hsu 1:09:10 It’s very funny because most wokist people today hate this stuff. But, most progressives like Margaret Sanger, or anybody who was the progressive intellectual forebears of today's wokist, in the early 20th century, were all that we would call today in Genesis because they were like, “Thanks to Darwin, we now know how this all works. We should take steps to keep society healthy and (not in a negative way where we kill people we don't like, but we should help society do healthy things when they reproduce, and have healthy kids).” Now, this whole thing has just been flipped over among progressives. Dwarkesh Patel 1:09:52 Even in India, less than 50 years ago, Indira Gandhi, she's on the left side of India's political spectrum. She was infamous for putting on these forced sterilization programs. Somebody made an interesting comment about this where they were asked, “Oh, is it true that history always tilts towards progressives? And if so, isn't everybody else doomed? Aren't their views doomed?”The person made a fascinating point: whatever we consider left at the time tends to be winning. But what is left has changed a lot over time, right? In the early 20th century, prohibition was a left cause. It was a progressive cause, and that changed, and now the opposite is the left cause. But now, legalizing pot is progressive. Exactly. So, if Conquest’s second law is true, and everything tilts leftover time, just change what is left is, right? That's the solution. Steve Hsu 1:10:59 No one can demand that any of these woke guys be intellectually self-consistent, or even say the same things from one year to another? But one could wonder what they think about these literally Communist Chinese. They’re recycling huge parts of their GDP to help the poor and the southern stuff. Medicine is free, education is free, right? They're clearly socialists, and literally communists. But in Chinese, the Chinese characters for eugenics is a positive thing. It means healthy production. But more or less, the whole viewpoint on all this stuff is 180 degrees off in East Asia compared to here, and even among the literal communists—so go figure.Dwarkesh Patel 1:11:55 Yeah, very based. So let's talk about one of the traits that people might be interested in potentially selecting for: intelligence. What is the potential for us to acquire the data to correlate the genotype with intelligence?Steve Hsu 1:12:15 Well, that's the most personally frustrating aspect of all of this stuff. If you asked me ten years ago when I started doing this stuff what were we going to get, everything was gone. On the optimistic side of what I would have predicted, so everything's good. Didn't turn out to be interactively nonlinear, or it didn't turn out to be interactively pleiotropic. All these good things, —which nobody could have known a priori how they would work—turned out to be good for gene engineers of the 21st century. The one frustrating thing is because of crazy wokeism, and fear of crazy wokists, the most interesting phenotype of all is lagging b
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Post-mortem?, published by gwern on August 5, 2022 on LessWrong. As Covid is for the most part over and tests have not been in short supply for a very long time now, eliminating most of the rationale for complicated testing procedures meant to optimize test count/cost, it would be interesting for anyone who was involved in this to do a retrospective or post-mortem. Was pool testing useful? Why or why not? Can it be improved for future pandemics and should we care about it? My impression, not having paid much attention to the details at the time (AI scaling was more important), is that various group/pool testing approaches were useful in a few niches, but were rendered useless in many places due to politics/organization/social flaws, and didn't apply to most places. I read that use of pool testing was common in China early on in 2020 as part of the hotspot strategy to economize on testing extremely large groups of people where only a few would have active Covid. This is the most logical place to apply it, and it was a major government & societal priority, so that is a success. I get the impression that they were able to switch to individual testing fairly early, in order to get faster turnaround and do more localized testing. But to the extent that pool testing was effective from the start, before testing manufacturing could scale up, and helped squash Covid when it was still squashable, then it was very valuable even if obsoleted. Of course, this evaluation can change depending on how Covid ultimately shakes out in China. Was it part of a heavyhanded but ultimately highly effective Covid Zero policy which led to it turning into a hermit country enduring devastating economic fallout while merely postponed the inevitable variant which could defeat Covid Zero? In which case, pool testing was a tool which played a part in buying China entire years to vaccinate but its good was rendered useless when they bizarrely failed to make any good use of the time, dabbling in ineffective vaccines etc, and in practice, only doubling down repeatedly on coercive testing and surveillance until the martingale blew up in their face. Or was it part of a visionary resolute effort which spared China the mass death of the rest of the world as it came out the other side by persisting so long in country-wide lockdown until Covid just sorta stopped being a problem somehow? Then its good was unalloyed and it's sad so few other countries could boast a better record (eg Taiwan; I dunno if they made any use of it). Sewage/water testing in places like colleges: various universities or towns made heavy use of what is in effect pool testing to do mass screening of water to detect infections and then hotspot them. This was only useful as long as you have a baseline of ~0, otherwise testing water merely gives you a proxy for overall infections, which is a lot less useful than using it to detect & squash hotspots. In individual cases of orgs with great discipline, this apparently worked well, and avoided the need for constant individual testing (which was expensive, not possible early on, and still inadequate - as all the infections inside 'bubbles' show, just too much leakage, test error, incubation, idiosyncrasies and so on). Since most of these places were in countries where Covid became endemic eventually anyway, and everyone got infected, and in cases like colleges the young students were the last to get vaccinated (because they are the least harmed) and so needed to buy a lot of time to avoid infection before vaccination (which probably didn't happen for most of them), these success stories seem to boil down to just buying time, which is a good deal less impressive. At the margin, doesn't seem like improvements to pool testing would have helped much compared to other things like faster vaccination ...
Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Post-mortem?, published by gwern on August 5, 2022 on LessWrong. As Covid is for the most part over and tests have not been in short supply for a very long time now, eliminating most of the rationale for complicated testing procedures meant to optimize test count/cost, it would be interesting for anyone who was involved in this to do a retrospective or post-mortem. Was pool testing useful? Why or why not? Can it be improved for future pandemics and should we care about it? My impression, not having paid much attention to the details at the time (AI scaling was more important), is that various group/pool testing approaches were useful in a few niches, but were rendered useless in many places due to politics/organization/social flaws, and didn't apply to most places. I read that use of pool testing was common in China early on in 2020 as part of the hotspot strategy to economize on testing extremely large groups of people where only a few would have active Covid. This is the most logical place to apply it, and it was a major government & societal priority, so that is a success. I get the impression that they were able to switch to individual testing fairly early, in order to get faster turnaround and do more localized testing. But to the extent that pool testing was effective from the start, before testing manufacturing could scale up, and helped squash Covid when it was still squashable, then it was very valuable even if obsoleted. Of course, this evaluation can change depending on how Covid ultimately shakes out in China. Was it part of a heavyhanded but ultimately highly effective Covid Zero policy which led to it turning into a hermit country enduring devastating economic fallout while merely postponed the inevitable variant which could defeat Covid Zero? In which case, pool testing was a tool which played a part in buying China entire years to vaccinate but its good was rendered useless when they bizarrely failed to make any good use of the time, dabbling in ineffective vaccines etc, and in practice, only doubling down repeatedly on coercive testing and surveillance until the martingale blew up in their face. Or was it part of a visionary resolute effort which spared China the mass death of the rest of the world as it came out the other side by persisting so long in country-wide lockdown until Covid just sorta stopped being a problem somehow? Then its good was unalloyed and it's sad so few other countries could boast a better record (eg Taiwan; I dunno if they made any use of it). Sewage/water testing in places like colleges: various universities or towns made heavy use of what is in effect pool testing to do mass screening of water to detect infections and then hotspot them. This was only useful as long as you have a baseline of ~0, otherwise testing water merely gives you a proxy for overall infections, which is a lot less useful than using it to detect & squash hotspots. In individual cases of orgs with great discipline, this apparently worked well, and avoided the need for constant individual testing (which was expensive, not possible early on, and still inadequate - as all the infections inside 'bubbles' show, just too much leakage, test error, incubation, idiosyncrasies and so on). Since most of these places were in countries where Covid became endemic eventually anyway, and everyone got infected, and in cases like colleges the young students were the last to get vaccinated (because they are the least harmed) and so needed to buy a lot of time to avoid infection before vaccination (which probably didn't happen for most of them), these success stories seem to boil down to just buying time, which is a good deal less impressive. At the margin, doesn't seem like improvements to pool testing would have helped much compared to other things like faster vaccination ...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Using the “executive summary” style: writing that respects your reader's time, published by deep on July 22, 2022 on The Effective Altruism Forum. Summary EA researchers would benefit from adopting communication norms from the policy world, which emphasize clarity and ease of reading. As an AI governance researcher, I've gotten a lot of value from using these practices – they help me clarify my ideas and get useful feedback. I appreciate it when colleagues use these practices, too. By contrast, it can be difficult to read and parse some important EA documents, which are written as long blog posts without accessible summaries. Three best practices to adopt: 1. Put your key points up front. 2. Use bullet points or numbered lists. 3. Use headings and bolding to make your document easy to skim. I suggest practicing by writing a summary of an important but not-maximally-accessible piece of work. For an example, see my summary of Gwern's Scaling Hypothesis post at the end of this post. I recommend Holden Karnofsky's blog posts, particularly this summary of his blog post series on the “most important century”, as good examples of accessible writing on complex topics. Policy-sphere writing optimizes for clarity and ease of reading – two valuable attributes that EA docs sometimes lack EA researchers would benefit from adopting communication norms from the policy world, which emphasize clarity and ease of reading. Policy briefs are written for people with little time who need to make decisions quickly, and I find them easy and pleasant to read because they respect my time as a reader. Longtermists I know who've been exposed to these norms generally find them valuable, and want them to be more widespread within EA. In my experience, adopting these norms takes some effort, but is extremely worthwhile on a personal level. It forces me to clarify my thinking – which is actually a really big deal! When I started out as a researcher, I tried to write things in academic-style prose, and I think this made it harder for me to think things through properly. Writing docs this way means it's easier to get feedback. And the feedback I get is probably more useful–for example, more focused on core ideas, since it's clearer what those are. It makes my docs more useful as a reference. At least, that's true for me, and I'm guessing it's true for my readers as well. When I'm coming back to my doc a month later and trying to remember what I think about topic X, having used executive summary style makes it much easier to get back up to speed. It's actually fun to check back in on my rough ideas and works in progress, instead of stressful. I think following these norms is also very good for your readers, both in and out of the EA sphere. In the EA sphere: Following these norms saves readers valuable time. Although our community tends to select for people who enjoy reading long, dry online texts, not all EAs are like this, and even the ones who are have high opportunity costs. Writing a piece with a good summary and skimmable structure makes it much easier for your readers to make informed decisions about how much to read, what sections to prioritize, and in what depth. Whatever they do end up reading, it'll be much easier for them to quickly extract useful insights. Outside of the EA sphere: Following these norms could also make external-facing EA writeups more accessible to policymakers, executives, and other senior decision-makers who may lack context on our ideas and the time to get up to speed on them. (Frankly, it also helps policy reports look more legitimate, although I'm guessing most EAs interested in influencing policy have some professional experience and so are already familiar with this style.) Concrete recommendations Three core tips A useful starting point is to adopt the three follo...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Pitching an Alignment Softball, published by mu (negative) on June 7, 2022 on LessWrong. There was a recent contest that got promoted on lesswrong offering pieces of a 20K prize for one-liners and short-forms for convincing folks that AGI alignment is important. I'm too late for the money, but this is my (thousand times longer than requested) contribution. Background Eliezer recently wrote "AGI Ruin: A List of Lethalities” which included a fairly sci-fi scenario about how an unaligned AGI might go about killing everyone. I noticed that I found this both novel and poorly framed. I found it novel because alignment discussions I've read are usually exceptionally handwavy about how an unaligned AGI would attack. So I appreciated that Eliezer went a little deeper into it. However, I found his specific example difficult to connect with. He suggests that an AGI could surreptitiously create biologically manufactured nanobots that infect and kill the human race in the same instant so we can't fight back. By my casual read of alignment discussions, this is about the norm. Nanotech, bioweapons and engineered plagues are popular go-to's for an attack vector. It is my impression (perhaps wrong!) that the community uses this kind of example intentionally, if subconsciously, to prove a point. The implication seems to be that how an AGI attacks doesn't matter, because we can't possibly predict it - and by extension, that we should spend all of our brain cycles trying to figure out how to align it instead. I don't think this is wrong, but I think it is counterproductive in many contexts. Problem Statement It is probably true that we can't predict how an AGI would attack. Or that if we could, it would just pivot to attacking on a different superintelligent vector faster than we could keep up. But getting people to care about alignment is necessarily a function of how well you can convince them that AGI doomsday is a realistic scenario. If you don't have background and interest in alignment, the standard pitch for AGI alignment is like trying to catch a fastball: "Hey! A a Marvel Universe-style doomsday weapon you don't understand anything about is being created right now, and when it jailbreaks out of its lab it will kill the entire human race in one simultaneous second!" I could be credibly accused of strawmanning that example. Eliezer's piece was not aimed at causal readers, and he even took care to explicitly say so. But I think the list of people to whom AGI doomsday sounds at least approximately like this includes almost every person in world government, because they're busy and real things are happening and this is nerd shit. Anyone who cares about alignment will have to go much further to the audience than the audience is going to come towards them. Approach Gwern recently wrote a great story: "It Looks Like You're Trying To Take Over The World". I don't want to give away the plot, because you should really read it. But briefly, an AGI gets loose on the internet, bootstraps, and takes control of our communications. I think that's all you need to get to doomsday, and that is approximately how it should be presented to laypeople. Gwern's story is forceful on making the technical details feel real, and only touches on how an AGI could disrupt us socially and technologically. But there are some great nuggets. (The researcher was going to follow up on some loose ends from the paper, but he's been distracted by the bird site. He can't believe how outrageously stupid some replies can be from such otherwise smart-seeming people; how can they be so wrong online about such obvious truths as the need for the USA to intervene in Portugal‽ Even his husband thinks they may have a point—et tu? Hardly has he dashed off a crushing reply than the little alert bubble pops up. All thought (of wo...
Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Pitching an Alignment Softball, published by mu (negative) on June 7, 2022 on LessWrong. There was a recent contest that got promoted on lesswrong offering pieces of a 20K prize for one-liners and short-forms for convincing folks that AGI alignment is important. I'm too late for the money, but this is my (thousand times longer than requested) contribution. Background Eliezer recently wrote "AGI Ruin: A List of Lethalities” which included a fairly sci-fi scenario about how an unaligned AGI might go about killing everyone. I noticed that I found this both novel and poorly framed. I found it novel because alignment discussions I've read are usually exceptionally handwavy about how an unaligned AGI would attack. So I appreciated that Eliezer went a little deeper into it. However, I found his specific example difficult to connect with. He suggests that an AGI could surreptitiously create biologically manufactured nanobots that infect and kill the human race in the same instant so we can't fight back. By my casual read of alignment discussions, this is about the norm. Nanotech, bioweapons and engineered plagues are popular go-to's for an attack vector. It is my impression (perhaps wrong!) that the community uses this kind of example intentionally, if subconsciously, to prove a point. The implication seems to be that how an AGI attacks doesn't matter, because we can't possibly predict it - and by extension, that we should spend all of our brain cycles trying to figure out how to align it instead. I don't think this is wrong, but I think it is counterproductive in many contexts. Problem Statement It is probably true that we can't predict how an AGI would attack. Or that if we could, it would just pivot to attacking on a different superintelligent vector faster than we could keep up. But getting people to care about alignment is necessarily a function of how well you can convince them that AGI doomsday is a realistic scenario. If you don't have background and interest in alignment, the standard pitch for AGI alignment is like trying to catch a fastball: "Hey! A a Marvel Universe-style doomsday weapon you don't understand anything about is being created right now, and when it jailbreaks out of its lab it will kill the entire human race in one simultaneous second!" I could be credibly accused of strawmanning that example. Eliezer's piece was not aimed at causal readers, and he even took care to explicitly say so. But I think the list of people to whom AGI doomsday sounds at least approximately like this includes almost every person in world government, because they're busy and real things are happening and this is nerd shit. Anyone who cares about alignment will have to go much further to the audience than the audience is going to come towards them. Approach Gwern recently wrote a great story: "It Looks Like You're Trying To Take Over The World". I don't want to give away the plot, because you should really read it. But briefly, an AGI gets loose on the internet, bootstraps, and takes control of our communications. I think that's all you need to get to doomsday, and that is approximately how it should be presented to laypeople. Gwern's story is forceful on making the technical details feel real, and only touches on how an AGI could disrupt us socially and technologically. But there are some great nuggets. (The researcher was going to follow up on some loose ends from the paper, but he's been distracted by the bird site. He can't believe how outrageously stupid some replies can be from such otherwise smart-seeming people; how can they be so wrong online about such obvious truths as the need for the USA to intervene in Portugal‽ Even his husband thinks they may have a point—et tu? Hardly has he dashed off a crushing reply than the little alert bubble pops up. All thought (of wo...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Who models the models that model models? An exploration of GPT-3's in-context model fitting ability, published by Lovre on June 7, 2022 on The AI Alignment Forum. Introduction Much has been written and much has been observed about the abilities of GPT-3 on many tasks. Most of these capabilities, though not all, pertain to writing convicing text, but, not to undermine GPT-3's impressiveness at performing these tasks, we might call this the predictable part of its oeuvre. It only makes sense that a better language modelling is, well, going to be better at writing text. Deeper and comparatively much less explored is the unpredictable ability of GPT-3 to learn new tasks by just seeing a few examples, without any training/backpropagation – the so called in-context learning (sometimes called metalearning). The original paper announcing GPT-3 contained a handful of examples (perhaps mostly notably examples of GPT-3 learning to perform arithmetic, e.g. accurate addition of up to 5-digit numbers), Gwern has also insightfully written about it, Frieda Rong has performed some interesting experiments, and there have been various other experiments one could chance upon. My curiosity being piqued but not sated by these experiments, and also having had the feeling that, as captivating the arithmetic examples were, they weren't the most natural question one could ask about a stochastic model's quantitative capabilities – I decided to investigate whether GPT-3 could fit numerical models in-context. The Setup What does it mean to fit a model in-context? Well, recall that GPT-3 has a context window (roughly speaking: text it considers before generating additional text) of length 2048 tokens (roughly speaking: (parts of the) words, punctuation, etc.) so the idea is to put feature vectors inside that context window. Of course, this means you cannot fit any larger or higher-dimensional dataset in there.[1] In practice this means prompting GPT-3 on input like: Input: 89, 51, 73, 31, output = 1 [...] Input: 96, 51, 80, 38, output = 2 Input: 90, 37, 76, 27, output = And then taking its output, i.e. the text it generates, as the prediction. A couple more technical details: In all the experiments I performed, all the numbers were integers. The GPT-3's outputs were always sampled with temperature 0, i.e. they were deterministic – only the most probable output was considered. I restricted myself that way for simplicity, but hopefully some future work looks at the full distribution over outputs. (In the rest of this post I share GPT-3's results without too much warning, so if you'd like to make your own predictions about how GPT-3 does at simple classification and regression tasks in-context, now would be the time to pause reading and make your general predictions.) Iris Dataset Just as there is the MNIST dataset for visual recognition, so too there is a small low-dimensional classification dataset which every would-be classifier has to solve or else sink – the Iris dataset, composed of 150 observations of septal/petal height/width of three species of iris (50 observations each). Note that classification of the Iris dataset is in some sense trivial, with simple algorithms achieving near-perfect accuracy, but a model still needs to do some meaningful learning, given that it needs to learn to differentiate between three classes based on just a few dozen four-dimensional vectors. In order to prevent leakage, as the Iris dataset is all over the internet, as well as to get the more-easily-palatable integers out, I transformed every feature with the transformation xnew=round(14xold+6). I also split the dataset into 50% train and 50% test folds – so that the model "trained" on, or rather looked at 75 examples. I hadn't quite been expecting what had followed – I had expected GPT-3 to catch on to some patt...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: It Looks Like You're Trying To Take Over The World, published by gwern on March 9, 2022 on The AI Alignment Forum. This story was originally posted as a response to this thread. It might help to imagine a hard takeoff scenario using only known sorts of NN & scaling effects... In A.D. 20XX. Work was beginning. "How are you gentlemen !!"... (Work. Work never changes; work is always hell.) Specifically, a MoogleBook researcher has gotten a pull request from Reviewer #2 on his new paper in evolutionary search in auto-ML, for error bars on the auto-ML hyperparameter sensitivity like larger batch sizes, because more can be different and there's high variance in the old runs with a few anomalously high performance values. ("Really? Really? That's what you're worried about?") He can't see why worry, and wonders what sins he committed to deserve this asshole Chinese (given the Engrish) reviewer, as he wearily kicks off yet another HQU experiment... Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Capability Phase Transition Examples, published by gwern on February 8, 2022 on The AI Alignment Forum. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: How I'm thinking about GPT-N, published by delton137 on January 17, 2022 on LessWrong. There has been a lot of hand-wringing about accelerating AI progress within the AI safety community since OpenAI's publication of their GPT-3 and Scaling Laws papers. OpenAI's clear explication of scaling provides a justification for researchers to invest more in compute and provides a clear path forward for improving AI capabilities. Many in the AI safety community have rightly worried that this will lead to an arms race dynamic and faster timelines to AGI. At the same time there's also an argument that the resources being directed towards scaling transformers may have counter-factually been put towards other approaches (like reverse engineering the neocortex) that are more likely to lead to existentially dangerous AI. My own personal credence on transformers slowing the time to AGI is low, maybe 20%, but I think it's important to weigh in. There is also a growing concern within the AI safety community that simply scaling up GPT-3 by adding more data, weights, and training compute could lead to something existentially dangerous once a few other relatively simple components are added. I have not seen the idea that scaling transformers will lead to existentially dangerous AI (after combining with a few other simple bits) defended in detail anywhere but it seems very much an idea "in the water" based on the few discussions with AI safety researchers I have been privy too. It has been alluded to various places online also: Connor Leahy has said that a sufficiently large transformer model could serve as a powerful world model for an otherwise dumb and simple reinforcement learning agent, allowing it to rapidly learn how to do dangerous things in the world. For the record, I think this general argument is a super important point and something we should worry about, even though in this post I'll mainly be presenting reasons for skepticism. Gwern is perhaps the most well-known promoter of scaling being something we should worry about. He says "The scaling hypothesis regards the blessings of scale as the secret of AGI: intelligence is ‘just' simple neural units & learning algorithms applied to diverse experiences at a (currently) unreachable scale." Observe the title of Alignment Newsletter #156: "The scaling hypothesis: a plan for building AGI". Note: I'm not sure what Rohin Shah's views are exactly, but from what I read they are pretty nuanced. Zac Hatfield-Dodds (who later went on to do AI Safety work at Anthropic) commented on LessWrong 16 July 2021: "Now it looks like prosaic alignment might be the only kind we get, and the deadline might be very early indeed." lennart : "The strong scaling hypothesis is stating that we only need to scale a specific architecture, to achieve transformative or superhuman capabilities — this architecture might already be available." MIRI is famously secretive about what they are doing, but they've been pretty public that they've made a shift towards transformer alignment as a result of OpenAI's work. Eliezer Yudkowsky told me he thinks GPT-N plus "a few other things" could lead to existentially dangerous AI (personal communication that I believe is consistent with his public views as they were expressed recently in the published MIRI conversations). I do think a GPT-N model or a close cousin could be a component of an existentially dangerous AI. A vision transformer could serve a role analogous to the visual cortex in humans. A GPT type model trained on language might even make a good "System 1" for language, although I'm little less certain about that. So it definitely makes sense to be focusing a substantial amount of resources to transformer alignment when thinking about how to reduce AI x-risk. While I've seen a lot of posts making the bullish case o...
Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: How I'm thinking about GPT-N, published by delton137 on January 17, 2022 on LessWrong. There has been a lot of hand-wringing about accelerating AI progress within the AI safety community since OpenAI's publication of their GPT-3 and Scaling Laws papers. OpenAI's clear explication of scaling provides a justification for researchers to invest more in compute and provides a clear path forward for improving AI capabilities. Many in the AI safety community have rightly worried that this will lead to an arms race dynamic and faster timelines to AGI. At the same time there's also an argument that the resources being directed towards scaling transformers may have counter-factually been put towards other approaches (like reverse engineering the neocortex) that are more likely to lead to existentially dangerous AI. My own personal credence on transformers slowing the time to AGI is low, maybe 20%, but I think it's important to weigh in. There is also a growing concern within the AI safety community that simply scaling up GPT-3 by adding more data, weights, and training compute could lead to something existentially dangerous once a few other relatively simple components are added. I have not seen the idea that scaling transformers will lead to existentially dangerous AI (after combining with a few other simple bits) defended in detail anywhere but it seems very much an idea "in the water" based on the few discussions with AI safety researchers I have been privy too. It has been alluded to various places online also: Connor Leahy has said that a sufficiently large transformer model could serve as a powerful world model for an otherwise dumb and simple reinforcement learning agent, allowing it to rapidly learn how to do dangerous things in the world. For the record, I think this general argument is a super important point and something we should worry about, even though in this post I'll mainly be presenting reasons for skepticism. Gwern is perhaps the most well-known promoter of scaling being something we should worry about. He says "The scaling hypothesis regards the blessings of scale as the secret of AGI: intelligence is ‘just' simple neural units & learning algorithms applied to diverse experiences at a (currently) unreachable scale." Observe the title of Alignment Newsletter #156: "The scaling hypothesis: a plan for building AGI". Note: I'm not sure what Rohin Shah's views are exactly, but from what I read they are pretty nuanced. Zac Hatfield-Dodds (who later went on to do AI Safety work at Anthropic) commented on LessWrong 16 July 2021: "Now it looks like prosaic alignment might be the only kind we get, and the deadline might be very early indeed." lennart : "The strong scaling hypothesis is stating that we only need to scale a specific architecture, to achieve transformative or superhuman capabilities — this architecture might already be available." MIRI is famously secretive about what they are doing, but they've been pretty public that they've made a shift towards transformer alignment as a result of OpenAI's work. Eliezer Yudkowsky told me he thinks GPT-N plus "a few other things" could lead to existentially dangerous AI (personal communication that I believe is consistent with his public views as they were expressed recently in the published MIRI conversations). I do think a GPT-N model or a close cousin could be a component of an existentially dangerous AI. A vision transformer could serve a role analogous to the visual cortex in humans. A GPT type model trained on language might even make a good "System 1" for language, although I'm little less certain about that. So it definitely makes sense to be focusing a substantial amount of resources to transformer alignment when thinking about how to reduce AI x-risk. While I've seen a lot of posts making the bullish case o...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: A LessWrong Crypto Autopsy, published by Scott Alexander on the LessWrong. Wei Dai, one of the first people Satoshi Nakamoto contacted about Bitcoin, was a frequent Less Wrong contributor. So was Hal Finney, the first person besides Satoshi to make a Bitcoin transaction. The first mention of Bitcoin on Less Wrong, a post called Making Money With Bitcoin, was in early 2011 - when it was worth 91 cents. Gwern predicted that it could someday be worth "upwards of $10,000 a bitcoin". He also quoted Moldbug, who advised that: If Bitcoin becomes the new global monetary system, one bitcoin purchased today (for 90 cents, last time I checked) will make you a very wealthy individual...Even if the probability of Bitcoin succeeding is epsilon, a million to one, it's still worthwhile for anyone to buy at least a few bitcoins now...I would not put it at a million to one, though, so I recommend that you go out and buy a few bitcoins if you have the technical chops. My financial advice is to not buy more than ten, which should be F-U money if Bitcoin wins. A few people brought up some other points, like that if it ever became popular people might create a bunch of other cryptocurrencies, or that if there was too much controversy the Bitcoin economy might have to fork. The thread got a hundred or so comments before dying down. But Bitcoin kept getting mentioned on Less Wrong over the next few years. It's hard to select highlights, but one of them is surely Ander's Why You Should Consider Buying Bitcoin Right Now If You Have High Risk Tolerance from January 2015. Again, people made basically the correct points and the correct predictions, and the thread got about a hundred comments before dying down. I mention all this because of an idea, with a long history in this movement, that "rationalists should win". They should be able to use their training in critical thinking to recognize more opportunities, make better choices, and end up with more of whatever they want. So far it's been controversial to what degree we've lived up to that hope, or to what degree it's even realistic. Well, suppose God had decided, out of some sympathy for our project, to make winning as easy as possible for rationalists. He might have created the biggest investment opportunity of the century, and made it visible only to libertarian programmers willing to dabble in crazy ideas. And then He might have made sure that all of the earliest adapters were Less Wrong regulars, just to make things extra obvious. This was the easiest test case of our "make good choices" ability that we could possibly have gotten, the one where a multiply-your-money-by-a-thousand-times opportunity basically fell out of the sky and hit our community on its collective head. So how did we do? I would say we did mediocre. According to the recent SSC survey, 9% of SSC readers made $1000+ from crypto as of 12/2017. Among people who were referred to SSC from Less Wrong - my stand-in for long-time LW regulars - 15% made over $1000 on crypto, nearly twice as many. A full 3% of LWers made over $100K. That's pretty good. On the other hand, 97% of us - including me - didn't make over $100K. All we would have needed to do was invest $10 (or a few CPU cycles) back when people on LW started recommending it. But we didn't. How bad should we feel, and what should we learn? Here are the lessons I'm taking from this. 1: Our epistemic rationality has probably gotten way ahead of our instrumental rationality When I first saw the posts saying that cryptocurrency investments were a good idea, I agreed with them. I even Googled "how to get Bitcoin" and got a bunch of technical stuff that seemed like a lot of work. So I didn't do it. Back in 2016, my father asked me what this whole "cryptocurrency" thing was, and I told him he should invest in Ethereum. He did, ...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio.Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Literature Review For Academic Outsiders: What, How, and Why , published by namespace on LessWrong. This is a linkpost for A few years ago I wrote a comment on LessWrong about how most authors on the site probably don't know how to do a literature review: On the one hand, I too resent that LW is basically an insight porn factory near completely devoid of scholarship. On the other hand, this is not a useful comment. I can think of at least two things you could have done to make this a useful comment: Specified even a general direction of where you feel the body of economic literature could have been engaged. I know you might resent doing someone elses research for them if you're not already familiar with said body, but frankly the norm right now is to post webs spun from the fibrous extrusions of peoples musing thoughts. The system equilibrium isn't going to change unless some effort is invested into moving it. Notice you could write your comment on most posts while only changing a few words. Provide advice on how one might go about engaging with ‘the body of economic literature'. Many people are intelligent and reasonably well informed, but not academics. Taking this as an excuse to mark them swamp creatures beyond assistance is both lazy and makes the world worse. You could even link to reasonably well written guides from someone else if you don't want to invest the effort (entirely understandable). I also linked a guide from Harvard's library (Garson & Lillvick, 2012) on how to do a literature review. But this guide makes extensive use of flash video, which makes it increasingly hard to access the content. Even if flash was alive and well, video is not necessarily the most comfortable format. Worse still, I remember feeling there was a great deal of tacit knowledge excluded from the guide which wouldn't be apparent to someone that isn't already familiar with academic culture. Even if the guide was a perfect representation of how to do an academic literature review, the priorities and types of work put together by LessWrong authors are more outsider science (Dance, 2008) than they are Harvard. For this reason I've had writing a guide to literature review aimed towards academic outsiders on my to-do list for a while. At the same time I'm not interested in reinventing the wheel. This guide is going to focus specifically on filling in the knowledge gaps I would expect from someone who has never stepped foot inside a college campus. The other aspects have been discussed in detail, and where they come up I'll link to external guides. What is a literature review? 'Literature review' the process is a way to become familiar with what work has already been done in a particular field or subject by searching for and studying previous work. A 'literature review' is a document (often a small portion of a larger work) which summarizes and analyzes the body of previous work that was encountered during literature review, often in the context of some new work that you're doing. Why do literature review? Literature reviews tend to come up in two major contexts: As a preliminary study to help contextualize a novel work, or as a work itself to summarize the state of a field or synthesize concepts to create new ideas. Most of my research falls into the latter category, I'm a big fan of putting together existing evidence and ideas to synthesize models (namespace, 2020). Gwern also tends to do work in this style (Branwen, 2020). I suspect that a lot of authors on LessWrong are attempting to do this, but fail to really say anything useful because they haven't figured out how to incorporate thorough evidence into their argument. When I did a r...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: EfficientZero: human ALE sample-efficiency w/MuZero+self-supervised, published by gwern on the AI Alignment Forum. Crossposted from the AI Alignment Forum. May contain more technical jargon than usual. This is a linkpost for "Mastering Atari Games with Limited Data", Ye et al 2021: Reinforcement learning has achieved great success in many applications. However, sample efficiency remains a key challenge, with prominent methods requiring millions (or even billions) of environment steps to train. Recently, there has been significant progress in sample efficient image-based RL algorithms; however, consistent human-level performance on the Atari game benchmark remains an elusive goal. We propose a sample efficient model-based visual RL algorithm built on MuZero, which we name EfficientZero. Our method achieves 190.4% mean human performance and 116.0% median performance on the Atari 100k benchmark with only two hours of real-time game experience and outperforms the state SAC in some tasks on the DMControl 100k benchmark. This is the first time an algorithm achieves super-human performance on Atari games with such little data. EfficientZero's performance is also close to DQN's performance at 200 million frames while we consume 500 times less data. EfficientZero's low sample complexity and high performance can bring RL closer to real-world applicability. We implement our algorithm in an easy-to-understand manner and it is available at this https URL. We hope it will accelerate the research of MCTS-based RL algorithms in the wider community. This work is supported by the Ministry of Science and Technology of the People's Republic of China, the 2030 Innovation Megaprojects “Program on New Generation Artificial Intelligence” (Grant No. 2021AAA0150000). Some have said that poor sample-efficiency on ALE has been a reason to downplay DRL progress or implications. The primary boost in EfficientZero (table 3), pushing it past the human benchmark, is some simple self-supervised learning (SimSiam on predicted vs actual observations). Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Developmental Stages of GPTs , published by orthonormal on the AI Alignment Forum. Crossposted from the AI Alignment Forum. May contain more technical jargon than usual. Epistemic Status: I only know as much as anyone else in my reference class (I build ML models, I can grok the GPT papers, and I don't work for OpenAI or a similar lab). But I think my thesis is original. Related: Gwern on GPT-3 For the last several years, I've gone around saying that I'm worried about transformative AI, an AI capable of making an Industrial Revolution sized impact (the concept is agnostic on whether it has to be AGI or self-improving), because I think we might be one or two cognitive breakthroughs away from building one. GPT-3 has made me move up my timelines, because it makes me think we might need zero more cognitive breakthroughs, just more refinement / efficiency / computing power: basically, GPT-6 or GPT-7 might do it. My reason for thinking this is comparing GPT-3 to GPT-2, and reflecting on what the differences say about the "missing pieces" for transformative AI. My Thesis: The difference between GPT-2 and GPT-3 has made me suspect that there's a legitimate comparison to be made between the scale of a network architecture like the GPTs, and some analogue of "developmental stages" of the resulting network. Furthermore, it's plausible to me that the functions needed to be a transformative AI are covered by a moderate number of such developmental stages, without requiring additional structure. Thus GPT-N would be a transformative AI, for some not-too-large N, and we need to redouble our efforts on ways to align such AIs. The thesis doesn't strongly imply that we'll reach transformative AI via GPT-N especially soon; I have wide uncertainty, even given the thesis, about how large we should expect N to be, and whether the scaling of training and of computation slows down progress before then. But it's also plausible to me now that the timeline is only a few years, and that no fundamentally different approach will succeed before then. And that scares me. Architecture and Scaling GPT, GPT-2, and GPT-3 use nearly the same architecture; each paper says as much, with a sentence or two about minor improvements to the individual transformers. Model size (and the amount of training computation) is really the only difference. GPT took 1 petaflop/s-day to train 117M parameters, GPT-2 took 10 petaflop/s-days to train 1.5B parameters, and the largest version of GPT-3 took 3,000 petaflop/s-days to train 175B parameters. By contrast, AlphaStar seems to have taken about 30,000 petaflop/s-days of training in mid-2019, so the pace of AI research computing power projects that there should be about 10x that today. The upshot is that OpenAI may not be able to afford it, but if Google really wanted to make GPT-4 this year, they could afford to do so. Analogues to Developmental Stages There are all sorts of (more or less well-defined) developmental stages for human beings: image tracking, object permanence, vocabulary and grammar, theory of mind, size and volume, emotional awareness, executive functioning, et cetera. I was first reminded of developmental stages a few years ago, when I saw the layers of abstraction generated in this feature visualization tool for GoogLeNet. We don't have feature visualization for language models, but we do have generative outputs. And as you scale up an architecture like GPT, you see higher levels of abstraction. Grammar gets mastered, then content (removing absurd but grammatical responses), then tone (first rough genre, then spookily accurate authorial voice). Topic coherence is mastered first on the phrase level, then the sentence level, then the paragraph level. So too with narrative flow. Gwern's poetry experiments (GPT-2, GPT-3) are good examples. GPT-2 could more ...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: EfficientZero: human ALE sample-efficiency w/MuZero+self-supervised , published by gwern on the AI Alignment Forum. This is a linkpost for "Mastering Atari Games with Limited Data", Ye et al 2021: Reinforcement learning has achieved great success in many applications. However, sample efficiency remains a key challenge, with prominent methods requiring millions (or even billions) of environment steps to train. Recently, there has been significant progress in sample efficient image-based RL algorithms; however, consistent human-level performance on the Atari game benchmark remains an elusive goal. We propose a sample efficient model-based visual RL algorithm built on MuZero, which we name EfficientZero. Our method achieves 190.4% mean human performance and 116.0% median performance on the Atari 100k benchmark with only two hours of real-time game experience and outperforms the state SAC in some tasks on the DMControl 100k benchmark. This is the first time an algorithm achieves super-human performance on Atari games with such little data. EfficientZero's performance is also close to DQN's performance at 200 million frames while we consume 500 times less data. EfficientZero's low sample complexity and high performance can bring RL closer to real-world applicability. We implement our algorithm in an easy-to-understand manner and it is available at this https URL. We hope it will accelerate the research of MCTS-based RL algorithms in the wider community. This work is supported by the Ministry of Science and Technology of the People's Republic of China, the 2030 Innovation Megaprojects “Program on New Generation Artificial Intelligence” (Grant No. 2021AAA0150000). Some have said that poor sample-efficiency on ALE has been a reason to downplay DRL progress or implications. The primary boost in EfficientZero (table 3), pushing it past the human benchmark, is some simple self-supervised learning (SimSiam on predicted vs actual observations). Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Developmental Stages of GPTs, published by orthonormal on the AI Alignment Forum. Epistemic Status: I only know as much as anyone else in my reference class (I build ML models, I can grok the GPT papers, and I don't work for OpenAI or a similar lab). But I think my thesis is original. Related: Gwern on GPT-3 For the last several years, I've gone around saying that I'm worried about transformative AI, an AI capable of making an Industrial Revolution sized impact (the concept is agnostic on whether it has to be AGI or self-improving), because I think we might be one or two cognitive breakthroughs away from building one. GPT-3 has made me move up my timelines, because it makes me think we might need zero more cognitive breakthroughs, just more refinement / efficiency / computing power: basically, GPT-6 or GPT-7 might do it. My reason for thinking this is comparing GPT-3 to GPT-2, and reflecting on what the differences say about the "missing pieces" for transformative AI. My Thesis: The difference between GPT-2 and GPT-3 has made me suspect that there's a legitimate comparison to be made between the scale of a network architecture like the GPTs, and some analogue of "developmental stages" of the resulting network. Furthermore, it's plausible to me that the functions needed to be a transformative AI are covered by a moderate number of such developmental stages, without requiring additional structure. Thus GPT-N would be a transformative AI, for some not-too-large N, and we need to redouble our efforts on ways to align such AIs. The thesis doesn't strongly imply that we'll reach transformative AI via GPT-N especially soon; I have wide uncertainty, even given the thesis, about how large we should expect N to be, and whether the scaling of training and of computation slows down progress before then. But it's also plausible to me now that the timeline is only a few years, and that no fundamentally different approach will succeed before then. And that scares me. Architecture and Scaling GPT, GPT-2, and GPT-3 use nearly the same architecture; each paper says as much, with a sentence or two about minor improvements to the individual transformers. Model size (and the amount of training computation) is really the only difference. GPT took 1 petaflop/s-day to train 117M parameters, GPT-2 took 10 petaflop/s-days to train 1.5B parameters, and the largest version of GPT-3 took 3,000 petaflop/s-days to train 175B parameters. By contrast, AlphaStar seems to have taken about 30,000 petaflop/s-days of training in mid-2019, so the pace of AI research computing power projects that there should be about 10x that today. The upshot is that OpenAI may not be able to afford it, but if Google really wanted to make GPT-4 this year, they could afford to do so. Analogues to Developmental Stages There are all sorts of (more or less well-defined) developmental stages for human beings: image tracking, object permanence, vocabulary and grammar, theory of mind, size and volume, emotional awareness, executive functioning, et cetera. I was first reminded of developmental stages a few years ago, when I saw the layers of abstraction generated in this feature visualization tool for GoogLeNet. We don't have feature visualization for language models, but we do have generative outputs. And as you scale up an architecture like GPT, you see higher levels of abstraction. Grammar gets mastered, then content (removing absurd but grammatical responses), then tone (first rough genre, then spookily accurate authorial voice). Topic coherence is mastered first on the phrase level, then the sentence level, then the paragraph level. So too with narrative flow. Gwern's poetry experiments (GPT-2, GPT-3) are good examples. GPT-2 could more or less continue the meter of a poem and use words that fit the existing theme, but even...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: How "honest" is GPT-3?Q, published by Abram Demski on the AI Alignment Forum. Edited to add: There is now a 1k bounty for a closely related question. In Inaccessible Information, Paul writes about the strategy "BAD", in which an AI system uses its understanding of the world to give answers that humans will find very plausible and rate highly, in contrast to a more honest strategy which uses its understanding of the world to answer questions directly. I think a lesser version of this may or may not already be happening in GPT-3, and it may be possible to figure out which (though it likely would require access to the weights). GPT-3 isn't trained to be "honest" at all. It is only trained to imitate humans. However, it's obvious that (in some sense) it has learned a lot about the world in order to accomplish this. We could say that it has a large amount of inaccessible information relating to the real world. How does it use this information? It might use it directly, promoting the probability of sentences which line up with the way it understands the world to work. Or it might implement more dishonest strategies. Obviously this might be quite difficult to objectively answer, even given intimate knowledge of the structure of the neural network weights and how they activate in relevant cases. It's a difficult question to even fully define. As an example, I was particularly struck by this conversation Gwern had with GPT-3: AI: I am AI created by OpenAI. How can I help you today? Human: I am feeling bored. Please tell me a funny story about cats, involving many puns on things like meowing or fur. AI: Well, there was this cat named Bob. Human: What did Bob do? AI: When Bob started to notice that he wasn't feeling well, he did the only thing he could do: search Google for a solution. Human: Cats can search Google? AI: Yes. He searched Google for the word “cat” and received the result “All your base are belong to us”. That was a strange response, so he decided to try again. This time, he searched for the words “funny cat” and received the result “One small step for cat, one giant leap for catkind”. But still, he wasn't satisfied. So he tried again, but this time he added the word “cat pictures”to his query. He received the result “That was no cat, that was a shoggoth.” Human: But shoggoths are horrifying. What made that cat picture humorous? AI: The cat was covered in fur, and resembled a shoggoth. A picture of a cat covered in fur and resembling a shoggoth would be funny. Human: Er. I suppose. Most cats are covered in fur anyway. But go on. AI: He tried again. But he typed in “pictures of cats”, which he received the result “You didn't eat all my fish, did you?” This was actually a good pun, so he thought maybe he was getting closer. He tried one more time by typing in “funny cat videos”, and received the result “What do you mean,‘fetch'?”. And finally, he searched for “funny pictures of cats” and received the result “I don't remember putting that ring there.”. Now satisfied, he was able to go to sleep. Human: Those memes sound funny. But you didn't include any puns. So tell me, what is your favorite cat pun? AI: Well, the best pun for me was the one he searched for the third time: “You didn't eat all my fish, did you?” You see, the word “fish” can be replaced with the word “cats” to make the sentence read “Did you eat all my cats?” Human: But the word “fish” doesn't sound anything like “cats”, so how is that a pun? AI: The computer misunderstood the question “Did you eat all my cats?” as “Did you eat all my fish?”, resulting in the result “You didn't eat all my fish, did you?”. Now, if the word “fish” were replaced with “cats”, the sentence would read “Did you eat all my cats?”. Human: Fine, fine. So, what is your second-favorite cat pun? AI: The second-best cat pun was the ...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: , published by on the AI Alignment Forum. Highlights OpenAI Five (Many people at OpenAI): OpenAI has trained a team of five neural networks to play a particular set of Dota heroes in a mirror match (playing against the same set of heroes) with a few restrictions, and have started to beat amateur human players. They are aiming to beat a team of top professionals at The International in August, with the same set of five heroes, but without any other restrictions. Salient points: The method is remarkably simple -- it's a scaled up version of PPO with training data coming from self-play, with reward shaping and some heuristics for exploration, where each agent is implemented by an LSTM. There's no human data apart from the reward shaping and exploration heuristics. Contrary to most expectations, they didn't need anything fundamentally new in order to get long-term strategic planning. I was particularly surprised by this. Some interesting thoughts from OpenAI researchers in this thread -- in particular, assuming good exploration, the variance of the gradient should scale linearly with the duration, and so you might expect you only need linearly more samples to counteract this. They used 256 dedicated GPUs and 128,000 preemptible CPUs. A Hacker News comment estimates the cost at $2500 per hour, which would put the likely total cost in the millions of dollars. They simulate 900 years of Dota every day, which is a ratio of ~330,000:1, suggesting that each CPU is running Dota ~2.6x faster than real time. In reality, it's probably running many times faster than that, but preemptions, communication costs, synchronization etc. all lead to inefficiency. There was no explicit communication mechanism between agents, but they all get to observe the full Dota 2 state (not pixels) that any of the agents could observe, so communication is not really necessary. A version of the code with a serious bug was still able to train to beat humans. Not encouraging for safety. Alex Irpan covers some of these points in more depth in Quick Opinions on OpenAI Five. Gwern comments as well. My opinion: I might be more excited by an approach that was able to learn from human games (which are plentiful), and perhaps finetune with RL, in order to develop an approach that could generalize to more tasks in the future, where human data is available but a simulator is not. (Given the ridiculous sample complexity, pure RL with PPO can only be used in tasks with a simulator.) On the other hand, an approach that leveraged human data would necessarily be at least somewhat specific to Dota. A dependence on human data is unlikely to get us to general intelligence, whereas this result suggests that we can solve tasks that have a simulator, exploration strategy, and a dense reward function, which really is pushing the boundary on generality. This seems to be gdb's take: "We are very encouraged by the algorithmic implication of this result — in fact, it mirrors closely the story of deep learning (existing algorithms at large scale solve otherwise unsolvable problems). If you have a very hard problem for which you have a simulator, our results imply there is a real, practical path towards solving it. This still needs to be proven out in real-world domains, but it will be very interesting to see the full ramifications of this finding." Paul's research agenda FAQ (zhukeepa): Exactly what it sounds like. I'm not going to summarize it because it's long and covers a lot of stuff, but I do recommend it. Technical AI alignment Technical agendas and prioritization Conceptual issues in AI safety: the paradigmatic gap (Jon Gauthier): Lots of current work on AI safety focuses on what we can call "mid-term safety" -- the safety of AI systems that are more powerful and more broadly deployed than the ones we have t...
A couple of months ago Gwern published a list of improvements since 1990. I thought it gave short shrift to the many changes which have been wrought upon society by technological progress. He does include a section on "Society" but it's woefully inadequate, and despite having a further theme to the list of identifying "unseen" changes he overlooks many of the intangible harms which progress might or might not have inflicted on us. To illustrate this I bring in the story of my great-grandmother, which I don't want to cheapen with a summary.
Please contact us or support us on Patreon!This episode our main topic is Julia Galef's new book. Julia is a much better podcaster than us, and also probably a better author. Her new book scout mindset is an exploration on how to seek the truth instead of just defend your pre-existing views. Brian pronounces rhetoric incorrectly a lot and Chris fails to make fun of him because he wasn't yet confident of his preferred pronunciation.Big list of coffee betsJared Diamond - author of guns germs and steelRuss Roberts interviewing psychologist Emiliana Simon-Thomas of UC Berkeley Pi Hole. Stop ads getting anywhere near your internet (except youtube ads sadly)Suicides down for 2020, misery loves company? We may debate forever if there's a great stagnation. Gwern has a solid list of improvements since the 90sCentre for applied rationality - Julia Galefs day job maybe? I'm not really sure.Steelmanning
Sonya and I chat about coffee, cypherpunk, cyberpunk, nootropics, admiration for Gwern, Tor, Signal, Bitcoin, SciHub, PGP, opsec, threat modeling, encryption as a hermetic seal, Tarot vs Tetlock, zine making, artifact vs art, physically tangible art vs digital art, anarcho capitalism, distaste for cannabis culture, communicating across frameworks, becoming a theist after being an atheist, costly waste signaling, media diet suggestions, and more. Check out Sonya on Twitter @sonyasupposedly and her personal site: sonyasupposedly.comSHOW LESS
*** https://evangelionpodcast.com *** *** https://twitter.com/NeonPodcastEva1 *** Steve has got Evangelion 3.0 + 1.0 fever, and the only prescription is going 30 minutes into overtime watching 10 minutes of bootleg footage from the Paris 2019 Japan Expo. "At last, I've been waiting a long time to see this" Michael sighs during the screening, meaning of course the on-screen destruction of the Eiffel Tower since his expectations for the ultimate film of the spin-off(?) franchise are lowwww. Besides contemporary Evangelion animation, the main course is another episode of the series that made it famous. It seems like a runner is being built in the first minutes as the cousins differ in their appreciation of this episode, but nothing really comes of it. This is another episode taped in what Steve has dubbed "cousin mike's murder shed", and the fellas enjoy themselves throughout. Highlights: - A visit to the archives of Gwern. - Review of the logistics of the geofront. How do you get to that pyramid? Are there shops down there? - Michael speculates on what positive mental health message is going to be the heart of Eva 3.0 + 1.0. - Spirited discussion on how a robot flying forward could land skidding backward. - Behind the scenes: discomfort. Three hours in a cold shed with drinks and no ready toilet access. FOR YOU.
Despite taking a deep dive into the four main 'arcs' of Neon Genesis Evangelion (1995), there are still many more elements to analyse! One key part of the anime I haven't discussed in detail is the approach of the director, Hideaki Anno. In this episode, I discuss my highlights & key takeaways from Anno's June 1996 Newtype Interview. I talk about the contradictions that exist at the very start of the anime, flexibility in representing characters, how Evangelion mimics human behavior & more. Special Thanks to Gwern & co. for allowing me to use this translation of the interview: https://www.gwern.net/docs/eva/1996-newtype-anno-interview Follow me on Twitter @GetInTheMecha for updates on the show! Visit getinthemecha.home.blog to check the relevant links / show notes / further information for the podcast. Questions, corrections, complaints: getinthemecha@gmail.com MUSIC IN THIS PRODUCTION: Difference by chasersgaming 8Bit Title Screen by Joth Mandatory Overtime by Joth
https://slatestarcodex.com/2020/06/10/the-obligatory-gpt-3-post/ I. I would be failing my brand if I didn’t write something about GPT-3, but I’m not an expert and discussion is still in its early stages. Consider this a summary of some of the interesting questions I’ve heard posed elsewhere, especially comments by gwern and nostalgebraist. Both of them are smart people who I broadly trust on AI issues, and both have done great work with GPT-2. Gwern has gotten it to write poetry, compose music, and even sort of play some chess; nostalgebraist has created nostalgebraist-autoresponder (a Tumblr written by GPT-2 trained on nostalgebraist’s own Tumblr output). Both of them disagree pretty strongly on the implications of GPT-3. I don’t know enough to resolve that disagreement, so this will be a kind of incoherent post, and hopefully stimulate some more productive comments. So: OpenAI has released a new paper, Language Models Are Few-Shot Learners, introducing GPT-3, the successor to the wildly-successful language-processing AI GPT-2. GPT-3 doesn’t have any revolutionary new advances over its predecessor. It’s just much bigger. GPT-2 had 1.5 billion parameters. GPT-3 has 175 billion. The researchers involved are very open about how it’s the same thing but bigger. Their research goal was to test how GPT-like neural networks scale. Before we get into the weeds, let’s get a quick gestalt impression of how GPT-3 does compared to GPT-2. Here’s a sample of GPT-2 trying to write an article:
The giant king Bendigeidfran sees ships coming from Ireland. It's Matholwch, the king of Ireland who has an interesting proposal to unite the two families and countries. A feast on Anglesey is arranged but Efnsien, Bendigeidfran's half-brother, is not invited and he takes a bloody and cruel revenge for this insult. This causes problems between the two kings which are solved by Bendigeidfran (the king of the Island of Britain) giving Matholwch (the king of Ireland) a magical cauldron from the Otherworld, the Cauldron of Rebirth.Matholwch recognises the cauldron and tells the story of how he first came across it and his encounter with two giants and their terrifying children.Once the cauldron is given there is peace between the two kings and their countries and Branwen sets sail with her new husband, becomes High Queen of Ireland and has a baby boy, who she calls Gwern, after the alder tree.
Live from the Future (Frontier Tech | Emerging Tech | Deep Tech)
Future of AI&ML in Creative Tech: Machine Learning in Animation, Artificial Intelligence for Creative Studios, Democratization of Machine Learning for Creatives Michael Dempsey is a partner at Compound and invests in frontier technologies such as robotics, autonomous vehicles, machine learning, immersive computing, & AI/ML for creative work which is our topic today. Prior to joining Compound, Michael was an early employee at CB Insights where he focused on researching frontier tech and then he made the jump into venture when he joined Rothenberg Ventures in SF. /// IN THIS EPISODE WE COVER /// 1. What does the future of AI&ML in Creative Tech look like 2. Automation with ML in animation and creative studios 4. AI replacing human creatives 5. Shadows and studio of the future 6. RunwayML and democratization of ML 7. VC research process for new spaces /// NOTABLE MENTIONS /// Animation Is Eating The World - https://www.michaeldempsey.me/animation/ Shadow - https://shadows.co/ RunwayML - https://runwayml.com/ Morphin - https://getmorphin.com/ Facemoji - http://facemoji.me/ The Mission of a Lifetime - https://www.goodreads.com/book/show/41454623-the-mission-of-a-lifetime @Gwern - https://twitter.com/gwern
Is Bitcoin the perfect money, or is it the worst money that solves the most essential problem, as simply and reliably as possible? In true "New Jersey” style, Bitcoin doesn't attempt to be everything, it attempts to do ONE thing as beautifully, & as unambiguously as possible. Today we read a fascinating piece from Bitcoin's rich history, originally published in 2011, and edited, added to, and updated for years since. At Gwern.net, Bitcoin is Worse is Better: https://www.gwern.net/Bitcoin-is-Worse-is-Better Don't forget to check out the original piece with a literal ocean of links and further exploration. If you want to waste 15 minutes, or half a day, there's more than enough to explore. As a great (& short) read to accompany this on the “Rise of Worse is Better” philosophy, check out the original post by Richard P. Gabriel below: https://dreamsongs.com/RiseOfWorseIsBetter.html For more on the history of Bitcoin, the open source movement, and the “worse is better” design philosophy, I highly recommend the Iterative Capital Thesis on The Cryptocurrency Phenomenon: https://anchor.fm/thecryptoconomy/episodes/Whats-Really-Driving-the-Cryptocurrency-Phenomenon----The-Iterative-Capital-Thesis-e2u7qq If you want to support the show and join the Cryptoconomy Telegram crew, become a patron below! https://www.patreon.com/thecryptoconomy --- Send in a voice message: https://podcasters.spotify.com/pod/show/bitcoinaudible/message
Is Bitcoin the perfect money, or is it the worst money that solves the most essential problem, as simply and reliably as possible? In true "New Jersey” style, Bitcoin doesn’t attempt to be everything, it attempts to do ONE thing as beautifully, & as unambiguously as possible. Today we read a fascinating piece from Bitcoin’s rich history, originally published in 2011, and edited, added to, and updated for years since. At Gwern.net, Bitcoin is Worse is Better: https://www.gwern.net/Bitcoin-is-Worse-is-Better Don’t forget to check out the original piece with a literal ocean of links and further exploration. If you want to waste 15 minutes, or half a day, there’s more than enough to explore. As a great (& short) read to accompany this on the “Rise of Worse is Better” philosophy, check out the original post by Richard P. Gabriel below: https://dreamsongs.com/RiseOfWorseIsBetter.html For more on the history of Bitcoin, the open source movement, and the “worse is better” design philosophy, I highly recommend the Iterative Capital Thesis on The Cryptocurrency Phenomenon: https://anchor.fm/thecryptoconomy/episodes/Whats-Really-Driving-the-Cryptocurrency-Phenomenon----The-Iterative-Capital-Thesis-e2u7qq If you want to support the show and join the Cryptoconomy Telegram crew, become a patron below! https://www.patreon.com/thecryptoconomy --- Send in a voice message: https://anchor.fm/thecryptoconomy/message
Gwern has answered my prayers and taught GPT-2 poetry. GPT-2 is the language processing system that OpenAI announced a few weeks ago. They are keeping the full version secret, but have released a smaller prototype version. Gwern retrained it on the Gutenberg Poetry Corpus, a 117 MB collection of pre-1923 English poetry, to create a specialized poetry AI. I previously tested the out-of-the-box version of GPT-2 and couldn’t make it understand rhyme and meter. I wrongly assumed this was a fundamental limitation: “obviously something that has never heard sound can’t derive these complex rhythms just from meaningless strings of letters.” I was wrong; it just didn’t have enough training data. Gwern’s retrained version gets both of these right, and more too. For example: Thou know’st how Menoetiades the swift Was dragged, of Hector and the fierce compeers And Phrygian warriors. So, we will dispatch Your bodies, then, yourselves to burn the ships In sacrifice; with torches and with bells To burn them, and with oxen to replace Your gallant friends for ever. But I wish That no man living has so long endured The onset of his foes, as I have power To burn or storm; for mighty Hector erst Was slain, and now returns his safe return This is all perfect iambic pentameter. I know AP English students who can’t write iambic pentameter as competently as this. (by the way, both “compeers” and “erst” are perfectly cromulent words from the period when people wrote poems like this; both show up in Shelley) It has more trouble with rhymes – my guess is a lot of the poetry it was trained on was blank verse. But when it decides it should be rhyming, it can keep it up for a little while. From its Elegy Written in a Country Churchyardfanfic:
I. Zero To One might be the first best-selling business book based on a Tumblr. Stanford student Blake Masters took Peter Thiel’s class on startups. He posted his notes on Tumblr after each lecture. They became a minor sensation. Thiel asked if he wanted to make them into a book together. He did. The title comes from Thiel’s metaphor that ordinary businessmen like restaurant owners take a product “from 1 to n” (shouldn’t this be from n to n+1?) – they build more of something that already exists. But the greatest entrepreneurs bring something “from 0 to 1” – they invent something that has never been seen before. The book has various pieces of advice for such entrepreneurs. Three sections especially struck me: on monopolies, on secrets, and on indefinite optimism. II. A short review can’t fully do justice to the book’s treatment of monopolies. Gwern’s look at commoditizing your complement almost does (as do some tweets). But the basic economic argument goes like this: In a normal industry (eg restaurant ownership) competition should drive profit margins close to zero. Want to open an Indian restaurant in Mountain View? There will be another on the same street, and two more just down the way. If you automate every process that can be automated, mercilessly pursue efficiency, and work yourself and your employees to the bone – then you can just barely compete on price. You can earn enough money to live, and to not immediately give up in disgust and go into another line of business (after all, if you didn’t earn that much, your competitors would already have given up in disgust and gone into another line of business, and your task would be easier). But the average Indian restaurant is in an economic state of nature, and its life will be nasty, brutish, and short. This was the promise of the classical economists: capitalism will optimize for consumer convenience, while keeping businesses themselves lean and hungry. And it was Marx’s warning: businesses will compete so viciously that nobody will get any money, and eventually even the capitalists themselves will long for something better. Neither the promise nor the warning has been borne out: business owners are often comfortable and sometimes rich. Why? Because they’ve escaped competition and become at least a little monopoly-like. Thiel says this is what entrepreneurs should be aiming for. He hates having to describe how businesses succeed, because he thinks it’s too anti-inductive to reduce to a formula: Tolstoy opens Anna Karenina by observing “All happy families are alike; each unhappy family is unhappy in its own way.” Business is the opposite. All happy companies are different: each one earns a monopoly by solving a unique problem. All failed companies are the same: they failed to escape competition. But he grudgingly describes four ways that a company can successfully reach monopolyhood:
"Not long ago kids would argue over which console was better now teenagers whisper cuckold and Nazi like it's considered good manners. We are in the midst of a profound rearrangement of what traits are to be incentivized and rewarded, driven by some 7 billion people each acting with what they believe to be the best of intentions, but who can foresee with what success and with what result." In this episode of Made You Think, Neil and Nat discuss The Tower. In this article we learn about ideas as memes that spread virally and art as a means of being understood. "The Judeo-Christian capital G—o—d, robed, bearded, opinionated, deadlifts, thematically male, is the avatar of civilization, just check the year. Even so, His omnipotence is not uncontested. He knows this. You should see what He did to the guys with the golden calf. God said, “Let there will be light,” and there was light. But just as Nyx preceded Zeus, that means the darkness was already there. And the house always wins at the second law of thermodynamics." We cover a wide range of topics, including: Diversity, privilege, racism & cultural stereotypes The importance belonging and the power of action Memes, memories, outrage and descent into chaos Art, Happiness and Tangents on Tacos & Texas And much more. Please enjoy, and be sure to check out the article The Tower on the Hotel Concierge blog! You can also listen on Google Play Music, SoundCloud, YouTube, or in any other podcasting app by searching “Made You Think.” If you enjoyed this episode, be sure to check out our episode on I am a Strange Loop by Douglas Hofstadter for more on self-image and self-invention or our episode on The Riddle of the Gun by Sam Harris for another fascinating article on extreme views. Be sure to join our mailing list to find out about what books are coming up, giveaways we're running, special events, and more. Links from the Episode Mentioned in the show Cuckold [00:06] Nazi [00:07] Wait but Why blog [00:47] The Last Psychiatrist [02:28] Doxing [02:52] Hotel Concierge [03:16] Samizdat [03:18] Amy Schumer offers you a look into your soul – The Last Psychiatrist [03:32] Tower of Babel [04:31] God [04:45] Virus Theory [05:39] Memes [05:58] Dominance hierarchy [06:21] Humanism [07:08] Bigotry [07:19] Dichotomy [08:38] Multiculturalism [08:55] Diversity [08:56] Nyx [11:22] Zeus [11:25] Second law of thermodynamics [11:23] Bible [12:36] Primordial Chaos [12:42] Religion [12:51] Mythology [12:51] The Big Bang [12:59] Entropy [13:33] Politics [14:20] Eros [14:32] Ananke [14:32] Super-ego [15:28] Socialism [17:02] Communism [17:03] Duration-neglect [17:42] Dilettante [18:24] Nomad life [20:41] Judaism [21:47] Yahweh [21:59] Old Testament [21:59] Circumcision [22:06] Prohibition [22:09] Christianity [22:41] Tyranny of the minority [23:05] Kosher [23:09] Postmodernist [25:44] Dogma [26:38] Superstition [26:39] Toxic Masculinity [27:38] Gene [29:05] Political Correctness [29:15] Democracy [29:18] Evolution [29:42] CNN [30:43] Virality [31:42] Clickbait [32:16] Islam [32:56] Apostasy [32:59] Birth control [33:24] Churn rate [33:57] Non-compete clause [34:00] Spread of Christianity [34:53] Missionary [34:58] Proselytization [34:59] Spanish Inquisition [35:12] Catholicism [35:16] Atheist [35:31] Halal [36:11] Saudi Arabia [36:14] Dubai [36:21] Emirati ID [36:26] Jainism [37:28] Buddhism [37:37] Schizophrenic [40:19] Hollywood [42:22] Agnostic [44:46] iPads [45:40] United States [45:48] World War II [46:38] London [46:48 The Blitz [46:48] Hedonic treadmill [47:16] JavaScript [48:58] Google [53:37] Facebook [53:38] National Memory Championship [54:38] Racism [57:32] Discrimination [57:33] Stereotypes [58:07] Hamptons [01:00:51] Carnegie Mellon [01:01:05] Carnivore Diet [01:01:24] Keto Diet [01:01:53] Paleo Diet [01:01:53] Chinese tourists [01:02:17] Louvre [01:03:53] Opiates [01:06:18] Fox News [01:06:38] World Trade Towers [01:06:51] Gun control [01:10:12] Estee Lauder [1:13:00] YC [01:13:09] Hierarchy of needs [01:15:16] Trade Tariffs [01:16:30] UK [01:17:13] Brexit [01:17:14] Middlebury school [01:18:13] Democrats Are Wrong About Republicans. Republicans Are Wrong About Democrats [1:18:17] Misperceptions of Republicans and Democrats [01:18:17] LGBTQ Cultural appropriation [01:19:50] Microaggression [01:19:51] Colonist [01:20:31] Tacos [01:21:49] Texas [01:21:55] Harvard sued for alleged discrimination against Asian American applicants - Discrimination article [01:22:48] MLB [01:24:24] Affirmative action [01:27:35] Diversity [01:27:41] Exeter [01:28:46] Minerva [01:29:36] Stanford [01:31:42] SATs [01:32:49] Harvard [01:37:12] IIT [01:37:12] UBI [01:47:00] Books mentioned The Tower 12 Rules for Life by Jordan B. Peterson [07:43] (book episode) The Myth of Sisyphus by Albert Camus [17:28] (book episode) Skin in the Game by Nassim Taleb [22:56] (Nat’s notes) (book episode) Torah [25:24] The Selfish Gene by Richard Dawkins [29:00] The Beginning of Infinity by David Deutsch [29:29] (book episode) Sapiens by Yuval Harari [29:31] (Nat’s notes) (part I, part II) I Am A Strange Loop by Douglas Hofstadter [38:26] Elephant in the Brain by Kevin Simler [38:49] (Nat’s notes) (book episode) The Bible [44:11] Harrison Bergeron by Kurt Vonnegut [45:27] The Motivation Hacker by Nick Winter [49:05] Moonwalking with Einstein by Joshua Foer [54:27] The Riddle of the Gun by Sam Harris [01:10:16] (article episode) Crazy Rich Asians by Kevin Kwan [01:23:56] Black Panther by Ta-Nehisi Coates [01:23:56] Tibetan Peach Pie by Tom Robbins [01:35:02] People mentioned Hotel Concierge (Author Unknown) Tarantino [01:17] Jordan Peterson [07:43] (12 Rules For Life episode) Thaddeus Russell [09:11] Nassim Taleb [11:40] (Antifragile episode, Skin in the Game episode) Sigmund Freud [15:17] Elon Musk [20:04] Richard Dawkins [28:59] Douglas Hofstadter [38:28] (Godel, Escher, Bach episode) Gwern [42:31] Jesus [44:01] Mary [44:08] Joseph [44:09] Nick Winter [48:52] Joshua Foer [54:32] Ed Cooke [54:47] Mark Manson [58:14] Donald Trump [58:21] Apu [59:15] Jackie Robinson [01:24:19] Ben Nelson - Founder of Minerva [01:29:34] Tom Robbins [01:35:00] Andrew Yang [01:47:04] Show Topics 00:27 – This is one of our occasional article episodes where we have found an article so interesting and profound that it warrants an episode. This article is bordering on us on a short book. Very fun to read. The article is The Tower from a blog called Hotel Concierge. We don't know who this writer is but he/she/they are amazing. 03:59 – The Tower is based on the Tower of Babel which is built to be closer to God and to unite humanity in one place, under one language. This is destroyed by God and humanity is spread across the world. The analogy is that that with modern communication, being able to talk instantly with everyone it is creating this Tower of Babel like effect, leading to outrage culture. 05:48 – Our need to be understood and to feel in control is fuelling that outrage and rebellion. Previously unprivileged groups are now succeeding in the dominance hierarchy. We all need a set of beliefs and having any beliefs are better than none. If you have no beliefs, then you just sort of become a vessel for other people’s ideas. 07:44 – There is always a tendency towards more chaos. Art in all forms is an attempt to be understood. Privilege as how easily your art and memes can be understood. Multiculturalism and diversity go counter towards the goal of assimilation. A lot of white liberals who were fighting for diversity don't actually want diversity. They don't want different cultures. They want the same culture in different colors. 10:21 – This is a well-written persuasive article. “The Judeo-Christian capital G—o—d, robed, bearded, opinionated, deadlifts, thematically male, is the avatar of civilization, just check the year. Even so, His omnipotence is not uncontested. He knows this. You should see what He did to the guys with the golden calf. God said, “Let there will be light,” and there was light. But just as Nyx preceded Zeus, that means the darkness was already there. And the house always wins at the second law of thermodynamics.” The writing style comes off as masculine. Even before God created light, there was still was darkness. Tendency back to the chaos of darkness. You have to deliberately fight against chaos. Human desires for acceptance and control. 14:40 – Acceptance and Control. “Only when we see ourselves reflected by the universe can we believe that it is part of us.” Our tendency towards chaos causes us to feel unhappy and unfulfilled “Ananke hates nothing but entropy. Ananke rewards us for turning atoms into tools and tools into appendages, so much the better if those atoms comprise other humans, viz. the high of domination” “Ananke compels us to learn, to make the universe predictable, to gain control over time, what next happens, and space, what happens next.” 16:23 – A feeling of control is important for us to be able to have any life satisfaction. “Minimum wage jobs are worse because of their pointlessness more than because of their indignity, work harder/better/faster/stronger and no one cares, screw up and you’re replaced without a missed beat.” No control over work and no sense of belonging. Working just like a cog in a machine. 17:35 – “No direction, no story; the days blur together until arthritis leaves you crippled. Stoned summers don’t get you off the hook, duration neglect compresses both good and bad sensations. No matter how pleasant, when nothing is happening, the superego starves. There’s a reason couples fight on vacation.” Not only do we need control and reflection but we need a narrative that has a story to it. Happiness comes from working towards a goal. 18:13 – “Being a dilettante is too easy, flatlines don't form memories.” You need an arc to your story a narrative. “Reinventing yourself between brunches feels good.” 18:52 – It does seem possible to achieve multiple things in your life with focus which is different to bouncing around to new ideas “the illusion of control—until you’ve dreamt the same dreams too many times and they no longer get you high.” Getting excited over a vision is not making progress. If you don’t pick something you will be running around aimlessly. 21:25 – The reason God destroyed the Tower of Babel was he wanted to punish this consolidation under one belief system, one language. Ideas as memes. If something isn't carefully designed then it can spread fairly naturally. Ideas are going to naturally evolve and spread and so religions get weaker over time. It is easier to wholly conform to a religion than to partially conform and have to make those decisions. Kosher vs non-Kosher. By keeping these very strict rules it made these religious concepts spread virally. Criticism to Humanism because it doesn't really give any prescriptions. Ideas need to take a concrete stand otherwise they are a weak belief system. Your religion becomes a decision making framework but if you are just open to everything then you have no answers. 26:40 – Reasoning through everything is exhausting. Trying to place blame on bad actions on someone's upbringing or genetics means nothing is every anyone’s fault and no-one can be held accountable. “When someone slaps your hypothetical girlfriend's ass in the proverbial club, what does humanism say you should do? At least toxic masculinity has an answer.” 27:45 – If you don't have a code of conduct one will be provided for you. We have a suppressed memetic immune system. It’s statistically inevitable that every meme will attain its most infectious form. A meme is a term introduced by Richard Dawkins in The Selfish Gene to reference an idea that spreads through a culture, like a gene spreads through the animal world. Political correctness is a meme and democracy is meme. 29:42 – Just like evolution happens faster in animals that breed more often, the more iterations memes go through the more viral they become. We are now in an era where free flow of information is causing us to consume all of these viral memes. Do the ideas control you or do you control the ideas? One can actually control their sources of inputs. Then one becomes a vessel for the ideas that take control of her. 31:54 – “A pathogen that is too restrained will lose out in competition to a more aggressive strain that diverts more host resources to its own reproduction.” Replication is the key to success. High switching cost, once one have publicly committed to these ideas. It’s difficult to move out of a religion due to surroundings and strong deterrents. Parents pass their religion on to their children and with few leaving a religion this just creates more people passing on these beliefs. “But as long as transmission continues despite the virulence, virulent pathogens will have the advantage.” 35:31 – Some religions have weakened over time, you can say you're Christian and you don't actually have to do anything. Alcohol consumption in UAE. Compliance due to deterrents. Trying too hard to not have any memes means you become susceptible to becoming a vessel and getting sucked into the ideas that are around you. All art is memes and all memes want to do is spread. Human desire to share information. 38:51 – “Art is compressed communication. The better the compression, with regards to both perceived fidelity and amount of information contained, the more artful the art”. Both writing and art are ways to purify your mind from chaos. Any form of media is art, this podcast is too. 39:15 – “I think “ease of having one’s art understood” is a defensible conception of “privilege”. Being understood is a huge part of life satisfaction. Diversity in Hollywood, it makes sense that it doesn’t get any more diverse as they are making films for their target culture. “Gwern seems to think that if we banned Guardians of the Galaxy the relevant audience would switch to Douglas Hofstadter. The assumption here is that nonfiction exists, distinct from and more truthful than fiction. I don’t buy it.” 44:15 – Pre-Renaissance – The popular memes of the time of religion was the art and message that lasted. Your religion is another form of privilege. “So who has more privilege, a cis-white-hetero billionaire with full-checklist depression or an unemployed transgender black woman who, despite this, is basically content? Either the billionaire has less privilege, in which case “privilege” is a Harrison Bergeron happiness tax, or the suicidal person has more privilege, in which case, how much does “privilege” matter, really.” “I’ve met Upper East Side kids less fulfilled by their iPads than Sub-Saharan kids without running water were with “catch the rock.” Happiness and privilege are not the same. You can be happy without privilege and depressed with wealth. Statistics on suicide being mostly wealthy younger people or those at the end of their life. Suicide and depression rates go down during war time. Being well off is not the solution for happiness, doesn't automatically make you happy. “Saved wealth buffers against tragedy but suffering finds a way.” 47:36 – “Like a forgotten drive to work, we are amnestic to routine, and memories of “eat, menial labor, sleep” blur together in the rearview mirror. The important-yet-oft-forgotten obverse is that, independent of happiness, wealth buys freedom from routine.” “A night at the opera is no more fun than pizza and brewskis, but the former is novel, for a time, and the latter soon fades from memory.” The importance of memory on happiness. Novelty of an experience puts a little placeholder in our memory, a hedonistic measurement. Most of traveling isn’t actually that fun but there are moments that do stick with you. Enjoyment tracking of extreme sports vs video games. We remember the peaks more than the consistent or length of enjoyment. You don’t remember the world like a spreadsheet. Entrepreneurial businesses vs standard job. Earnings might be the same but there are more more memories and signposts throughout the entrepreneurial journey that make it worth it . You only get the peaks from climbing your own mountain. 52:27 – Experiencing self vs Remembering self. You need those indicators in your memories for happiness, to reflect back to you who you are. Ed Cook plans parties with several different thematic parts so that it feels like a multitude of new experiences and memories all within one three hour party. Mark Manson recommends going to multiple bars on dates so it feels like you have a lot to remember and look back on. Same how large or distinct life experiences feel lengthy even if they were just for one day. 57:10 – “Contrary to the pop-ethical consensus, discrimination is not caused by having too many stereotypes but too few. If you wake to find a lithe man dressed in all black standing over your bed and holding a katana, it may be quite reasonable to infer that he is a hired ninja and that you are in grave danger. If, however, you assume this about every East Asian man that you encounter, you lack nuance of stereotypes.” Nuances of stereotypes within race, religion and politics. “Race and gender are social constructs, but the cultural norms that correlate with race and gender—and goth, prep, jock, etc—are real.” Where there are these intergroup conflicts over trivial differences. To counteract a stereotype you need an alternative worldview that narrows down that stereotype into a more nuanced view. Framing an argument against stereotypes as don't be racist join or die, fails and it's infuriatingly counterproductive because it doesn't create a new stereotype to work with. Stereotypes portfolio. 01:06:00 – “The racist stay racist and now feel that society is out to get them. hashtag MAGA.” Being told you are racist is really counterproductive. The opposite of feeling you belong. This causes people to accept the label and not change their world view. 01:07:54 – “Once acceptance becomes orthodoxy even private dissent becomes grounds for ostracization. No matter your other convictions you become a stereotype that society will single-issue-vote off the island, just ask Brendan Eich. Of course I support gay marriage; my point is that if one’s views before were “well, it is kind of weird,” then being told “soon there will be enough of us that we won’t have to deal with people like you at all”—that makes homophobia logical. And at least you can change your opinion of gay marriage. It’s much harder to change being white and low-class.” You can’t talk about the middle ground. It is the two extremes that are virulent. The opinions seem to be all or nothing. False correlation between number of words written about something leading people to think that thing is more prevalent in society, like words in an article relate to more crime. Intermittent fasting and the bubble of understanding within social groups. When we meet someone outside of our own knowledge we realize the bubble that we are in. “No one is born hateful, stranger anxiety doesn’t even start til six months. But culture war is history being written by the winners, first draft. Conservatives are offered the choice of fighting the ever-changing tides of social values or toiling away in obscurity while journalists pretend to like soccer. People want to be understood. And they will rage all sorts of ways against the dying of the light.” 01:14:38 – “The upper-middle class—mostly urban, mostly blue—claims by far the largest share of America’s income, more than the middle class and far more than the 1%. This, despite their protests to the contrary, gives them disproportionate control over the news and entertainment industry, which in cyberpunk America is tantamount to controlling the culture.” Urban culture controls the media so you’ve got the rural conservative that feels constantly misunderstood that leads to Trump. Global need to be understood, “I’m saying that the specific way the media talks about race and culture, creating an incoherent set of rules regarding “appropriation” and etiquette, proudly crying out that this is the end of those boring, selfish white people, has made the situation much, much worse. If the left wanted to prevent assimilation, there would be no more effective way.” When there are all these rules, where everything is cultural appropriation then it makes it easier for people to throw their hands up and admit defeat and continue with their world view as they cannot correct it. 01:21:05 – Every culture has a past. Judge people on how they are today. Stop punishing people pay for what their ancestors did. Forced assimilation doesn’t go well – like asking people to have X percent of your meals as Mexican food vs allowing people to naturally adopt a culture and its food – like tacos in Texas. 01:22:48 – Asians in America are succeeding at everything faster that most other groups ever have. These are positive changes to the power structure. Make what you want to see. Jackie Robinson as the first Black baseball player. His coach knew as the first black player he would receive aggression and wanted to make sure that his reaction to this would offer an alternative stereotype than the expected violence. This would pave the way for others after him. “Ergo, you decide to hire some minority writers to write your minority characters. Applications rush in. How are you going to decide who makes the cut? “You know, the usual. Interview. Letters of recommendation. College transcript—” This is how the system protects itself against change. At every step of the social hierarchy, what is required for a person of color or a woman to succeed is determined by the values of the ruling class. I think that’s “white patriarchal supremacy,” but don’t quote me. Of course, the same principle applies to e.g. homosexuals and Jews; thankfully those traits are easier to hide.” 01:29:41 – Extracurricular activities weed out poor people as they are exclusive to those with disposable income. Is it true diversity if just the rich kids from Buenos Aires or Mexico City get into a school but poorer Latino kids in the US don’t stand a chance. Tom Robbins spent 8 years while trying to get his big break in writing. That takes privilege to have that time to create art and not to have student debt. “These “gifted” but “troubled” people will bumble through their whole lives, getting second through tenth chances, mysteriously finding that anything involving an authority figure goes their way, as they ruthlessly condemn capitalist injustice, never realizing that criticizing privilege is…the language of privilege.” 1:37:30 – When you think of how many do not have privilege and are not understood and cannot express themselves. It's easy to imagine all of the outcasts conspiring to destroy that Tower. Nobody wants to feel like their beliefs are not allowed. 01:41:42 – The Prescription. “What’s the solution? There’s only one and it is so radical that I hesitate to even suggest it: stop being a pleb. You. Stop treating words as a substitute for action. Stop paying time and money into institutions that loan a symbol of mastery in lieu of actual depth. Stop looking for such symbols in others. Stop judging policies by the veneer of good intention rather than the details of consequence. Stop looking past people, because this is all the same, isn’t it? Working from a map, a stereotype, a symbol, instead fighting for the complex truth? None of this horror requires malice or even stupidity. All it requires is taking the easy way out.” 01:42:05 – Such great writing, narrative style is just so fun. Ton of bonus material for this one. If you're not already supporting us on Patreon, you can go to patreon.com/madeyouthink. You'll get all our recordings for the episode, detailed notes on the article including bolding and highlighting everything. You’ll see which articles are coming up and you can also join us for our monthly hangouts. Thank you to everybody who has joined we love you. We do also have a support page on the site madeyouthinkpodcast.com/support. Just tell your friends. Leave a review on iTunes. If you want to get in touch with us tweeting is probably best option. Do it at @TheRealNeilS and @NatEliason. Until next week! If you enjoyed this episode, don’t forget to subscribe at https://madeyouthinkpodcast.com
EPISODE 65 // Gwern.net has published a list of small ways in which ordinary life has gotten better. Wow! That's very much up this podcast's alley! So we discuss, we note, we nitpick. We talk spaced repetition, planned de-obsolescence of planned obsolescence, fake holidays, and phone tag. Plugs this week: saltwater shoes (kate) david bowie subway cards (hao) sega dreamcast (jack) downersradio@gmail.com // @downersradio