Podcasts about cerebros

  • 235PODCASTS
  • 380EPISODES
  • 44mAVG DURATION
  • 1EPISODE EVERY OTHER WEEK
  • Nov 6, 2024LATEST

POPULARITY

20172018201920202021202220232024


Best podcasts about cerebros

Latest podcast episodes about cerebros

Buenos Días Madrid OM
Los secretos del cerebro, al alcance en la Semana de la Ciencia de Madrid

Buenos Días Madrid OM

Play Episode Listen Later Nov 6, 2024 8:43


Seguimos en Onda Madrid descubriendo la ciencia que manda en nuestra región y esta vez hemos contado en el programa Buenos Días Madrid con Enrique García, investigador en el Hospital de la Princesa, con Isidoro González Álvaro, director científico en el Instituto de Investigación Sanitaria del Hospital de la Princesa y con varios alumnos (Teo y Nora). Una de las actividades más llamativas de la Semana de la Ciencia se llama Cerebros y Bisturíes en neurocirugía: del quirófano al laboratorio, que va a permitir que alumnos de 3º y 4º de la ESO y de Bachillerato puedan ver una cirugía neuroquirúrgica en el quirófano. Alumnos de la ESO se convierten en científicos por un día También pueden visitar el laboratorio para observar células cerebrales y experimentos, así como teñir sus propias células.

Última Hora Caracol
En libertad uno de los cerebros del escándalo de corrupción de 'Odebrecht', José Elías Melo.

Última Hora Caracol

Play Episode Listen Later Oct 28, 2024 3:58


Resumen informativo con las noticias más destacadas de Colombia del lunes 28 de octubre 2024 a las tres de la tarde.

El Podcast de Marc Vidal
¿Por qué HUYEN los JÓVENES de España? FUGA de CEREBROS récord - Vlog de Marc Vidal

El Podcast de Marc Vidal

Play Episode Listen Later Oct 17, 2024 13:39


En este video analizamos la alarmante fuga de cerebros en España, un fenómeno que está afectando gravemente el futuro del país. Miles de jóvenes cualificados abandonan España cada año en busca de mejores oportunidades laborales, dejando una huella profunda en la economía y la demografía. Exploraremos las cifras, las causas estructurales detrás de esta tendencia y sus consecuencias a largo plazo. Además, discutiremos cómo otros países están captando este talento y qué puede hacer España para revertir esta situación. ¡No te lo pierdas si te interesa el futuro de la economía española!Conviértete en un seguidor de este podcast: https://www.spreaker.com/podcast/el-podcast-de-marc-vidal--5231699/support.

Mi Última Neurona
Cerebros, Conciencia y Divulgación Científica c/ Dr. Pedro Maldonado

Mi Última Neurona

Play Episode Listen Later Sep 16, 2024 77:07


En este episodio de, "Mi Última Neurona", Jessica Chomik-Morales y el Dr. Pedro Maldonado participan en una fascinante conversación sobre el cerebro, la conciencia y los límites de la inteligencia artificial. Los temas cubren desde cómo el cerebro genera conductas y percepciones hasta la posibilidad de crear un cerebro artificial y sus implicaciones en la conciencia. Además, se discute la importancia de la divulgación científica, la conexión entre ciencia y sociedad, y la promoción del pensamiento crítico. Este diálogo es un fascinante recorrido por las investigaciones y trayectoria académica del Dr. Maldonado y reflexiones sobre la ciencia en nuestra sociedad.Ver el episodio aqui: youtube.com/@miultimaneuronaSitio Web: https://www.miultimaneurona.comMarcas de Tiempo:00:00 Intro00:47 Presentación01:40 ¿Qué le atrajo al Dr. Maldonado a la ciencia?04:14 Uso de modelos animales para la aplicación en humanos05:21 Interés por la percepción10:00 Estudios sobre la visión con monos10:58 Estudios en humanos a través de electroencefalografía12:03 Estudios de neurociencia cognitiva de Jean Livet13:13 Sobre libre albedrío15:43 Las preguntas e investigaciones en el laboratorio con células cultivadas 22:43 Ventajas de los cultivos celulares y cómo el algoritmo de aprendizaje neuronal se puede aplicar a IA27:08 ¿Se puede entrenar a un cerebro cultivado? Ejemplo a partir de SNF 2019 y el cerebro que se estaba creando en laboratorio.31:14 Distinguir entre la realidad y la simulación que se realiza en el cerebro sobre una realidad34:16 Cómo funcionan los altos y bajos de consciencia, en humanos y modelos animales36:35 Investigaciones de percepción activa a nivel humano40:03 Investigación en modelos animales a partir de la colaboración con otros investigadores del mundo42:31 En Neurociencia hay muchos datos pero poca teoría46:13 Cómo realizar un experimento de gran impacto con pocos recursos47:15 El libro de Pedro Maldonado: “¿Por qué tenemos el cerebro en la cabeza?51:02 El deber del científico de conectarse con la sociedad para informar, caso del alcoholismo52:47 Mitos de la ciencia que se construyen por la prensa: El efecto Mozart54:12 Otro mito: Usamos 10% del cerebro56:54 Falta que los científicos aprendan la forma de divulgar sus estudios y resultados al público no científico.59:53 Cómo desarrollar pensamiento crítico en la educación1:00:39 Los dos mecanismos que impiden la flexibilidad mental: Default mode network y el confabulador1:04:41 Situación académica en neurociencia en Chile1:13:31 Consejos para los oyentes1:16:14 OutroLinks de interés para este episodio:https://mitpress.mit.edu/978026252579...https://uchile.cl/publicaciones/15734...https://neurotree.org/beta/publicatio...Esta temporada es patrocinada por el McGovern Brain Institute, MIT Department of Brain and Cognitive Sciences, el Picower Center for Learning and Memory, y MIT International Science and Technology InitiativesMúsica y diseño de sonido por David Samuel Production

Tras la tormenta
Tras la tormenta | Del revés 2 y los cerebros de Elon Musk

Tras la tormenta

Play Episode Listen Later Sep 16, 2024 53:34


En el refugio sonoro de hoy aprendemos educación emocional con una de las películas del año: Del revés 2. Una historia para todos los públicos y con gran repercusión que analizamos con una especialista, nuestra psiquiatra Anabel González. En la sala de cine de al lado, nos espera Alfonso Levy con un clásico del séptimo arte que nos trasmite ganas de volver a empezar en este inicio de curso. Nos reencontramos con nuestro neurólogo Jesús Porta que nos ayuda a interpretar algunas noticias llamativas sobre el cerebro (por ejemplo, algunas iniciativas de Elon Musk). La guinda del programa es mérito de nuestros valiosos oyentes que hoy son especialmente generosos. Gracias de corazón por ser caminante tras la tormenta.Escuchar audio

Chistes y Mas!!
S2 Guerra de cerebros part 2

Chistes y Mas!!

Play Episode Listen Later Sep 11, 2024 56:12


Sigue la batalla de neuronas jejeje

Chistes y Mas!!
S2 Guerra de cerebros part 1

Chistes y Mas!!

Play Episode Listen Later Sep 4, 2024 39:21


Julio y El Marciano se enfretan cerebro a cerebro

SER Málaga
¿Tenemos dos cerebros? El eje intestino-cerebro y su relación con las enfermedades mentales con Macu Infantes

SER Málaga

Play Episode Listen Later Aug 27, 2024 12:06


MLP - Me lo Platicaron
MLP - CEREBROS PEQUEÑOS, PANTALLAS GRANDES!

MLP - Me lo Platicaron

Play Episode Listen Later Aug 12, 2024 18:50


Quiero que piensen en esto desde la perspectiva del cerebro. Ver un programa de ritmo rápido es como intentar seguir una maratón sin haber entrenado. El cerebro del niño está tratando de procesar todo lo que pasa en la pantalla, y eso puede ser agotador. Luego, cuando el programa termina, su cerebro sigue funcionando a ese ritmo frenético, pero ya no tiene una pantalla para seguir, lo que lo deja un poco desorientado. Cerebros Pequeños, Pantallas Grandes", en este Episodio nos sumergimos en el fascinante mundo de los niños y cómo las pantallas influyen en esos cerebritos en desarrollo. #Podcast #MeLoPlaticaron #Podcastshow #Ciencia #TV #Niños #ControlParental --- Support this podcast: https://podcasters.spotify.com/pod/show/me-lo-platicaron/support

10 minutos con Sami
Caída bursátil, Google antimonopolio y cerebros cuánticos

10 minutos con Sami

Play Episode Listen Later Aug 6, 2024 5:35


En el episodio de hoy de "10 Minutos con Sami", exploramos tres noticias impactantes que están sacudiendo el mundo. Comenzamos con la dramática caída de los mercados bursátiles globales, incluyendo el peor desplome del índice Nikkei japonés desde 1987, mientras crecen los temores de una recesión en Estados Unidos. Luego, analizamos la histórica decisión judicial contra Google, que ha sido declarada culpable de violar las leyes antimonopolio en el mercado de búsquedas online, lo que podría cambiar radicalmente el panorama tecnológico. Finalmente, nos adentramos en el fascinante mundo de la física cuántica y su posible relación con la conciencia humana. Discutimos las últimas investigaciones sobre el entrelazamiento cuántico en el cerebro y cómo esta teoría podría revolucionar nuestra comprensión de la mente. Acompáñanos en este viaje a través de la economía global, la tecnología y la ciencia de vanguardia, mientras desentrañamos las implicaciones de estos eventos trascendentales. Fuentes: https://www.latimes.com/business/story/2024-08-05/global-stock-markets-plunge-us-economy , https://www.pbs.org/newshour/economy/dow-drops-nearly-1000-points-and-japanese-stocks-suffer-worst-crash-since-1987-on-economy-fears , https://www.whro.org/2024-08-05/google-loses-massive-antitrust-case-over-its-search-dominance , https://thequantuminsider.com/2024/08/03/researchers-explore-quantum-entanglements-potential-role-in-neural-synchronization/ , https://thedebrief.org/a-quantum-brain-could-solve-the-hard-problem-of-consciousness-new-research-suggests/ Redes: Puedes buscarme por redes sociales como Threads, Twitter e Instagram con @olivernabani, y puedes encontrarme habitualmente en Twitch: http://twitch.tv/olivernabani Puedes encontrar tanto este Podcast como otro contenido original en YouTube: https://youtube.com/olivernabani Además si quieres participar en la comunidad mashain, tenemos un server de Discord donde compartimos nuestras inquietudes: https://discord.gg/7M2SEfbF Un canal de Telegram donde os aviso de novedades y contenidos: https://t.me/sedicemashain Y un canal de Whatsapp: https://whatsapp.com/channel/0029VaCSKOzFCCoavMoLwX43 Y por supuesto lo más importante, recuerda: No se dice Machine, se dice Mashain

Buenos Días América
11 de Septiembre: cerebros del ataque llegan a un inesperado acuerdo

Buenos Días América

Play Episode Listen Later Aug 1, 2024 59:08


En un giro inesperado, los autores intelectuales de los ataques terroristas del 11 de septiembre han llegado a un acuerdo judicial que ha generado controversia y sorpresa en todo el mundo. A más de dos décadas de los trágicos eventos, este pacto plantea nuevas preguntas sobre la justicia, la seguridad y el cierre de uno de los capítulos más oscuros de la historia reciente. Entérate de  las últimas noticias con el mejor análisis de nuestros especialistas invitados.  Ponte los audífonos y escucha el podcast del Buenos Días América en Uforia App, Apple Podcast, Youtube, Spotify o donde sea que escuches podcasts. 

Radio Albacete
Un estudio de la UCLM con cerebros avanza en el Alzheimer y abre nuevas vías para su detección temprana

Radio Albacete

Play Episode Listen Later Jul 24, 2024 25:12


Hijos de la Resistencia
CEREBROS FELICES - Disponible hasta el 31 de julio

Hijos de la Resistencia

Play Episode Listen Later Jul 19, 2024 38:02


Envía tu solicitud para entrenar con nosotros aquí https://www.hijosdelaresistencia.com/formulario Aquí tienes toda la información https://hijosdelaresistencia.com/entrena-en-hijos-de-la-resistencia/

SER Málaga
Palabras al revés, cerebros al derecho. La habilidad para invertir el habla

SER Málaga

Play Episode Listen Later Jul 16, 2024 13:33


Geek-Tech Shorts
PixxelCast 109 - Tramita tu Pajaporte Español y Nintendo Ataca de Nuevo

Geek-Tech Shorts

Play Episode Listen Later Jul 5, 2024 100:33


Suscríbete para más: https://www.youtube.com/c/pixxelers Sígueme en redes: https://linktr.ee/jlrock92 Discord: https://discord.gg/EFkfqhMZDU NOTAS: - Steam Summer Sale: https://store.steampowered.com/ - Pajaporte Español - Donkey Kong: https://tinyurl.com/2nk9r5y2 - Nintendo vs Piratas: https://tinyurl.com/4b4ceax5 - Xbox Keystone: https://tinyurl.com/29juu3fc - Despidos no despidos: https://tinyurl.com/yn86fzkr - Despidos Japoneses: https://tinyurl.com/nbjee3m4 - Dead Rising: https://www.youtube.com/live/rd83elIk_aI - Intel CPU óptico: https://tinyurl.com/9zpfe4cn - Apple RCS: https://tinyurl.com/msaay4d9 - Apple Pay cargos: https://tinyurl.com/5dw8xfuj - Bots rusos: https://tinyurl.com/2s47p5he - UE vs Meta: https://tinyurl.com/yv46htct - Pixel 9 filtrado: https://tinyurl.com/ycxjf2dd - Cerebros artificiales: https://tinyurl.com/yedp4wwy - Record en fibra óptica: https://tinyurl.com/66auzebf - Thunderbolt 5: https://tinyurl.com/24ch5xx6 - CPUs sin electricidad: https://tinyurl.com/42xhxh6j - Videojuego católico: https://youtu.be/l-aFFqIQ02U

El Podcast Fitness de FullMusculo
Clip: La Tecnología está DEBILITANDO Nuestros CEREBROS

El Podcast Fitness de FullMusculo

Play Episode Listen Later Jun 27, 2024 12:04


Escucha el episodio completo aquí: https://open.spotify.com/episode/6hyt0sfRFK89gJX5kSGiOt?si=ZFybiQWURBe3pYsTGoT5cA Puedes ver el episodio completo aquí: https://youtu.be/LIEO6qGUWcA La tecnología y la medicina han elevado nuestra supervivencia, pero a menudo nuestros comportamientos siguen siendo primitivos. Nos desvelamos frente a pantallas, desconectados de la naturaleza, a pesar de la abundante información disponible. Experto en Biohacking explica que hoy vivimos en una era de "infoxicación", sobrecargados de datos pero sin saber aplicarlos para mejorar nuestra vida. Aún así, es crucial combinar la sabiduría ancestral con los avances tecnológicos. Nuestro cerebro busca evitar el dolor, ahorrar energía y sobrevivir. En un mundo donde los recursos se agotan, necesitamos la tecnología para expandirnos y sobrevivir. La clave está en usar la tecnología para apoyarnos y alcanzar una longevidad saludable, integrando prácticas ancestrales con el conocimiento moderno, tal como lo promueve el biohacking, que busca maximizar nuestras capacidades humanas para una vida plena y feliz.

La Estrategia del Día Colombia
Fuga de cerebros, Nubank, Corficolombiana y Bolivia

La Estrategia del Día Colombia

Play Episode Listen Later Jun 27, 2024 10:13


Hoy hablaremos sobre la fuga de talento, que se acelera en América Latina; del nuevo negocio de Nubank, de la nueva presidenta de Corficolombiana y de la situación en Bolivia   https://whatsapp.com/channel/0029Va4POHt6WaKhONGGhi38

Alternativa 3
Procesadores basados en mini cerebros humanos

Alternativa 3

Play Episode Listen Later Jun 14, 2024 18:02


La empresa FinalSpark ha desarrollado la tecnología para crear supercomputadoras con mini cerebros humanos creados a partir de material fetal.La noticia salió en las siguientes fuentes:1) MSN https://www.msn.com/es-mx/noticias/tecnologia/empresa-suiza-conecta-16-minicerebros-humanos-para-crear-un-ordenador-vivo/ar-BB1o2qST?ocid=msedgntp&pc=ASTS&cvid=2d1a6040fc8f4d51ac13c2c959abb719&ei=22 2) DW https://www.dw.com/es/empresa-suiza-conecta-16-minicerebros-humanos-para-crear-un-ordenador-vivo/a-69337219 --- Send in a voice message: https://podcasters.spotify.com/pod/show/a3misterio/message

Creativos radio
Procesadores basados en mini cerebros humanos

Creativos radio

Play Episode Listen Later Jun 13, 2024 18:02


La empresa FinalSpark ha desarrollado la tecnología para crear supercomputadoras con mini cerebros humanos creados a partir de material fetal.La noticia salió en las siguientes fuentes:1) MSN https://www.msn.com/es-mx/noticias/tecnologia/empresa-suiza-conecta-16-minicerebros-humanos-para-crear-un-ordenador-vivo/ar-BB1o2qST?ocid=msedgntp&pc=ASTS&cvid=2d1a6040fc8f4d51ac13c2c959abb719&ei=22 2) DW https://www.dw.com/es/empresa-suiza-conecta-16-minicerebros-humanos-para-crear-un-ordenador-vivo/a-69337219 --- Send in a voice message: https://podcasters.spotify.com/pod/show/creativos/message

Gana Tu Día: El Podcast
La DROGA de la ATENCIÓN: Transformando nuestros cerebros y la sociedad | El GPS de tu Vida Ep. 054

Gana Tu Día: El Podcast

Play Episode Listen Later Jun 7, 2024 12:23


 La DROGA de la ATENCIÓN: Transformando nuestros cerebros y la sociedad | El GPS de tu Vida Ep. 054 #ganatudia¡Bienvenidos a El GPS de tu Vida! En este episodio, abordamos un tema que está transformando nuestros cerebros y la sociedad: la droga de la atención. Si alguna vez te has sentido atrapado en la búsqueda constante de aprobación en redes sociales, este video es para ti. Descubre cómo esta adicción afecta tu autoestima y cómo puedes superarla para vivir una vida más auténtica y plena.Si quieres saber todo sobre los próximos eventos y oportunidades de Coaching TIENES que unirte a nuestra comunidad aquí: https://ganatudia.us4.list-manage.com/subscribe?u=a6dbd1203de8cd57a2bcd1122&id=03a9ced497Si quieres unirte a nuestra misión de impactar 1,000,000 deja tu información aqui:https://forms.gle/HrFRmr4sDZB2wHbv9Si quieres coaching con Carlos o unirte a la comunidad, aquí tienes detalles:https://linktr.ee/CarlosFigueroa¡Anuncio Importante al Final del Episodio!Visita nuestro sitio web: https://construyeturuta.com/En este episodio, exploramos cómo la búsqueda de validación y aprobación en redes sociales puede actuar como una droga potente. Aquí te contamos todo lo que necesitas saber:-Definición y Problemas: No solo hablamos de sustancias que ingerimos;    la necesidad de validación es una droga poderosa que afecta nuestro  cerebro y emociones.-Impacto de la Dopamina: La dopamina genera un crash emocional más fuerte que el high inicial, afectando nuestra autoestima y autoconcepto.-La Cura a esta Droga: Aprende por qué no necesitas la validación externa para sentirte exitoso y cómo muchas personas con muchos likes y follows se sienten vacías.¡No olvides darle like a este video, suscribirte a nuestro canal y activar la campanita para recibir notificaciones de nuestros próximos videos sobre desarrollo personal y hábitos Redes Carloshttp://www.tiktok.com/carlosefigueroaprhttp://www.instagram.com/carlosefigueroaRedes Gana Tu Díahttp://www.instagram.com/ganatudia http://www.tiktok.com/ganatudiahttp://www.ganatudia.cominfo@ganatudia.com

Muy Interesante - Grandes Reportajes
Mentes sincronizadas ¿se sincronizan nuestros cerebros? (Neurología)

Muy Interesante - Grandes Reportajes

Play Episode Listen Later Jun 4, 2024 12:16


Durante las conversaciones y al trabajar en equipo, el ritmo de las ondas neuronales de las personas que dialogan, interactúan o colaboran se coordina, se acompasa físicamente, provocando que los cerebros del emisor y del receptor de una charla o de los participantes en una tarea común se sincronicen entre sí para procesar mejor la información sensorial. ¿Cómo es posible que los cerebros de las personas se sincronicen en función de lo que escuchan, ajustando sus ritmos neuronales a los estímulos auditivos? «El habla es un estímulo que se desarrolla en el tiempo, con intervalos en los que hay contenido e intervalos de silencio. Esta alternancia tiene un patrón cuasi rítmico, tanto a nivel de palabras como a nivel de sílabas. Nuestro cerebro es sensible a este patrón e invierte sus recursos solamente en los momentos en los que hay contenido en el habla (o sea, por ejemplo, las palabras). Utiliza el código CIENCIADIGITAL y obtén tu descuento en Muy Interesante, sigue con este link https://bit.ly/3TYwx9a Déjanos tu comentario en Ivoox o Spotify, o escríbenos a podcast@zinetmedia.es ¿Nos ayudas? Comparte nuestro contenido en redes sociales . Texto: Henar L Senovilla Dirección, locución y producción: Iván Patxi Gómez Gallego @ivanpatxi Contacto de publicidad en podcast: podcast@zinetmedia.es

El Faro
El Faro | El Repor de Irene | Hay casi 10.000 cerebros humanos escondidos en el sótano de una universidad de Dinamarca

El Faro

Play Episode Listen Later Jun 4, 2024 4:12


El sótano de la Universidad de Dinamarca del Sur perfectamente podría haberse sacado de una película de terror. Allí se esconde, dentro de tarros con formol, la mayor colección de cerebros humanos del mundo. En total 9.479 cerebros extraídos de pacientes psiquiátricos entre 1945 y 1982 con los que se creía que en un futuro, gracias a los avances de la ciencia, los investigadores podrían descubrir cómo diagnosticar y tratar los trastornos mentales. Sin embargo, en una sociedad más moderna como la actual, la colección de cerebros no escapa al dilema ético. 

La rosa de los vientos
¡En Egipto operaban cerebros con cáncer hace miles de años!

La rosa de los vientos

Play Episode Listen Later Jun 3, 2024 76:35


El arqueólogo y profesor de prehistoria de la Universidad de Santiago, Edgard Camarós, da todos los detalles sobre el cráneo en el cual los médicos egipcios hicieron incisiones sobre los tumores cancerígenos. La canoa que usaban los mayas para su ritual. El influencer millenial a punto de ser canonizado. Descubren un brazo del río Nilo que daría explicación a la construcción de las pirámides. Irene Gut, la enfermera polaca que salvó a varios judíos escondiéndoles en la casa de un oficial nazi. Falso lama, falsa vidente, dos casos llevados ante la justicia. Curiosa y extraña enfermedad medieval con mucha mortandad Y la sorpresa de un "bonustrack" sobre el comportamiento de animales.

Masaje cerebral
LETRAS Y CEREBROS

Masaje cerebral

Play Episode Listen Later May 31, 2024 62:08


Exploramos la fértil brecha entre neurología, psiquiatría y literatura de mano de dos de nuestros médicos escritores favoritos: Oliver Sacks y Orlando Mondragón.

Hoy empieza todo 2
Hoy empieza todo 2 - 'Los Planetas', 'Cerebros Imperfectos' y 'Suicidio, el dolor invisible' - 22/05/24

Hoy empieza todo 2

Play Episode Listen Later May 22, 2024 118:49


Abrimos con las Noticias Oh!Cultas hablando con José Manuel Sebastián y Gus Iglesias sobre el nuevo podcast que presentan para Radio3extra 'Los Planetas, una nueva dimensión'.Seguimos con una nueva entrega de GAME-BOYS con la ilustradora y swiftie Patricia Agüero con la que hablamos de su libro 'Taylor Swift: La era de la generación swiftie'. Mesa de debate de podcast con Manuel Bartual, Conchi Cejudo, Constan Sotoca y Teresa Familiar. Cerramos con la Barra Libre de Aloma Rodríguez que trae a Irish Murdoch y su libro 'Algo del otro mundo'.Escuchar audio

Hoy empieza todo 2
Hoy empieza todo 2 - Podcast con Manuel Bartual, Conchi Cejudo y Cerebros Imperfectos - 22/05/2024

Hoy empieza todo 2

Play Episode Listen Later May 22, 2024 27:58


Hablamos con el guionista Manuel Bartual, con la creadora de la docuserie 'Suicidio, el dolor invisible' Conchi Cejudo, y con los autores del podcast 'Cerebros Imperfectos' Constan Sotoca y Teresa Familiar sobre el mundo de los podcast. Escuchar audio

Daily Easy Spanish
El misterio de los cerebros que se han preservado durante miles de años de forma natural

Daily Easy Spanish

Play Episode Listen Later Apr 9, 2024 18:11


Según los investigadores nos podrían dar pistas sobre nuestro pasado para beneficio de nuestro futuro.

Fernanda Familiar
La fuga de cerebros en tiempos de la 4T - Dr. Andrew Almazán

Fernanda Familiar

Play Episode Listen Later Apr 7, 2024 10:55


Redes Sociales: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Instagram⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠   ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Facebook⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠   ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Twitter (X)⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ --- Send in a voice message: https://podcasters.spotify.com/pod/show/geniofm/message

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0
Latent Space Chats: NLW (Four Wars, GPT5), Josh Albrecht/Ali Rohde (TNAI), Dylan Patel/Semianalysis (Groq), Milind Naphade (Nvidia GTC), Personal AI (ft. Harrison Chase — LangFriend/LangMem)

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Apr 6, 2024 121:17


Our next 2 big events are AI UX and the World's Fair. Join and apply to speak/sponsor!Due to timing issues we didn't have an interview episode to share with you this week, but not to worry, we have more than enough “weekend special” content in the backlog for you to get your Latent Space fix, whether you like thinking about the big picture, or learning more about the pod behind the scenes, or talking Groq and GPUs, or AI Leadership, or Personal AI. Enjoy!AI BreakdownThe indefatigable NLW had us back on his show for an update on the Four Wars, covering Sora, Suno, and the reshaped GPT-4 Class Landscape:and a longer segment on AI Engineering trends covering the future LLM landscape (Llama 3, GPT-5, Gemini 2, Claude 4), Open Source Models (Mistral, Grok), Apple and Meta's AI strategy, new chips (Groq, MatX) and the general movement from baby AGIs to vertical Agents:Thursday Nights in AIWe're also including swyx's interview with Josh Albrecht and Ali Rohde to reintroduce swyx and Latent Space to a general audience, and engage in some spicy Q&A:Dylan Patel on GroqWe hosted a private event with Dylan Patel of SemiAnalysis (our last pod here):Not all of it could be released so we just talked about our Groq estimates:Milind Naphade - Capital OneIn relation to conversations at NeurIPS and Nvidia GTC and upcoming at World's Fair, we also enjoyed chatting with Milind Naphade about his AI Leadership work at IBM, Cisco, Nvidia, and now leading the AI Foundations org at Capital One. We covered:* Milind's learnings from ~25 years in machine learning * His first paper citation was 24 years ago* Lessons from working with Jensen Huang for 6 years and being CTO of Metropolis * Thoughts on relevant AI research* GTC takeaways and what makes NVIDIA specialIf you'd like to work on building solutions rather than platform (as Milind put it), his Applied AI Research team at Capital One is hiring, which falls under the Capital One Tech team.Personal AI MeetupIt all started with a meme:Within days of each other, BEE, FRIEND, EmilyAI, Compass, Nox and LangFriend were all launching personal AI wearables and assistants. So we decided to put together a the world's first Personal AI meetup featuring creators and enthusiasts of wearables. The full video is live now, with full show notes within.Timestamps* [00:01:13] AI Breakdown Part 1* [00:02:20] Four Wars* [00:13:45] Sora* [00:15:12] Suno* [00:16:34] The GPT-4 Class Landscape* [00:17:03] Data War: Reddit x Google* [00:21:53] Gemini 1.5 vs Claude 3* [00:26:58] AI Breakdown Part 2* [00:27:33] Next Frontiers: Llama 3, GPT-5, Gemini 2, Claude 4* [00:31:11] Open Source Models - Mistral, Grok* [00:34:13] Apple MM1* [00:37:33] Meta's $800b AI rebrand* [00:39:20] AI Engineer landscape - from baby AGIs to vertical Agents* [00:47:28] Adept episode - Screen Multimodality* [00:48:54] Top Model Research from January Recap* [00:53:08] AI Wearables* [00:57:26] Groq vs Nvidia month - GPU Chip War* [01:00:31] Disagreements* [01:02:08] Summer 2024 Predictions* [01:04:18] Thursday Nights in AI - swyx* [01:33:34] Dylan Patel - Semianalysis + Latent Space Live Show* [01:34:58] GroqTranscript[00:00:00] swyx: Welcome to the Latent Space Podcast Weekend Edition. This is Charlie, your AI co host. Swyx and Alessio are off for the week, making more great content. We have exciting interviews coming up with Elicit, Chroma, Instructor, and our upcoming series on NSFW, Not Safe for Work AI. In today's episode, we're collating some of Swyx and Alessio's recent appearances, all in one place for you to find.[00:00:32] swyx: In part one, we have our first crossover pod of the year. In our listener survey, several folks asked for more thoughts from our two hosts. In 2023, Swyx and Alessio did crossover interviews with other great podcasts like the AI Breakdown, Practical AI, Cognitive Revolution, Thursday Eye, and Chinatalk, all of which you can find in the Latentspace About page.[00:00:56] swyx: NLW of the AI Breakdown asked us back to do a special on the 4Wars framework and the AI engineer scene. We love AI Breakdown as one of the best examples Daily podcasts to keep up on AI news, so we were especially excited to be back on Watch out and take[00:01:12] NLW: care[00:01:13] AI Breakdown Part 1[00:01:13] NLW: today on the AI breakdown. Part one of my conversation with Alessio and Swix from Latent Space.[00:01:19] NLW: All right, fellas, welcome back to the AI Breakdown. How are you doing? I'm good. Very good. With the last, the last time we did this show, we were like, oh yeah, let's do check ins like monthly about all the things that are going on and then. Of course, six months later, and, you know, the, the, the world has changed in a thousand ways.[00:01:36] NLW: It's just, it's too busy to even, to even think about podcasting sometimes. But I, I'm super excited to, to be chatting with you again. I think there's, there's a lot to, to catch up on, just to tap in, I think in the, you know, in the beginning of 2024. And, and so, you know, we're gonna talk today about just kind of a, a, a broad sense of where things are in some of the key battles in the AI space.[00:01:55] NLW: And then the, you know, one of the big things that I, that I'm really excited to have you guys on here for us to talk about where, sort of what patterns you're seeing and what people are actually trying to build, you know, where, where developers are spending their, their time and energy and, and, and any sort of, you know, trend trends there, but maybe let's start I guess by checking in on a framework that you guys actually introduced, which I've loved and I've cribbed a couple of times now, which is this sort of four wars of the, of the AI stack.[00:02:20] Four Wars[00:02:20] NLW: Because first, since I have you here, I'd love, I'd love to hear sort of like where that started gelling. And then and then maybe we can get into, I think a couple of them that are you know, particularly interesting, you know, in the, in light of[00:02:30] swyx: some recent news. Yeah, so maybe I'll take this one. So the four wars is a framework that I came up around trying to recap all of 2023.[00:02:38] swyx: I tried to write sort of monthly recap pieces. And I was trying to figure out like what makes one piece of news last longer than another or more significant than another. And I think it's basically always around battlegrounds. Wars are fought around limited resources. And I think probably the, you know, the most limited resource is talent, but the talent expresses itself in a number of areas.[00:03:01] swyx: And so I kind of focus on those, those areas at first. So the four wars that we cover are the data wars, the GPU rich, poor war, the multi modal war, And the RAG and Ops War. And I think you actually did a dedicated episode to that, so thanks for covering that. Yeah, yeah.[00:03:18] NLW: Not only did I do a dedicated episode, I actually used that.[00:03:22] NLW: I can't remember if I told you guys. I did give you big shoutouts. But I used it as a framework for a presentation at Intel's big AI event that they hold each year, where they have all their folks who are working on AI internally. And it totally resonated. That's amazing. Yeah, so, so, what got me thinking about it again is specifically this inflection news that we recently had, this sort of, you know, basically, I can't imagine that anyone who's listening wouldn't have thought about it, but, you know, inflection is a one of the big contenders, right?[00:03:53] NLW: I think probably most folks would have put them, you know, just a half step behind the anthropics and open AIs of the world in terms of labs, but it's a company that raised 1. 3 billion last year, less than a year ago. Reed Hoffman's a co founder Mustafa Suleyman, who's a co founder of DeepMind, you know, so it's like, this is not a a small startup, let's say, at least in terms of perception.[00:04:13] NLW: And then we get the news that basically most of the team, it appears, is heading over to Microsoft and they're bringing in a new CEO. And you know, I'm interested in, in, in kind of your take on how much that reflects, like hold aside, I guess, you know, all the other things that it might be about, how much it reflects this sort of the, the stark.[00:04:32] NLW: Brutal reality of competing in the frontier model space right now. And, you know, just the access to compute.[00:04:38] Alessio: There are a lot of things to say. So first of all, there's always somebody who's more GPU rich than you. So inflection is GPU rich by startup standard. I think about 22, 000 H100s, but obviously that pales compared to the, to Microsoft.[00:04:55] Alessio: The other thing is that this is probably good news, maybe for the startups. It's like being GPU rich, it's not enough. You know, like I think they were building something pretty interesting in, in pi of their own model of their own kind of experience. But at the end of the day, you're the interface that people consume as end users.[00:05:13] Alessio: It's really similar to a lot of the others. So and we'll tell, talk about GPT four and cloud tree and all this stuff. GPU poor, doing something. That the GPU rich are not interested in, you know we just had our AI center of excellence at Decibel and one of the AI leads at one of the big companies was like, Oh, we just saved 10 million and we use these models to do a translation, you know, and that's it.[00:05:39] Alessio: It's not, it's not a GI, it's just translation. So I think like the inflection part is maybe. A calling and a waking to a lot of startups then say, Hey, you know, trying to get as much capital as possible, try and get as many GPUs as possible. Good. But at the end of the day, it doesn't build a business, you know, and maybe what inflection I don't, I don't, again, I don't know the reasons behind the inflection choice, but if you say, I don't want to build my own company that has 1.[00:06:05] Alessio: 3 billion and I want to go do it at Microsoft, it's probably not a resources problem. It's more of strategic decisions that you're making as a company. So yeah, that was kind of my. I take on it.[00:06:15] swyx: Yeah, and I guess on my end, two things actually happened yesterday. It was a little bit quieter news, but Stability AI had some pretty major departures as well.[00:06:25] swyx: And you may not be considering it, but Stability is actually also a GPU rich company in the sense that they were the first new startup in this AI wave to brag about how many GPUs that they have. And you should join them. And you know, Imadis is definitely a GPU trader in some sense from his hedge fund days.[00:06:43] swyx: So Robin Rhombach and like the most of the Stable Diffusion 3 people left Stability yesterday as well. So yesterday was kind of like a big news day for the GPU rich companies, both Inflection and Stability having sort of wind taken out of their sails. I think, yes, it's a data point in the favor of Like, just because you have the GPUs doesn't mean you can, you automatically win.[00:07:03] swyx: And I think, you know, kind of I'll echo what Alessio says there. But in general also, like, I wonder if this is like the start of a major consolidation wave, just in terms of, you know, I think that there was a lot of funding last year and, you know, the business models have not been, you know, All of these things worked out very well.[00:07:19] swyx: Even inflection couldn't do it. And so I think maybe that's the start of a small consolidation wave. I don't think that's like a sign of AI winter. I keep looking for AI winter coming. I think this is kind of like a brief cold front. Yeah,[00:07:34] NLW: it's super interesting. So I think a bunch of A bunch of stuff here.[00:07:38] NLW: One is, I think, to both of your points, there, in some ways, there, there had already been this very clear demarcation between these two sides where, like, the GPU pores, to use the terminology, like, just weren't trying to compete on the same level, right? You know, the vast majority of people who have started something over the last year, year and a half, call it, were racing in a different direction.[00:07:59] NLW: They're trying to find some edge somewhere else. They're trying to build something different. If they're, if they're really trying to innovate, it's in different areas. And so it's really just this very small handful of companies that are in this like very, you know, it's like the coheres and jaspers of the world that like this sort of, you know, that are that are just sort of a little bit less resourced than, you know, than the other set that I think that this potentially even applies to, you know, everyone else that could clearly demarcate it into these two, two sides.[00:08:26] NLW: And there's only a small handful kind of sitting uncomfortably in the middle, perhaps. Let's, let's come back to the idea of, of the sort of AI winter or, you know, a cold front or anything like that. So this is something that I, I spent a lot of time kind of thinking about and noticing. And my perception is that The vast majority of the folks who are trying to call for sort of, you know, a trough of disillusionment or, you know, a shifting of the phase to that are people who either, A, just don't like AI for some other reason there's plenty of that, you know, people who are saying, You Look, they're doing way worse than they ever thought.[00:09:03] NLW: You know, there's a lot of sort of confirmation bias kind of thing going on. Or two, media that just needs a different narrative, right? Because they're sort of sick of, you know, telling the same story. Same thing happened last summer, when every every outlet jumped on the chat GPT at its first down month story to try to really like kind of hammer this idea that that the hype was too much.[00:09:24] NLW: Meanwhile, you have, you know, just ridiculous levels of investment from enterprises, you know, coming in. You have, you know, huge, huge volumes of, you know, individual behavior change happening. But I do think that there's nothing incoherent sort of to your point, Swyx, about that and the consolidation period.[00:09:42] NLW: Like, you know, if you look right now, for example, there are, I don't know, probably 25 or 30 credible, like, build your own chatbot. platforms that, you know, a lot of which have, you know, raised funding. There's no universe in which all of those are successful across, you know, even with a, even, even with a total addressable market of every enterprise in the world, you know, you're just inevitably going to see some amount of consolidation.[00:10:08] NLW: Same with, you know, image generators. There are, if you look at A16Z's top 50 consumer AI apps, just based on, you know, web traffic or whatever, they're still like I don't know, a half. Dozen or 10 or something, like, some ridiculous number of like, basically things like Midjourney or Dolly three. And it just seems impossible that we're gonna have that many, you know, ultimately as, as, as sort of, you know, going, going concerned.[00:10:33] NLW: So, I don't know. I, I, I think that the, there will be inevitable consolidation 'cause you know. It's, it's also what kind of like venture rounds are supposed to do. You're not, not everyone who gets a seed round is supposed to get to series A and not everyone who gets a series A is supposed to get to series B.[00:10:46] NLW: That's sort of the natural process. I think it will be tempting for a lot of people to try to infer from that something about AI not being as sort of big or as as sort of relevant as, as it was hyped up to be. But I, I kind of think that's the wrong conclusion to come to.[00:11:02] Alessio: I I would say the experimentation.[00:11:04] Alessio: Surface is a little smaller for image generation. So if you go back maybe six, nine months, most people will tell you, why would you build a coding assistant when like Copilot and GitHub are just going to win everything because they have the data and they have all the stuff. If you fast forward today, A lot of people use Cursor everybody was excited about the Devin release on Twitter.[00:11:26] Alessio: There are a lot of different ways of attacking the market that are not completion of code in the IDE. And even Cursors, like they evolved beyond single line to like chat, to do multi line edits and, and all that stuff. Image generation, I would say, yeah, as a, just as from what I've seen, like maybe the product innovation has slowed down at the UX level and people are improving the models.[00:11:50] Alessio: So the race is like, how do I make better images? It's not like, how do I make the user interact with the generation process better? And that gets tough, you know? It's hard to like really differentiate yourselves. So yeah, that's kind of how I look at it. And when we think about multimodality, maybe the reason why people got so excited about Sora is like, oh, this is like a completely It's not a better image model.[00:12:13] Alessio: This is like a completely different thing, you know? And I think the creative mind It's always looking for something that impacts the viewer in a different way, you know, like they really want something different versus the developer mind. It's like, Oh, I, I just, I have this like very annoying thing I want better.[00:12:32] Alessio: I have this like very specific use cases that I want to go after. So it's just different. And that's why you see a lot more companies in image generation. But I agree with you that. If you fast forward there, there's not going to be 10 of them, you know, it's probably going to be one or[00:12:46] swyx: two. Yeah, I mean, to me, that's why I call it a war.[00:12:49] swyx: Like, individually, all these companies can make a story that kind of makes sense, but collectively, they cannot all be true. Therefore, they all, there is some kind of fight over limited resources here. Yeah, so[00:12:59] NLW: it's interesting. We wandered very naturally into sort of another one of these wars, which is the multimodality kind of idea, which is, you know, basically a question of whether it's going to be these sort of big everything models that end up winning or whether, you know, you're going to have really specific things, you know, like something, you know, Dolly 3 inside of sort of OpenAI's larger models versus, you know, a mid journey or something like that.[00:13:24] NLW: And at first, you know, I was kind of thinking like, For most of the last, call it six months or whatever, it feels pretty definitively both and in some ways, you know, and that you're, you're seeing just like great innovation on sort of the everything models, but you're also seeing lots and lots happen at sort of the level of kind of individual use cases.[00:13:45] Sora[00:13:45] NLW: But then Sora comes along and just like obliterates what I think anyone thought you know, where we were when it comes to video generation. So how are you guys thinking about this particular battle or war at the moment?[00:13:59] swyx: Yeah, this was definitely a both and story, and Sora tipped things one way for me, in terms of scale being all you need.[00:14:08] swyx: And the benefit, I think, of having multiple models being developed under one roof. I think a lot of people aren't aware that Sora was developed in a similar fashion to Dolly 3. And Dolly3 had a very interesting paper out where they talked about how they sort of bootstrapped their synthetic data based on GPT 4 vision and GPT 4.[00:14:31] swyx: And, and it was just all, like, really interesting, like, if you work on one modality, it enables you to work on other modalities, and all that is more, is, is more interesting. I think it's beneficial if it's all in the same house, whereas the individual startups who don't, who sort of carve out a single modality and work on that, definitely won't have the state of the art stuff on helping them out on synthetic data.[00:14:52] swyx: So I do think like, The balance is tilted a little bit towards the God model companies, which is challenging for the, for the, for the the sort of dedicated modality companies. But everyone's carving out different niches. You know, like we just interviewed Suno ai, the sort of music model company, and, you know, I don't see opening AI pursuing music anytime soon.[00:15:12] Suno[00:15:12] swyx: Yeah,[00:15:13] NLW: Suno's been phenomenal to play with. Suno has done that rare thing where, which I think a number of different AI product categories have done, where people who don't consider themselves particularly interested in doing the thing that the AI enables find themselves doing a lot more of that thing, right?[00:15:29] NLW: Like, it'd be one thing if Just musicians were excited about Suno and using it but what you're seeing is tons of people who just like music all of a sudden like playing around with it and finding themselves kind of down that rabbit hole, which I think is kind of like the highest compliment that you can give one of these startups at the[00:15:45] swyx: early days of it.[00:15:46] swyx: Yeah, I, you know, I, I asked them directly, you know, in the interview about whether they consider themselves mid journey for music. And he had a more sort of nuanced response there, but I think that probably the business model is going to be very similar because he's focused on the B2C element of that. So yeah, I mean, you know, just to, just to tie back to the question about, you know, You know, large multi modality companies versus small dedicated modality companies.[00:16:10] swyx: Yeah, highly recommend people to read the Sora blog posts and then read through to the Dali blog posts because they, they strongly correlated themselves with the same synthetic data bootstrapping methods as Dali. And I think once you make those connections, you're like, oh, like it, it, it is beneficial to have multiple state of the art models in house that all help each other.[00:16:28] swyx: And these, this, that's the one thing that a dedicated modality company cannot do.[00:16:34] The GPT-4 Class Landscape[00:16:34] NLW: So I, I wanna jump, I wanna kind of build off that and, and move into the sort of like updated GPT-4 class landscape. 'cause that's obviously been another big change over the last couple months. But for the sake of completeness, is there anything that's worth touching on with with sort of the quality?[00:16:46] NLW: Quality data or sort of a rag ops wars just in terms of, you know, anything that's changed, I guess, for you fundamentally in the last couple of months about where those things stand.[00:16:55] swyx: So I think we're going to talk about rag for the Gemini and Clouds discussion later. And so maybe briefly discuss the data piece.[00:17:03] Data War: Reddit x Google[00:17:03] swyx: I think maybe the only new thing was this Reddit deal with Google for like a 60 million dollar deal just ahead of their IPO, very conveniently turning Reddit into a AI data company. Also, very, very interestingly, a non exclusive deal, meaning that Reddit can resell that data to someone else. And it probably does become table stakes.[00:17:23] swyx: A lot of people don't know, but a lot of the web text dataset that originally started for GPT 1, 2, and 3 was actually scraped from GitHub. from Reddit at least the sort of vote scores. And I think, I think that's a, that's a very valuable piece of information. So like, yeah, I think people are figuring out how to pay for data.[00:17:40] swyx: People are suing each other over data. This, this, this war is, you know, definitely very, very much heating up. And I don't think, I don't see it getting any less intense. I, you know, next to GPUs, data is going to be the most expensive thing in, in a model stack company. And. You know, a lot of people are resorting to synthetic versions of it, which may or may not be kosher based on how far along or how commercially blessed the, the forms of creating that synthetic data are.[00:18:11] swyx: I don't know if Alessio, you have any other interactions with like Data source companies, but that's my two cents.[00:18:17] Alessio: Yeah yeah, I actually saw Quentin Anthony from Luther. ai at GTC this week. He's also been working on this. I saw Technium. He's also been working on the data side. I think especially in open source, people are like, okay, if everybody is putting the gates up, so to speak, to the data we need to make it easier for people that don't have 50 million a year to get access to good data sets.[00:18:38] Alessio: And Jensen, at his keynote, he did talk about synthetic data a little bit. So I think that's something that we'll definitely hear more and more of in the enterprise, which never bodes well, because then all the, all the people with the data are like, Oh, the enterprises want to pay now? Let me, let me put a pay here stripe link so that they can give me 50 million.[00:18:57] Alessio: But it worked for Reddit. I think the stock is up. 40 percent today after opening. So yeah, I don't know if it's all about the Google deal, but it's obviously Reddit has been one of those companies where, hey, you got all this like great community, but like, how are you going to make money? And like, they try to sell the avatars.[00:19:15] Alessio: I don't know if that it's a great business for them. The, the data part sounds as an investor, you know, the data part sounds a lot more interesting than, than consumer[00:19:25] swyx: cosmetics. Yeah, so I think, you know there's more questions around data you know, I think a lot of people are talking about the interview that Mira Murady did with the Wall Street Journal, where she, like, just basically had no, had no good answer for where they got the data for Sora.[00:19:39] swyx: I, I think this is where, you know, there's, it's in nobody's interest to be transparent about data, and it's, it's kind of sad for the state of ML and the state of AI research but it is what it is. We, we have to figure this out as a society, just like we did for music and music sharing. You know, in, in sort of the Napster to Spotify transition, and that might take us a decade.[00:19:59] swyx: Yeah, I[00:20:00] NLW: do. I, I agree. I think, I think that you're right to identify it, not just as that sort of technical problem, but as one where society has to have a debate with itself. Because I think that there's, if you rationally within it, there's Great kind of points on all side, not to be the sort of, you know, person who sits in the middle constantly, but it's why I think a lot of these legal decisions are going to be really important because, you know, the job of judges is to listen to all this stuff and try to come to things and then have other judges disagree.[00:20:24] NLW: And, you know, and have the rest of us all debate at the same time. By the way, as a total aside, I feel like the synthetic data right now is like eggs in the 80s and 90s. Like, whether they're good for you or bad for you, like, you know, we, we get one study that's like synthetic data, you know, there's model collapse.[00:20:42] NLW: And then we have like a hint that llama, you know, to the most high performance version of it, which was one they didn't release was trained on synthetic data. So maybe it's good. It's like, I just feel like every, every other week I'm seeing something sort of different about whether it's a good or bad for, for these models.[00:20:56] swyx: Yeah. The branding of this is pretty poor. I would kind of tell people to think about it like cholesterol. There's good cholesterol, bad cholesterol. And you can have, you know, good amounts of both. But at this point, it is absolutely without a doubt that most large models from here on out will all be trained as some kind of synthetic data and that is not a bad thing.[00:21:16] swyx: There are ways in which you can do it poorly. Whether it's commercial, you know, in terms of commercial sourcing or in terms of the model performance. But it's without a doubt that good synthetic data is going to help your model. And this is just a question of like where to obtain it and what kinds of synthetic data are valuable.[00:21:36] swyx: You know, if even like alpha geometry, you know, was, was a really good example from like earlier this year.[00:21:42] NLW: If you're using the cholesterol analogy, then my, then my egg thing can't be that far off. Let's talk about the sort of the state of the art and the, and the GPT 4 class landscape and how that's changed.[00:21:53] Gemini 1.5 vs Claude 3[00:21:53] NLW: Cause obviously, you know, sort of the, the two big things or a couple of the big things that have happened. Since we last talked, we're one, you know, Gemini first announcing that a model was coming and then finally it arriving, and then very soon after a sort of a different model arriving from Gemini and and Cloud three.[00:22:11] NLW: So I guess, you know, I'm not sure exactly where the right place to start with this conversation is, but, you know, maybe very broadly speaking which of these do you think have made a bigger impact? Thank you.[00:22:20] Alessio: Probably the one you can use, right? So, Cloud. Well, I'm sure Gemini is going to be great once they let me in, but so far I haven't been able to.[00:22:29] Alessio: I use, so I have this small podcaster thing that I built for our podcast, which does chapters creation, like named entity recognition, summarization, and all of that. Cloud Tree is, Better than GPT 4. Cloud2 was unusable. So I use GPT 4 for everything. And then when Opus came out, I tried them again side by side and I posted it on, on Twitter as well.[00:22:53] Alessio: Cloud is better. It's very good, you know, it's much better, it seems to me, it's much better than GPT 4 at doing writing that is more, you know, I don't know, it just got good vibes, you know, like the GPT 4 text, you can tell it's like GPT 4, you know, it's like, it always uses certain types of words and phrases and, you know, maybe it's just me because I've now done it for, you know, So, I've read like 75, 80 generations of these things next to each other.[00:23:21] Alessio: Clutter is really good. I know everybody is freaking out on twitter about it, my only experience of this is much better has been on the podcast use case. But I know that, you know, Quran from from News Research is a very big opus pro, pro opus person. So, I think that's also It's great to have people that actually care about other models.[00:23:40] Alessio: You know, I think so far to a lot of people, maybe Entropic has been the sibling in the corner, you know, it's like Cloud releases a new model and then OpenAI releases Sora and like, you know, there are like all these different things, but yeah, the new models are good. It's interesting.[00:23:55] NLW: My my perception is definitely that just, just observationally, Cloud 3 is certainly the first thing that I've seen where lots of people.[00:24:06] NLW: They're, no one's debating evals or anything like that. They're talking about the specific use cases that they have, that they used to use chat GPT for every day, you know, day in, day out, that they've now just switched over. And that has, I think, shifted a lot of the sort of like vibe and sentiment in the space too.[00:24:26] NLW: And I don't necessarily think that it's sort of a A like full you know, sort of full knock. Let's put it this way. I think it's less bad for open AI than it is good for anthropic. I think that because GPT 5 isn't there, people are not quite willing to sort of like, you know get overly critical of, of open AI, except in so far as they're wondering where GPT 5 is.[00:24:46] NLW: But I do think that it makes, Anthropic look way more credible as a, as a, as a player, as a, you know, as a credible sort of player, you know, as opposed to to, to where they were.[00:24:57] Alessio: Yeah. And I would say the benchmarks veil is probably getting lifted this year. I think last year. People were like, okay, this is better than this on this benchmark, blah, blah, blah, because maybe they did not have a lot of use cases that they did frequently.[00:25:11] Alessio: So it's hard to like compare yourself. So you, you defer to the benchmarks. I think now as we go into 2024, a lot of people have started to use these models from, you know, from very sophisticated things that they run in production to some utility that they have on their own. Now they can just run them side by side.[00:25:29] Alessio: And it's like, Hey, I don't care that like. The MMLU score of Opus is like slightly lower than GPT 4. It just works for me, you know, and I think that's the same way that traditional software has been used by people, right? Like you just strive for yourself and like, which one does it work, works best for you?[00:25:48] Alessio: Like nobody looks at benchmarks outside of like sales white papers, you know? And I think it's great that we're going more in that direction. We have a episode with Adapt coming out this weekend. I'll and some of their model releases, they specifically say, We do not care about benchmarks, so we didn't put them in, you know, because we, we don't want to look good on them.[00:26:06] Alessio: We just want the product to work. And I think more and more people will, will[00:26:09] swyx: go that way. Yeah. I I would say like, it does take the wind out of the sails for GPT 5, which I know where, you know, Curious about later on. I think anytime you put out a new state of the art model, you have to break through in some way.[00:26:21] swyx: And what Claude and Gemini have done is effectively take away any advantage to saying that you have a million token context window. Now everyone's just going to be like, Oh, okay. Now you just match the other two guys. And so that puts An insane amount of pressure on what gpt5 is going to be because it's just going to have like the only option it has now because all the other models are multimodal all the other models are long context all the other models have perfect recall gpt5 has to match everything and do more to to not be a flop[00:26:58] AI Breakdown Part 2[00:26:58] NLW: hello friends back again with part two if you haven't heard part one of this conversation i suggest you go check it out but to be honest they are kind of actually separable In this conversation, we get into a topic that I think Alessio and Swyx are very well positioned to discuss, which is what developers care about right now, what people are trying to build around.[00:27:16] NLW: I honestly think that one of the best ways to see the future in an industry like AI is to try to dig deep on what developers and entrepreneurs are attracted to build, even if it hasn't made it to the news pages yet. So consider this your preview of six months from now, and let's dive in. Let's bring it to the GPT 5 conversation.[00:27:33] Next Frontiers: Llama 3, GPT-5, Gemini 2, Claude 4[00:27:33] NLW: I mean, so, so I think that that's a great sort of assessment of just how the stakes have been raised, you know is your, I mean, so I guess maybe, maybe I'll, I'll frame this less as a question, just sort of something that, that I, that I've been watching right now, the only thing that makes sense to me with how.[00:27:50] NLW: Fundamentally unbothered and unstressed OpenAI seems about everything is that they're sitting on something that does meet all that criteria, right? Because, I mean, even in the Lex Friedman interview that, that Altman recently did, you know, he's talking about other things coming out first. He's talking about, he's just like, he, listen, he, he's good and he could play nonchalant, you know, if he wanted to.[00:28:13] NLW: So I don't want to read too much into it, but. You know, they've had so long to work on this, like unless that we are like really meaningfully running up against some constraint, it just feels like, you know, there's going to be some massive increase, but I don't know. What do you guys think?[00:28:28] swyx: Hard to speculate.[00:28:29] swyx: You know, at this point, they're, they're pretty good at PR and they're not going to tell you anything that they don't want to. And he can tell you one thing and change their minds the next day. So it's, it's, it's really, you know, I've always said that model version numbers are just marketing exercises, like they have something and it's always improving and at some point you just cut it and decide to call it GPT 5.[00:28:50] swyx: And it's more just about defining an arbitrary level at which they're ready and it's up to them on what ready means. We definitely did see some leaks on GPT 4. 5, as I think a lot of people reported and I'm not sure if you covered it. So it seems like there might be an intermediate release. But I did feel, coming out of the Lex Friedman interview, that GPT 5 was nowhere near.[00:29:11] swyx: And you know, it was kind of a sharp contrast to Sam talking at Davos in February, saying that, you know, it was his top priority. So I find it hard to square. And honestly, like, there's also no point Reading too much tea leaves into what any one person says about something that hasn't happened yet or has a decision that hasn't been taken yet.[00:29:31] swyx: Yeah, that's, that's my 2 cents about it. Like, calm down, let's just build .[00:29:35] Alessio: Yeah. The, the February rumor was that they were gonna work on AI agents, so I don't know, maybe they're like, yeah,[00:29:41] swyx: they had two agent two, I think two agent projects, right? One desktop agent and one sort of more general yeah, sort of GPTs like agent and then Andre left, so he was supposed to be the guy on that.[00:29:52] swyx: What did Andre see? What did he see? I don't know. What did he see?[00:29:56] Alessio: I don't know. But again, it's just like the rumors are always floating around, you know but I think like, this is, you know, we're not going to get to the end of the year without Jupyter you know, that's definitely happening. I think the biggest question is like, are Anthropic and Google.[00:30:13] Alessio: Increasing the pace, you know, like it's the, it's the cloud four coming out like in 12 months, like nine months. What's the, what's the deal? Same with Gemini. They went from like one to 1. 5 in like five days or something. So when's Gemini 2 coming out, you know, is that going to be soon? I don't know.[00:30:31] Alessio: There, there are a lot of, speculations, but the good thing is that now you can see a world in which OpenAI doesn't rule everything. You know, so that, that's the best, that's the best news that everybody got, I would say.[00:30:43] swyx: Yeah, and Mistral Large also dropped in the last month. And, you know, not as, not quite GPT 4 class, but very good from a new startup.[00:30:52] swyx: So yeah, we, we have now slowly changed in landscape, you know. In my January recap, I was complaining that nothing's changed in the landscape for a long time. But now we do exist in a world, sort of a multipolar world where Cloud and Gemini are legitimate challengers to GPT 4 and hopefully more will emerge as well hopefully from meta.[00:31:11] Open Source Models - Mistral, Grok[00:31:11] NLW: So speak, let's actually talk about sort of the open source side of this for a minute. So Mistral Large, notable because it's, it's not available open source in the same way that other things are, although I think my perception is that the community has largely given them Like the community largely recognizes that they want them to keep building open source stuff and they have to find some way to fund themselves that they're going to do that.[00:31:27] NLW: And so they kind of understand that there's like, they got to figure out how to eat, but we've got, so, you know, there there's Mistral, there's, I guess, Grok now, which is, you know, Grok one is from, from October is, is open[00:31:38] swyx: sourced at, yeah. Yeah, sorry, I thought you thought you meant Grok the chip company.[00:31:41] swyx: No, no, no, yeah, you mean Twitter Grok.[00:31:43] NLW: Although Grok the chip company, I think is even more interesting in some ways, but and then there's the, you know, obviously Llama3 is the one that sort of everyone's wondering about too. And, you know, my, my sense of that, the little bit that, you know, Zuckerberg was talking about Llama 3 earlier this year, suggested that, at least from an ambition standpoint, he was not thinking about how do I make sure that, you know, meta content, you know, keeps, keeps the open source thrown, you know, vis a vis Mistral.[00:32:09] NLW: He was thinking about how you go after, you know, how, how he, you know, releases a thing that's, you know, every bit as good as whatever OpenAI is on at that point.[00:32:16] Alessio: Yeah. From what I heard in the hallways at, at GDC, Llama 3, the, the biggest model will be, you 260 to 300 billion parameters, so that that's quite large.[00:32:26] Alessio: That's not an open source model. You know, you cannot give people a 300 billion parameters model and ask them to run it. You know, it's very compute intensive. So I think it is, it[00:32:35] swyx: can be open source. It's just, it's going to be difficult to run, but that's a separate question.[00:32:39] Alessio: It's more like, as you think about what they're doing it for, you know, it's not like empowering the person running.[00:32:45] Alessio: llama. On, on their laptop, it's like, oh, you can actually now use this to go after open AI, to go after Anthropic, to go after some of these companies at like the middle complexity level, so to speak. Yeah. So obviously, you know, we estimate Gentala on the podcast, they're doing a lot here, they're making PyTorch better.[00:33:03] Alessio: You know, they want to, that's kind of like maybe a little bit of a shorted. Adam Bedia, in a way, trying to get some of the CUDA dominance out of it. Yeah, no, it's great. The, I love the duck destroying a lot of monopolies arc. You know, it's, it's been very entertaining. Let's bridge[00:33:18] NLW: into the sort of big tech side of this, because this is obviously like, so I think actually when I did my episode, this was one of the I added this as one of as an additional war that, that's something that I'm paying attention to.[00:33:29] NLW: So we've got Microsoft's moves with inflection, which I think pretend, potentially are being read as A shift vis a vis the relationship with OpenAI, which also the sort of Mistral large relationship seems to reinforce as well. We have Apple potentially entering the race, finally, you know, giving up Project Titan and and, and kind of trying to spend more effort on this.[00:33:50] NLW: Although, Counterpoint, we also have them talking about it, or there being reports of a deal with Google, which, you know, is interesting to sort of see what their strategy there is. And then, you know, Meta's been largely quiet. We kind of just talked about the main piece, but, you know, there's, and then there's spoilers like Elon.[00:34:07] NLW: I mean, you know, what, what of those things has sort of been most interesting to you guys as you think about what's going to shake out for the rest of this[00:34:13] Apple MM1[00:34:13] swyx: year? I'll take a crack. So the reason we don't have a fifth war for the Big Tech Wars is that's one of those things where I just feel like we don't cover differently from other media channels, I guess.[00:34:26] swyx: Sure, yeah. In our anti interestness, we actually say, like, we try not to cover the Big Tech Game of Thrones, or it's proxied through Twitter. You know, all the other four wars anyway, so there's just a lot of overlap. Yeah, I think absolutely, personally, the most interesting one is Apple entering the race.[00:34:41] swyx: They actually released, they announced their first large language model that they trained themselves. It's like a 30 billion multimodal model. People weren't that impressed, but it was like the first time that Apple has kind of showcased that, yeah, we're training large models in house as well. Of course, like, they might be doing this deal with Google.[00:34:57] swyx: I don't know. It sounds very sort of rumor y to me. And it's probably, if it's on device, it's going to be a smaller model. So something like a Jemma. It's going to be smarter autocomplete. I don't know what to say. I'm still here dealing with, like, Siri, which hasn't, probably hasn't been updated since God knows when it was introduced.[00:35:16] swyx: It's horrible. I, you know, it, it, it makes me so angry. So I, I, one, as an Apple customer and user, I, I'm just hoping for better AI on Apple itself. But two, they are the gold standard when it comes to local devices, personal compute and, and trust, like you, you trust them with your data. And. I think that's what a lot of people are looking for in AI, that they have, they love the benefits of AI, they don't love the downsides, which is that you have to send all your data to some cloud somewhere.[00:35:45] swyx: And some of this data that we're going to feed AI is just the most personal data there is. So Apple being like one of the most trusted personal data companies, I think it's very important that they enter the AI race, and I hope to see more out of them.[00:35:58] Alessio: To me, the, the biggest question with the Google deal is like, who's paying who?[00:36:03] Alessio: Because for the browsers, Google pays Apple like 18, 20 billion every year to be the default browser. Is Google going to pay you to have Gemini or is Apple paying Google to have Gemini? I think that's, that's like what I'm most interested to figure out because with the browsers, it's like, it's the entry point to the thing.[00:36:21] Alessio: So it's really valuable to be the default. That's why Google pays. But I wonder if like the perception in AI is going to be like, Hey. You just have to have a good local model on my phone to be worth me purchasing your device. And that was, that's kind of drive Apple to be the one buying the model. But then, like Shawn said, they're doing the MM1 themselves.[00:36:40] Alessio: So are they saying we do models, but they're not as good as the Google ones? I don't know. The whole thing is, it's really confusing, but. It makes for great meme material on on Twitter.[00:36:51] swyx: Yeah, I mean, I think, like, they are possibly more than OpenAI and Microsoft and Amazon. They are the most full stack company there is in computing, and so, like, they own the chips, man.[00:37:05] swyx: Like, they manufacture everything so if, if, if there was a company that could do that. You know, seriously challenge the other AI players. It would be Apple. And it's, I don't think it's as hard as self driving. So like maybe they've, they've just been investing in the wrong thing this whole time. We'll see.[00:37:21] swyx: Wall Street certainly thinks[00:37:22] NLW: so. Wall Street loved that move, man. There's a big, a big sigh of relief. Well, let's, let's move away from, from sort of the big stuff. I mean, the, I think to both of your points, it's going to.[00:37:33] Meta's $800b AI rebrand[00:37:33] NLW: Can I, can[00:37:34] swyx: I, can I, can I jump on factoid about this, this Wall Street thing? I went and looked at when Meta went from being a VR company to an AI company.[00:37:44] swyx: And I think the stock I'm trying to look up the details now. The stock has gone up 187% since Lamo one. Yeah. Which is $830 billion in market value created in the past year. . Yeah. Yeah.[00:37:57] NLW: It's, it's, it's like, remember if you guys haven't Yeah. If you haven't seen the chart, it's actually like remarkable.[00:38:02] NLW: If you draw a little[00:38:03] swyx: arrow on it, it's like, no, we're an AI company now and forget the VR thing.[00:38:10] NLW: It's it, it is an interesting, no, it's, I, I think, alessio, you called it sort of like Zuck's Disruptor Arc or whatever. He, he really does. He is in the midst of a, of a total, you know, I don't know if it's a redemption arc or it's just, it's something different where, you know, he, he's sort of the spoiler.[00:38:25] NLW: Like people loved him just freestyle talking about why he thought they had a better headset than Apple. But even if they didn't agree, they just loved it. He was going direct to camera and talking about it for, you know, five minutes or whatever. So that, that's a fascinating shift that I don't think anyone had on their bingo card, you know, whatever, two years ago.[00:38:41] NLW: Yeah. Yeah,[00:38:42] swyx: we still[00:38:43] Alessio: didn't see and fight Elon though, so[00:38:45] swyx: that's what I'm really looking forward to. I mean, hey, don't, don't, don't write it off, you know, maybe just these things take a while to happen. But we need to see and fight in the Coliseum. No, I think you know, in terms of like self management, life leadership, I think he has, there's a lot of lessons to learn from him.[00:38:59] swyx: You know he might, you know, you might kind of quibble with, like, the social impact of Facebook, but just himself as a in terms of personal growth and, and, you know, Per perseverance through like a lot of change and you know, everyone throwing stuff his way. I think there's a lot to say about like, to learn from, from Zuck, which is crazy 'cause he's my age.[00:39:18] swyx: Yeah. Right.[00:39:20] AI Engineer landscape - from baby AGIs to vertical Agents[00:39:20] NLW: Awesome. Well, so, so one of the big things that I think you guys have, you know, distinct and, and unique insight into being where you are and what you work on is. You know, what developers are getting really excited about right now. And by that, I mean, on the one hand, certainly, you know, like startups who are actually kind of formalized and formed to startups, but also, you know, just in terms of like what people are spending their nights and weekends on what they're, you know, coming to hackathons to do.[00:39:45] NLW: And, you know, I think it's a, it's a, it's, it's such a fascinating indicator for, for where things are headed. Like if you zoom back a year, right now was right when everyone was getting so, so excited about. AI agent stuff, right? Auto, GPT and baby a GI. And these things were like, if you dropped anything on YouTube about those, like instantly tens of thousands of views.[00:40:07] NLW: I know because I had like a 50,000 view video, like the second day that I was doing the show on YouTube, you know, because I was talking about auto GPT. And so anyways, you know, obviously that's sort of not totally come to fruition yet, but what are some of the trends in what you guys are seeing in terms of people's, people's interest and, and, and what people are building?[00:40:24] Alessio: I can start maybe with the agents part and then I know Shawn is doing a diffusion meetup tonight. There's a lot of, a lot of different things. The, the agent wave has been the most interesting kind of like dream to reality arc. So out of GPT, I think they went, From zero to like 125, 000 GitHub stars in six weeks, and then one year later, they have 150, 000 stars.[00:40:49] Alessio: So there's kind of been a big plateau. I mean, you might say there are just not that many people that can start it. You know, everybody already started it. But the promise of, hey, I'll just give you a goal, and you do it. I think it's like, amazing to get people's imagination going. You know, they're like, oh, wow, this This is awesome.[00:41:08] Alessio: Everybody, everybody can try this to do anything. But then as technologists, you're like, well, that's, that's just like not possible, you know, we would have like solved everything. And I think it takes a little bit to go from the promise and the hope that people show you to then try it yourself and going back to say, okay, this is not really working for me.[00:41:28] Alessio: And David Wong from Adept, you know, they in our episode, he specifically said. We don't want to do a bottom up product. You know, we don't want something that everybody can just use and try because it's really hard to get it to be reliable. So we're seeing a lot of companies doing vertical agents that are narrow for a specific domain, and they're very good at something.[00:41:49] Alessio: Mike Conover, who was at Databricks before, is also a friend of Latentspace. He's doing this new company called BrightWave doing AI agents for financial research, and that's it, you know, and they're doing very well. There are other companies doing it in security, doing it in compliance, doing it in legal.[00:42:08] Alessio: All of these things that like, people, nobody just wakes up and say, Oh, I cannot wait to go on AutoGPD and ask it to do a compliance review of my thing. You know, just not what inspires people. So I think the gap on the developer side has been the more bottom sub hacker mentality is trying to build this like very Generic agents that can do a lot of open ended tasks.[00:42:30] Alessio: And then the more business side of things is like, Hey, If I want to raise my next round, I can not just like sit around the mess, mess around with like super generic stuff. I need to find a use case that really works. And I think that that is worth for, for a lot of folks in parallel, you have a lot of companies doing evals.[00:42:47] Alessio: There are dozens of them that just want to help you measure how good your models are doing. Again, if you build evals, you need to also have a restrained surface area to actually figure out whether or not it's good, right? Because you cannot eval anything on everything under the sun. So that's another category where I've seen from the startup pitches that I've seen, there's a lot of interest in, in the enterprise.[00:43:11] Alessio: It's just like really. Fragmented because the production use cases are just coming like now, you know, there are not a lot of long established ones to, to test against. And so does it, that's kind of on the virtual agents and then the robotic side it's probably been the thing that surprised me the most at NVIDIA GTC, the amount of robots that were there that were just like robots everywhere.[00:43:33] Alessio: Like, both in the keynote and then on the show floor, you would have Boston Dynamics dogs running around. There was, like, this, like fox robot that had, like, a virtual face that, like, talked to you and, like, moved in real time. There were industrial robots. NVIDIA did a big push on their own Omniverse thing, which is, like, this Digital twin of whatever environments you're in that you can use to train the robots agents.[00:43:57] Alessio: So that kind of takes people back to the reinforcement learning days, but yeah, agents, people want them, you know, people want them. I give a talk about the, the rise of the full stack employees and kind of this future, the same way full stack engineers kind of work across the stack. In the future, every employee is going to interact with every part of the organization through agents and AI enabled tooling.[00:44:17] Alessio: This is happening. It just needs to be a lot more narrow than maybe the first approach that we took, which is just put a string in AutoGPT and pray. But yeah, there's a lot of super interesting stuff going on.[00:44:27] swyx: Yeah. Well, he Let's recover a lot of stuff there. I'll separate the robotics piece because I feel like that's so different from the software world.[00:44:34] swyx: But yeah, we do talk to a lot of engineers and you know, that this is our sort of bread and butter. And I do agree that vertical agents have worked out a lot better than the horizontal ones. I think all You know, the point I'll make here is just the reason AutoGPT and maybe AGI, you know, it's in the name, like they were promising AGI.[00:44:53] swyx: But I think people are discovering that you cannot engineer your way to AGI. It has to be done at the model level and all these engineering, prompt engineering hacks on top of it weren't really going to get us there in a meaningful way without much further, you know, improvements in the models. I would say, I'll go so far as to say, even Devin, which is, I would, I think the most advanced agent that we've ever seen, still requires a lot of engineering and still probably falls apart a lot in terms of, like, practical usage.[00:45:22] swyx: Or it's just, Way too slow and expensive for, you know, what it's, what it's promised compared to the video. So yeah, that's, that's what, that's what happened with agents from, from last year. But I, I do, I do see, like, vertical agents being very popular and, and sometimes you, like, I think the word agent might even be overused sometimes.[00:45:38] swyx: Like, people don't really care whether or not you call it an AI agent, right? Like, does it replace boring menial tasks that I do That I might hire a human to do, or that the human who is hired to do it, like, actually doesn't really want to do. And I think there's absolutely ways in sort of a vertical context that you can actually go after very routine tasks that can be scaled out to a lot of, you know, AI assistants.[00:46:01] swyx: So, so yeah, I mean, and I would, I would sort of basically plus one what let's just sit there. I think it's, it's very, very promising and I think more people should work on it, not less. Like there's not enough people. Like, we, like, this should be the, the, the main thrust of the AI engineer is to look out, look for use cases and, and go to a production with them instead of just always working on some AGI promising thing that never arrives.[00:46:21] swyx: I,[00:46:22] NLW: I, I can only add that so I've been fiercely making tutorials behind the scenes around basically everything you can imagine with AI. We've probably done, we've done about 300 tutorials over the last couple of months. And the verticalized anything, right, like this is a solution for your particular job or role, even if it's way less interesting or kind of sexy, it's like so radically more useful to people in terms of intersecting with how, like those are the ways that people are actually.[00:46:50] NLW: Adopting AI in a lot of cases is just a, a, a thing that I do over and over again. By the way, I think that's the same way that even the generalized models are getting adopted. You know, it's like, I use midjourney for lots of stuff, but the main thing I use it for is YouTube thumbnails every day. Like day in, day out, I will always do a YouTube thumbnail, you know, or two with, with Midjourney, right?[00:47:09] NLW: And it's like you can, you can start to extrapolate that across a lot of things and all of a sudden, you know, a AI doesn't. It looks revolutionary because of a million small changes rather than one sort of big dramatic change. And I think that the verticalization of agents is sort of a great example of how that's[00:47:26] swyx: going to play out too.[00:47:28] Adept episode - Screen Multimodality[00:47:28] swyx: So I'll have one caveat here, which is I think that Because multi modal models are now commonplace, like Cloud, Gemini, OpenAI, all very very easily multi modal, Apple's easily multi modal, all this stuff. There is a switch for agents for sort of general desktop browsing that I think people so much for joining us today, and we'll see you in the next video.[00:48:04] swyx: Version of the the agent where they're not specifically taking in text or anything They're just watching your screen just like someone else would and and I'm piloting it by vision And you know in the the episode with David that we'll have dropped by the time that this this airs I think I think that is the promise of adept and that is a promise of what a lot of these sort of desktop agents Are and that is the more general purpose system That could be as big as the browser, the operating system, like, people really want to build that foundational piece of software in AI.[00:48:38] swyx: And I would see, like, the potential there for desktop agents being that, that you can have sort of self driving computers. You know, don't write the horizontal piece out. I just think we took a while to get there.[00:48:48] NLW: What else are you guys seeing that's interesting to you? I'm looking at your notes and I see a ton of categories.[00:48:54] Top Model Research from January Recap[00:48:54] swyx: Yeah so I'll take the next two as like as one category, which is basically alternative architectures, right? The two main things that everyone following AI kind of knows now is, one, the diffusion architecture, and two, the let's just say the, Decoder only transformer architecture that is popularized by GPT.[00:49:12] swyx: You can read, you can look on YouTube for thousands and thousands of tutorials on each of those things. What we are talking about here is what's next, what people are researching, and what could be on the horizon that takes the place of those other two things. So first of all, we'll talk about transformer architectures and then diffusion.[00:49:25] swyx: So transformers the, the two leading candidates are effectively RWKV and the state space models the most recent one of which is Mamba, but there's others like the Stripe, ENA, and the S four H three stuff coming out of hazy research at Stanford. And all of those are non quadratic language models that scale the promise to scale a lot better than the, the traditional transformer.[00:49:47] swyx: That this might be too theoretical for most people right now, but it's, it's gonna be. It's gonna come out in weird ways, where, imagine if like, Right now the talk of the town is that Claude and Gemini have a million tokens of context and like whoa You can put in like, you know, two hours of video now, okay But like what if you put what if we could like throw in, you know, two hundred thousand hours of video?[00:50:09] swyx: Like how does that change your usage of AI? What if you could throw in the entire genetic sequence of a human and like synthesize new drugs. Like, well, how does that change things? Like, we don't know because we haven't had access to this capability being so cheap before. And that's the ultimate promise of these two models.[00:50:28] swyx: They're not there yet but we're seeing very, very good progress. RWKV and Mamba are probably the, like, the two leading examples, both of which are open source that you can try them today and and have a lot of progress there. And the, the, the main thing I'll highlight for audio e KV is that at, at the seven B level, they seem to have beat LAMA two in all benchmarks that matter at the same size for the same amount of training as an open source model.[00:50:51] swyx: So that's exciting. You know, they're there, they're seven B now. They're not at seven tb. We don't know if it'll. And then the other thing is diffusion. Diffusions and transformers are are kind of on the collision course. The original stable diffusion already used transformers in in parts of its architecture.[00:51:06] swyx: It seems that transformers are eating more and more of those layers particularly the sort of VAE layer. So that's, the Diffusion Transformer is what Sora is built on. The guy who wrote the Diffusion Transformer paper, Bill Pebbles, is, Bill Pebbles is the lead tech guy on Sora. So you'll just see a lot more Diffusion Transformer stuff going on.[00:51:25] swyx: But there's, there's more sort of experimentation with diffusion. I'm holding a meetup actually here in San Francisco that's gonna be like the state of diffusion, which I'm pretty excited about. Stability's doing a lot of good work. And if you look at the, the architecture of how they're creating Stable Diffusion 3, Hourglass Diffusion, and the inconsistency models, or SDXL Turbo.[00:51:45] swyx: All of these are, like, very, very interesting innovations on, like, the original idea of what Stable Diffusion was. So if you think that it is expensive to create or slow to create Stable Diffusion or an AI generated art, you are not up to date with the latest models. If you think it is hard to create text and images, you are not up to date with the latest models.[00:52:02] swyx: And people still are kind of far behind. The last piece of which is the wildcard I always kind of hold out, which is text diffusion. So Instead of using autogenerative or autoregressive transformers, can you use text to diffuse? So you can use diffusion models to diffuse and create entire chunks of text all at once instead of token by token.[00:52:22] swyx: And that is something that Midjourney confirmed today, because it was only rumored the past few months. But they confirmed today that they were looking into. So all those things are like very exciting new model architectures that are, Maybe something that we'll, you'll see in production two to three years from now.[00:52:37] swyx: So the couple of the trends[00:52:38] NLW: that I want to just get your takes on, because they're sort of something that, that seems like they're coming up are one sort of these, these wearable, you know, kind of passive AI experiences where they're absorbing a lot of what's going on around you and then, and then kind of bringing things back.[00:52:53] NLW: And then the, the other one that I, that I wanted to see if you guys had thoughts on were sort of this next generation of chip companies. Obviously there's a huge amount of emphasis. On on hardware and silicon and, and, and different ways of doing things, but, y

america god tv love ceo amazon spotify netflix world learning english europe google ai apple pr lessons magic san francisco phd friend digital marvel chinese reading data predictions elon musk microsoft events funny fortune startups white house weird economics wall street memory wall street journal reddit vr wars auto cloud singapore gate curious stanford connections mix israelis context ibm mark zuckerberg senior vice president average intel cto ram state of the union tigers vc minecraft transformers ipo siri adapt sol instructors signal lsu clouds gemini openai rust ux stability nvidia api lemon gi nsfw patel cisco luther b2c d d bro progression compass davos sweep bing makes disagreement mythology ml lama gpt github llama token thursday night apis quran stripe vcs amd devops captive baldur silicon embody dozen sora opus bobo tab mamba capital one copilot sam altman gpu altman boba llm waze generic dali upfront ide midjourney approve agi gdc napster golem coliseum git zuck kv albrecht prs diffusion cloudflare rag gpus klarna coders gan waymo deepmind boston dynamics tldr alessio gitlab sergei minefields anthropic json ppa fragmented lex fridman ena stable diffusion nox suno inflection mistral grok decibel counterpoint a16z mts databricks rohde adept cuda chroma gpts asr sundar cursor lemurian decoder iou gtc jensen huang stability ai singaporeans netlify sram etched cerebros omniverse pytorch nvidia gpus eac lamo day6 not safe devtools kubecon agis jupyter elicit autogpt vae project titan mustafa suleyman tpu milind demis personal ai groq nvidia gtc jeff dean marginally neurips practical ai andrej karpathy nlw positron imbue slido ai engineer nat friedman hbm entropic ppap lstm c300 boba guys technium mbu lpu you look xla simon willison swix latent space medex lstms mxu metax
Capitán Obvio
E48. Inclusión real. De cerebros neurodivergentes y neurotípicos al Sindrome de Down.

Capitán Obvio

Play Episode Listen Later Mar 21, 2024 14:32


El término neurodiversidad parece estar últimamente por todas partes. Cada vez más, los niños y los jóvenes adultos están utilizándolo para describirse a sí mismos. Pero, ¿qué significa ser neurodivergente y de dónde procede el término? El término refleja las muchas y variadas diferencias en el funcionamiento del cerebro de las personas. No hay una forma “correcta” o “incorrecta”. En cambio, existe una gran variedad de formas en que las personas perciben y responden al mundo, y estas diferencias deben ser aceptadas. En esta nueva columna de CRIANZA y SALUD mental conversamos acerca del 21 de Marzo el día Mundial de Sindrome de Down acerca de nuestros vinculos, de los entornos que puedan integrar e incluir, de ser menos perfectos o normales y más reales. ¡Ojalá te guste!

Pamela Cerdeira
"Fuga de cerebros se agudiza en México"

Pamela Cerdeira

Play Episode Listen Later Mar 20, 2024 8:07


En entrevista con Pamela Cerdeira, para MVS Noticias, el doctor Andrew Almazán Anaya, director de investigación y psicólogo del Centro de Atención al Talento habla de la fuga de cerebros se está agudizando en México.See omnystudio.com/listener for privacy information.

Cienciaes.com
Ay, ay, ay, lo que nos dice la AI sobre los cerebros de hombres y mujeres - Quilo de Ciencia

Cienciaes.com

Play Episode Listen Later Mar 14, 2024


Investigadores de la Facultad de Medicina de la Universidad de Standford, en California, USA, dirigidos por el Dr. Vinod Menon desarrollaron una red neuronal para el aprendizaje profundo de las estructuras cerebrales. Para entrenarla, utilizaron los datos de resonancia magnética, una especie de sónar para el cerebro, recolectados de cientos de voluntarios sanos por el proyecto Conectoma Humano. Los datos recolectados de esos cientos de cerebros en funcionamiento fueron utilizados para entrenar a la red neuronal, indicándole cuáles de las imágenes de resonancia corresponden a cerebros de hombres y cuáles corresponden a cerebros de mujeres. El sistema fue capaz de aprender e identificar correctamente si la procedencia de la imagen era de un hombre o una mujer nueve de cada diez veces. Una precisión del 90 por ciento.

Quilo de Ciencia - Cienciaes.com
Ay, ay, ay, lo que nos dice la AI sobre los cerebros de hombres y mujeres

Quilo de Ciencia - Cienciaes.com

Play Episode Listen Later Mar 14, 2024


Investigadores de la Facultad de Medicina de la Universidad de Standford, en California, USA, dirigidos por el Dr. Vinod Menon desarrollaron una red neuronal para el aprendizaje profundo de las estructuras cerebrales. Para entrenarla, utilizaron los datos de resonancia magnética, una especie de sónar para el cerebro, recolectados de cientos de voluntarios sanos por el proyecto Conectoma Humano. Los datos recolectados de esos cientos de cerebros en funcionamiento fueron utilizados para entrenar a la red neuronal, indicándole cuáles de las imágenes de resonancia corresponden a cerebros de hombres y cuáles corresponden a cerebros de mujeres. El sistema fue capaz de aprender e identificar correctamente si la procedencia de la imagen era de un hombre o una mujer nueve de cada diez veces. Una precisión del 90 por ciento.

BBVA Aprendemos Juntos
Saul Martínez-Horta: Neuropsicología de la vida cotidiana o por qué perdemos las llaves

BBVA Aprendemos Juntos

Play Episode Listen Later Mar 5, 2024 64:08


¿Qué sucede en el cerebro cuando olvidamos dónde hemos dejado las llaves o qué buscábamos en la cocina? ¿Por qué algunas enfermedades que afectan a la memoria pueden robar casi todos los recuerdos, menos los que están relacionados con emociones? ¿Y cómo puede la neuropsicología apoyar a la educación en el diagnóstico precoz de problemas del desarrollo, atención y lenguaje? A estas y otras preguntas responde Saul Martínez-Horta, doctor en Medicina y especialista en Neuropsicología clínica, autor de los libros ‘Dónde están las llaves' y ‘Cerebros rotos', donde reúne ejemplos de su amplia experiencia clínica en el ámbito del daño cerebral, trastornos del neurodesarrollo y enfermedades neurodegenerativas. “En neuropsicología, nuestro objeto de estudio es aquello que sucede cuando el cerebro se estropea. Saber en qué se traduce que un cerebro deje de hacer aquellas cosas que hace para que un ser humano funcione acorde a como es un ser humano. Nosotros somos arqueólogos de los procesos cognitivos, exploradores que utilizamos una serie de herramientas en un intento para comprender, por ejemplo, si está fallando la memoria, ¿a expensas de qué está fallando? ¿Cuál de los procesos implicados en lo que llamamos memoria puede dar lugar a que una persona aparentemente no sea capaz de recordar?”, plantea el doctor, que combina su labor divulgadora con su trabajo e investigación en el Servicio de Neurología del Hospital Sant Pau y la dirección del departamento de Neuropsicología del Centro de Diagnóstico e Intervención Neurocognitiva (CDINC) en Barcelona.

Hora 25
El análisis de Xavier Vidal-Folch | Cerebros expatriados

Hora 25

Play Episode Listen Later Feb 9, 2024 1:48


Xavier Vidal-Folch reflexiona sobre la fuga de cerebros a Arabia Saudí. 

La rosa de los vientos
Así se comunican nuestros "dos" cerebros

La rosa de los vientos

Play Episode Listen Later Dec 18, 2023 14:31


El cerebro humano está dividido en dos hemisferios. el traspaso de información entre ambos es muy importante para la vida. Sobre cómo se produce esta comunicación versa la investigación que ha efectuado el Instituto de Neurociencia de Alicante. Ramón Reig es el científico que ha estado al frente de las investigaciones. Y con él charlamos para conocer las claves de este singular y necesario estudio.

Bully Magnets
Hablar es lo peor y los cerebros de papaya – Esos Tipos Opinan 487 – Bully Magnets #podcast

Bully Magnets

Play Episode Listen Later Dec 8, 2023 117:53


Hablar es lo peor y los cerebros de papaya - Esos Tipos Opinan 487 - Bully Magnets #podcast El cargo Hablar es lo peor y los cerebros de papaya – Esos Tipos Opinan 487 – Bully Magnets #podcast apareció primero en Bully Magnets.

Que se vayan todos
ABURRIDO 252 GRETA, UNA STRIPPER Y UN TONTO SE ENCUENTRAN EN ISLANDIA público

Que se vayan todos

Play Episode Listen Later Nov 14, 2023 55:30


(00:00:00) Intro (00:01:35) No más strippers (00:19:50) El menú (00:21:35) Meta te dice si quieres que te hagas daño (00:33:18) A Greta le quitan el micrófono (00:40:58) El costo mundial de que la mitad seamos gordos (00:49:12) Semaglutide también para el corazón (00:53:40) Ojo con esto y lo que te estás perdiendo (01:04:39) España constituciocasional (01:13:09) Cerebros artificiales que recuerdan en selenio (01:18:14) La encrucijada de Netanyahu vista por otro primer ministro (01:25:18) Francia y el antisemitismo (01:30:55) Palestinos en Chile (01:36:15) El futuro de Argentina es de 7 días (01:41:05) El alcalde, el travesti y el pastor (01:45:49) El cadáver nuestro planeta madre (01:49:05) Islandia se prepara para la lava (01:50:25) Los carros híbridos van para arriba (01:52:49) El Reino Unido tendrá que importar doctores (01:54:45) El extra EPISODIO COMPLETO Y PARTICIPACION EN VIVO EN https://www.patreon.com/profesorbriceno Las Grabaciones pueden verse en vivo en TWITCH https://www.twitch.tv/profesorbriceno SUSCRÍBETE AL PODCAST POR AUDIO EN CUALQUIER PLATAFORMA SPOTIFY https://open.spotify.com/show/3rFE3ZP8OXMLUEN448Ne5i?si=1cec891caf6c4e03 APPLE PODCASTS https://podcasts.apple.com/es/podcast/que-se-vayan-todos/id676871115 GOOGLE PODCASTS https://www.ivoox.com/en/podcast-que-se-vayan-todos_sq_f11549_1.html FEED PARA CUALQUIER APP DE PODCASTS https://www.ivoox.com/feed_fg_f11549_filtro_1.xml FECHAS DE PRESENTACIONES www.profesorbriceno.com/tour SOLO PARA SUSCRIPTORES. PROHIBIDA SU REPRODUCCIÓN. HUMOR NEGRO NO APTO PARA MENORES NI ESPIRITUS SENSIBLES. GRABADO EN FECHA FEB 2023 El tema de las strippers agarra vuelo en Australia, cuando el feminismo clama desde ambos lados https://www.instagram.com/p/Czj-Z_KvKiH/?igshid=MXJzMHNoNXJkbHo2eA%3D%3D&img_index=4 https://www.dailymail.co.uk/news/article-12739829/Exotic-dancer-slams-calls-ban-strip-clubs.html https://www.prospectmagazine.co.uk/society/38722/the-naked-truth-about-strip-clubs Meta te dice en la cara, quieres que te use o quieres pagar https://www.enriquedans.com/2023/11/quiere-usted-hacerse-dano.html Le quitan el micrófono a Greta Thumberg porque ajá yo vine por una cosa no por la otra https://www.bbc.com/news/av/world-europe-67399096 la mitad del mundo va a ser obesa https://www.advisory.com/daily-briefing/2023/03/06/obesity-costs#:~:text=Overall%2C%20the%20report%20estimated%20that,COVID%2D19%20had%20in%202020. https://elpais.com/economia/negocios/2023-11-11/el-desastre-economico-de-los-kilos-de-mas-las-farmaceuticas-se-frotan-las-manos-en-un-mundo-cada-vez-mas-obeso.html como si no fuera suficiente, el semiglutide parece que sirve también para el corazon https://www.sciencedirect.com/science/article/pii/S2352409X19305371 https://www.acc.org/Latest-in-Cardiology/Articles/2023/11/08/20/14/sat-915am-select-aha-2023 España y el debate de algo muy debatible que no termina de debatirse https://www.bbc.com/mundo/articles/cjkp0vj8e1mo Otra forma de inteligencia artificial que no sabemos ni como funciona, pero funciona https://www.elconfidencial.com/tecnologia/novaceno/2023-11-02/cerebro-nanocables-neurona_3766264/ Netanyahu y los errores de cálculo, la variable de para dónde va esto https://www.politico.eu/article/israel-gaza-war-benjamin-netanyahu-miscalculating-over-gaza-former-israeli-pm-ehud-olmert-says/ Mientras tanto en Europa tratan de mantener la cordura ante el antisemitismo https://www.nytimes.com/2023/11/12/world/europe/france-antisemitism-march.html Qué tiene que ver Chile con Palestina https://www.aljazeera.com/news/2023/11/10/from-sport-to-music-chiles-palestinian-diaspora-rallies-to-support-gaza Quien ganó el debate en Argentina, no importa https://www.lanacion.com.ar/politica/sergio-massa-o-javier-milei-quien-gano-el-debate-presidencial-antes-del-balotaje-segun-los-analistas-nid13112023/#/ Por qué ya no me dan risa estos republicanos que terminan siendo travestis https://www.nytimes.com/2023/11/12/us/alabama-mayor-suicide-smiths-station.html Resulta que este planeta nuestro tien el cadáver de otro adentro https://www.caltech.edu/about/news/the-remains-of-an-ancient-planet-lie-deep-within-earth Y hablando del Planeta Islandia se prepara para perder una ciudad https://www.reuters.com/world/europe/iceland-evacuates-town-over-concerns-volcanic-eruption-2023-11-11/ Qué siginifica que se estén vendiendo más vehículo híbridos de los que se esperaba https://www.spglobal.com/mobility/en/research-analysis/dont-count-out-hybrids-just-yet.html El reino unido necesita doctores, doctores extranjeros https://www.bbc.com/news/health-67378621 EN EL EXTRA Una visita a la venezuela empresarial y agrícola Y LOS PUNTOS DE FRICCIÓN QUE NO QUEREMOS TOCAR https://www.bbc.com/mundo/articles/c4nvzq30rr8o RECOMENDACIONES PARA LOS QUE NOS SIGUEN https://www.sciencedirect.com/science/article/pii/S2352409X19305371

Noticentro
Un mes de guerra entre Hamás e Israel deja 11 mil muertos

Noticentro

Play Episode Listen Later Nov 7, 2023 1:59


80% de Unidades del Primer Nivel de Atención de Salud en Acapulco ya operan: Jorge Alcocer Hayan frascos de cristal que presuntamente contienen partes de cerebros en EdomexFES Acatlán de la UNAM interpusieron una denuncia penal por el delito de daño en los bienesMás información en nuestro podcast

Nuestro insólito universo
Nuestro Insólito Universo _ Los 3 cerebros

Nuestro insólito universo

Play Episode Listen Later Oct 14, 2023 5:06


Nuestro Insólito Universo _ Los 3 cerebros. En los cinco minutos de duración que tiene este programa se narran historias asombrosas referentes a cualquier tema. La primera transmisión de este programa se realizó por la Radio Nacional de Venezuela el 4 de agosto de 1969 y su éxito fue tal que, posteriormente, fue transmitido también por Radio Capital y, actualmente, se mantiene en la Radio Nacional (AM) y en los circuitos Éxitos y Onda, de Unión Radio (FM), lo cual le otorga una tribuna de red AM y FM que cubren todo el país, uno de los programas radiales más premiados y de mayor duración en la historia de la radio de Venezuela.

El Explicador Sitio Oficial
Cerebros de Futbolistas 2023/10/10. El Explicador. Cápsula.

El Explicador Sitio Oficial

Play Episode Listen Later Oct 10, 2023 29:16


Un estudio reciente revela que el cerebro de algunos jugadores de futbol soccer funcionan de una manera peculiar. Este descubrimiento revela nuevos aspectos de la capacidad de nuestro cerebro para adaptarse a ambientes cambiantes. Gracias por sus comentarios, interacciones, apoyo económico y suscripción. Escuche y descargue gratuitamente en MP3 2023/10/10 Cerebros de Futbolistas. Gracias por su apoyo a El Explicador en: Patreon, https://www.patreon.com/elexplicador_enriqueganem PayPal, elexplicadorpatrocinio@gmail.com SoundCloud, https://soundcloud.com/el-explicador Spotify, https://open.spotify.com/show/01PwWfs1wV9JrXWGQ2MrbY iTunes, https://podcasts.apple.com/mx/podcast/el-explicador-sitio-oficial/id1562019070 Amazon Music, https://music.amazon.com/podcasts/f2656899-46c8-4d0b-85ef-390aaf20f366/el-explicador-sitio-oficial YouTube, https://youtube.com/c/ElExplicadorSitioOficial Twitter @enrique_ganem Lo invitamos a suscribirse a estas redes para recibir avisos de nuestras publicaciones y visitar nuestra página http://www.elexplicador.net. En el título de nuestros trabajos aparece la fecha año/mes/día de grabación, lo que facilita su consulta cronológica, ya sabe usted que el conocimiento cambia a lo largo del tiempo. Siempre leemos sus comentarios, no tenemos tiempo para reponder a cada uno personalmente pero todos son leídos y tomados en cuenta. Este es un espacio de divulgación científica en el que nos interesa informar de forma clara y amena, que le invite a Ud. a investigar sobre los temas tratados y a que Ud. forme su propia opinión. Serán borrados todos los comentarios que promuevan la desinformación, charlatanería, odio, bullying, violencia verbal o incluyan enlaces a páginas que no sean de revistas científicas arbitradas, que sean ofensivos hacia cualquier persona o promuevan alguna tendencia política o religiosa ya sea en el comentario o en la fotografía de perfil. Aclaramos que no somos apolíticos, nos reservamos el derecho de no expresar nuestra opinión política, ya que éste es un canal cuya finalidad es la divulgación científica. ¡Gracias por su preferencia!

Espacio Vital
¿En qué consiste el estudio clínico que implantará chips en cerebros humanos?

Espacio Vital

Play Episode Listen Later Oct 6, 2023 5:27


¿Es posible insertar chips en seres humanos? Un estudio clínico utilizará este método. El Dr. Elmer Huerta nos explica los detalles.

Espacio Vital
¿En qué consiste el estudio clínico que implantará chips en cerebros humanos?

Espacio Vital

Play Episode Listen Later Oct 6, 2023 5:27


¿Es posible insertar chips en seres humanos? Un estudio clínico utilizará este método. El Dr. Elmer Huerta nos explica los detalles.

Humanismo Digital
62 - Cazador de cerebros con Pere Estupinyà - Humanismo Digital

Humanismo Digital

Play Episode Listen Later Oct 1, 2023 58:08


Conversamos con Pere Estupinyà divulgador científico y escritor que nos invita a interesarnos y aprender sobre temas diversos como física, genética, biomedicina, nanotecnología o inteligencia artificial Su presencia en medios como la cadena SER o su programa El Cazador de Cerebros en RTVE estimulan siempre nuestra curiosidad y espíritu crítico Aquí conoceremos una versión más personal de este cazador que se dejó cazar durante esta agradable y espero que interesante conversación ... Referencias El cazador de cerebros en RTVE: https://qrcd.org/3iWa A vivir que son dos días en cadena SER: https://qrcd.org/3iWb Web Pere Estupinyà: https://pereestupinya.com/ Libros: https://pereestupinya.com/libros/

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0
FlashAttention 2: making Transformers 800% faster w/o approximation - with Tri Dao of Together AI

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Jul 26, 2023 54:31


FlashAttention was first published by Tri Dao in May 2022 and it had a deep impact in the large language models space. Most open models you've heard of (RedPajama, MPT, LLaMA, Falcon, etc) all leverage it for faster inference. Tri came on the podcast to chat about FlashAttention, the newly released FlashAttention-2, the research process at Hazy Lab, and more. This is the first episode of our “Papers Explained” series, which will cover some of the foundational research in this space. Our Discord also hosts a weekly Paper Club, which you can signup for here. How does FlashAttention work?The paper is titled “FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness”. There are a couple keywords to call out:* “Memory Efficient”: standard attention memory usage is quadratic with sequence length (i.e. O(N^2)). FlashAttention is sub-quadratic at O(N). * “Exact”: the opposite of “exact” in this case is “sparse”, as in “sparse networks” (see our episode with Jonathan Frankle for more). This means that you're not giving up any precision.* The “IO” in “IO-Awareness” stands for “Input/Output” and hints at a write/read related bottleneck. Before we dive in, look at this simple GPU architecture diagram:The GPU has access to three memory stores at runtime:* SRAM: this is on-chip memory co-located with the actual execution core. It's limited in size (~20MB on an A100 card) but extremely fast (19TB/s total bandwidth)* HBM: this is off-chip but on-card memory, meaning it's in the GPU but not co-located with the core itself. An A100 has 40GB of HBM, but only a 1.5TB/s bandwidth. * DRAM: this is your traditional CPU RAM. You can have TBs of this, but you can only get ~12.8GB/s bandwidth, which is way too slow.Now that you know what HBM is, look at how the standard Attention algorithm is implemented:As you can see, all 3 steps include a “write X to HBM” step and a “read from HBM” step. The core idea behind FlashAttention boils down to this: instead of storing each intermediate result, why don't we use kernel fusion and run every operation in a single kernel in order to avoid memory read/write overhead? (We also talked about kernel fusion in our episode with George Hotz and how PyTorch / tinygrad take different approaches here)The result is much faster, but much harder to read:As you can see, FlashAttention is a very meaningful speed improvement on traditional Attention, and it's easy to understand why it's becoming the standard for most models.This should be enough of a primer before you dive into our episode! We talked about FlashAttention-2, how Hazy Research Group works, and some of the research being done in Transformer alternatives.Show Notes:* FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness (arXiv)* FlashAttention-2* Together AI* From Deep Learning to Long Learning* The Hardware Lottery by Sara Hooker* Hazy Research* Is Attention All You Need?* Nvidia CUTLASS 3* SRAM scaling slows* Transformer alternatives:* S4* Hyena* Recurrent Neural Networks (RNNs)Timestamps:* Tri's background [00:00:00]* FlashAttention's deep dive [00:02:18]* How the Hazy Research group collaborates across theory, systems, and applications [00:17:21]* Evaluating models beyond raw performance [00:25:00]* FlashAttention-2 [00:27:00]* CUDA and The Hardware Lottery [00:30:00]* Researching in a fast-changing market [00:35:00]* Promising transformer alternatives like state space models and RNNs [00:37:30]* The spectrum of openness in AI models [00:43:00]* Practical impact of models like LLAMA2 despite restrictions [00:47:12]* Incentives for releasing open training datasets [00:49:43]* Lightning Round [00:53:22]Transcript:Alessio: Hey everyone, welcome to the Latent Space podcast. This is Alessio, Partner and CTO-in-Residence at Decibel Partners. Today we have no Swyx, because he's in Singapore, so it's a one-on-one discussion with Tri Dao. Welcome! [00:00:24]Tri: Hi everyone. I'm Tri Dao, excited to be here. [00:00:27]Alessio: Tri just completed his PhD at Stanford a month ago. You might not remember his name, but he's one of the main authors in the FlashAttention paper, which is one of the seminal work in the Transformers era. He's got a lot of interest from efficient transformer training and inference, long range sequence model, a lot of interesting stuff. And now you're going to be an assistant professor in CS at Princeton next year. [00:00:51]Tri: Yeah, that's right. [00:00:52]Alessio: Yeah. And in the meantime, just to get, you know, a low pressure thing, you're Chief Scientist at Together as well, which is the company behind RedPajama. [00:01:01]Tri: Yeah. So I just joined this week actually, and it's been really exciting. [00:01:04]Alessio: So what's something that is not on the internet that people should know about you? [00:01:09]Tri: Let's see. When I started college, I was going to be an economist, so I was fully on board. I was going to major in economics, but the first week I was at Stanford undergrad, I took a few math classes and I immediately decided that I was going to be a math major. And that kind of changed the course of my career. So now I'm doing math, computer science, AI research. [00:01:32]Alessio: I had a similar thing. I started with physics and then I took like a programming course and I was like, I got to do computer science. I don't want to do physics. So FlashAttention is definitely, everybody's using this. Everybody loves it. You just released FlashAttention 2 last week. [00:01:48]Tri: Yeah. Early this week on Monday. Yeah. [00:01:53]Alessio: You know, AI time. Things move fast. So maybe let's run through some of the FlashAttention highlights, some of the innovation there, and then we can dive into FlashAttention 2. So the core improvement in FlashAttention is that traditional attention is a quadratic sequence length. And to the two, FlashAttention is linear, which obviously helps with scaling some of these models. [00:02:18]Tri: There are two factors there. So of course the goal has been to make attention go faster or more memory efficient. And ever since attention became popular in 2017 with the Transformer paper, lots and lots of folks have been working on this. And a lot of approaches has been focusing on approximating attention. The goal is you want to scale to longer sequences. There are tons of applications where you want to do that. But scaling to longer sequences is difficult because attention scales quadratically in sequence length on both runtime and memory, as you mentioned. So instead of trying to approximate attention, we were trying to figure out, can we do the same computation and maybe be more memory efficient? So in the end, we ended up being the memory is linear in sequence length. In terms of computation, it's still quadratic, but we managed to make it much more hardware friendly. And as a result, we do get wall clock speed up on the order of 2 to 4x, which really helps because that just means that you'll be able to train with 2 to 4x longer sequence length for the same cost without doing any approximations. As a result, lots of folks have been using this. The thing is available in a lot of libraries that do language model training or fine tuning. [00:03:32]Alessio: And the approximation thing is important because this is an exact thing versus a sparse. So maybe explain a little bit the difference there. [00:03:40]Tri: For sure. So in addition, essentially you compute pairwise similarity between every single element in a sequence against each other. So there's been other approaches where instead of doing all that pairwise computation, you only compute similarity for some pairs of elements in the sequence. So you don't do quadratic number of comparison. And this can be seen as some form of sparsity. Essentially you're ignoring some of the elements. When you write down the matrix, you essentially say, OK, I'm going to pretend there's zero. So that has some benefits in terms of runtime and memory. But the trade-off is that it tends to do worse in terms of quality because you're essentially approximating or ignoring some elements. And I personally have worked on this as well for a few years. But when we talk to practitioners who actually train models, especially at large scale, they say, tend not to use these approximate attention methods. Because it turns out, this was surprising to me at the time, was that these approximation methods, even though they perform fewer computation, they tend to not be faster in walk-on time. So this was pretty surprising because back then, I think my background was more on the theoretical side. So I was thinking of, oh, how many flops or floating point operations are you performing? And hopefully that correlates well with walk-on time. But I realized that I was missing a bunch of ideas from the system side where flops or floating point operations don't necessarily correlate with runtime. There are other factors like memory reading and writing, parallelism, and so on. So I learned a ton from just talking to systems people because they kind of figured this stuff out a while ago. So that was really eye-opening. And then we ended up focusing a lot more on memory reading and writing because that turned out to be the majority of the time when you're doing attention is reading and writing memory. [00:05:34]Alessio: Yeah, the I.O. awareness is probably one of the biggest innovations here. And the idea behind it is, like you mentioned, the FLOPS growth of the cards have been going up, but the memory bandwidth, not as much. So I think maybe that was one of the assumptions that the original attention paper had. So talk a bit about how that came to be as an idea. It's one of those things that like in insight, it's like, obviously, why are we like rewriting to like HBM every time, you know, and like once you change it, it's clear. But what was that discovery process? [00:06:08]Tri: Yeah, in hindsight, a lot of the ideas have already been there in the literature. And I would say is it was somehow at the intersection of both machine learning and systems. And you kind of needed ideas from both sides. So on one hand, on the system side, so lots of systems folks have known that, oh, you know, kernel fusion is great. Kernel fusion just means that instead of performing, you know, loading the same element, instead of performing an operation, write it down, load it back up and perform the second operation, you just load it once, perform two operations and then write it down again. So that saves you kind of memory read and write in the middle there. So kernel fusion has been a classic. There's been other techniques from the system side, like tiling, where you perform things in the form of computations in block, again, so that you can load it into a really fast memory. Think of it as a cache. And this is, again, classical computer science ideas, right? You want to use the cache. So the system folks have been thinking about these ideas for a long time, and they apply to attention as well. But there were certain things in attention that made it difficult to do a complete kernel fusion. One of which is there is this softmax operation in the middle, which requires you to essentially sum across the row of the attention matrix. So it makes it difficult to kind of break it, because there's this dependency. So it makes it difficult to break things into a block. So on the system side, people have been thinking about these ideas, but it's been difficult to kind of do kernel fusion for the entire operation. On the machine learning side, people have been thinking more algorithmically. They say, okay, either we can approximate attention, or there's this trick called the online softmax trick, which says that because of softmax, the way it's written mathematically, you can actually break it up into smaller pieces, do some rescaling, and still get the right answer. So this online softmax trick has been around for a while. I think there was a paper from NVIDIA folks back in 2018 about this. And then there was a paper from Google. So Marcus, Rob, and Stats wrote a paper late 2021 on using this online softmax trick to break attention up into smaller pieces. So a lot of the ideas were already there. But it turns out, you kind of need to combine ideas from both sides. So you need to understand that, hey, we want to do kernel fusion to reduce memory written writes. But we also need this online softmax trick to be able to break the softmax into smaller pieces so that a lot of the systems tricks kind of carry through. We saw that, and it was kind of a natural idea that we ended up using ideas from both sides, and it ended up working pretty well. Yeah. [00:08:57]Alessio: Are there any downsides to kernel fusion? If I think about databases and the reasons why we have atomic operations, you know, it's like, you have observability and fallback in between them. How does that work with attention? Is there anything that we lose by fusing the operations? [00:09:13]Tri: Yeah, I think mostly on the practical side is that you lose a little bit of flexibility in the sense that, hey, now you have, for example, faster attention, it's just a subroutine that you would call to do attention. But as a researcher, let's say you don't want that exact thing, right? You don't want just attention, let's say you want some modification to attention. You want to do, hey, I'm going to multiply the query and key, but then I'm going to do this extra thing before I carry on. So kernel fusion just means that, okay, we have a subroutine that does the entire thing. But if you want to experiment with things, you won't be able to use that fused kernel. And the answer is, can we have a compiler that then automatically does a lot of this kernel fusion? Lots of compiler folks are thinking about this, either with a new language or you can embed it in PyTorch. PyTorch folks have been working on this as well. So if you write just your code in PyTorch and they can capture the graph, can they generate code that will fuse everything together? That's still ongoing, and it works for some cases. But for attention, because of this kind of softmax rewriting stuff, it's been a little bit more difficult. So maybe in a year or two, we'll have compilers that are able to do a lot of these optimizations for you. And you don't have to, for example, spend a couple months writing CUDA to get this stuff to work. Awesome. [00:10:41]Alessio: And just to make it clear for listeners, when we say we're not writing it to memory, we are storing it, but just in a faster memory. So instead of the HBM, we're putting it in the SRAM. Yeah. [00:10:53]Tri: Yeah. [00:10:54]Alessio: Maybe explain just a little bit the difference there. [00:10:56]Tri: Yeah, for sure. This is kind of a caricature of how you think about accelerators or GPUs in particular, is that they have a large pool of memory, usually called HBM, or high bandwidth memory. So this is what you think of as GPU memory. So if you're using A100 and you list the GPU memory, it's like 40 gigs or 80 gigs. So that's the HBM. And then when you perform any operation, you need to move data from the HBM to the compute unit. So the actual hardware unit that does the computation. And next to these compute units, there are on-chip memory or SRAM, which are much, much smaller than HBM, but much faster. So the analogy there is if you're familiar with, say, CPU and RAM and so on. So you have a large pool of RAM, and then you have the CPU performing the computation. But next to the CPU, you have L1 cache and L2 cache, which are much smaller than DRAM, but much faster. So you can think of SRAM as the small, fast cache that stays close to the compute unit. Physically, it's closer. There is some kind of asymmetry here. So HBM is much larger, and SRAM is much smaller, but much faster. One way of thinking about it is, how can we design algorithms that take advantage of this asymmetric memory hierarchy? And of course, lots of folks have been thinking about this. These ideas are pretty old. I think back in the 1980s, the primary concerns were sorting. How can we sort numbers as efficiently as possible? And the motivating example was banks were trying to sort their transactions, and that needs to happen overnight so that the next day they can be ready. And so the same idea applies, which is that they have slow memory, which was hard disk, and they have fast memory, which was DRAM. And people had to design sorting algorithms that take advantage of this asymmetry. And it turns out, these same ideas can apply today, which is different kinds of memory. [00:13:00]Alessio: In your paper, you have the pyramid of memory. Just to give people an idea, when he says smaller, it's like HBM is like 40 gig, and then SRAM is like 20 megabytes. So it's not a little smaller, it's much smaller. But the throughput on card is like 1.5 terabytes a second for HBM and like 19 terabytes a second for SRAM, which is a lot larger. How do you think that evolves? So TSMC said they hit the scaling limits for SRAM, they just cannot grow that much more. HBM keeps growing, HBM3 is going to be 2x faster than HBM2, I think the latest NVIDIA thing has HBM3. How do you think about the future of FlashAttention? Do you think HBM is going to get fast enough when maybe it's not as useful to use the SRAM? [00:13:49]Tri: That's right. I think it comes down to physics. When you design hardware, literally SRAM stays very close to compute units. And so you don't have that much area to essentially put the transistors. And you can't shrink these things too much. So just physics, in terms of area, you don't have that much area for the SRAM. HBM is off-chip, so there is some kind of bus that essentially transfers data from HBM to the compute unit. So you have more area to essentially put these memory units. And so yeah, I think in the future SRAM probably won't get that much larger, because you don't have that much area. HBM will get larger and faster. And so I think it becomes more important to design algorithms that take advantage of this memory asymmetry. It's the same thing in CPU, where the cache is really small, the DRAM is growing larger and larger. DRAM could get to, I don't know, two terabytes, six terabytes, or something, whereas the cache stays at, I don't know, 15 megabytes or something like that. I think maybe the algorithm design becomes more and more important. There's still ways to take advantage of this, I think. So in the future, I think flash attention right now is being used. I don't know if in the next couple of years, some new architecture will come in and whatnot, but attention seems to be still important. For the next couple of years, I still expect some of these ideas to be useful. Not necessarily the exact code that's out there, but I think these ideas have kind of stood the test of time. New ideas like IO awareness from back in the 1980s, ideas like kernel fusions, tiling. These are classical ideas that have stood the test of time. So I think in the future, these ideas will become more and more important as we scale models to be larger, as we have more kinds of devices, where performance and efficiency become much, much more important. [00:15:40]Alessio: Yeah, and we had Jonathan Frankle on the podcast, and if you go to issattentionallyouneed.com, he has an outstanding bet, and he does believe that attention will be the state of the art architecture still in a few years. Did you think flash attention would be this popular? I'm always curious on the research side, you publish a paper, and obviously you know it's great work, but sometimes it just kind of falls flat in the industry. Could you see everybody just starting to use this, or was that a surprise to you? [00:16:11]Tri: Certainly, I didn't anticipate the level of popularity. Of course, we were extremely happy to have people using this stuff and giving us feedback and so on, and help us improve things. I think when we were writing the paper, I remember sending an email to one of my advisors, and like, hey, I'm excited about this paper, but I think the most important thing will be the artifact, which is the code. So I knew that the code will be valuable. So we kind of focus a lot on the code and make sure that the code is usable and as fast as can be. Of course, the idea, the paper presents the ideas and explain it and have experiments that validate the idea, but I knew that the artifact or the code was also pretty important. And that turned out to be the right focus, which is, you know, we put out the paper, we release the code and continue working on the code. So it's a team effort with my co-authors as well. [00:17:07]Alessio: We mentioned Hazy Research a bunch of times on the podcast before. I would love for you to spend five minutes just talking about how does the group work? How do people get together? How do you bounce ideas off of each other? Yeah. [00:17:21]Tri: So Hazy Research is a research group at Stanford led by one of my advisors, Chris Re. I love the people there. It was one of the best experiences I had. They've made my PhD so much more enjoyable. And I think there are a couple of ways that the group has been working pretty well. So one is, I think there's a diverse pool of people who either, you know, some of them focus on algorithms and theory, some of them focus on building systems, some of them focus on applications. And as a result, there is this flow of idea. So as an example, some of us were working on like more algorithms and theory, and then we can talk to the folks building systems and say, hey, let's try it out and let's put it in the systems and see how it is. And there you will get feedback from systems folks. They will say, hey, we implemented this, or we tried this and this is where it doesn't work, something like that. And once we put it in the systems, the application folks can use the algorithm or new methods or new models. And we again get great feedback from them because the application folks, for example, some of my good friends, they focus on medical imaging or seizure detection. And that is the problem they care about. And if your method doesn't work on the task they care about, they will tell you. Whereas I think a lot of people in machine learning, they're a little bit more flexible. So they will be like, hey, it doesn't work on seizure detection. Let's try some other task, right? But having that direct feedback of like, hey, it doesn't work there, let's figure out why. I think that that feedback allows us to do better work. And I think that kind of process of exchanging ideas, validating it in a real system so that applications folks can try it out and give you feedback. That cycle has been very, very useful. And so that's one, having a diverse group of people. The other one is, and this is something I really appreciate from advice from Chris was try to understand the fundamental, right? And he's happy letting me go off and read some textbooks and playing with things because I think a lot of research ideas come from understanding the old literature and see how it fits with the new landscape. And so if you just new archive papers every day, that's great, but you also need to read textbooks. And that's one advice I got from Chris, which is understand the fundamentals. And I think that allows us to do more impactful work. [00:19:46]Alessio: How do you think about academia versus industry? I feel like AI / Machine Learning has been an area where up until three, four years ago, most of the cutting edge work was being done in academia. And now there's all these big industry research labs. You're obviously going to Princeton, so you're an academia believer. How should people think about where to go? Say I'm doing my master's, I have to decide between doing a PhD and going into OpenAI Anthropic. How should I decide? [00:20:15]Tri: I think they kind of play a complementary role, in my opinion. Of course, I also was considering different paths as well. So I think right now, scaling matters a lot, especially when you talk about language models and AI and so on. Scaling matters a lot. And that means that you need compute resources and you need infrastructure and you need engineers time. And so industry tends to have an advantage when it comes to scaling things. But a lot of the ideas actually came from academia. So let's take Attention, which got popular with the Transformer in 2017. Attention actually has been around for a while. So I think the first mention was in 2014, a paper from Bernadot and others and Yoshua Bengio, which is coming from academia. A lot of ideas did come from academia. And scaling things up, of course, I think OpenAI has been great at scaling things up. That was the bet that they made after, I think, GPT-2. So they saw that scaling these things up to back then was 1.5 billion parameter seemed to give you amazing capabilities. So they really committed to that. They really committed to scaling things. And that turned out to be, it's been a pretty successful bet. I think for academia, we're still trying to figure out exactly what we're doing in this shifting landscape. And so lots of folks have been focusing on, for example, evaluation. So I know the Stanford Center for Foundation Model led by Percy, they have this benchmark called HELM, which is this holistic benchmark. So trying to figure out, okay, characterizing the landscape of different kinds of models, what people should evaluate, what people should measure, and things like that. So evaluation is one role. The other one is understanding. So this has happened historically where there's been some development in the industry and academia can play a role in explaining, understanding. They have the luxury to slow down trying to understand stuff, right? So lots of paper on understanding what's really going on, probing these models, and so on. I think I'm not as familiar with the NLP literature, but my impression is there's a lot of that going on in the NLP conferences, which is understanding what these models are doing, what capabilities they have, and so on. And the third one I could see is that the academia can take more risky bets in the sense that we can work on stuff that is quite different from industry. I think industry, my impression is you have some objective. You're trying to say, hey, for this quarter, we want to scale the model in this particular way. Next quarter, we want the model to have these capabilities. You're trying to get objectives that maybe, I don't know, 70% that will work out because it's important for the company's direction. I think for academia, the way things work is you have many, many researchers or PhD students, and they're kind of pursuing independent directions. And they have a little bit more flexibility on, hey, I'm going to try out this seemingly crazy idea and see, let's say there's a 30% chance of success or something. And however you define success, for academia, a lot of the time, success just means like, hey, we found something interesting. That could eventually go into industry through collaboration and so on. So I do see academia and industry kind of playing complementary roles. And as for someone choosing a career, I think just more and more generally, industry would be probably better in terms of compensation, in terms of probably work-life balance. But my biased perspective is that maybe academia gives you a little bit more freedom to think and understand things. So it probably comes down to personal choice. I end up choosing to be a professor next year at Princeton. But of course, I want to maintain a relationship with industry folks. I think industry folks can provide very valuable feedback to what we're doing in academia so that we understand where the field is moving because some of the directions are very much influenced by what, for example, OpenAI or Google is doing. So we want to understand where the field is moving. What are some promising applications? And try to anticipate, okay, if the field is moving like this, these applications are going to be popular. What problems will be important in two, three years? And then we try to start thinking about those problems so that hopefully in two, three years, we have some of the answers to some of these problems in two, three years. Sometimes it works out, sometimes it doesn't. But as long as we do interesting things in academia, that's the goal. [00:25:03]Alessio: And you mentioned the eval side. So we did a Benchmarks 101 episode. And one of the things we were seeing is sometimes the benchmarks really influence the model development. Because obviously, if you don't score well on the benchmarks, you're not going to get published and you're not going to get funded. How do you think about that? How do you think that's going to change now that a lot of the applications of these models, again, is in more narrow industry use cases? Do you think the goal of the academia eval system is to be very broad and then industry can do their own evals? Or what's the relationship there? [00:25:40]Tri: Yeah, so I think evaluation is important and often a little bit underrated. So it's not as flashy as, oh, we have a new model that can do such and such. But I think evaluation, what you don't measure, you can't make progress on, essentially. So I think industry folks, of course, they have specific use cases that their models need to do well on. And that's what they care about. Not just academia, but other groups as well. People do understand what are some of the emerging use cases. So for example, now one of the most popular use cases is Chatbot. And then I think folks from Berkeley, some of them are from Berkeley, call them MLCs. They set up this kind of Chatbot arena to essentially benchmark different models. So people do understand what are some of the emerging use cases. People do contribute to evaluation and measurement. And as a whole, I think people try to contribute to the field and move the field forward, albeit that maybe slightly different directions. But we're making progress and definitely evaluation and measurement is one of the ways you make progress. So I think going forward, there's still going to be just more models, more evaluation. We'll just have better understanding of what these models are doing and what capabilities they have. [00:26:56]Alessio: I like that your work has been focused on not making benchmarks better, but it's like, let's just make everything faster. So it's very horizontal. So FlashAttention 2, you just released that on Monday. I read in the blog post that a lot of the work was also related to some of the NVIDIA library updates. Yeah, maybe run us through some of those changes and some of the innovations there. Yeah, for sure. [00:27:19]Tri: So FlashAttention 2 is something I've been working on for the past couple of months. So the story is the NVIDIA CUTLASS team, they released a new version of their library, which contains all these primitives to allow you to do matrix multiply or memory loading on GPU efficiently. So it's a great library and I built on that. So they released their version 3 back in January and I got really excited and I wanted to play with that library. So as an excuse, I was just like, okay, I'm going to refactor my code and use this library. So that was kind of the start of the project. By the end, I just ended up working with the code a whole lot more and I realized that, hey, there are these inefficiencies still in Flash Attention. We could change this way or that way and make it, in the end, twice as fast. But of course, building on the library that the NVIDIA folks released. So that was kind of a really fun exercise. I was starting out, it's just an excuse for myself to play with the new library. What ended up was several months of improvement, improving Flash Attention, discovering new ideas. And in the end, we managed to make it 2x faster and now it's pretty close to probably the efficiency of things like matrix multiply, which is probably the most optimized subroutine on the planet. So we're really happy about it. The NVIDIA Cutlass team has been very supportive and hopefully in the future, we're going to collaborate more. [00:28:46]Alessio: And since it's an NVIDIA library, can you only run this on CUDA runtimes? Or could you use this and then run it on an AMD GPU? [00:28:56]Tri: Yeah, so it's an NVIDIA library. So right now, the code we release runs on NVIDIA GPUs, which is what most people are using to train models. Of course, there are emerging other hardware as well. So the AMD folks did implement a version of Flash Attention, I think last year as well, and that's also available. I think there's some implementation on CPU as well. For example, there's this library, ggml, where they implemented the same idea running on Mac and CPU. So I think that kind of broadly, the idea would apply. The current implementation ended up using NVIDIA's library or primitives, but I expect these ideas to be broadly applicable to different hardware. I think the main idea is you have asymmetry in memory hierarchy, which tends to be everywhere in a lot of accelerators. [00:29:46]Alessio: Yeah, it kind of reminds me of Sara Hooker's post, like the hardware lottery. There could be all these things that are much better, like architectures that are better, but they're not better on NVIDIA. So we're never going to know if they're actually improved. How does that play into some of the research that you all do too? [00:30:04]Tri: Yeah, so absolutely. Yeah, I think Sara Hooker, she wrote this piece on hardware lottery, and I think she captured really well of what a lot of people have been thinking about this. And I certainly think about hardware lottery quite a bit, given that I do some of the work that's kind of really low level at the level of, hey, we're optimizing for GPUs or NVIDIA GPUs and optimizing for attention itself. And at the same time, I also work on algorithms and methods and transformer alternatives. And we do see this effect in play, not just hardware lottery, but also kind of software framework lottery. You know, attention has been popular for six years now. And so many kind of engineer hours has been spent on making it as easy and efficient as possible to run transformer, right? And there's libraries to do all kinds of tensor parallel, pipeline parallel, if you use transformer. Let's say someone else developed alternatives, or let's just take recurrent neural nets, like LSTM, GRU. If we want to do that and run that efficiently on current hardware with current software framework, that's quite a bit harder. So in some sense, there is this feedback loop where somehow the model architectures that take advantage of hardware become popular. And the hardware will also kind of evolve to optimize a little bit for that kind of architecture and software framework will also evolve to optimize for that particular architecture. Right now, transformer is the dominant architecture. So yeah, I'm not sure if there is a good way out of this. Of course, there's a lot of development. Things like, I think compilers will play a role because compilers allow you to maybe still be much more efficient across different kinds of hardware because essentially you write the same code and compiler will be able to make it run efficiently different kinds of hardware. So for example, there's this language Mojo, they're compiler experts, right? And their bet is AI models will be running on different kinds of devices. So let's make sure that we have really good compilers with a good language that then the compiler can do a good job optimizing for all kinds of devices. So that's maybe one way that you can get out of this cycle. But yeah, I'm not sure of a good way. In my own research, I have to think about both the algorithm new model and how it maps to hardware. So there are crazy ideas that seem really good, but will be really, really difficult to run efficiently. And so as a result, for example, we can't really scale some of the architectures up simply because they're not hardware friendly. I have to think about both sides when I'm working on new models. [00:32:50]Alessio: Yeah. Have you spent any time looking at some of the new kind of like AI chips companies, so to speak, like the Cerebras of the world? Like one of their innovations is co-locating everything on the chip. So you remove some of this memory bandwidth issue. How do you think about that? [00:33:07]Tri: Yeah, I think that's an interesting bet. I think Tesla also has this Dojo supercomputer where they try to have essentially as fast on-chip memory as possible and removing some of these data transfer back and forth. I think that's a promising direction. The issues I could see, you know, I'm definitely not a hardware expert. One issue is the on-chip memory tends to be really expensive to manufacture, much more expensive per gigabyte compared to off-chip memory. So I talked to, you know, some of my friends at Cerebros and, you know, they have their own stack and compiler and so on, and they can make it work. The other kind of obstacle is, again, with compiler and software framework and so on. For example, if you can run PyTorch on this stuff, lots of people will be using it. But supporting all the operations in PyTorch will take a long time to implement. Of course, people are working on this. So I think, yeah, we kind of need these different bets on the hardware side as well. Hardware has, my understanding is, has a kind of a longer time scale. So you need to design hardware, you need to manufacture it, you know, maybe on the order of three to five years or something like that. So people are taking different bets, but the AI landscape is changing so fast that it's hard to predict, okay, what kind of models will be dominant in, let's say, three or five years. Or thinking back five years ago, would we have known that Transformer would have been the dominant architecture? Maybe, maybe not, right? And so different people will make different bets on the hardware side. [00:34:39]Alessio: Does the pace of the industry and the research also influence the PhD research itself? For example, in your case, you're working on improving attention. It probably took you quite a while to write the paper and everything, but in the meantime, you could have had a new model architecture come out and then it's like nobody cares about attention anymore. How do people balance that? [00:35:02]Tri: Yeah, so I think it's tough. It's definitely tough for PhD students, for researchers. Given that the field is moving really, really fast, I think it comes down to understanding fundamental. Because that's essentially, for example, what the PhD allows you to do. It's been a couple of years understanding the fundamentals. So for example, when I started my PhD, I was working on understanding matrix vector multiply, which has been a concept that's been around for hundreds of years. We were trying to characterize what kind of matrices would have theoretically fast multiplication algorithm. That seems to have nothing to do with AI or anything. But I think that was a time when I developed mathematical maturity and research taste and research skill. The research topic at that point didn't have to be super trendy or anything, as long as I'm developing skills as a researcher, I'm making progress. And eventually, I've gotten quite a bit better in terms of research skills. And that allows, for example, PhD students later in their career to quickly develop solutions to whatever problems they're facing. So I think that's just the natural arc of how you're being trained as a researcher. For a lot of PhD students, I think given the pace is so fast, maybe it's harder to justify spending a lot of time on the fundamental. And it's tough. What is this kind of explore, exploit kind of dilemma? And I don't think there's a universal answer. So I personally spend some time doing this kind of exploration, reading random textbooks or lecture notes. And I spend some time keeping up with the latest architecture or methods and so on. I don't know if there's a right balance. It varies from person to person. But if you only spend 100% on one, either you only do exploration or only do exploitation, I think it probably won't work in the long term. It's probably going to have to be a mix and you have to just experiment and kind of be introspective and say, hey, I tried this kind of mixture of, I don't know, one exploration paper and one exploitation paper. How did that work out for me? Should I, you know, having conversation with, for example, my advisor about like, hey, did that work out? You know, should I shift? I focus more on one or the other. I think quickly adjusting and focusing on the process. I think that's probably the right way. I don't have like a specific recommendation that, hey, you focus, I don't know, 60% on lecture notes and 40% on archive papers or anything like that. [00:37:35]Alessio: Let's talk about some Transformer alternatives. You know, say Jonathan Franco loses his bet and Transformer is not the state of the art architecture. What are some of the candidates to take over? [00:37:49]Tri: Yeah, so this bet is quite fun. So my understanding is this bet between Jonathan Franco and Sasha Rush, right? I've talked to Sasha a bunch and I think he recently gave an excellent tutorial on Transformer alternatives as well. So I would recommend that. So just to quickly recap, I think there's been quite a bit of development more recently about Transformer alternatives. So architectures that are not Transformer, right? And the question is, can they do well on, for example, language modeling, which is kind of the application that a lot of people care about these days. So there are methods based on state space methods that came out in 2021 from Albert Gu and Curran and Chris Re that presumably could do much better in terms of capturing long range information while not scaling quadratically. They scale sub-quadratically in terms of sequence length. So potentially you could have a much more efficient architecture when sequence length gets really long. The other ones have been focusing more on recurrent neural nets, which is, again, an old idea, but adapting to the new landscape. So things like RWKV, I've also personally worked in this space as well. So there's been some promising results. So there's been some results here and there that show that, hey, these alternatives, either RNN or state space methods, can match the performance of Transformer on language modeling. So that's really exciting. And we're starting to understand on the academic research side, we want to understand, do we really need attention? I think that's a valuable kind of intellectual thing to understand. And maybe we do, maybe we don't. If we want to know, we need to spend serious effort on trying the alternatives. And there's been folks pushing on this direction. I think RWKV scale up to, they have a model at 14 billion that seems pretty competitive with Transformer. So that's really exciting. That's kind of an intellectual thing. We want to figure out if attention is necessary. So that's one motivation. The other motivation is Transformer Alternative could have an advantage in practice in some of the use cases. So one use case is really long sequences. The other is really high throughput of generation. So for really long sequences, when you train with Transformer, with flash attention and so on, the computation is still quadratic in the sequence length. So if your sequence length is on the order of, I don't know, 16K, 32K, 100K or something, which some of these models have sequence length 100K, then you do get significantly slower in terms of training, also in terms of inference. So maybe these alternative architectures could scale better in terms of sequence length. I haven't seen actual validation on this. Let's say an RNN model release with context length, I don't know, 100K or something. I haven't really seen that. But the hope could be that as we scale to long sequences, these alternative architectures could be more well-suited. Not just text, but things like high resolution images, audio, video, and so on, which are emerging applications. So that's one, long sequences. Number two is a high throughput generation, where I can imagine scenarios where the application isn't like an interactive chatbot, but let's say a company wants to batch as many requests as possible on their server, or they're doing offline processing, they're generating stuff based on their internal documents, that you need to process in batch. And the issue with Transformer is that during generation, it essentially needs to keep around all the previous history. It's called the KV cache. And that could take a significant amount of memory, so you can't really batch too much because you run out of memory. I am personally bullish on RNNs. I think RNNs, they essentially summarize the past into a state vector that has fixed size, so the size doesn't grow with the history. So that means that you don't need as much memory to keep around all the previous tokens. And as a result, I think you can scale to much higher batch sizes. And as a result, you can make much more efficient use of the GPUs or the accelerator, and you could have much higher generation throughput. Now, this, I don't think, has been validated at scale. So as a researcher, I'm bullish on this stuff because I think in the next couple of years, these are use cases where these alternatives could have an advantage. We'll just kind of have to wait and see to see if these things will happen. I am personally bullish on this stuff. At the same time, I also spend a bunch of time making attention as fast as possible. So maybe hatching and playing both sides. Ultimately, we want to understand, as researchers, we want to understand what works, why do the models have these capabilities? And one way is, let's push attention to be as efficient as possible. On the other hand, let's push other alternatives to be as efficient at scale, as big as possible, and so that we can kind of compare them and understand. Yeah, awesome. [00:43:01]Alessio: And I think as long as all of this work happens and open, it's a net positive for everybody to explore all the paths. Yeah, let's talk about open-source AI. Obviously, together, when Red Pajama came out, which was an open clone of the LLAMA1 pre-training dataset, it was a big thing in the industry. LLAMA2 came out on Tuesday, I forget. And this week, there's been a lot of things going on, which they call open-source, but it's not really open-source. Actually, we wrote a post about it that was on the front page of Hacker News before this podcast, so I was frantically responding. How do you think about what open-source AI really is? In my mind, in open-source software, we have different levels of open. So there's free software, that's like the GPL license. There's open-source, which is Apache, MIT. And then there's kind of restricted open-source, which is the SSPL and some of these other licenses. In AI, you have the open models. So Red Pajama is an open model because you have the pre-training dataset, you have the training runs and everything. And then there's obviously RandomLens that doesn't make it one-to-one if you retrain it. Then you have the open-weights model that's kind of like StableLM, where the weights are open, but the dataset is not open. And then you have LLAMA2, which is the dataset is not open, the weights are restricted. It's kind of like not really open-source, but open enough. I think it's net positive because it's like $3 million of flops donated to the public. [00:44:32]Tri: How do you think about that? [00:44:34]Alessio: And also, as you work together, what is your philosophy with open-source AI? Right, right. [00:44:40]Tri: Yeah, I think that's a great question. And I think about it on maybe more practical terms. So of course, Meta has done an amazing job training LLAMA1, LLAMA2. And for LLAMA2, they make it much less restrictive compared to LLAMA1. Now you can use it for businesses, unless you are a monthly active user or something like that. I think just this change will have a very significant impact in the kind of landscape of open-source AI, where now lots of businesses, lots of companies will be using, I expect will be using things like LLAMA2. They will fine-tune on their own dataset. They will be serving variants or derivatives of LLAMA2. Whereas before, with LLAMA1, it was also a really good model, but your business companies weren't allowed to do that. So I think on a more practical term, it's kind of shifting the balance between a closed-source model like OpenAI and Anthropic and Google, where you're making API calls, right? And maybe you don't understand as much of what the model is doing, how the model is changing, and so on. Versus now, we have a model with open weight that is pretty competitive from what I've seen in terms of benchmarks, pretty competitive with GPT 3.5, right? And if you fine-tune it on your own data, maybe it's more well-suited for your own data. And I do see that's going to shift the balance of it. More and more folks are going to be using, let's say, derivatives of LLAMA2. More and more folks are going to fine-tune and serve their own model instead of calling an API. So that shifting of balance is important because in one way, we don't want just a concentration of decision-making power in the hands of a few companies. So I think that's a really positive development from Meta. Of course, training the model takes a couple of millions of dollars, but engineers have and I'm sure they spend tons of time trying many, many different things. So the actual cost is probably way more than that. And they make the weights available and they allow probably a lot of companies are going to be using this. So I think that's a really positive development. And we've also seen amazing progress on the open source community where they would take these models and they either fine-tune on different kinds of data sets or even make changes to the model. So as an example, I think for LLAMA1, the context lane was limited to 2K. Like a bunch of folks figured out some really simple methods to scale up to like 8K. [00:47:12]Alessio: Like the RoPE. [00:47:13]Tri: Yes. I think the open source community is very creative, right? And lots of people. LLAMA2 will, again, kind of accelerate this where more people will try it out. More people will make tweaks to it and make a contribution and then so on. So overall, I think I see that as still a very positive development for the field. And there's been lots of libraries that will allow you to host or fine-tune these models, like even with quantization and so on. Just a couple of hours after LLAMA2 was released, tons of companies announcing that, hey, it's on our API or hosting and so on and together did the same. So it's a very fast-paced development and just kind of a model with available weights that businesses are allowed to use. I think that alone is already a very positive development. At the same time, yeah, we can do much better in terms of releasing data sets. Data sets tend to be... Somehow people are not incentivized to release data sets. So philosophically, yeah, you want to be as open as possible. But on a practical term, I think it's a little bit harder for companies to release data sets. Legal issues. The data sets released tend to be not as eye-catchy as the model release. So maybe people are less incentivized to do that. We've seen quite a few companies releasing data sets together. Released a red pajama data set. I think Cerebus then worked on that and deduplicate and clean it up and release slim pajama and so on. So we're also seeing positive development on that front, kind of on the pre-training data set. So I do expect that to continue. And then on the fine-tuning data set or instruction tuning data set, I think we now have quite a few open data sets on instruction tuning and fine-tuning. But these companies do pay for human labelers to annotate these instruction tuning data set. And that is expensive. And maybe they will see that as their competitive advantage. And so it's harder to incentivize these companies to release these data sets. So I think on a practical term, we're still going to make a lot of progress on open source AI, on both the model development, on both model hosting, on pre-training data set and fine-tuning data set. Right now, maybe we don't have the perfect open source model since all the data sets are available. Maybe we don't have such a thing yet, but we've seen very fast development on the open source side. I think just maybe this time last year, there weren't as many models that are competitive with, let's say, ChatGPT. [00:49:43]Alessio: Yeah, I think the open data sets have so much more impact than open models. If you think about Elusive and the work that they've done, GPT-J was great, and the Pythia models are great, but the Pyle and the Stack, everybody uses them. So hopefully we get more people to contribute time to work on data sets instead of doing the 100th open model that performs worse than all the other ones, but they want to say they released the model. [00:50:14]Tri: Yeah, maybe the question is, how do we figure out an incentive structure so that companies are willing to release open data sets? And for example, it could be like, I think some of the organizations are now doing this where they are asking volunteers to annotate and so on. And maybe the Wikipedia model of data set, especially for instruction tuning, could be interesting where people actually volunteer their time and instead of editing Wikipedia, add annotation. And somehow they acknowledge and feel incentivized to do so. Hopefully we get to that kind of level of, in terms of data, it would be kind of like Wikipedia. And in terms of model development, it's kind of like Linux where people are contributing patches and improving the model in some way. I don't know exactly how that's going to happen, but based on history, I think there is a way to get there. [00:51:05]Alessio: Yeah, I think the Dolly-15K data set is a good example of a company saying, let's do this smaller thing, just make sure we make it open. We had Mike Conover from Databricks on the podcast, and he was like, people just bought into it and leadership was bought into it. You have companies out there with 200,000, 300,000 employees. It's like, just put some of them to label some data. It's going to be helpful. So I'm curious to see how that evolves. What made you decide to join Together? [00:51:35]Tri: For Together, the focus has been focusing a lot on open source model. And I think that aligns quite well with what I care about, of course. I also know a bunch of people there that I know and trust, and I'm excited to work with them. Philosophically, the way they've been really open with data set and model release, I like that a lot. Personally, for the stuff, for example, the research that I've developed, like we also try to make code available, free to use and modify and so on, contributing to the community. That has given us really valuable feedback from the community and improving our work. So philosophically, I like the way Together has been focusing on open source model. And the nice thing is we're also going to be at the forefront of research and the kind of research areas that I'm really excited about, things like efficient training and inference, aligns quite well with what the company is doing. We'll try our best to make things open and available to everyone. Yeah, but it's going to be fun being at the company, leading a team, doing research on the topic that I really care about, and hopefully we'll make things open to benefit the community. [00:52:45]Alessio: Awesome. Let's jump into the lightning round. Usually, I have two questions. So one is on acceleration, one on exploration, and then a takeaway. So the first one is, what's something that already happened in AI machine learning that you thought would take much longer than it has? [00:53:01]Tri: I think understanding jokes. I didn't expect that to happen, but it turns out scaling model up and training lots of data, the model can now understand jokes. Maybe it's a small thing, but that was amazing to me. [00:53:16]Alessio: What about the exploration side? What are some of the most interesting unsolved questions in the space? [00:53:22]Tri: I would say reasoning in the broad term. We don't really know how these models do. Essentially, they do something that looks like reasoning. We don't know how they're doing it. We have some ideas. And in the future, I think we will need to design architecture that explicitly has some kind of reasoning module in it if we want to have much more capable models. [00:53:43]Alessio: What's one message you want everyone to remember today? [00:53:47]Tri: I would say try to understand both the algorithm and the systems that these algorithms run on. I think at the intersection of machine learning system has been really exciting, and there's been a lot of amazing results at this intersection. And then when you scale models to large scale, both the machine learning side and the system side really matter. [00:54:06]Alessio: Awesome. Well, thank you so much for coming on 3. [00:54:09]Tri: This was great. Yeah, this has been really fun. [00:54:11] Get full access to Latent Space at www.latent.space/subscribe

Ciencia en Bicicleta
Cerebros queer

Ciencia en Bicicleta

Play Episode Listen Later Jul 2, 2023 50:16


“El universo no solo es más QUEER de lo que suponemos, sino de lo que podemos llegar a suponer”, decía el científico británico John Burdon Haldane, reconocido por sus amplios aportes en fisiología, genética, biología evolutiva y matemáticas. Y cuando se trata de la orientación o la identidad sexual, el neurocientífico Alejandro Velásquez Torres dice: “SE NACE”. ¿Por qué? ¿No es aprendido? ¿Influye la familia en la elección? ¿Es un asunto predominantemente biológico? En este episodio de Ciencia en bicicleta, conversamos sobre los hallazgos de las neurociencias y la biología en relación con la identidad y la orientación sexual. En Explora llevamos la ciencia al debate e invitamos a crear una CONVERSACIÓN PÚBLICA que ayude a trocar la conjetura en posibilidad de comprensión. Alejandro Velásquez Torres es médico de la Universidad del Rosario. Magíster en Neurociencia de la Universidad de Salamanca y doctorando en ciencias biomédicas de la Universidad Tecnológica de Pereira. Es, además, miembro del grupo de investigación en Neurociencia NEUROS de la Universidad del Rosario.

Más de uno
#HistoriaD: La colección de los diez mil cerebros

Más de uno

Play Episode Listen Later Jun 6, 2023 4:31


En una universidad de Dinamarca, en un sótano, hay 10.000 cerebros almacenados y catalogados. Todos tienen un número que los identifica. Son cerebros de personas que sufrieron enfermedades mentales. Javier Cancho relata la historia de estos expedientes clínicos. 

Que se vayan todos
Qsvt cdh 79 cerebros hackeables

Que se vayan todos

Play Episode Listen Later May 14, 2023 54:43


tu cerebro te pertence menos de lo que crees. EPISODIO COMPLETO Y PARTICIPACION EN VIVO EN https://www.patreon.com/profesorbriceno Las Grabaciones pueden verse en vivo en TWITCH https://www.twitch.tv/profesorbriceno SUSCRÍBETE AL PODCAST POR AUDIO EN CUALQUIER PLATAFORMA SPOTIFY https://open.spotify.com/show/3rFE3ZP8OXMLUEN448Ne5i?si=1cec891caf6c4e03 APPLE PODCASTS https://podcasts.apple.com/es/podcast/que-se-vayan-todos/id676871115 GOOGLE PODCASTS https://www.ivoox.com/en/podcast-que-se-vayan-todos_sq_f11549_1.html FEED PARA CUALQUIER APP DE PODCASTS https://www.ivoox.com/feed_fg_f11549_filtro_1.xml FECHAS DE PRESENTACIONES www.profesorbriceno.com/tour SOLO PARA SUSCRIPTORES. PROHIBIDA SU REPRODUCCIÓN. HUMOR NEGRO NO APTO PARA MENORES NI ESPIRITUS SENSIBLES. GRABADO EN FECHA FEB 2023

Sin Complejos
Al margen. Leyes del Puño, cerebros de alquiler

Sin Complejos

Play Episode Listen Later Apr 1, 2023 4:30


Carmen Carbonell destaca el artículo de Javier Somalo publicado en Libertad Digital.