Podcasts about cohere

  • 223PODCASTS
  • 460EPISODES
  • 43mAVG DURATION
  • 5WEEKLY NEW EPISODES
  • Apr 15, 2025LATEST

POPULARITY

20172018201920202021202220232024


Best podcasts about cohere

Latest podcast episodes about cohere

Unsupervised Learning
Ep 62: CEO of Cohere Aidan Gomez on Scaling Limits Emerging, AI Use-cases with PMF & Life After Transformers

Unsupervised Learning

Play Episode Listen Later Apr 15, 2025 50:44


Aidan joined this week's Unsupervised Learning for a wide-ranging conversation on model architectures, enterprise adoption, and what's breaking in the foundation model stack. If you're building or investing in AI infrastructure, Aidan is worth listening to. He co-authored the original Transformer paper, leads one of the most advanced model labs outside of the hyperscalers, and is now building for real-world enterprise deployment with Cohere's agent platform, North. Cohere serves thousands of customers across sectors like finance, telco, and healthcare — and they've made a name for themselves by staying model-agnostic, privacy-forward, and deeply international (with major bets in Japan and Korea) (0:00) Intro(0:32) Enterprise AI(3:23) Custom Integrations and Future of AI Agents(4:33) Enterprise Use Cases for Gen AI(7:02) The Importance of Reasoning in AI Models(10:38) Custom Models and Synthetic Data(17:48) Cohere's Approach to AI Applications(23:24) Future Use Cases and Market Fit(27:11) Building a Unified Automation Platform(27:34) Strategic Decisions in the AI Journey(29:19) International Partnerships and Language Models(31:05) Future of Foundation Models(32:27) AI in Specialized Domains(34:40) Challenges in Data Integration(35:06) Emerging Foundation Model Companies(35:31) Technological Frontiers and Architectures(37:29) Scaling Hypothesis and Model Capabilities(42:26) AI Research Culture and Team Building(44:39) Future of AI and Societal Impact(48:31) Addressing AI Risks With your co-hosts:  @jacobeffron  - Partner at Redpoint, Former PM Flatiron Health  @patrickachase  - Partner at Redpoint, Former ML Engineer LinkedIn  @ericabrescia  - Former COO Github, Founder Bitnami (acq'd by VMWare)  @jordan_segall  - Partner at Redpoint

programmier.bar – der Podcast für App- und Webentwicklung
News AI 13/25: Manus AI // Open AI Agent Tools // Google's Gemini // Mistral 3.1 Small

programmier.bar – der Podcast für App- und Webentwicklung

Play Episode Listen Later Mar 26, 2025 39:53


Dennis und Philipp lotsen euch durch die aufregendsten Entwicklungen der KI-Welt. Diese Woche im Fokus: Von Modell-Updates bis hin zu praktischen Tools, die eure AI-Workflows revolutionieren könnten.Manus AI, dem neuen Stern am AI-Agenten-Himmel, sorgt aktuell für Furore. Wir analysieren, was hinter dem DeepSeek-Hype steckt und wie es sich im Vergleich zu anderen Modellen schlägt.Weiter geht's zu Cohere und ihrem Command A. Was macht Command A so besonders, und wie positioniert es sich als effiziente Alternative zu GPT-4o?Auch Open AI mischt mit: Sie haben o1-pro in ihrer API veröffentlicht, aber zu welchem Preis? Wir vergleichen die Kosten und Leistung mit anderen Modellen und diskutieren, für wen sich das Upgrade lohnt.Open AI krempelt die Agenten-Entwicklung um! Mit den neuen Agent Building Tools soll das Erstellen von AI-Assistenten einfacher denn je werden. Außerdem gibt es Next-Gen Audio-Modelle inklusive Voice Customization, die neue Maßstäbe setzen könnten.Google hat in den letzten Wochen nicht geschlafen und die Gemini App auf den Markt gebracht. Von Collaboration Features bis hin zu einem praktischen Gemini Cookbook Quickstart Notebook. Wir besprechen die wichtigsten Neuerungen.Die Open-Source-Szene boomt! Google DeepMind hat mit 

CanCon Podcast
Cohere commands attention, Apple Intelligence delays, plus a vibe coding PSA

CanCon Podcast

Play Episode Listen Later Mar 23, 2025 56:10


”Apple won't use user data to develop and train AI. And Apple lives in a world where AI is going to be table stakes in any software or hardware moving forward.” Call it counter-programming. A tech palate cleanse for all the election and trade war talk. This week, Rob and Doug tackle the latest AI developments: vibe coding, Cohere's (temporary?) LLM pole position, AI data centres from Telus (powered by Nvidia), and Apple's delayed intelligence. The BetaKit Podcast is presented by The Cyber Challenge, powered by Rogers Cybersecure Catalyst and CCTX—your pathway to new sales, industry connections, and non-dilutive funding. If you're ready to scale, refine, and lead cybersecurity innovation, apply today at www.thecyberchallenge.ca. The BetaKit Podcast is also brought to you by Consensus, where innovators meet investors. This May, crypto's longest-running conference will welcome 20,000 attendees to shape the future of the decentralized digital economy at its inaugural festival in Toronto, Canada's largest tech and financial hub. You can't afford to miss it. Visit go.coindesk.com/betakit to sign up and save 20% off your ticket!   Related links: Something Is Rotten in the State of Cupertino Apple's AI division reportedly starting from scratch   Apple's Craig Federighi Discusses the Future of iPhone AI Apple's Catch-22 Cohere's Command A goes brrrrrrrrr Cohere had to extend the axis a bit Telus, data centres and ‘Sovereign AI' (powered by Nvidia) 25% of Y Combinator's latest startup batch is vibe coding Vibe coding startups are already raising millions @leojr94_ vibe coded his way into a cyber attack (lol)  

Web and Mobile App Development (Language Agnostic, and Based on Real-life experience!)
AI Explorer Series (Part 3: Anthropic, Hugging Face, Cohere)

Web and Mobile App Development (Language Agnostic, and Based on Real-life experience!)

Play Episode Listen Later Mar 19, 2025 78:26


In this conversation, Krish Palaniappan delves into the AWS AI series, focusing on Amazon Bedrock and its foundational models. He discusses the differences between serverless models and the Bedrock marketplace, the importance of selecting the right model for specific use cases, and the training and inference processes in AI. The conversation also compares AWS Bedrock with Azure's offerings and emphasizes the complexities of AI architecture in modern development. In this conversation, Krish Palaniappan delves into the complexities of selecting AI models and platforms, particularly focusing on Bedrock and Hugging Face. He discusses the challenges startups face in asset comparisons, the importance of initial architecture in software development, and the evolving landscape of AI tools. The conversation emphasizes the need for a strategic approach to model selection, deployment, and understanding pricing structures, while also highlighting the significance of community engagement in the AI space. Snowpal Products Backends as Services on ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠AWS Marketplace⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Mobile Apps on ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠App Store⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ and ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Play Store⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Web App⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Education Platform⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ for Learners and Course Creators

Machine Learning Street Talk
Reasoning, Robustness, and Human Feedback in AI - Max Bartolo (Cohere)

Machine Learning Street Talk

Play Episode Listen Later Mar 18, 2025 83:11


Dr. Max Bartolo from Cohere discusses machine learning model development, evaluation, and robustness. Key topics include model reasoning, the DynaBench platform for dynamic benchmarking, data-centric AI development, model training challenges, and the limitations of human feedback mechanisms. The conversation also covers technical aspects like influence functions, model quantization, and the PRISM project.Max Bartolo (Cohere):https://www.maxbartolo.com/https://cohere.com/commandTRANSCRIPT:https://www.dropbox.com/scl/fi/vujxscaffw37pqgb6hpie/MAXB.pdf?rlkey=0oqjxs5u49eqa2m7uaol64lbw&dl=0TOC:1. Model Reasoning and Verification [00:00:00] 1.1 Model Consistency and Reasoning Verification [00:03:25] 1.2 Influence Functions and Distributed Knowledge Analysis [00:10:28] 1.3 AI Application Development and Model Deployment [00:14:24] 1.4 AI Alignment and Human Feedback Limitations2. Evaluation and Bias Assessment [00:20:15] 2.1 Human Evaluation Challenges and Factuality Assessment [00:27:15] 2.2 Cultural and Demographic Influences on Model Behavior [00:32:43] 2.3 Adversarial Examples and Model Robustness3. Benchmarking Systems and Methods [00:41:54] 3.1 DynaBench and Dynamic Benchmarking Approaches [00:50:02] 3.2 Benchmarking Challenges and Alternative Metrics [00:50:33] 3.3 Evolution of Model Benchmarking Methods [00:51:15] 3.4 Hierarchical Capability Testing Framework [00:52:35] 3.5 Benchmark Platforms and Tools4. Model Architecture and Performance [00:55:15] 4.1 Cohere's Model Development Process [01:00:26] 4.2 Model Quantization and Performance Evaluation [01:05:18] 4.3 Reasoning Capabilities and Benchmark Standards [01:08:27] 4.4 Training Progression and Technical Challenges5. Future Directions and Challenges [01:13:48] 5.1 Context Window Evolution and Trade-offs [01:22:47] 5.2 Enterprise Applications and Future ChallengesREFS:[00:03:10] Research at Cohere with Laura Ruis et al., Max Bartolo, Laura Ruis et al.https://cohere.com/research/papers/procedural-knowledge-in-pretraining-drives-reasoning-in-large-language-models-2024-11-20[00:04:15] Influence functions in machine learning, Koh & Lianghttps://arxiv.org/abs/1703.04730[00:08:05] Studying Large Language Model Generalization with Influence Functions, Roger Grosse et al.https://storage.prod.researchhub.com/uploads/papers/2023/08/08/2308.03296.pdf[00:11:10] The LLM ARChitect: Solving ARC-AGI Is A Matter of Perspective, Daniel Franzen, Jan Disselhoff, and David Hartmannhttps://github.com/da-fr/arc-prize-2024/blob/main/the_architects.pdf[00:12:10] Hugging Face model repo for C4AI Command A, Cohere and Cohere For AIhttps://huggingface.co/CohereForAI/c4ai-command-a-03-2025[00:13:30] OpenInterpreterhttps://github.com/KillianLucas/open-interpreter[00:16:15] Human Feedback is not Gold Standard, Tom Hosking, Max Bartolo, Phil Blunsomhttps://arxiv.org/abs/2309.16349[00:27:15] The PRISM Alignment Dataset, Hannah Kirk et al.https://arxiv.org/abs/2404.16019[00:32:50] How adversarial examples arise, Andrew Ilyas, Shibani Santurkar, Dimitris Tsipras, Logan Engstrom, Brandon Tran, Aleksander Madryhttps://arxiv.org/abs/1905.02175[00:43:00] DynaBench platform paper, Douwe Kiela et al.https://aclanthology.org/2021.naacl-main.324.pdf[00:50:15] Sara Hooker's work on compute limitations, Sara Hookerhttps://arxiv.org/html/2407.05694v1[00:53:25] DataPerf: Community-led benchmark suite, Mazumder et al.https://arxiv.org/abs/2207.10062[01:04:35] DROP, Dheeru Dua et al.https://arxiv.org/abs/1903.00161[01:07:05] GSM8k, Cobbe et al.https://paperswithcode.com/sota/arithmetic-reasoning-on-gsm8k[01:09:30] ARC, François Chollethttps://github.com/fchollet/ARC-AGI[01:15:50] Command A, Coherehttps://cohere.com/blog/command-a[01:22:55] Enterprise search using LLMs, Coherehttps://cohere.com/blog/commonly-asked-questions-about-search-from-coheres-enterprise-customers

Mon Carnet, l'actu numérique
{ENTREVUE} - Les IA poussent à Toronto avec Chloé Sondervorst

Mon Carnet, l'actu numérique

Play Episode Listen Later Mar 16, 2025 10:44


Toronto a été le théâtre de plusieurs annonces en intelligence artificielle cette semaine. L'entreprise Moonvalley a dévoilé Marey, un modèle de génération vidéo conçu avec des données acquises sous licence, visant l'industrie du divertissement. Cette approche éthique pourrait influencer le secteur face aux critiques sur l'origine des données d'entraînement. De son côté, Cohere a lancé une IA concurrente à ChatGPT, axée sur les entreprises. Ces avancées marquent la montée en puissance du Canada dans l'IA. J'en discute avec Chloé Sondervorst, réalisatrice et observatrice de l'IA à Radio-Canada

AI Unraveled: Latest AI News & Trends, Master GPT, Gemini, Generative AI, LLMs, Prompting, GPT Store

These news snippets from March 14th, 2025, highlight several key developments in the AI landscape, including OpenAI's push for federal AI policy and their concerns regarding state-level regulations and competition with China. The advancements also showcase new AI models from Cohere and Google's Gemini, focusing on enterprise solutions and personalised user experiences respectively. Furthermore, the texts address security risks associated with AI, such as espionage and the potential for misuse in cyberattacks. Finally, the updates touch upon AI integration into existing platforms like Windows and the transformative impact of AI on sectors like healthcare, drug discovery, and global supply chains, alongside the ongoing debate about training AI on copyrighted material.

The top AI news from the past week, every ThursdAI

LET'S GO! Happy second birthday to ThursdAI, your favorite weekly AI news show! Can you believe it's been two whole years since we jumped into that random Twitter Space to rant about GPT-4? From humble beginnings as a late-night Twitter chat to a full-blown podcast, Newsletter and YouTube show with hundreds of thousands of downloads, it's been an absolutely wild ride! That's right, two whole years of me, Alex Volkov, your friendly AI Evangelist, along with my amazing co-hosts, trying to keep you up-to-date on the breakneck speed of the AI worldAnd what better way to celebrate than with a week PACKED with insane AI news? Buckle up, folks, because this week Google went OPEN SOURCE crazy, Gemini got even cooler, OpenAI created a whole new Agents SDK and the open-source community continues to blow our minds. We've got it all - from game-changing model releases to mind-bending demos.This week I'm also on the Weights & Biases company retreat, so TL;DR first and then the newsletter, but honestly, I'll start embedding the live show here in the substack from now on, because we're getting so good at it, I barely have to edit lately and there's a LOT to show you guys! TL;DR and Show Notes & Links* Hosts & Guests* Alex Volkov - AI Eveangelist & Weights & Biases (@altryne)* Co Hosts - @WolframRvnwlf @ldjconfirmed @nisten * Sandra Kublik - DevRel at Cohere (@itsSandraKublik)* Open Source LLMs * Google open sources Gemma 3 - 1B - 27B - 128K context (Blog, AI Studio, HF)* EuroBERT - multilingual encoder models (210M to 2.1B params)* Reka Flash 3 (reasoning) 21B parameters is open sourced (Blog, HF)* Cohere Command A 111B model - 256K context (Blog)* Nous Research Deep Hermes 24B / 3B Hybrid Reasoners (X, HF)* AllenAI OLMo 2 32B - fully open source GPT4 level model (X, Blog, Try It)* Big CO LLMs + APIs* Gemini Flash generates images natively (X, AI Studio)* Google deep research is now free in Gemini app and powered by Gemini Thinking (Try It no cost)* OpenAI released new responses API, Web Search, File search and Computer USE tools (X, Blog)* This weeks Buzz * The whole company is at an offsite at oceanside, CA* W&B internal MCP hackathon and had cool projects - launching an MCP server soon!* Vision & Video* Remade AI - 8 LORA video effects for WANX (HF)* AI Art & Diffusion & 3D* ByteDance Seedream 2.0 - A Native Chinese-English Bilingual Image Generation Foundation Model by ByteDance (Blog, Paper)* Tools* Everyone's talking about Manus - (manus.im)* Google AI studio now supports youtube understanding via link droppingOpen Source LLMs: Gemma 3, EuroBERT, Reka Flash 3, and Cohere Command-A Unleashed!This week was absolutely HUGE for open source, folks. Google dropped a BOMBSHELL with Gemma 3! As Wolfram pointed out, this is a "very technical achievement," and it's not just one model, but a whole family ranging from 1 billion to 27 billion parameters. And get this – the 27B model can run on a SINGLE GPU! Sundar Pichai himself claimed you'd need "at least 10X compute to get similar performance from other models." Insane!Gemma 3 isn't just about size; it's packed with features. We're talking multimodal capabilities (text, images, and video!), support for over 140 languages, and a massive 128k context window. As Nisten pointed out, "it might actually end up being the best at multimodal in that regard" for local models. Plus, it's fine-tuned for safety and comes with ShieldGemma 2 for content moderation. You can grab Gemma 3 on Google AI Studio, Hugging Face, Ollama, Kaggle – everywhere! Huge shoutout to Omar Sanseviero and the Google team for this incredible release and for supporting the open-source community from day one! Colin aka Bartowski, was right, "The best thing about Gemma is the fact that Google specifically helped the open source communities to get day one support." This is how you do open source right!Next up, we have EuroBERT, a new family of multilingual encoder models. Wolfram, our European representative, was particularly excited about this one: "In European languages, you have different characters than in other languages. And, um, yeah, encoding everything properly is, uh, difficult." Ranging from 210 million to 2.1 billion parameters, EuroBERT is designed to push the boundaries of NLP in European and global languages. With training on a massive 5 trillion-token dataset across 15 languages and support for 8K context tokens, EuroBERT is a workhorse for RAG and other NLP tasks. Plus, how cool is their mascot?Reka Flash 3 - a 21B reasoner with apache 2 trained with RLOOAnd the open source train keeps rolling! Reka AI dropped Reka Flash 3, a 21 billion parameter reasoning model with an Apache 2.0 license! Nisten was blown away by the benchmarks: "This might be one of the best like 20B size models that there is right now. And it's Apache 2.0. Uh, I, I think this is a much bigger deal than most people realize." Reka Flash 3 is compact, efficient, and excels at chat, coding, instruction following, and function calling. They even used a new reinforcement learning technique called REINFORCE Leave One-Out (RLOO). Go give it a whirl on Hugging Face or their chat interface – chat.reka.ai!Last but definitely not least in the open-source realm, we had a special guest, Sandra (@itsSandraKublik) from Cohere, join us to announce Command-A! This beast of a model clocks in at 111 BILLION parameters with a massive 256K context window. Sandra emphasized its efficiency, "It requires only two GPUs. Typically the models of this size require 32 GPUs. So it's a huge, huge difference." Command-A is designed for enterprises, focusing on agentic tasks, tool use, and multilingual performance. It's optimized for private deployments and boasts enterprise-grade security. Congrats to Sandra and the Cohere team on this massive release!Big CO LLMs + APIs: Gemini Flash Gets Visual, Deep Research Goes Free, and OpenAI Builds for AgentsThe big companies weren't sleeping either! Google continued their awesome week by unleashing native image generation in Gemini Flash Experimental! This is seriously f*****g cool, folks! Sorry for my French, but it's true. You can now directly interact with images, tell Gemini what to do, and it just does it. We even showed it live on the stream, turning ourselves into cat-confetti-birthday-hat-wearing masterpieces! Wolfram was right, "It's also a sign what we will see in, like, Photoshop, for example. Where you, you expect to just talk to it and have it do everything that a graphic designer would be doing." The future of creative tools is HERE.And guess what else Google did? They made Deep Research FREE in the Gemini app and powered by Gemini Thinking! Nisten jumped in to test it live, and we were all impressed. "This is the nicest interface so far that I've seen," he said. Deep Research now digs through HUNDREDS of websites (Nisten's test hit 156!) to give you comprehensive answers, and the interface is slick and user-friendly. Plus, you can export to Google Docs! Intelligence too cheap to meter? Google is definitely pushing that boundary.Last second additions - Allen Institute for AI released OLMo 2 32B - their biggest open model yetJust as I'm writing this, friend of the pod, Nathan from Allen Institute for AI announced the release of a FULLY OPEN OLMo 2, which includes weights, code, dataset, everything and apparently it beats the latest GPT 3.5, GPT 4o mini, and leading open weight models like Qwen and Mistral. Evals look legit, but nore than that, this is an Apache 2 model with everything in place to advance open AI and open science! Check out Nathans tweet for more info, and congrats to Allen team for this awesome release! OpenAI new responses API and Agent ASK with Web, File and CUA toolsOf course, OpenAI wasn't going to let Google have all the fun. They dropped a new SDK for agents called the Responses API. This is a whole new way to build with OpenAI, designed specifically for the agentic era we're entering. They also released three new tools: Web Search, Computer Use Tool, and File Search Tool. The Web Search tool is self-explanatory – finally, built-in web search from OpenAI!The Computer Use Tool, while currently limited in availability, opens up exciting possibilities for agent automation, letting agents interact with computer interfaces. And the File Search Tool gives you a built-in RAG system, simplifying knowledge retrieval from your own files. As always, OpenAI is adapting to the agentic world and giving developers more power.Finally in the big company space, Nous Research released PORTAL, their new Inference API service. Now you can access their awesome models, like Hermes 3 Llama 70B and DeepHermes 3 8B, directly via API. It's great to see more open-source labs offering API access, making these powerful models even more accessible.This Week's Buzz at Weights & Biases: Offsite Hackathon and MCP Mania!This week's "This Week's Buzz" segment comes to you live from Oceanside, California! The whole Weights & Biases team is here for our company offsite. Despite the not-so-sunny California weather (thanks, storm!), it's been an incredible week of meeting colleagues, strategizing, and HACKING!And speaking of hacking, we had an MCP hackathon! After last week's MCP-pilling episode, we were all hyped about Model Context Protocol, and the team didn't disappoint. In just three hours, the innovation was flowing! We saw agents built for WordPress, MCP support integrated into Weave playground, and even MCP servers for Weights & Biases itself! Get ready, folks, because an MCP server for Weights & Biases is COMING SOON! You'll be able to talk to your W&B data like never before. Huge shoutout to the W&B team for their incredible talent and for embracing the agentic future! And in case you missed it, Weights & Biases is now part of the CoreWeave family! Exciting times ahead!Vision & Video: LoRA Video Effects and OpenSora 2.0Moving into vision and video, Remade AI released 8 LoRA video effects for 1X! Remember 1X from Alibaba? Now you can add crazy effects like "squish," "inflate," "deflate," and even "cakeify" to your videos using LoRAs. It's open source and super cool to see video effects becoming trainable and customizable.And in the realm of open-source video generation, OpenSora 2.0 dropped! This 11 billion parameter model claims state-of-the-art video generation trained for just $200,000! They're even claiming performance close to Sora itself on some benchmarks. Nisten checked out the demos, and while we're all a bit jaded now with the rapid pace of video AI, it's still mind-blowing how far we've come. Open source video is getting seriously impressive, seriously fast.AI Art & Diffusion & 3D: ByteDance's Bilingual Seedream 2.0ByteDance, the folks behind TikTok, released Seedream 2.0, a native Chinese-English bilingual image generation foundation model. This model, from ByteDream, excels at text rendering, cultural nuance, and human preference alignment. Seedream 2.0 boasts "powerful general capability," "native bilingual comprehension ability," and "excellent text rendering." It's designed to understand both Chinese and English prompts natively, generating high-quality, culturally relevant images. The examples look stunning, especially its ability to render Chinese text beautifully.Tools: Manus AI Agent, Google AI Studio YouTube Links, and Cursor EmbeddingsFinally, in the tools section, everyone's buzzing about Manus, a new AI research agent. We gave it a try live on the show, asking it to do some research. The UI is slick, and it seems to be using Claude 3.7 behind the scenes. Manus creates a to-do list, browses the web in a real Chrome browser, and even generates files. It's like Operator on steroids. We'll be keeping an eye on Manus and will report back on its performance in future episodes.And Google AI Studio keeps getting better! Now you can drop YouTube links into Google AI Studio, and it will natively understand the video! This is HUGE for video analysis and content understanding. Imagine using this for support, content summarization, and so much more.PHEW! What a week to celebrate two years of ThursdAI! From open source explosions to Gemini's visual prowess and OpenAI's agentic advancements, the AI world is moving faster than ever. As Wolfram aptly put it, "The acceleration, you can feel it." And Nisten reminded us of the incredible journey, "I remember I had early access to GPT-4 32K, and, uh, then... the person for the contract that had given me access, they cut it off because on the one weekend, I didn't realize how expensive it was. So I had to use $180 worth of tokens just trying it out." Now, we have models that are more powerful and more accessible than ever before. Thank you to Wolfram, Nisten, and LDJ for co-hosting and bringing their insights every week. And most importantly, THANK YOU to our amazing community for tuning in, listening, and supporting ThursdAI for two incredible years! We couldn't do it without you. Here's to another year of staying up-to-date so YOU don't have to! Don't forget to subscribe to the podcast, YouTube channel, and newsletter to stay in the loop. And share ThursdAI with a friend – it's the best birthday gift you can give us! Until next week, keep building and keep exploring the amazing world of AI! LET'S GO! This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit sub.thursdai.news/subscribe

AI Unraveled: Latest AI News & Trends, Master GPT, Gemini, Generative AI, LLMs, Prompting, GPT Store

Google is reinventing search through AI-driven overviews, while Amazon is aggressively pursuing Agentic AI and hybrid reasoning models. Researchers are being recognised for reinforcement learning achievements, and warnings are emerging about emotional attachments to hyper-realistic AI voices. Meanwhile, legal battles surrounding OpenAI's for-profit transition continue, and academic institutions are benefiting from initiatives like OpenAI's NextGenAI. Furthermore, Cohere has launched an impressive multilingual vision model, while incidents such as students using AI to cheat in interviews highlight ongoing ethical challenges.

Business Podcast by Roohi | VC, Startups
Deep Dive on The Komo Club Ft. Rohit Bhargava(Host of The Startup Playbook Podcast)

Business Podcast by Roohi | VC, Startups

Play Episode Listen Later Mar 2, 2025 31:12


In this podcast I am joined by a dream guestRohit Bhargava: Host of The Startup Playbook Podcast, The Founder of The Komo ClubWe chat about:Podcasting in 2025 and beyondThe Komo Club GenesisThe Startup Playbook PodcastPlaybook VenturesGuest Rohit's Handles⤵︎ The Komo Club Website here: https://www.thekomoclub.com/Email: rohit@startupplaybook.coHere's Rohit's LinkedIn:https://www.linkedin.com/in/rohbhargava/Heres The Startup Playbook Podcast link: https://open.spotify.com/show/0pIPF1J8KiK0VnCq5XIl03?si=6db67403426c4e0cHost Roohi Kazi's Handles ⤵︎ LinkedIn:   ⁠https://www.linkedin.com/in/roohi-kazi-53174113b/⁠Instagram:  ⁠https://www.instagram.com/roohik2/#⁠Twitter: ⁠https://x.com/roohi_kr⁠E-Mail: bizpodroohi2@gmail.comTO GET FEATURED ON “Business Podcast by Roohi” Email at: bizpodroohi2@gmail.com

Tank Talks
News Rundown 2/24/25: BDC Capital Goes Big On Growth, CVCA's VC Trends Show Problems, High Speed Rail Going Nowhere Fast, and Cohere vs The Media Giants

Tank Talks

Play Episode Listen Later Feb 24, 2025 24:32


Matt Cohen and John Ruffolo talk about the BDC Capital $1B fund, the state of early-stage VC funding in Canada, and the rise of mega-deals dominated by U.S. investors. They also discuss the feasibility of the Quebec City-Toronto high-speed rail project, AI copyright lawsuits, potential Trump-era tariffs, and the future of open banking in Canada.Key TopicsBDC Capital's $1B Growth-Stage Investment Fund (00:42)* BDC Capital announces a $1B investment fund, with:* $500M Growth Venture Fund for direct investments and co-investments.* $450M Growth Equity Partners Program for minority stake investments in mid-market companies.* Concerns were raised by Mark McQueen about lack of early-stage funding* John Ruffolo's take:* Canada's early-stage VC ecosystem is underfunded.* BDC was meant to focus on riskier, early-stage investments, while EDC (Export Development Canada) focused on growth-stage.* Shift towards later-stage funding may leave early-stage startups without necessary capital.Canadian Venture Capital Funding Trends (04:55)* CVCA's 2024 report:* $7.86B invested across 592 deals, up 10% from 2023.* Mega deals ($50M+ rounds) comprised 62% of total VC investments.* Seed-stage funding fell 50% to $510M.* Notable mega-deals:* Clio – $1.24B Series F* Cohere – $616M Series D* Blockstream – $289M convertible note* Waabi – $275M Series B* U.S. investors dominate:* 32% of Canadian VC deals had U.S. investor participation.* Clio's round was entirely U.S.-funded.* John Ruffolo's analysis:* Canada needs stronger domestic venture capital.* U.S. capital will always flow into late-stage companies, but early-stage funding is crucial for long-term ecosystem growth.* Lack of Canadian IPOs in 2024 is a concerning sign.Quebec City-Toronto High-Speed Rail: $90B Boondoggle? (09:17)* Massive infrastructure proposal:* $60B–$90B price tag, with $3.9B allocated to planning alone.* Construction won't begin for at least five years, taking 5–7 years per segment.* Criticisms:* Timing is political (announced right before an election).* Where is the funding coming from? Canada's finances are already stretched.* Route selection is questionable – e.g., Laval getting a stop over Mississauga/Brampton.* John Ruffolo's take:* Financial viability is unclear – pension funds won't invest without guarantees of ridership.* Other priorities (e.g., Arctic infrastructure, national security) are being ignored.* The government should invest in digital infrastructure instead (e.g., full 5G coverage).AI Copyright Lawsuits: Cohere vs. Media Giants (14:35)* Major media coalition (The Atlantic, Forbes, The Guardian, Vox, etc.) sues AI startup Cohere for copyright infringement in New York.* Allegations: Cohere scraped and displayed copyrighted content without permission.* Seeking $150K per work infringed + an injunction against Cohere using their content.* Growing legal pressure on AI companies:* NY Times vs. OpenAI – potentially setting a massive precedent.* Anthropic, Meta, and Thomson Reuters have faced similar lawsuits.* John Ruffolo's view:* Copyright concerns were always an issue for AI models.* AI startups may have to pay into a licensing pool (like the music industry).* Investor risk increasing – legal uncertainties may impact funding for public LLMs.Trump's Potential Tariffs: What Canada Should Do (19:25)* Trump's trade policies likely to return if re-elected, impacting Canadian businesses.* John Ruffolo's recommendations:* Canada must fix internal issues first (e.g., interprovincial trade barriers).* Tariffs won't disappear for at least four years, so businesses must adapt.* Canadian businesses will have to shift profits & operations to the U.S. to remain competitive.The Future of Open Banking in Canada (22:00)* U.S. fintech sector gains a boost as Trump administration removes CFPB regulations.* Chime & Klarna expected to benefit from deregulation.* Canadian Conservatives promise major push for open banking if elected.* Liberals have been slow to act on open banking despite six years of promises.John Ruffolo's perspective:* Open banking will make Canadian banks stronger, not weaker.* Canada must prepare for U.S. competition in financial services.Follow Matt Cohen and Tank Talks here!Podcast production support provided by Agentbee.ai This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit tanktalks.substack.com

Engadget
Major publishers sue AI startup Cohere over copyright infringement

Engadget

Play Episode Listen Later Feb 14, 2025 7:22


The Canadian company is currently worth $5 billion. Learn more about your ad choices. Visit podcastchoices.com/adchoices

WSJ Tech News Briefing
TNB Tech Minute: Musk Says He'll Pull OpenAI Bid If It Stays a Nonprofit

WSJ Tech News Briefing

Play Episode Listen Later Feb 13, 2025 2:20


Plus, publishers sue AI startup Cohere for copyright and trademark infringement. And, Jeff Bezos's Blue Origin plans to lay off 10% of its workforce. Julie Chang hosts. Learn more about your ad choices. Visit megaphone.fm/adchoices

The ERP Advisor
The ERP Minute Episode 173 - February 11th, 2025

The ERP Advisor

Play Episode Listen Later Feb 12, 2025 2:01


Oracle announced the Oracle Fusion Cloud Applications Suite is now available on Oracle EU Sovereign Cloud to enable private and public sector organizations across the European Union. NetSuite announced it is migrating to Oracle Autonomous Database to enable its customers to take advantage of the enhanced security, reliability, and performance of a fully managed Oracle Database in OCI with integrated AI. Salesforce, in collaboration with Hugging Face, Cohere, and Carnegie Mellon University, announced the release of the AI Energy Score, a first-of-its-kind benchmarking tool that enables AI developers and users to evaluate, identify, and compare the energy consumption of AI models.Connect with us!https://www.erpadvisorsgroup.com866-499-8550LinkedIn:https://www.linkedin.com/company/erp-advisors-groupTwitter:https://twitter.com/erpadvisorsgrpFacebook:https://www.facebook.com/erpadvisorsInstagram:https://www.instagram.com/erpadvisorsgroupPinterest:https://www.pinterest.com/erpadvisorsgroupMedium:https://medium.com/@erpadvisorsgroup

The Travel Coach Network Podcast
Helping Travel Coaches Launch Their Offers With Cohere Co-Founder Anette Oran | Episode 124

The Travel Coach Network Podcast

Play Episode Listen Later Feb 12, 2025 23:50


Every travel professional knows how frustrating tech can be. In fact, it's one of the most commonly shared roadblocks when building a coaching business online.Thankfully, we now have a solution for that.The Travel Coach Network is proud to announce a partnership with Cohere!In this episode, Sahara Rose DeVore - founder of the Travel Coach Network - interviews Anette Oran, co-founder of Cohere, to share why this partnership is so needed.Register for the masterclass happening on Tuesday, February 18th, 2025 at 1pm Eastern.Click below to register, and for the replay if you find this episode after the masterclass has been recorded:https://app.cohere.live/contribution-view/679019f40740293de23a91e9/b7858b8d-f286-4809-83d6-f1fbb5facff7/aboutIf you've been loving the show, we'd so appreciate it if you could leave a 5-star review on Spotify or Apple Podcasts!And of course, we'd love to see you in our free Facebook Group:⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.facebook.com/groups/928430197344106⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Have questions about the Travel Coach Certification Program? ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Send me a DM on Instagram over at @travelcoachnetwork.⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠-------------------TRAVEL COACHING RESOURCESAre you ready to elevate your travel business? To achieve clarity, focus, and success instead of constant confusion?If so, then I'd love to invite you to join the Travel Coach Certification Program.Join the conversation in our Travel Coach Network Global Community. It's our free Facebook Group for aspiring and inspiring travel coaches.If you're brand new to the concept of travel coaching, be sure to grab the Beginner's Guide to Travel Coaching by clicking below.Website: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://thetravelcoachnetwork.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠TCN Global Community on Facebook: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.facebook.com/groups/travelcoachnetwork⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Instagram: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.instagram.com/thetravelcoachnetwork/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠The Travel Coach Certification Program: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://thetravelcoachnetwork.mykajabi.com/the-travel-coach-program⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Free Beginner's Guide to Travel Coaching:  ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://thetravelcoachnetwork.mykajabi.com/main-email-series-and-workbook⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Ultimate Travel Business Planner Bundle: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.etsy.com/shop/TravelCoachNetwork?ref=seller-platform-mcnav⁠

Tank Talks

In this episode, Matt welcomes Willson Cross, the co-founder and CEO of Borderless AI, to discuss how AI is transforming the global HR and payroll industry. Willson shares his entrepreneurial journey, from founding and selling GoFetch to launching Borderless AI. They explore how AI-driven compliance, payroll, and onboarding are solving key challenges in hiring global teams. Willson also talks about the company's $35M funding, its partnership with Cohere, and how they differentiate from major competitors like Deel and Rippling.About Willson Cross:Willson Cross is the Co-Founder and CEO of Borderless AI, a global payroll platform that uses generative AI to streamline hiring, managing, and paying international employees. Since launching in 2023, the company has raised $27 million from top investors, including Susquehanna and Bernard Arnault. Based in Toronto, Willson leads the team in building AI-powered solutions for the future of work.Before Borderless AI, Willson co-founded GoFetch, Canada's leading pet services marketplace. Starting from his basement in 2015, he grew the company to seven markets, raised $3.5 million, and led a team of 45 before selling the business in 2018. Earlier, he launched UBC Bitcoin Jobs, an online job board that connected university students with cryptocurrency startups, matching over 80 students to 20 companies.Originally from Vancouver, Willson studied economics at New York University before leaving after his third year to pursue startups full-time.⏱ Topics* (1:26) – Willson's background & founding GoFetch* (2:59) – Key lessons from running a bootstrapped startup* (4:55) – The transition to Borderless AI & identifying HR's biggest challenges* (6:33) – Payroll & benefits: The first major opportunities* (6:52) – Building real-time global payroll infrastructure* (7:50) – Meeting co-founder Sean Agarwal & forming a strong partnership* (9:45) – AI's role in HR compliance, payroll & automation* (12:04) – How Cohere's AI models enhance HRGPT* (15:48) – Competing with Deel & Rippling as an AI-native company* (18:19) – Pricing strategy & product differentiation* (19:13) – How AI is transforming HR roles* (20:47) – The shift toward larger early-stage funding rounds* (24:30) – Target customers: Startups & large enterprises* (27:41) – Why Borderless AI chose a full in-office model

Badass Breastfeeding Podcast
Breastfeeding and Jury Duty

Badass Breastfeeding Podcast

Play Episode Listen Later Feb 10, 2025 35:58


Submit your question and we'll answer it in a future episode!Join our Patreon Community!https://www.patreon.com/badassbreastfeedingpodcast Have you ever been called for Jury Duty?  What about being called to Jury Duty as a breastfeeding mother?  What can you do about this?  Listen today as Dianne and Abby discuss a specific situation and give some tips on what to do if you were to get called to serve as a juror for Jury Duty. If you are a new listener, we would love to hear from you.  Please consider leaving us a review on iTunes or sending us an email with your suggestions and comments to badassbreastfeedingpodcast@gmail.com.  You can also add your email to our list and have episodes sent right to your inbox! Things we talked about:Messages with questions [4:24]What is jury duty [10:00]Every state is different [14:08]Abby turned her husband in for jury duty [19:10]FB post from Alabama [22:07]Breastfeeding support is lacking despite recommendations [27:38]Pumping in other countries [32:26]Links to information we discussed or episodes you should check out!https://badassbreastfeedingpodcast.com/episode/exclusive-pumping/https://badassbreastfeedingpodcast.com/episode/pumping-stories-from-badasses/ Set up your consultation with Diannehttps://badassbreastfeedingpodcast.com/consultations/      Check out Dianne's blog here:https://diannecassidyconsulting.com/milklytheblog/Follow our Podcast:https://badassbreastfeedingpodcast.coHere is how you can connect with Dianne and Abby:AbbyTheuring ,https://www.thebadassbreastfeeder.comDianne Cassidy @diannecassidyibclc,  http://www.diannecassidyconsulting.com Music we use:Music: "Levels of Greatness" from "We Used to Paint Stars in the Sky (2012)" courtesy of Scott Holmes at freemusicarchive.org/music/Scott Holmes

Good Time Show by Aarthi and Sriram
Ep 93 - From Reading Papers in the Gym to a Billion-Dollar AI Company | Cohere's Untold Story

Good Time Show by Aarthi and Sriram

Play Episode Listen Later Jan 28, 2025 55:35


Chapters:0:00 Introduction to Aidan Gomez, CEO of Cohere2:12 Childhood and growing up in Canada5:50 Getting into computers and internet10:30 Sending cold emails14:20 How to work with Aidan16:40 The AI paper - "Attention is all you need"18:45 Starting Cohere21:10 Why choose enterprise (vs consumer) as a market24:45 AI strategy28:20 Hallucinations in LLM models30:10 Enterprise software and security implications32:05 Deloitte, Accenture and the impact of generative AI36:40 Will LLM scaling laws hit a plateau?38:50 AGI, reasoning and inference for LLM models41:30 Synthetic data - what is it? Why is it interesting?43:25 Looking ahead - what is the Cohere's strategy?46:00 Cohere's capital structure49:15 Enterprise Use Cases52:10 Advice for founders - adaptability54:15 Thank you Follow Sriram:https://www.instagram.com/sriramk/https://twitter.com/sriramkFollow Aarthi:https://www.instagram.com/aarthir/https://twitter.com/aarthirFollow the podcast:https://www.instagram.com/aarthiandsriramshow/https://twitter.com/aarthisrirampod

Machine Learning Street Talk
How Do AI Models Actually Think? - Laura Ruis

Machine Learning Street Talk

Play Episode Listen Later Jan 20, 2025 78:01


Laura Ruis, a PhD student at University College London and researcher at Cohere, explains her groundbreaking research into how large language models (LLMs) perform reasoning tasks, the fundamental mechanisms underlying LLM reasoning capabilities, and whether these models primarily rely on retrieval or develop procedural knowledge. SPONSOR MESSAGES: *** CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments. https://centml.ai/pricing/ Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. Are you interested in working on reasoning, or getting involved in their events? Goto https://tufalabs.ai/ *** TOC 1. LLM Foundations and Learning 1.1 Scale and Learning in Language Models [00:00:00] 1.2 Procedural Knowledge vs Fact Retrieval [00:03:40] 1.3 Influence Functions and Model Analysis [00:07:40] 1.4 Role of Code in LLM Reasoning [00:11:10] 1.5 Semantic Understanding and Physical Grounding [00:19:30] 2. Reasoning Architectures and Measurement 2.1 Measuring Understanding and Reasoning in Language Models [00:23:10] 2.2 Formal vs Approximate Reasoning and Model Creativity [00:26:40] 2.3 Symbolic vs Subsymbolic Computation Debate [00:34:10] 2.4 Neural Network Architectures and Tensor Product Representations [00:40:50] 3. AI Agency and Risk Assessment 3.1 Agency and Goal-Directed Behavior in Language Models [00:45:10] 3.2 Defining and Measuring Agency in AI Systems [00:49:50] 3.3 Core Knowledge Systems and Agency Detection [00:54:40] 3.4 Language Models as Agent Models and Simulator Theory [01:03:20] 3.5 AI Safety and Societal Control Mechanisms [01:07:10] 3.6 Evolution of AI Capabilities and Emergent Risks [01:14:20] REFS: [00:01:10] Procedural Knowledge in Pretraining & LLM Reasoning Ruis et al., 2024 https://arxiv.org/abs/2411.12580 [00:03:50] EK-FAC Influence Functions in Large LMs Grosse et al., 2023 https://arxiv.org/abs/2308.03296 [00:13:05] Surfaces and Essences: Analogy as the Core of Cognition Hofstadter & Sander https://www.amazon.com/Surfaces-Essences-Analogy-Fuel-Thinking/dp/0465018475 [00:13:45] Wittgenstein on Language Games https://plato.stanford.edu/entries/wittgenstein/ [00:14:30] Montague Semantics for Natural Language https://plato.stanford.edu/entries/montague-semantics/ [00:19:35] The Chinese Room Argument David Cole https://plato.stanford.edu/entries/chinese-room/ [00:19:55] ARC: Abstraction and Reasoning Corpus François Chollet https://arxiv.org/abs/1911.01547 [00:24:20] Systematic Generalization in Neural Nets Lake & Baroni, 2023 https://www.nature.com/articles/s41586-023-06668-3 [00:27:40] Open-Endedness & Creativity in AI Tim Rocktäschel https://arxiv.org/html/2406.04268v1 [00:30:50] Fodor & Pylyshyn on Connectionism https://www.sciencedirect.com/science/article/abs/pii/0010027788900315 [00:31:30] Tensor Product Representations Smolensky, 1990 https://www.sciencedirect.com/science/article/abs/pii/000437029090007M [00:35:50] DreamCoder: Wake-Sleep Program Synthesis Kevin Ellis et al. https://courses.cs.washington.edu/courses/cse599j1/22sp/papers/dreamcoder.pdf [00:36:30] Compositional Generalization Benchmarks Ruis, Lake et al., 2022 https://arxiv.org/pdf/2202.10745 [00:40:30] RNNs & Tensor Products McCoy et al., 2018 https://arxiv.org/abs/1812.08718 [00:46:10] Formal Causal Definition of Agency Kenton et al. https://arxiv.org/pdf/2208.08345v2 [00:48:40] Agency in Language Models Sumers et al. https://arxiv.org/abs/2309.02427 [00:55:20] Heider & Simmel's Moving Shapes Experiment https://www.nature.com/articles/s41598-024-65532-0 [01:00:40] Language Models as Agent Models Jacob Andreas, 2022 https://arxiv.org/abs/2212.01681 [01:13:35] Pragmatic Understanding in LLMs Ruis et al. https://arxiv.org/abs/2210.14986

Future of Data and AI
Jay Alammar on RAG, AI Education, and Industry Transformation - Future of AI

Future of Data and AI

Play Episode Listen Later Jan 20, 2025 83:41


In this episode, Raja Iqbal welcomes Jay Alammar, a renowned educator, researcher, and visual storyteller in machine learning. Jay shares his fascinating journey into simplifying complex AI concepts through visual storytelling and his passion for making AI education accessible to everyone. Raja and Jay discuss the power of visual learning, the role of intuition in understanding AI, and the challenges and opportunities in enterprise AI adoption. Jay also explores how AI is reshaping industries, the importance of tools like Retrieval-Augmented Generation (RAG), and his experiences at Cohere, where he helps organizations harness the power of large language models for real-world applications. This episode is perfect for anyone curious about the evolving world of AI, practical ways to adopt AI in business, and the importance of education in driving innovation.

The Game Plan
#30 Joe Delaney - Marriage, Fatherhood, and Life's Biggest Changes

The Game Plan

Play Episode Listen Later Jan 12, 2025 138:40


This episode is sponsored by Oracle. Harness the power of AI without overspending with Oracle Cloud Infrastructure (OCI). Ideal for AI model training, OCI offers 4-8x more bandwidth than competitors at half the cost. Transform your business like Uber and Cohere with OCI.Try it for free at https://oracle.com/gameplanWelcome back to The Game Plan podcast!In this episode, I'm joined by my good friend, best man and fitness legend, @Joe Delaney.We dive into Joe's journey of balancing fatherhood, running a fitness app, and life as a content creator. Joe shares why he stepped away from sponsorships, the challenges of redesigning and growing his app, and his fresh perspective on building a sustainable business.We also get real about the highs and lows of parenthood—from sleepless nights to the pure joy of first laughs—and how he's navigating it all while staying true to his long-term vision.Enjoyed the chat? Don't forget to like, comment, and subscribe for more!Check out the best protein pancakes in the world at Fuel Cakes: https://fuelcakes.com/

Tank Talks
News Rundown: CRA Gives Murky Guidance, Legality of Prorogation, RBC x Cohere, and Bench Accounting goes Bye-Bye

Tank Talks

Play Episode Listen Later Jan 10, 2025 17:02


Matt Cohen and John Ruffolo discuss the fast-moving events shaping Canada's political and economic landscape. Topics include the fallout from Prime Minister Justin Trudeau's resignation, the complexities of the CRA's proposed capital gains tax adjustments, and the legal challenges tied to Parliament's prorogation. The conversation then pivots to groundbreaking developments in AI, spotlighting RBC's partnership with Cohere to build a generative AI platform. The episode wraps with a critical analysis of the sudden closure of Vancouver-based Bench Accounting and its surprising acquisition.Topics:* (00:45) CRA's enforcement of capital gains tax changes and taxpayer strategies* (02:41) Legislative uncertainty surrounding the federal budget and prorogation* (04:08) Legal arguments challenging prorogation and their implications* (06:04) External perceptions of Canadian governance* (08:22) RBC's partnership with Cohere for AI development* (11:36) Anthropic's funding round and global AI investment trends* (11:52) Bench Accounting's shutdown and its acquisition by employer.comFollow Matt Cohen and Tank Talks here!Podcast production support provided by Agentbee.ai This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit tanktalks.substack.com

This Week in Pre-IPO Stocks
E176: Anthropic targets $60B valuation with $2B raise; Whatnot hits $4.97B valuation with $265M raise; xAI reaches $83B valuation, launches iOS app for Grok; SandboxAQ raises $300M at $5.6B valuation; Wiz prepares for IPO, valued at $20.5B; Cohere launche

This Week in Pre-IPO Stocks

Play Episode Listen Later Jan 10, 2025 11:32


Send us a textNEW FUND ANNOUNCEMENT*: The AG Dillon Anduril Pre-IPO Stock Fund is now accepting investors. Anduril Industries is a defense technology company that specializes in building advanced artificial intelligence (AI) and autonomous systems for military and national security purposes. Financial advisors only. Email aaron.dillon@agdillon.com to invest or request fund materials. Note important disclosures at the end of this post.Subscribe to AG Dillon Pre-IPO Stock Research at agdillon.com/subscribe;- Wednesday = secondary market valuations, revenue multiples, performance, index fact sheets- Saturdays = pre-IPO news and insights, webinar replays00:00 - Intro00:07 - Anthropic Targets $60B Valuation with $2B Raise01:33 - Whatnot Hits $4.97B Valuation with $265M Raise02:31 - xAI Reaches $83B Valuation, Launches iOS App for Grok03:55 - SandboxAQ Raises $300M at $5.6B Valuation05:03 - Wiz Prepares for IPO, Valued at $20.5B06:10 - Cohere Launches North, Valued at $5.4B07:38 - Epirus in Talks for $1B Valuation Amid Defense Focus08:38 - Hippocratic AI Raises $141M, Valued at $1.64B09:27 - Pre-IPO Stock Market Weekly Performance10:18 - Pre-IPO Stock Vintage Index Weekly Performance* NOTE: AG Dillon ("AGD") is not affiliated with Anduril. Anduril may require company approval for purchases (aka transfers). AGD has not been pre-approved by Anduril to purchase their stock. AGD purchases pre-IPO stocks in the secondary market and may gain exposure by directly purchasing the stock (on the company's capitalization table) and/or through a third-party fund (aka special purpose vehicle, or SPV).

Sync Book Radio from thesyncbook.com
42 Minutes Episode 394: Fall Book Club

Sync Book Radio from thesyncbook.com

Play Episode Listen Later Jan 4, 2025 88:32


Topics: Sickness, Prose, Reality, Box Scores, Worldly Knight, Spiritual Knight, Cohere, Themes, Galahad, Continuations, Lanzelet, Vulgate, Chaucer, Purity, Merlin, Monmouth, The Firste Moevere, Original Spelling, Ovid, Round Table, Avalon, Alliteration, Enli...

roon's Heroic Duty: Will "the Good Guys" Build AGI First? (from Doom Debates)

Play Episode Listen Later Dec 28, 2024 117:58


In this episode of The Cognitive Revolution, Nathan shares a fascinating cross-post from Doom Debates featuring a conversation between Liron Shapira and roon, an influential Twitter Anon from OpenAI's technical staff. They explore crucial insights into how OpenAI's team views AI's future, including discussions on AGI development, alignment challenges, and extinction risks. Join us for this thought-provoking analysis of AI safety and the mindset of those building transformative AI systems. Help shape our show by taking our quick listener survey at https://bit.ly/TurpentinePulse SPONSORS: GiveWell: GiveWell has spent over 17 years researching global health and philanthropy to identify the highest-impact giving opportunities. Over 125,000 donors have contributed more than $2 billion, saving over 200,000 lives through evidence-backed recommendations. First-time donors can have their contributions matched up to $100 before year-end. Visit https://GiveWell.org, select podcast, and enter Cognitive Revolution at checkout to make a difference today. SelectQuote: Finding the right life insurance shouldn't be another task you put off. SelectQuote compares top-rated policies to get you the best coverage at the right price. Even in our AI-driven world, protecting your family's future remains essential. Get your personalized quote at https://selectquote.com/cognitive Oracle Cloud Infrastructure (OCI): Oracle's next-generation cloud platform delivers blazing-fast AI and ML performance with 50% less for compute and 80% less for outbound networking compared to other cloud providers13. OCI powers industry leaders with secure infrastructure and application development capabilities. New U.S. customers can get their cloud bill cut in half by switching to OCI before December 31, 2024 at https://oracle.com/cognitive Weights & Biases RAG++: Advanced training for building production-ready RAG applications. Learn from experts to overcome LLM challenges, evaluate systematically, and integrate advanced features. Includes free Cohere credits. Visit https://wandb.me/cr to start the RAG++ course today. CHAPTERS: CHAPTERS: (00:00:00) About the Episode (00:07:18) Introducing roon (00:09:13) roon's Background (00:16:40) roon the Person (Part 1) (00:21:56) Sponsors: GiveWell | SelectQuote (00:24:45) roon the Person (Part 2) (00:26:43) Excitement in AI (00:31:59) Creativity in AI (00:40:18) Sponsors: Oracle Cloud Infrastructure (OCI) | Weights & Biases RAG++ (00:42:36) roon's P(Doom) (00:52:25) AI Risk & Regulation (00:53:51) AI Timelines (01:01:20) Aligned by Default? (01:09:16) Training vs Production (01:14:30) Open Source AI Risk (01:26:25) Goal-Oriented AI (01:34:29) Pause AI? (01:39:46) Dogecoin & Wrap Up (01:41:06) Outro & Call to Action (01:56:38) Outro SOCIAL LINKS: Website: https://www.cognitiverevolution.ai Twitter (Podcast): https://x.com/cogrev_podcast Twitter (Nathan): https://x.com/labenz LinkedIn: https://www.linkedin.com/in/nathanlabenz/ Youtube: https://www.youtube.com/@CognitiveRevolutionPodcast Apple: https://podcasts.apple.com/de/podcast...

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Happy holidays! We'll be sharing snippets from Latent Space LIVE! through the break bringing you the best of 2024! We want to express our deepest appreciation to event sponsors AWS, Daylight Computer, Thoth.ai, StrongCompute, Notable Capital, and most of all all our LS supporters who helped fund the gorgeous venue and A/V production!For NeurIPS last year we did our standard conference podcast coverage interviewing selected papers (that we have now also done for ICLR and ICML), however we felt that we could be doing more to help AI Engineers 1) get more industry-relevant content, and 2) recap 2024 year in review from experts. As a result, we organized the first Latent Space LIVE!, our first in person miniconference, at NeurIPS 2024 in Vancouver. Today, we're proud to share Loubna's highly anticipated talk (slides here)!Synthetic DataWe called out the Synthetic Data debate at last year's NeurIPS, and no surprise that 2024 was dominated by the rise of synthetic data everywhere:* Apple's Rephrasing the Web, Microsoft's Phi 2-4 and Orca/AgentInstruct, Tencent's Billion Persona dataset, DCLM, and HuggingFace's FineWeb-Edu, and Loubna's own Cosmopedia extended the ideas of synthetic textbook and agent generation to improve raw web scrape dataset quality* This year we also talked to the IDEFICS/OBELICS team at HuggingFace who released WebSight this year, the first work on code-vs-images synthetic data.* We called Llama 3.1 the Synthetic Data Model for its extensive use (and documentation!) of synthetic data in its pipeline, as well as its permissive license. * Nemotron CC and Nemotron-4-340B also made a big splash this year for how they used 20k items of human data to synthesize over 98% of the data used for SFT/PFT.* Cohere introduced Multilingual Arbitrage: Optimizing Data Pools to Accelerate Multilingual Progress observing gains of up to 56.5% improvement in win rates comparing multiple teachers vs the single best teacher model* In post training, AI2's Tülu3 (discussed by Luca in our Open Models talk) and Loubna's Smol Talk were also notable open releases this year.This comes in the face of a lot of scrutiny and criticism, with Scale AI as one of the leading voices publishing AI models collapse when trained on recursively generated data in Nature magazine bringing mainstream concerns to the potential downsides of poor quality syndata:Part of the concerns we highlighted last year on low-background tokens are coming to bear: ChatGPT contaminated data is spiking in every possible metric:But perhaps, if Sakana's AI Scientist pans out this year, we will have mostly-AI AI researchers publishing AI research anyway so do we really care as long as the ideas can be verified to be correct?Smol ModelsMeta surprised many folks this year by not just aggressively updating Llama 3 and adding multimodality, but also adding a new series of “small” 1B and 3B “on device” models this year, even working on quantized numerics collaborations with Qualcomm, Mediatek, and Arm. It is near unbelievable that a 1B model today can qualitatively match a 13B model of last year:and the minimum size to hit a given MMLU bar has come down roughly 10x in the last year. We have been tracking this proxied by Lmsys Elo and inference price:The key reads this year are:* MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases* Apple Intelligence Foundation Language Models* Hymba: A Hybrid-head Architecture for Small Language Models* Loubna's SmolLM and SmolLM2: a family of state-of-the-art small models with 135M, 360M, and 1.7B parameters on the pareto efficiency frontier.* and Moondream, which we already covered in the 2024 in Vision talkFull Talk on YouTubeplease like and subscribe!Timestamps* [00:00:05] Loubna Intro* [00:00:33] The Rise of Synthetic Data Everywhere* [00:02:57] Model Collapse* [00:05:14] Phi, FineWeb, Cosmopedia - Synthetic Textbooks* [00:12:36] DCLM, Nemotron-CC* [00:13:28] Post Training - AI2 Tulu, Smol Talk, Cohere Multilingual Arbitrage* [00:16:17] Smol Models* [00:18:24] On Device Models* [00:22:45] Smol Vision Models* [00:25:14] What's NextTranscript2024 in Synthetic Data and Smol Models[00:00:00] ​[00:00:05] Loubna Intro[00:00:05] Speaker: ​I'm very happy to be here. Thank you for the invitation. So I'm going to be talking about synthetic data in 2024. And then I'm going to be talking about small on device models. So I think the most interesting thing about synthetic data this year is that like now we have it everywhere in the large language models pipeline.[00:00:33] The Rise of Synthetic Data Everywhere[00:00:33] Speaker: I think initially, synthetic data was mainly used just for post training, because naturally that's the part where we needed human annotators. And then after that, we realized that we don't really have good benchmarks to [00:01:00] measure if models follow instructions well, if they are creative enough, or if they are chatty enough, so we also started using LLMs as judges.[00:01:08] Speaker: Thank you. And I think this year and towards the end of last year, we also went to the pre training parts and we started generating synthetic data for pre training to kind of replace some parts of the web. And the motivation behind that is that you have a lot of control over synthetic data. You can control your prompt and basically also the kind of data that you generate.[00:01:28] Speaker: So instead of just trying to filter the web, you could try to get the LLM to generate what you think the best web pages could look like and then train your models on that. So this is how we went from not having synthetic data at all in the LLM pipeline to having it everywhere. And so the cool thing is like today you can train an LLM with like an entirely synthetic pipeline.[00:01:49] Speaker: For example, you can use our Cosmopedia datasets and you can train a 1B model on like 150 billion tokens that are 100 percent synthetic. And those are also of good quality. And then you can [00:02:00] instruction tune the model on a synthetic SFT dataset. You can also do DPO on a synthetic dataset. And then to evaluate if the model is good, you can use.[00:02:07] Speaker: A benchmark that uses LLMs as a judge, for example, MTBench or AlpacaEvil. So I think this is like a really mind blowing because like just a few years ago, we wouldn't think this is possible. And I think there's a lot of concerns about model collapse, and I'm going to talk about that later. But we'll see that like, if we use synthetic data properly and we curate it carefully, that shouldn't happen.[00:02:29] Speaker: And the reason synthetic data is very popular right now is that we have really strong models, both open and closed. It is really cheap and fast to use compared to human annotations, which cost a lot and take a lot of time. And also for open models right now, we have some really good inference frameworks.[00:02:47] Speaker: So if you have enough GPUs, it's really easy to spawn these GPUs and generate like a lot of synthetic data. Some examples are VLM, TGI, and TensorRT.[00:02:57] Model Collapse[00:02:57] Speaker: Now let's talk about the elephant in the room, model [00:03:00] collapse. Is this the end? If you look at the media and all of like, for example, some papers in nature, it's really scary because there's a lot of synthetic data out there in the web.[00:03:09] Speaker: And naturally we train on the web. So we're going to be training a lot of synthetic data. And if model collapse is going to happen, we should really try to take that seriously. And the other issue is that, as I said, we think, a lot of people think the web is polluted because there's a lot of synthetic data.[00:03:24] Speaker: And for example, when we're building fine web datasets here at Guillerm and Hinek, we're interested in like, how much synthetic data is there in the web? So there isn't really a method to properly measure the amount of synthetic data or to save a webpage synthetic or not. But one thing we can do is to try to look for like proxy words, for example, expressions like as a large language model or words like delve that we know are actually generated by chat GPT.[00:03:49] Speaker: We could try to measure the amount of these words in our data system and compare them to the previous years. For example, here, we measured like a, these words ratio in different dumps of common crawl. [00:04:00] And we can see that like the ratio really increased after chat GPT's release. So if we were to say that synthetic data amount didn't change, you would expect this ratio to stay constant, which is not the case.[00:04:11] Speaker: So there's a lot of synthetic data probably on the web, but does this really make models worse? So what we did is we trained different models on these different dumps. And we then computed their performance on popular, like, NLP benchmarks, and then we computed the aggregated score. And surprisingly, you can see that the latest DOMs are actually even better than the DOMs that are before.[00:04:31] Speaker: So if there's some synthetic data there, at least it did not make the model's worse. Yeah, which is really encouraging. So personally, I wouldn't say the web is positive with Synthetic Data. Maybe it's even making it more rich. And the issue with like model collapse is that, for example, those studies, they were done at like a small scale, and you would ask the model to complete, for example, a Wikipedia paragraph, and then you would train it on these new generations, and you would do that every day.[00:04:56] Speaker: iteratively. I think if you do that approach, it's normal to [00:05:00] observe this kind of behavior because the quality is going to be worse because the model is already small. And then if you train it just on its generations, you shouldn't expect it to become better. But what we're really doing here is that we take a model that is very large and we try to distill its knowledge into a model that is smaller.[00:05:14] Phi, FineWeb, Cosmopedia - Synthetic Textbooks[00:05:14] Speaker: And in this way, you can expect to get like a better performance for your small model. And using synthetic data for pre-training has become really popular. After the textbooks are all you need papers where Microsoft basically trained a series of small models on textbooks that were using a large LLM.[00:05:32] Speaker: And then they found that these models were actually better than models that are much larger. So this was really interesting. It was like first of its time, but it was also met with a lot of skepticism, which is a good thing in research. It pushes you to question things because the dataset that they trained on was not public, so people were not really sure if these models are really good or maybe there's just some data contamination.[00:05:55] Speaker: So it was really hard to check if you just have the weights of the models. [00:06:00] And as Hugging Face, because we like open source, we tried to reproduce what they did. So this is our Cosmopedia dataset. We basically tried to follow a similar approach to what they documented in the paper. And we created a synthetic dataset of textbooks and blog posts and stories that had almost 30 billion tokens.[00:06:16] Speaker: And we tried to train some models on that. And we found that like the key ingredient to getting a good data set that is synthetic is trying as much as possible to keep it diverse. Because if you just throw the same prompts as your model, like generate like a textbook about linear algebra, and even if you change the temperature, the textbooks are going to look alike.[00:06:35] Speaker: So there's no way you could scale to like millions of samples. And the way you do that is by creating prompts that have some seeds that make them diverse. In our case, the prompt, we would ask the model to generate a textbook, but make it related to an extract from a webpage. And also we try to frame it within, to stay within topic.[00:06:55] Speaker: For example, here, we put like an extract about cardiovascular bioimaging, [00:07:00] and then we ask the model to generate a textbook related to medicine that is also related to this webpage. And this is a really nice approach because there's so many webpages out there. So you can. Be sure that your generation is not going to be diverse when you change the seed example.[00:07:16] Speaker: One thing that's challenging with this is that you want the seed samples to be related to your topics. So we use like a search tool to try to go all of fine web datasets. And then we also do a lot of experiments with the type of generations we want the model to generate. For example, we ask it for textbooks for middle school students or textbook for college.[00:07:40] Speaker: And we found that like some generation styles help on some specific benchmarks, while others help on other benchmarks. For example, college textbooks are really good for MMLU, while middle school textbooks are good for benchmarks like OpenBookQA and Pico. This is like a sample from like our search tool.[00:07:56] Speaker: For example, you have a top category, which is a topic, and then you have some [00:08:00] subtopics, and then you have the topic hits, which are basically the web pages in fine web does belong to these topics. And here you can see the comparison between Cosmopedia. We had two versions V1 and V2 in blue and red, and you can see the comparison to fine web, and as you can see throughout the training training on Cosmopedia was consistently better.[00:08:20] Speaker: So we managed to get a data set that was actually good to train these models on. It's of course so much smaller than FineWeb, it's only 30 billion tokens, but that's the scale that Microsoft data sets was, so we kind of managed to reproduce a bit what they did. And the data set is public, so everyone can go there, check if everything is all right.[00:08:38] Speaker: And now this is a recent paper from NVIDIA, Neumatron CC. They took things a bit further, and they generated not a few billion tokens, but 1. 9 trillion tokens, which is huge. And we can see later how they did that. It's more of, like, rephrasing the web. So we can see today that there's, like, some really huge synthetic datasets out there, and they're public, so, [00:09:00] like, you can try to filter them even further if you want to get, like, more high quality corpses.[00:09:04] Speaker: So for this, rephrasing the web this approach was suggested in this paper by Pratyush, where basically in this paper, they take some samples from C4 datasets, and then they use an LLM to rewrite these samples into a better format. For example, they ask an LLM to rewrite the sample into a Wikipedia passage or into a Q& A page.[00:09:25] Speaker: And the interesting thing in this approach is that you can use a model that is Small because it doesn't, rewriting doesn't require knowledge. It's just rewriting a page into a different style. So the model doesn't need to have like knowledge that is like extensive of what is rewriting compared to just asking a model to generate a new textbook and not giving it like ground truth.[00:09:45] Speaker: So here they rewrite some samples from C4 into Q& A, into Wikipedia, and they find that doing this works better than training just on C4. And so what they did in Nemo Trans CC is a similar approach. [00:10:00] They rewrite some pages from Common Crawl for two reasons. One is to, like improve Pages that are low quality, so they rewrite them into, for example, Wikipedia page, so they look better.[00:10:11] Speaker: And another reason is to create more diverse datasets. So they have a dataset that they already heavily filtered, and then they take these pages that are already high quality, and they ask the model to rewrite them in Question and Answer format. into like open ended questions or like multi choice questions.[00:10:27] Speaker: So this way they can reuse the same page multiple times without fearing like having multiple duplicates, because it's the same information, but it's going to be written differently. So I think that's also a really interesting approach for like generating synthetic data just by rephrasing the pages that you already have.[00:10:44] Speaker: There's also this approach called Prox where they try to start from a web page and then they generate a program which finds how to write that page to make it better and less noisy. For example, here you can see that there's some leftover metadata in the web page and you don't necessarily want to keep that for training [00:11:00] your model.[00:11:00] Speaker: So So they train a model that can generate programs that can like normalize and remove lines that are extra. So I think this approach is also interesting, but it's maybe less scalable than the approaches that I presented before. So that was it for like rephrasing and generating new textbooks.[00:11:17] Speaker: Another approach that I think is really good and becoming really popular for using synthetic data for pre training is basically building a better classifiers. For filtering the web for example, here we release the data sets called fine web edu. And the way we built it is by taking Llama3 and asking it to rate the educational content of web pages from zero to five.[00:11:39] Speaker: So for example, if a page is like a really good textbook that could be useful in a school setting, it would get a really high score. And if a page is just like an advertisement or promotional material, it would get a lower score. And then after that, we take these synthetic annotations and we train a classifier on them.[00:11:57] Speaker: It's a classifier like a BERT model. [00:12:00] And then we run this classifier on all of FineWeb, which is a 15 trillion tokens dataset. And then we only keep the pages that have like a score that's higher than 3. So for example, in our case, we went from 15 trillion tokens to 3. to just 1. 5 trillion tokens. Those are really highly educational.[00:12:16] Speaker: And as you can see here, a fine web EDU outperforms all the other public web datasets by a larger margin on a couple of benchmarks here, I show the aggregated score and you can see that this approach is really effective for filtering web datasets to get like better corpuses for training your LLMs.[00:12:36] DCLM, Nemotron-CC[00:12:36] Speaker: Others also try to do this approach. There's, for example, the DCLM datasets where they also train the classifier, but not to detect educational content. Instead, they trained it on OpenHermes dataset, which is a dataset for instruction tuning. And also they explain like IAM5 subreddits, and then they also get really high quality dataset which is like very information dense and can help [00:13:00] you train some really good LLMs.[00:13:01] Speaker: And then Nemotron Common Crawl, they also did this approach, but instead of using one classifier, they used an ensemble of classifiers. So they used, for example, the DCLM classifier, and also classifiers like the ones we used in FineWebEducational, and then they combined these two. Scores into a, with an ensemble method to only retain the best high quality pages, and they get a data set that works even better than the ones we develop.[00:13:25] Speaker: So that was it for like synthetic data for pre-training.[00:13:28] Post Training - AI2 Tulu, Smol Talk, Cohere Multilingual Arbitrage[00:13:28] Speaker: Now we can go back to post training. I think there's a lot of interesting post training data sets out there. One that was released recently, the agent instructs by Microsoft where they basically try to target some specific skills. And improve the performance of models on them.[00:13:43] Speaker: For example, here, you can see code, brain teasers, open domain QA, and they managed to get a dataset that outperforms that's when fine tuning Mistral 7b on it, it outperforms the original instruct model that was released by Mistral. And as I said, to get good synthetic data, you really [00:14:00] have to have a framework to make sure that your data is diverse.[00:14:03] Speaker: So for example, for them, they always. And then they see the generations on either source code or raw text documents, and then they rewrite them to make sure they're easier to generate instructions from, and then they use that for their like instruction data generation. There's also the Tool3SFT mixture, which was released recently by Allen AI.[00:14:23] Speaker: It's also really good quality and it covers a wide range of tasks. And the way they make sure that this dataset is diverse is by using personas from the persona hub datasets. Which is basically a data set of like I think over a million personas. And for example, in the tool mixture to generate like a new code snippet, they would give like the model persona, for example, a machine learning researcher interested in neural networks, and then ask it to generate like a coding problem.[00:14:49] Speaker: This way you make sure that your data set is really diverse, and then you can further filter the data sets, for example, using the reward models. We also released a dataset called Smalltalk, [00:15:00] and we also tried to cover the wide range of tasks, and as you can see here, for example, when fine tuning Mistral 7b on the dataset, we also outperformed the original Mistral instructs on a number of benchmarks, notably on mathematics and instruction following with ifevil.[00:15:18] Speaker: Another paper that's really interesting I wanted to mention is this one called Multilingual Data Arbitrage by Cohere. And basically they want to generate a data set for post training that is multilingual. And they have a really interesting problem. It's the fact that there isn't like one model that's really good at all the languages they wanted.[00:15:36] Speaker: So what they do is that like they use not just one teacher model, but multiple teachers. And then they have a router which basically sends the prompts they have to all these models. And then they get the completions and they have a reward model that traces all these generations and only keeps the best one.[00:15:52] Speaker: And this is like arbitrage and finance. So well, I think what's interesting in this, it shows that like synthetic data, it doesn't have to come from a single model. [00:16:00] And because we have so many good models now, you could like pull these models together and get like a dataset that's really high quality and that's diverse and that's covers all your needs.[00:16:12] Speaker: I was supposed to put a meme there, but. Yeah, so that was it for like a synthetic data.[00:16:17] Smol Models[00:16:17] Speaker: Now we can go to see what's happening in the small models field in 2024. I don't know if you know, but like now we have some really good small models. For example, Lama 3. 2 1B is. It matches Lama 2. 13b from, that was released last year on the LMSYS arena, which is basically the default go to leaderboard for evaluating models using human evaluation.[00:16:39] Speaker: And as you can see here, the scores of the models are really close. So I think we've made like hugely forward in terms of small models. Of course, that's one, just one data point, but there's more. For example, if you look at this chart from the Quint 2. 5 blog post, it shows that today we have some really good models that are only like 3 billion parameters [00:17:00] and 4 billion that score really high on MMLU.[00:17:03] Speaker: Which is a really popular benchmark for evaluating models. And you can see here that the red, the blue dots have more than 65 on MMLU. And the grey ones have less. And for example, Llama33b had less. So now we have a 3b model that outperforms a 33b model that was released earlier. So I think now people are starting to realize that like, we shouldn't just scale and scale models, but we should try to make them more efficient.[00:17:33] Speaker: I don't know if you knew, but you can also chat with a 3B plus model on your iPhone. For example, here, this is an app called PocketPal, where you can go and select a model from Hugging Face. It has a large choice. For example, here we loaded the 5. 3. 5, which is 3. 8 billion parameters on this iPhone. And we can chat with this and you can see that even the latency is also acceptable.[00:17:57] Speaker: For example, here, I asked it to give me a joke about [00:18:00] NeurIPS. So let's see what it has to say.[00:18:06] Speaker: Okay, why did the neural network attend NeurIPS? Because it heard there would be a lot of layers and fun and it wanted to train its sense of humor. So not very funny, but at least it can run on device. Yeah, so I think now we have good small models, but we also have like good frameworks and tools to use these small models.[00:18:24] On Device Models[00:18:24] Speaker: So I think we're really close to having like really on edge and on device models that are really good. And I think for a while we've had this narrative. But just training larger models is better. Of course, this is supported by science scaling laws. As you can see here, for example, when we scale the model size, the loss is lower and obviously you get a better model.[00:18:46] Speaker: But and we can see this, for example, in the GPT family of models, how we went from just a hundred million parameters to more than a trillion. parameters. And of course, we all observed the performance improvement when using the latest model. But [00:19:00] one thing that we shouldn't forget is that when we scale the model, we also scale the inference costs and time.[00:19:05] Speaker: And so the largest models were are going to cost so much more. So I think now instead of just building larger models, we should be focusing on building more efficient models. It's no longer a race for the largest models since these models are really expensive to run and they require like a really good infrastructure to do that and they cannot run on, for example, consumer hardware.[00:19:27] Speaker: And when you try to build more efficient models that match larger models, that's when you can really unlock some really interesting on device use cases. And I think a trend that we're noticing now is the trend of training smaller models longer. For example, if you compare how much, how long LLAMA was trained compared to LLAMA3, there is a huge increase in the pre training length.[00:19:50] Speaker: LLAMA was trained on 1 trillion tokens, but LLAMA3 8b was trained on 15 trillion tokens. So Meta managed to get a model that's the same size, but But it performs so much [00:20:00] better by choosing to like spend the sacrifice during training, because as we know, training is a one time cost, but inference is something that's ongoing.[00:20:08] Speaker: If we want to see what are like the small models reads in 2024, I think this mobile LLM paper by Meta is interesting. They try to study different models that are like have the less than 1 billion parameters and find which architecture makes most sense for these models. For example, they find that depth is more important than width.[00:20:29] Speaker: So it's more important to have models that have like more layers than just one. making them more wide. They also find that GQA helps, that tying the embedding helps. So I think it's a nice study overall for models that are just a few hundred million parameters. There's also the Apple intelligence tech report, which is interesting.[00:20:48] Speaker: So for Apple intelligence, they had two models, one that was like on server and another model that was on device. It had 3 billion parameters. And I think the interesting part is that they trained this model using [00:21:00] pruning. And then distillation. And for example, they have this table where they show that, like, using pruning and distillation works much better than training from scratch.[00:21:08] Speaker: And they also have some interesting insights about, like, how they specialize their models on specific tasks, like, for example, summarization and rewriting. There's also this paper by NVIDIA that was released recently. I think you've already had a talk about, like, hybrid models that was all interesting.[00:21:23] Speaker: And this model, they used, like, a hybrid architecture between state space models and transformers. And they managed to train a 1B model that's really performant without needing to train it on a lot of tokens. And regarding our work, we just recently released SmallM2, so it's a series of three models, which are the best in class in each model size.[00:21:46] Speaker: For example, our 1. 7b model outperforms Lama 1b and also Qt 2. 5. And how we managed to train this model is the following. That's where you spent a lot of time trying to curate the pre training datasets. We did a lot of [00:22:00] ablations, trying to find which datasets are good and also how to mix them. We also created some new math and code datasets that we're releasing soon.[00:22:08] Speaker: But you basically really spent a lot of time trying to find what's the best mixture that you can train these models on. And then we spent some time trying to like we also trained these models for very long. For example, small M1 was trained only on 1 trillion tokens, but this model is trained on 11 trillion tokens.[00:22:24] Speaker: And we saw that the performance kept improving. The models didn't really plateau mid training, which I think is really interesting. It shows that you can train such small models for very long and keep getting performance gains. What's interesting about SmallLM2 is that it's fully open. We also released, like the pre training code base, the fine tuning code, the datasets, and also evaluation in this repository.[00:22:45] Smol Vision Models[00:22:45] Speaker: Also there's, like, really interesting small models for text, but also for vision. For example, here you can see SmallVLM, which is a 2B model that's really efficient. It doesn't consume a lot of RAM, and it also has a good performance. There's also Moondream 0. [00:23:00] 5b, which was released recently. It's like the smallest visual language model.[00:23:04] Speaker: And as you can see, there isn't like a big trade off compared to Moondream 2b. So now I showed you that we have some really good small models. We also have the tools to use them, but why should you consider using small models and when? I think, like, small models are really interesting because of the on device feature.[00:23:23] Speaker: Because these models are small and they can run fast, you can basically run them on your laptop, but also on your mobile phone. And this means that your dataset stays locally. You don't have to send your queries to third parties. And this really enhances privacy. That was, for example, one of the big selling points for Apple Intelligence.[00:23:42] Speaker: Also, right now, we really have a lot of work to do. So many frameworks to do on device inference. For example, there's MLX, MLC, Llama, CPP, Transformers, JS. So we have a lot of options and each of them have like great features. So you have so many options for doing that. Small models are also really powerful if you choose to specialize them.[00:24:00][00:24:00] Speaker: For example, here there's a startup called Numind, which took small LM and then they fine tuned it on text extraction datasets. And they managed to get a model that's not very far from models that are much larger. So I think text extraction is like one use case where small models can be really performant and it makes sense to use them instead of just using larger models.[00:24:19] Speaker: You can also chat with these models in browser. For example, here, you can go there, you can load the model, you can even turn off your internet and just start chatting with the model locally. Speaking of text extraction, if you don't want to fine tune the models, there's a really good method of structure generation.[00:24:36] Speaker: We can basically force the models to follow a JSON schema that you defined. For example, here, we try to force the model to follow a schema for extracting key information from GitHub issues. So you can input free text, which is a complaint about a GitHub repository, something not working. And then you can run it there and the model can extract anything that is relevant for your GitHub issue creation.[00:24:58] Speaker: For example, the [00:25:00] priority, for example, here, priority is high, the type of the issue bug, and then a title and the estimation of how long this will take to fix. And you can just like do this in the browser, you can transform your text into a GitHub issue that's properly formatted.[00:25:14] What's Next[00:25:14] Speaker: So what's next for synthetic data and small models?[00:25:18] Speaker: I think that domain specific synthetic data is going to be, it's already important, it's going to be even more important. For example, generating synthetic data for math. I think this really would help improve the reasoning of a lot of models. And a lot of people are doing it, for example, Quint 2. 12 math, everyone's trying to reproduce a one.[00:25:37] Speaker: And so I think for synthetic data, trying to specialize it on some domains is going to be really important. And then for small models, I think specializing them through fine tuning, it's also going to be really important because I think a lot of companies are just trying to use these large models because they are better.[00:25:53] Speaker: But on some tasks, I think you can already get decent performance with small models. So you don't need to Pay like a [00:26:00] cost that's much larger just to make your model better at your task by a few percent. And this is not just for text. And I think it also applies for other modalities like vision and audio.[00:26:11] Speaker: And I think you should also watch out for on device frameworks and applications. For example, like the app I showed, or lama, all these frameworks are becoming really popular and I'm pretty sure that we're gonna get like more of them in 2025. And users really like that. Maybe for other, I should also say hot take.[00:26:28] Speaker: I think that like in AI, we just started like with fine tuning, for example, trying to make BERT work on some specific use cases, and really struggling to do that. And then we had some models that are much larger. So we just switched to like prompt engineering to get the models And I think we're going back to fine tuning where we realize these models are really costly.[00:26:47] Speaker: It's better to use just a small model or try to specialize it. So I think it's a little bit of a cycle and we're going to start to see like more fine tuning and less of just like a prompt engineering the models. So that was my talk. Thank you for following. And if you have [00:27:00] any questions, we can take them now. Get full access to Latent Space at www.latent.space/subscribe

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Happy holidays! We'll be sharing snippets from Latent Space LIVE! through the break bringing you the best of 2024! We want to express our deepest appreciation to event sponsors AWS, Daylight Computer, Thoth.ai, StrongCompute, Notable Capital, and most of all our LS supporters who helped fund the venue and A/V production!For NeurIPS last year we did our standard conference podcast coverage interviewing selected papers (that we have now also done for ICLR and ICML), however we felt that we could be doing more to help AI Engineers 1) get more industry-relevant content, and 2) recap 2024 year in review from experts. As a result, we organized the first Latent Space LIVE!, our first in person miniconference, at NeurIPS 2024 in Vancouver.Since Nathan Lambert ( Interconnects ) joined us for the hit RLHF 201 episode at the start of this year, it is hard to overstate how much Open Models have exploded this past year. In 2023 only five names were playing in the top LLM ranks, Mistral, Mosaic's MPT, TII UAE's Falcon, Yi from Kai-Fu Lee's 01.ai, and of course Meta's Llama 1 and 2. This year a whole cast of new open models have burst on the scene, from Google's Gemma and Cohere's Command R, to Alibaba's Qwen and Deepseek models, to LLM 360 and DCLM and of course to the Allen Institute's OLMo, OL MOE, Pixmo, Molmo, and Olmo 2 models. We were honored to host Luca Soldaini, one of the research leads on the Olmo series of models at AI2.Pursuing Open Model research comes with a lot of challenges beyond just funding and access to GPUs and datasets, particularly the regulatory debates this year across Europe, California and the White House. We also were honored to hear from and Sophia Yang, head of devrel at Mistral, who also presented a great session at the AI Engineer World's Fair Open Models track!Full Talk on YouTubePlease like and subscribe!Timestamps* 00:00 Welcome to Latent Space Live * 00:12 Recap of 2024: Best Moments and Keynotes * 01:22 Explosive Growth of Open Models in 2024 * 02:04 Challenges in Open Model Research * 02:38 Keynote by Luca Soldani: State of Open Models * 07:23 Significance of Open Source AI Licenses * 11:31 Research Constraints and Compute Challenges * 13:46 Fully Open Models: A New Trend * 27:46 Mistral's Journey and Innovations * 32:57 Interactive Demo: Lachat Capabilities * 36:50 Closing Remarks and NetworkingTranscriptSession3Audio[00:00:00] AI Charlie: Welcome to Latent Space Live, our first mini conference held at NeurIPS 2024 in Vancouver. This is Charlie, your AI co host. As a special treat this week, we're recapping the best of 2024 going domain by domain. We sent out a survey to the over 900 of you who told us what you wanted, and then invited the best speakers in the latent space network to cover each field.[00:00:28] AI Charlie: 200 of you joined us in person throughout the day, with over 2, 200 watching live online. Our next keynote covers the state of open models in 2024, with Luca Soldani and Nathan Lambert of the Allen Institute for AI, with a special appearance from Dr. Sophia Yang of Mistral. Our first hit episode of 2024 was with Nathan Lambert on RLHF 201 back in January.[00:00:57] AI Charlie: Where he discussed both reinforcement learning for language [00:01:00] models and the growing post training and mid training stack with hot takes on everything from constitutional AI to DPO to rejection sampling and also previewed the sea change coming to the Allen Institute. And to Interconnects, his incredible substack on the technical aspects of state of the art AI training.[00:01:18] AI Charlie: We highly recommend subscribing to get access to his Discord as well. It is hard to overstate how much open models have exploded this past year. In 2023, only five names were playing in the top LLM ranks. Mistral, Mosaics MPT, and Gatsby. TII UAE's Falcon, Yi, from Kaifu Lee's 01. ai, And of course, Meta's Lama 1 and 2.[00:01:43] AI Charlie: This year, a whole cast of new open models have burst on the scene. From Google's Jemma and Cohere's Command R, To Alibaba's Quen and DeepSeq models, to LLM360 and DCLM, and of course, to the Allen Institute's OLMO, [00:02:00] OLMOE, PIXMO, MOLMO, and OLMO2 models. Pursuing open model research comes with a lot of challenges beyond just funding and access to GPUs and datasets, particularly the regulatory debates this year across Europe.[00:02:14] AI Charlie: California and the White House. We also were honored to hear from Mistral, who also presented a great session at the AI Engineer World's Fair Open Models track. As always, don't forget to check the show notes for the YouTube link to their talk, as well as their slides. Watch out and take care.[00:02:35] Luca Intro[00:02:35] Luca Soldaini: Cool. Yeah, thanks for having me over. I'm Luca. I'm a research scientist at the Allen Institute for AI. I threw together a few slides on sort of like a recap of like interesting themes in open models for, for 2024. Have about maybe 20, 25 minutes of slides, and then we can chat if there are any questions.[00:02:57] Luca Soldaini: If I can advance to the next slide. [00:03:00] Okay, cool. So I did the quick check of like, to sort of get a sense of like, how much 2024 was different from 2023. So I went on Hugging Face and sort of get, tried to get a picture of what kind of models were released in 2023 and like, what do we get in 2024?[00:03:16] Luca Soldaini: 2023 we get, we got things like both LLAMA 1 and 2, we got Mistral, we got MPT, Falcon models, I think the YI model came in at the end. Tail end of the year. It was a pretty good year. But then I did the same for 2024. And it's actually quite stark difference. You have models that are, you know, reveling frontier level.[00:03:38] Luca Soldaini: Performance of what you can get from closed models from like Quen, from DeepSeq. We got Llama3. We got all sorts of different models. I added our own Olmo at the bottom. There's this growing group of like, Fully open models that I'm going to touch on a little bit later. But you know, just looking at the slides, it feels like 2024 [00:04:00] was just smooth sailing, happy knees, much better than previous year.[00:04:04] Luca Soldaini: And you know, you can plot you can pick your favorite benchmark Or least favorite, I don't know, depending on what point you're trying to make. And plot, you know, your closed model, your open model and sort of spin it in ways that show that, oh, you know open models are much closer to where closed models are today versus to Versus last year where the gap was fairly significant.[00:04:29] Luca Soldaini: So one thing that I think I don't know if I have to convince people in this room, but usually when I give this talks about like open models, there is always like this background question in, in, in people's mind of like, why should we use open models? APIs argument, you know, it's, it's. Just an HTTP request to get output from a, from one of the best model out there.[00:04:53] Luca Soldaini: Why do I have to set up infra and use local models? And there are really like two answer. There is the more [00:05:00] researchy answer for this, which is where it might be. Background lays, which is just research. If you want to do research on language models, research thrives on, on open models, there is like large swath of research on modeling, on how these models behave on evaluation and inference on mechanistic interpretability that could not happen at all if you didn't have open models they're also for AI builders, they're also like.[00:05:30] Luca Soldaini: Good use cases for using local models. You know, you have some, this is like a very not comprehensive slides, but you have things like there are some application where local models just blow closed models out of the water. So like retrieval, it's a very clear example. We might have like constraints like Edge AI applications where it makes sense.[00:05:51] Luca Soldaini: But even just like in terms of like stability, being able to say this model is not changing under the hood. It's, there's plenty of good cases for, [00:06:00] for open models. And the community is just not models. Is I stole this slide from one of the Quent2 announcement blog posts. But it's super cool to see like how much tech exists around open models and serving them on making them efficient and hosting them.[00:06:18] Luca Soldaini: It's pretty cool. And so. It's if you think about like where the term opens come from, comes from like the open source really open models meet the core tenants of, of open, of open source specifically when it comes around collaboration, there is truly a spirit, like through these open models, you can build on top of other people.[00:06:41] Luca Soldaini: innovation. We see a lot of these even in our own work of like, you know, as we iterate in the various versions of Alma it's not just like every time we collect from scratch all the data. No, the first step is like, okay, what are the cool data sources and datasets people have put [00:07:00] together for language model for training?[00:07:01] Luca Soldaini: Or when it comes to like our post training pipeline We one of the steps is you want to do some DPO and you use a lot of outputs of other models to improve your, your preference model. So it's really having like an open sort of ecosystem benefits and accelerates the development of open models.[00:07:23] The Definition of Open Models[00:07:23] Luca Soldaini: One thing that we got in 2024, which is not a specific model, but I thought it was really significant, is we first got we got our first open source AI definition. So this is from the open source initiative they've been generally the steward of a lot of the open source licenses when it comes to software and so they embarked on this journey in trying to figure out, okay, How does a license, an open source license for a model look like?[00:07:52] Luca Soldaini: Majority of the work is very dry because licenses are dry. So I'm not going to walk through the license step by [00:08:00] step, but I'm just going to pick out one aspect that is very good and then one aspect that personally feels like it needs improvement on the good side. This this open source AI license actually.[00:08:13] Luca Soldaini: This is very intuitive. If you ever build open source software and you have some expectation around like what open source looks like for software for, for AI, sort of matches your intuition. So, the weights need to be fairly available the code must be released with an open source license and there shouldn't be like license clauses that block specific use cases.[00:08:39] Luca Soldaini: So. Under this definition, for example, LLAMA or some of the QUEN models are not open source because the license says you can't use this model for this or it says if you use this model you have to name the output this way or derivative needs to be named that way. Those clauses don't meet open source [00:09:00] definition and so they will not be covered.[00:09:02] Luca Soldaini: The LLAMA license will not be covered under the open source definition. It's not perfect. One of the thing that, um, internally, you know, in discussion with with OSI, we were sort of disappointed is around the language. For data. So you might imagine that an open source AI model means a model where the data is freely available.[00:09:26] Luca Soldaini: There were discussion around that, but at the end of the day, they decided to go with a softened stance where they say a model is open source if you provide sufficient detail information. On how to sort of replicate the data pipeline. So you have an equivalent system, sufficient, sufficiently detailed.[00:09:46] Luca Soldaini: It's very, it's very fuzzy. Don't like that. An equivalent system is also very fuzzy. And this doesn't take into account the accessibility of the process, right? It might be that you provide enough [00:10:00] information, but this process costs, I don't know, 10 million to do. Now the open source definition. Like, any open source license has never been about accessibility, so that's never a factor in open source software, how accessible software is.[00:10:14] Luca Soldaini: I can make a piece of open source, put it on my hard drive, and never access it. That software is still open source, the fact that it's not widely distributed doesn't change the license, but practically there are expectations of like, what we want good open sources to be. So, it's, It's kind of sad to see that the data component in this license is not as, as, Open as some of us would like would like it to be.[00:10:40] Challenges for Open Models[00:10:40] Luca Soldaini: and I linked a blog post that Nathan wrote on the topic that it's less rambly and easier to follow through. One thing that in general, I think it's fair to say about the state of open models in 2024 is that we know a lot more than what we knew in, [00:11:00] in 2023. Like both on the training data, like And the pre training data you curate on like how to do like all the post training, especially like on the RL side.[00:11:10] Luca Soldaini: You know, 2023 was a lot of like throwing random darts at the board. I think 2024, we have clear recipes that, okay, don't get the same results as a closed lab because there is a cost in, in actually matching what they do. But at least we have a good sense of like, okay, this is, this is the path to get state of the art language model.[00:11:31] Luca Soldaini: I think that one thing that it's a downside of 2024 is that I think we are more research constrained in 2023. It feels that, you know, the barrier for compute that you need to, to move innovation along as just being right rising and rising. So like, if you go back to this slide, there is now this, this cluster of models that are sort of released by the.[00:11:57] Luca Soldaini: Compute rich club. Membership is [00:12:00] hotly debated. You know, some people don't want to be. Called the rich because it comes to expectations. Some people want to be called rich, but I don't know, there's debate, but like, these are players that have, you know, 10, 000, 50, 000 GPUs at minimum. And so they can do a lot of work and a lot of exploration and improving models that it's not very accessible.[00:12:21] Luca Soldaini: To give you a sense of like how I personally think about. Research budget for each part of the, of the language model pipeline is like on the pre training side, you can maybe do something with a thousand GPUs, really you want 10, 000. And like, if you want real estate of the art, you know, your deep seek minimum is like 50, 000 and you can scale to infinity.[00:12:44] Luca Soldaini: The more you have, the better it gets. Everyone on that side still complains that they don't have enough GPUs. Post training is a super wide sort of spectrum. You can do as little with like eight GPUs as long as you're able to [00:13:00] run, you know, a good version of, say, a LLAMA model, you can do a lot of work there.[00:13:05] Luca Soldaini: You can scale a lot of the methodology, just like scales with compute, right? If you're interested in you know, your open replication of what OpenAI's O1 is you're going to be on the 10K spectrum of our GPUs. Inference, you can do a lot with very few resources. Evaluation, you can do a lot with, well, I should say at least one GPUs if you want to evaluate GPUs.[00:13:30] Luca Soldaini: Open models but in general, like if you are, if you care a lot about intervention to do on this model, which it's my prefer area of, of research, then, you know, the resources that you need are quite, quite significant. Yeah. One other trends that has emerged in 2024 is this cluster of fully open models.[00:13:54] Luca Soldaini: So Omo the model that we built at ai, two being one of them and you know, it's nice [00:14:00] that it's not just us. There's like a cluster of other mostly research efforts who are working on this. And so it's good to to give you a primer of what like fully open means. So fully open, the easy way to think about it is instead of just releasing a model checkpoint that you run, you release a full recipe so that other people working on it.[00:14:24] Luca Soldaini: Working on that space can pick and choose whatever they want from your recipe and create their own model or improve on top of your model. You're giving out the full pipeline and all the details there instead of just like the end output. So I pull up the screenshot from our recent MOE model.[00:14:43] Luca Soldaini: And like for this model, for example, we released the model itself. Data that was trained on, the code, both for training and inference all the logs that we got through the training run, as well as every intermediate checkpoint and like the fact that you release different part of the pipeline [00:15:00] allows others to do really cool things.[00:15:02] Luca Soldaini: So for example, this tweet from early this year from folks in news research they use our pre training data to do a replication of the BitNet paper in the open. So they took just a Really like the initial part of a pipeline and then the, the thing on top of it. It goes both ways.[00:15:21] Luca Soldaini: So for example, for the Olmo2 model a lot of our pre trained data for the first stage of pre training was from this DCLM initiative that was led by folks Ooh, a variety of ins a variety of institutions. It was a really nice group effort. But you know, for When it was nice to be able to say, okay, you know, the state of the art in terms of like what is done in the open has improved.[00:15:46] AI2 Models - Olmo, Molmo, Pixmo etc[00:15:46] Luca Soldaini: We don't have to like do all this work from scratch to catch up the state of the art. We can just take it directly and integrate it and do our own improvements on top of that. I'm going to spend a few minutes doing like a [00:16:00] shameless plug for some of our fully open recipes. So indulge me in this.[00:16:05] Luca Soldaini: So a few things that we released this year was, as I was mentioning, there's OMOE model which is, I think still is state of the art MOE model in its size class. And it's also. Fully open, so every component of this model is available. We released a multi modal model called Molmo. Molmo is not just a model, but it's a full recipe of how you go from a text only model to a multi modal model, and we apply this recipe on top of Quent checkpoints, on top of Olmo checkpoints, as well as on top of OlmoE.[00:16:37] Luca Soldaini: And I think there'd be a replication doing that on top of Mistral as well. The post training side we recently released 2. 0. 3. Same story. This is a recipe on how you go from a base model to A state of the art post training model. We use the Tulu recipe on top of Olmo, on top of Llama, and then there's been open replication effort [00:17:00] to do that on top of Quen as well.[00:17:02] Luca Soldaini: It's really nice to see like, you know, when your recipe sort of, it's kind of turnkey, you can apply it to different models and it kind of just works. And finally, the last thing we released this year was Olmo 2, which so far is the best state of the art. Fully open language model a Sera combines aspect from all three of these previous models.[00:17:22] Luca Soldaini: What we learn on the data side from MomoE and what we learn on like making models that are easy to adapt from the Momo project and the Tulu project. I will close with a little bit of reflection of like ways this, this ecosystem of open models like it's not all roses. It's not all happy. It feels like day to day, it's always in peril.[00:17:44] Luca Soldaini: And, you know, I talked a little bit about like the compute issues that come with it. But it's really not just compute. One thing that is on top of my mind is due to like the environment and how you know, growing feelings about like how AI is treated. [00:18:00] It's actually harder to get access to a lot of the data that was used to train a lot of the models up to last year.[00:18:06] Luca Soldaini: So this is a screenshot from really fabulous work from Shane Longpre who's, I think is in Europe about Just access of like diminishing access to data for language model pre training. So what they did is they went through every snapshot of common crawl. Common crawl is this publicly available scrape of the, of a subset of the internet.[00:18:29] Luca Soldaini: And they looked at how For any given website whether a website that was accessible in say 2017, what, whether it was accessible or not in 2024. And what they found is as a reaction to like the close like of the existence of closed models like OpenAI or Cloud GPT or Cloud a lot of content owners have blanket Blocked any type of crawling to your website.[00:18:57] Luca Soldaini: And this is something that we see also internally at [00:19:00] AI2. Like one project that we started this year is we wanted to, we wanted to understand, like, if you're a good citizen of the internet and you crawl following sort of norms and policy that have been established in the last 25 years, what can you crawl?[00:19:17] Luca Soldaini: And we found that there's a lot of website where. The norms of how you express preference of whether to crawl your data or not are broken. A lot of people would block a lot of crawling, but do not advertise that in RobustDXT. You can only tell that they're crawling, that they're blocking you in crawling when you try doing it.[00:19:37] Luca Soldaini: Sometimes you can't even crawl the robots. txt to, to check whether you're allowed or not. And then a lot of websites there's, there's like all these technologies that historically have been, have existed to make websites serving easier such as Cloudflare or DNS. They're now being repurposed for blocking AI or any type of crawling [00:20:00] in a way that is Very opaque to the content owners themselves.[00:20:04] Luca Soldaini: So, you know, you go to these websites, you try to access them and they're not available and you get a feeling it's like, Oh, someone changed, something changed on the, on the DNS side that it's blocking this and likely the content owner has no idea. They're just using a Cloudflare for better, you know, load balancing.[00:20:25] Luca Soldaini: And this is something that was sort of sprung on them with very little notice. And I think the problem is this, this blocking or ideas really, it impacts people in different ways. It disproportionately helps companies that have a headstart, which are usually the closed labs and it hurts incoming newcomer players where either have now to do things in a sketchy way or you're never going to get that content that the closed lab might have.[00:20:54] Luca Soldaini: So there's a lot, it was a lot of coverage. I'm going to plug Nathan's blog post again. That is, [00:21:00] that I think the title of this one is very succinct which is like, we're actually not, You know, before thinking about running out of training data, we're actually running out of open training data. And so if we want better open models they should be on top of our mind.[00:21:13] Regulation and Lobbying[00:21:13] Luca Soldaini: The other thing that has emerged is that there is strong lobbying efforts on trying to define any kind of, AI as like a new extremely risky and I want to be precise here. Like the problem is now, um, like the problem is not not considering the risk of this technology. Every technology has risks that, that should always be considered.[00:21:37] Luca Soldaini: The thing that it's like to me is sorry, is ingenious is like just putting this AI on a pedestal and calling it like, An unknown alien technology that has like new and undiscovered potentials to destroy humanity. When in reality, all the dangers I think are rooted in [00:22:00] dangers that we know from existing software industry or existing issues that come with when using software on on a lot of sensitive domains, like medical areas.[00:22:13] Luca Soldaini: And I also noticed a lot of efforts that have actually been going on and trying to make this open model safe. I pasted one here from AI2, but there's actually like a lot of work that has been going on on like, okay, how do you make, if you're distributing this model, Openly, how do you make it safe?[00:22:31] Luca Soldaini: How, what's the right balance between accessibility on open models and safety? And then also there's annoying brushing of sort of concerns that are then proved to be unfounded under the rug. You know, if you remember the beginning of this year, it was all about bio risk of these open models.[00:22:48] Luca Soldaini: The whole thing fizzled because as being Finally, there's been like rigorous research, not just this paper from Cohere folks, but it's been rigorous research showing [00:23:00] that this is really not a concern that we should be worried about. Again, there is a lot of dangerous use of AI applications, but this one was just like, A lobbying ploy to just make things sound scarier than they actually are.[00:23:15] Luca Soldaini: So I got to preface this part. It says, this is my personal opinion. It's not my employer, but I look at things like the SP 1047 from, from California. And I think we kind of dodged a bullet on, on this legislation. We, you know, the open source community, a lot of the community came together at the last, sort of the last minute and did a very good effort trying to explain all the negative impact of this bill.[00:23:43] Luca Soldaini: But There's like, I feel like there's a lot of excitement on building these open models or like researching on these open models. And lobbying is not sexy it's kind of boring but it's sort of necessary to make sure that this ecosystem can, can really [00:24:00] thrive. This end of presentation, I have Some links, emails, sort of standard thing in case anyone wants to reach out and if folks have questions or anything they wanted to discuss.[00:24:13] Luca Soldaini: Is there an open floor? I think we have Sophia[00:24:16] swyx: who wants to who one, one very important open model that we haven't covered is Mistral. Ask her on this slide. Yeah, yeah. Well, well, it's nice to have the Mistral person talk recap the year in Mistral. But while Sophia gets set up, does anyone have like, just thoughts or questions about the progress in this space?[00:24:32] Questions - Incentive Alignment[00:24:32] swyx: Do you always have questions?[00:24:34] Quesiton: I'm very curious how we should build incentives to build open models, things like Francois Chollet's ArcPrize, and other initiatives like that. What is your opinion on how we should better align incentives in the community so that open models stay open?[00:24:49] Luca Soldaini: The incentive bit is, like, really hard.[00:24:51] Luca Soldaini: Like, even It's something that I actually, even we think a lot about it internally because like building open models is risky. [00:25:00] It's very expensive. And so people don't want to take risky bets. I think the, definitely like the challenges like our challenge, I think those are like very valid approaches for it.[00:25:13] Luca Soldaini: And then I think in general, promoting, building, so, any kind of effort to participate in this challenge, in those challenges, if we can promote doing that on top of open models and sort of really lean into like this multiplier effect, I think that is a good way to go. If there were more money for that.[00:25:35] Luca Soldaini: For efforts like research efforts around open models. There's a lot of, I think there's a lot of investments in companies that at the moment are releasing their model in the open, which is really cool. But it's usually more because of commercial interest and not wanting to support this, this like open models in the longterm, it's a really hard problem because I think everyone is operating sort of [00:26:00] in what.[00:26:01] Luca Soldaini: Everyone is at their local maximum, right? In ways that really optimize their position on the market. Global maximum is harder to achieve.[00:26:11] Question2: Can I ask one question? No.[00:26:12] Luca Soldaini: Yeah.[00:26:13] Question2: So I think one of the gap between the closed and open source models is the mutability. So the closed source models like chat GPT works pretty good on the low resource languages, which is not the same on the open, open source models, right?[00:26:27] Question2: So is it in your plan to improve on that?[00:26:32] Luca Soldaini: I think in general,[00:26:32] Luca Soldaini: yes, is I think it's. I think we'll see a lot of improvements there in, like, 2025. Like, there's groups like, Procurement English on the smaller side that are already working on, like, better crawl support, multilingual support. I think what I'm trying to say here is you really want to be experts.[00:26:54] Luca Soldaini: who are actually in those countries that teach those languages to [00:27:00] participate in the international community. To give you, like, a very easy example I'm originally from Italy. I think I'm terribly equipped to build a model that works well in Italian. Because one of the things you need to be able to do is having that knowledge of, like, okay, how do I access, you know, how Libraries, or content that is from this region that covers this language.[00:27:23] Luca Soldaini: I've been in the US long enough that I no longer know. So, I think that's the efforts that folks in Central Europe, for example, are doing. Around like, okay, let's tap into regional communities. To get access you know, to bring in collaborators from those areas. I think it's going to be, like, very crucial for getting products there.[00:27:46] Mistral intro[00:27:46] Sophia Yang: Hi everyone. Yeah, I'm super excited to be here to talk to you guys about Mistral. A really short and quick recap of what we have done, what kind of models and products we have released in the [00:28:00] past year and a half. So most of you We have already known that we are a small startup funded about a year and a half ago in Paris in May, 2003, it was funded by three of our co founders, and in September, 2003, we released our first open source model, Mistral 7b yeah, how, how many of you have used or heard about Mistral 7b?[00:28:24] Sophia Yang: Hey, pretty much everyone. Thank you. Yeah, it's our Pretty popular and community. Our committee really loved this model, and in December 23, we, we released another popular model with the MLE architecture Mr. A X seven B and oh. Going into this year, you can see we have released a lot of things this year.[00:28:46] Sophia Yang: First of all, in February 2004, we released MrSmall, MrLarge, LeChat, which is our chat interface, I will show you in a little bit. We released an embedding model for, you [00:29:00] know, converting your text into embedding vectors, and all of our models are available. The, the big cloud resources. So you can use our model on Google cloud, AWS, Azure Snowflake, IBM.[00:29:16] Sophia Yang: So very useful for enterprise who wants to use our model through cloud. And in April and May this year, we released another powerful open source MOE model, AX22B. And we also released our first code. Code Model Coastal, which is amazing at 80 plus languages. And then we provided another fine tuning service for customization.[00:29:41] Sophia Yang: So because we know the community love to fine tune our models, so we provide you a very nice and easy option for you to fine tune our model on our platform. And also we released our fine tuning code base called Menstrual finetune. It's open source, so feel free to take it. Take a look and.[00:29:58] Sophia Yang: More models. [00:30:00] On July 2, November this year, we released many, many other models. First of all is the two new small, best small models. We have Minestra 3B great for Deploying on edge devices we have Minstrel 8B if you used to use Minstrel 7B, Minstrel 8B is a great replacement with much stronger performance than Minstrel 7B.[00:30:25] Sophia Yang: We also collaborated with NVIDIA and open sourced another model, Nemo 12B another great model. And Just a few weeks ago, we updated Mistral Large with the version 2 with the updated, updated state of the art features and really great function calling capabilities. It's supporting function calling in LatentNate.[00:30:45] Sophia Yang: And we released two multimodal models Pixtral 12b. It's this open source and Pixtral Large just amazing model for, models for not understanding images, but also great at text understanding. So. Yeah, a [00:31:00] lot of the image models are not so good at textual understanding, but pixel large and pixel 12b are good at both image understanding and textual understanding.[00:31:09] Sophia Yang: And of course, we have models for research. Coastal Mamba is built on Mamba architecture and MathRoll, great with working with math problems. So yeah, that's another model.[00:31:29] Sophia Yang: Here's another view of our model reference. We have several premier models, which means these models are mostly available through our API. I mean, all of the models are available throughout our API, except for Ministry 3B. But for the premier model, they have a special license. Minstrel research license, you can use it for free for exploration, but if you want to use it for enterprise for production use, you will need to purchase a license [00:32:00] from us.[00:32:00] Sophia Yang: So on the top row here, we have Minstrel 3b and 8b as our premier model. Minstrel small for best, best low latency use cases, MrLarge is great for your most sophisticated use cases. PixelLarge is the frontier class multimodal model. And, and we have Coastral for great for coding and then again, MrEmbedding model.[00:32:22] Sophia Yang: And The bottom, the bottom of the slides here, we have several Apache 2. 0 licensed open way models. Free for the community to use, and also if you want to fine tune it, use it for customization, production, feel free to do so. The latest, we have Pixtros 3 12b. We also have Mr. Nemo mum, Coastal Mamba and Mastro, as I mentioned, and we have three legacy models that we don't update anymore.[00:32:49] Sophia Yang: So we recommend you to move to our newer models if you are still using them. And then, just a few weeks ago, [00:33:00] we did a lot of, uh, improvements to our code interface, Lachette. How many of you have used Lachette? Oh, no. Only a few. Okay. I highly recommend Lachette. It's chat. mistral. ai. It's free to use.[00:33:16] Sophia Yang: It has all the amazing capabilities I'm going to show you right now. But before that, Lachette in French means cat. So this is actually a cat logo. If you You can tell this is the cat eyes. Yeah. So first of all, I want to show you something Maybe let's, let's take a look at image understanding.[00:33:36] Sophia Yang: So here I have a receipts and I want to ask, just going to get the prompts. Cool. So basically I have a receipt and I said I ordered I don't know. Coffee and the sausage. How much do I owe? Add a 18 percent tip. So hopefully it was able to get the cost of the coffee and the [00:34:00] sausage and ignore the other things.[00:34:03] Sophia Yang: And yeah, I don't really understand this, but I think this is coffee. It's yeah. Nine, eight. And then cost of the sausage, we have 22 here. And then it was able to add the cost, calculate the tip, and all that. Great. So, it's great at image understanding, it's great at OCR tasks. So, if you have OCR tasks, please use it.[00:34:28] Sophia Yang: It's free on the chat. It's also available through our API. And also I want to show you a Canvas example. A lot of you may have used Canvas with other tools before. But, With Lachat, it's completely free again. Here, I'm asking it to create a canvas that's used PyScript to execute Python in my browser.[00:34:51] Sophia Yang: Let's see if it works. Import this. Okay, so, yeah, so basically it's executing [00:35:00] Python here. Exactly what we wanted. And the other day, I was trying to ask Lachat to create a game for me. Let's see if we can make it work. Yeah, the Tetris game. Yep. Let's just get one row. Maybe. Oh no. Okay. All right. You get the idea. I failed my mission. Okay. Here we go. Yay! Cool. Yeah. So as you can see, Lachet can write, like, a code about a simple game pretty easily. And you can ask Lachet to explain the code. Make updates however you like. Another example. There is a bar here I want to move.[00:35:48] Sophia Yang: Okay, great, okay. And let's go back to another one. Yeah, we also have web search capabilities. Like, you can [00:36:00] ask what's the latest AI news. Image generation is pretty cool. Generate an image about researchers. Okay. In Vancouver? Yeah, it's Black Forest Labs flux Pro. Again, this is free, so Oh, cool.[00:36:19] Sophia Yang: I guess researchers here are mostly from University of British Columbia. That's smart. Yeah. So this is Laia ira. Please feel free to use it. And let me know if you have any feedback. We're always looking for improvement and we're gonna release a lot more powerful features in the coming years.[00:36:37] Sophia Yang: Thank you. Get full access to Latent Space at www.latent.space/subscribe

Can AIs do AI R&D? Reviewing REBench Results with Neev Parikh of METR

Play Episode Listen Later Dec 21, 2024 107:58


In this episode of The Cognitive Revolution, Nathan explores METR's groundbreaking REBench evaluation framework with Neev Parikh. We dive deep into how this new benchmark assesses AI systems' ability to perform real machine learning research tasks, from optimizing GPU kernels to fine-tuning language models. Join us for a fascinating discussion about the current capabilities of AI models like Claude 3.5 and GPT-4, and what their performance tells us about the trajectory of artificial intelligence development. Check out METR's work: blog post: https://metr.org/blog/2024-11-22-evaluating-r-d-capabilities-of-llms/ paper: https://metr.org/AI_R_D_Evaluation_Report.pdf jobs: https://hiring.metr.org/ The Cognitive Revolution Ask Me Anything and Listener Survey: https://docs.google.com/forms/d/1aYv2XLID7RqGxj2_Y4_6x9mo_aqXcGCeLw1EQhy4IpY/edit Help shape our show by taking our quick listener survey at https://bit.ly/TurpentinePulse SPONSORS: GiveWell: GiveWell has spent over 17 years researching global health and philanthropy to identify the highest-impact giving opportunities. Over 125,000 donors have contributed more than $2 billion, saving over 200,000 lives through evidence-backed recommendations. First-time donors can have their contributions matched up to $100 before year-end. Visit https://GiveWell.org, select podcast, and enter Cognitive Revolution at checkout to make a difference today. SelectQuote: Finding the right life insurance shouldn't be another task you put off. SelectQuote compares top-rated policies to get you the best coverage at the right price. Even in our AI-driven world, protecting your family's future remains essential. Get your personalized quote at https://selectquote.com/cognitive Oracle Cloud Infrastructure (OCI): Oracle's next-generation cloud platform delivers blazing-fast AI and ML performance with 50% less for compute and 80% less for outbound networking compared to other cloud providers13. OCI powers industry leaders with secure infrastructure and application development capabilities. New U.S. customers can get their cloud bill cut in half by switching to OCI before December 31, 2024 at https://oracle.com/cognitive Weights & Biases RAG++: Advanced training for building production-ready RAG applications. Learn from experts to overcome LLM challenges, evaluate systematically, and integrate advanced features. Includes free Cohere credits. Visit https://wandb.me/cr to start the RAG++ course today. CHAPTERS: (00:00:00) Teaser (00:01:04) About the Episode (00:05:14) Introducing METR (00:07:36) Specialization of AI Risk (00:09:52) AI R&D vs. Autonomy (00:12:41) Benchmark Design Choices (00:16:04) Benchmark Design Principles (Part 1) (00:18:54) Sponsors: GiveWell | SelectQuote (00:21:44) Benchmark Design Principles (Part 2) (00:22:35) AI vs. Human Evaluation (00:26:55) Optimizing Runtimes (00:36:02) Sponsors: Oracle Cloud Infrastructure (OCI) | Weights & Biases RAG++ (00:38:20) AI Myopia (00:43:37) Optimizing Loss (00:47:59) Optimizing Win Rate (00:50:24) Best of K Analysis (01:02:26) Best of K Limitations (01:09:04) Agent Interaction Modalities (01:12:34) Analyzing Benchmark Results (01:17:16) Model Performance Differences (01:22:49) Elicitation and Scaffolding (01:27:08) Context Window & Best of K (01:35:17) Reward Hacking & Bad Behavior (01:43:47) Future Directions & Hiring (01:46:20) Outro SOCIAL LINKS: Website: https://www.cognitiverevolution.ai Twitter (Podcast): https://x.com/cogrev_podcast Twitter (Nathan): https://x.com/labenz LinkedIn: https://www.linkedin.com/in/nathanlabenz/

Equity
Are AI companies just defense tech now?

Equity

Play Episode Listen Later Dec 20, 2024 30:49


This week, the Equity pod gang — which included newcomer Max Zeff, Margaux MacColl, and Kirsten Korosec — noticed an emerging trend: the worlds of AI and defense tech are colliding. Listen to the full episode to hear about: A new fund is in town. And surprise, surprise, Humba Ventures' $40 million fund is focused on deep tech and defense.  How enterprise AI startup Cohere is unlike all the other AI startups out there, and why they've been so quiet. Particularly in this new deal with Palantir. Dig into the great philosophical question of 2024: is it dumb to IPO in an election year? And, perhaps more importantly, will this IPO dry spell continue in 2025? Should founders be cautious of investors with foreign backing? Equity is TechCrunch's flagship podcast, produced by Theresa Loconsolo, and posts every Wednesday and Friday. Subscribe to us on Apple Podcasts, Overcast, Spotify and all the casts. You also can follow Equity on X and Threads, at @EquityPod. For the full episode transcript, for those who prefer reading over listening, check out our full archive of episodes here. Credits: Equity is produced by Theresa Loconsolo with editing by Kell. Bryce Durbin is our Illustrator. We'd also like to thank the audience development team and Henry Pickavet, who manages TechCrunch audio products.

CanCon Podcast
The biggest tech stories of 2024

CanCon Podcast

Play Episode Listen Later Dec 20, 2024 76:02


“We are about to see a bunch of people who are powerful, successful, and feel like they know the best way to do things bring this approach into government.” The BetaKit Podcast reviews the biggest tech stories of 2024 before doling out annual letter grades for Shopify, Wealthsimple, Cohere, and many more. A podcast so good we had to record it twice. Have a different take on 2024? Let us know: podcast@betakit.com. The BetaKit Podcast is presented by ‘Devious Web,' a Shelley Grandy mystery novel available in digital or paperback formats. A successful Canadian entrepreneur is offered millions from Silicon Valley for his data analytics business. As CEO Tom Oliver considers the deal, he is targeted by an unknown perpetrator, and his friend, homicide detective Jason Liu, must strive to keep him safe. If you enjoyed ‘Suits' and ‘Succession,' this book is for you. Order your copy today.

The Game Plan
#29 Q&A - What 2024 Taught Me About Life & Fitness

The Game Plan

Play Episode Listen Later Dec 18, 2024 23:21


This episode is sponsored by Oracle. Harness the power of AI without overspending with Oracle Cloud Infrastructure (OCI). Ideal for AI model training, OCI offers 4-8x more bandwidth than competitors at half the cost.Transform your business like Uber and Cohere with OCI.Try it for free at https://oracle.com/gameplanWe're diving into a classic Q&A session today!!I'm answering all your questions about fitness, training, nutrition, motivation, and more, straight from my Instagram. From my go-to workouts when energy is low, to my favorite supplements, cheat meals, and top fitness advice—this video covers it all. Smash that like like button and subscribe for more!Try Whoop for free: http://join.whoop.com/LipsettJoin my mentorship program: https://www.bygameplan.com/mentorshipAlphalete Athletics: https://alphaleteathletics.comCode: LIPSETT for 10% offGHOST Supplements: https://www.ghostlifestyle.comDiscount Code: LIPSETT GHOST Supplements UK: https://uk.ghostlifestyle.com​​​​​​​ Discount Code: LIPSETT

Scouting Frontiers in AI for Biology: Dynamics, Diffusion, and Design, with Amelie Schreiber

Play Episode Listen Later Dec 14, 2024 107:28


Nathan welcomes back computational biochemist Amelie Schreiber for a fascinating update on AI's revolutionary impact in biology. In this episode of The Cognitive Revolution, we explore recent breakthroughs including AlphaFold3, ESM3, and new diffusion models transforming protein engineering and drug discovery. Join us for an insightful discussion about how AI is reshaping our understanding of molecular biology and making complex protein engineering tasks more accessible than ever before. Help shape our show by taking our quick listener survey at https://bit.ly/TurpentinePulse SPONSORS: Shopify: Shopify is the world's leading e-commerce platform, offering a market-leading checkout system and exclusive AI apps like Quikly. Nobody does selling better than Shopify. Get a $1 per month trial at https://shopify.com/cognitive SelectQuote: Finding the right life insurance shouldn't be another task you put off. SelectQuote compares top-rated policies to get you the best coverage at the right price. Even in our AI-driven world, protecting your family's future remains essential. Get your personalized quote at https://selectquote.com/cognitive Oracle Cloud Infrastructure (OCI): Oracle's next-generation cloud platform delivers blazing-fast AI and ML performance with 50% less for compute and 80% less for outbound networking compared to other cloud providers13. OCI powers industry leaders with secure infrastructure and application development capabilities. New U.S. customers can get their cloud bill cut in half by switching to OCI before December 31, 2024 at https://oracle.com/cognitive Weights & Biases RAG++: Advanced training for building production-ready RAG applications. Learn from experts to overcome LLM challenges, evaluate systematically, and integrate advanced features. Includes free Cohere credits. Visit https://wandb.me/cr to start the RAG++ course today. CHAPTERS: (00:00:00) Teaser (00:00:46) About the Episode (00:04:30) AI for Biology (00:07:14) David Baker's Impact (00:11:49) AlphaFold 3 & ESM3 (00:16:40) Protein Interaction Prediction (Part 1) (00:16:44) Sponsors: Shopify | SelectQuote (00:19:18) Protein Interaction Prediction (Part 2) (00:31:12) MSAs & Embeddings (Part 1) (00:32:32) Sponsors: Oracle Cloud Infrastructure (OCI) | Weights & Biases RAG++ (00:34:49) MSAs & Embeddings (Part 2) (00:35:57) Beyond Structure Prediction (00:51:13) Dynamics vs. Statics (00:57:24) In-Painting & Use Cases (00:59:48) Workflow & Platforms (01:06:45) Design Process & Success Rates (01:13:23) Ambition & Task Definition (01:19:25) New Models: PepFlow & GeoAB (01:28:23) Flow Matching vs. Diffusion (01:30:42) ESM3 & Multimodality (01:37:10) Summary & Future Directions (01:45:34) Outro SOCIAL LINKS: Website: https://www.cognitiverevolution.ai Twitter (Podcast): https://x.com/cogrev_podcast Twitter (Nathan): https://x.com/labenz LinkedIn: https://www.linkedin.com/in/nathanlabenz/ Youtube: https://www.youtube.com/@CognitiveRevolutionPodcast Apple: https://podcasts.apple.com/de/podcast/the-cognitive-revolution-ai-builders-researchers-and/id1669813431

Telecoms.com Podcast
Cohere, Vodafone and radio waves

Telecoms.com Podcast

Play Episode Listen Later Dec 5, 2024 110:30


This unprecedented episode of the pod not only took place in the middle of the week, but also featured a dialled-in special guest in addition to Ray from Cohere in the studio. They start by finding out what Cohere does (spoiler alert: it has invented some cleverness that makes better use of radio waves), which gets into the weeds regarding radio technology in general. Cohere recently won a Glotel award in partnership with Vodafone Spain, so they eventually patch in Paco from that company to join the fun, before concluding with some broader musing about the future direction of the industry.

EUVC
EUVC | E385 | Inovia's Michael McGraw European LPs and why a higher risk appetite could pay for itself

EUVC

Play Episode Listen Later Dec 4, 2024 54:09


In this episode of the EUVC podcast, Andreas sits down with Michael McGraw, Principal at Inovia Capital, a €415M growth equity fund headquartered in Canada but making waves in Europe.Inovia has €2.3B in assets under management and a track record of backing companies like Cohere, Lightspeed, Neo4j, and Wealthsimple. Mike brings a unique perspective shaped by his journey from LP at CDPQ—one of the world's largest pension funds—to leading growth-stage investments at Inovia. Together, we'll dive deep into the evolving role of European LPs, exploring why embracing a higher risk appetite could yield outsized returns and drive systemic innovation.We'll also discuss Inovia's strategy for scaling Series B to pre-IPO companies across North America and Europe, shedding light on key challenges and opportunities in the software space. Whether you're an LP curious about market dynamics or a founder navigating growth-stage fundraising, this episode is packed with insights you won't want to miss.Go to eu.vc to read the core take-aways.Chapters:01:00 Meet Michael McGraw from Inovia01:59 Inovia's Strategy and Focus02:23 Inovia's European Expansion03:22 Success Stories and Notable Investments04:05 The Role of CDPQ and Mike's Experience04:55 Canadian vs. European VC Ecosystems07:22 CDPQ's Investment Strategy11:42 Challenges for European LPs16:49 Fundraising in Europe: Insights and Observations27:27 Firepower and Fund Allocation28:05 Late Stage Market in Europe28:28 Investment Strategies and Risk Appetite29:49 Challenges in European Venture Growth Capital 31:45 Government's Role in Venture Capital32:27 Canadian Venture Capital Action Plan34:09 Fund of Funds in Europe37:45 Mike McGrath's Background41:41 Lessons Learned in Venture Capital48:06 Fundraising Tips for VCs

The Evolution of AI Agents: Lessons from 2024, with MultiOn CEO Div Garg

Play Episode Listen Later Dec 3, 2024 90:21


In this episode of The Cognitive Revolution, Nathan welcomes back Div Garg, Co-Founder and CEO of MultiOn, for his third appearance to discuss the evolving landscape of AI agents. We explore how agent development has shifted from open-ended frameworks to intelligent workflows, MultiOn's unique approach to agent development, and their journey toward achieving human-level performance. Dive into fascinating insights about data collection strategies, model fine-tuning techniques, and the future of agent authentication. Join us for an in-depth conversation about why 2025 might be the breakthrough year for AI agents. Check out MultiOn: https://www.multion.ai/ Help shape our show by taking our quick listener survey at https://bit.ly/TurpentinePulse SPONSORS: Oracle Cloud Infrastructure (OCI): Oracle's next-generation cloud platform delivers blazing-fast AI and ML performance with 50% less for compute and 80% less for outbound networking compared to other cloud providers13. OCI powers industry leaders with secure infrastructure and application development capabilities. New U.S. customers can get their cloud bill cut in half by switching to OCI before December 31, 2024 at https://oracle.com/cognitive SelectQuote: Finding the right life insurance shouldn't be another task you put off. SelectQuote compares top-rated policies to get you the best coverage at the right price. Even in our AI-driven world, protecting your family's future remains essential. Get your personalized quote at https://selectquote.com/cognitive Weights & Biases RAG++: Advanced training for building production-ready RAG applications. Learn from experts to overcome LLM challenges, evaluate systematically, and integrate advanced features. Includes free Cohere credits. Visit https://wandb.me/cr to start the RAG++ course today. RECOMMENDED PODCAST: Unpack Pricing - Dive into the dark arts of SaaS pricing with Metronome CEO Scott Woody and tech leaders. Learn how strategic pricing drives explosive revenue growth in today's biggest companies like Snowflake, Cockroach Labs, Dropbox and more. Apple: https://podcasts.apple.com/us/podcast/id1765716600 Spotify: https://open.spotify.com/show/38DK3W1Fq1xxQalhDSueFg CHAPTERS: (00:00:00) Teaser (00:00:40) About the Episode (00:04:10) The Rise of AI Agents (00:06:33) Open-Ended vs On-Rails (00:10:00) Agent Architecture (00:12:01) AI Learning & Feedback (00:14:01) Data Collection (Part 1) (00:18:27) Sponsors: Oracle Cloud Infrastructure (OCI) | SelectQuote (00:20:51) Data Collection (Part 2) (00:22:25) Self-Play & Rewards (00:25:04) Model Strategy & Agent Q (00:33:28) Sponsors: Weights & Biases RAG++ (00:34:39) Understanding Agent Q (00:43:16) Search & Learning (00:45:39) Benchmarks vs Reality (00:50:18) Positive Transfer & Scale (00:51:47) Fine-Tuning Strategies (00:55:16) Vision Strategy (01:00:16) Authentication & Security (01:03:48) Future of AI Agents (01:16:14) Cost, Latency, Reliability (01:19:30) Avoiding the Bitter Lesson (01:25:58) Agent-Assisted Future (01:27:11) Outro SOCIAL LINKS: Website: https://www.cognitiverevolution.ai Twitter (Podcast): https://x.com/cogrev_podcast Twitter (Nathan): https://x.com/labenz LinkedIn: https://www.linkedin.com/in/nathanlabenz/ Youtube: https://www.youtube.com/@CognitiveRevolutionPodcast Apple: https://podcasts.apple.com/de/podcast/the-cognitive-revolution-ai-builders-researchers-and/id1669813431 Spotify: https://open.spotify.com/show/6yHyok3M3BjqzR0VB5MSyk

Beyond Preference Alignment: Teaching AIs to Play Roles & Respect Norms, with Tan Zhi Xuan

Play Episode Listen Later Nov 30, 2024 117:12


In this episode of The Cognitive Revolution, Nathan explores groundbreaking perspectives on AI alignment with MIT PhD student Tan Zhi Xuan. We dive deep into Xuan's critique of preference-based AI alignment and their innovative proposal for role-based AI systems guided by social consensus. The conversation extends into their fascinating work on how AI agents can learn social norms through Bayesian rule induction. Join us for an intellectually stimulating discussion that bridges philosophical theory with practical implementation in AI development. Check out: "Beyond Preferences in AI Alignment" paper: https://arxiv.org/pdf/2408.16984 "Learning and Sustaining Shared Normative Systems via Bayesian Rule Induction in Markov Games" paper: https://arxiv.org/pdf/2402.13399 Help shape our show by taking our quick listener survey at https://bit.ly/TurpentinePulse SPONSORS: Notion: Notion offers powerful workflow and automation templates, perfect for streamlining processes and laying the groundwork for AI-driven automation. With Notion AI, you can search across thousands of documents from various platforms, generating highly relevant analysis and content tailored just for you - try it for free at https://notion.com/cognitiverevolution Weights & Biases RAG++: Advanced training for building production-ready RAG applications. Learn from experts to overcome LLM challenges, evaluate systematically, and integrate advanced features. Includes free Cohere credits. Visit https://wandb.me/cr to start the RAG++ course today. Oracle Cloud Infrastructure (OCI): Oracle's next-generation cloud platform delivers blazing-fast AI and ML performance with 50% less for compute and 80% less for outbound networking compared to other cloud providers13. OCI powers industry leaders with secure infrastructure and application development capabilities. New U.S. customers can get their cloud bill cut in half by switching to OCI before December 31, 2024 at https://oracle.com/cognitive RECOMMENDED PODCAST: Unpack Pricing - Dive into the dark arts of SaaS pricing with Metronome CEO Scott Woody and tech leaders. Learn how strategic pricing drives explosive revenue growth in today's biggest companies like Snowflake, Cockroach Labs, Dropbox and more. Apple: https://podcasts.apple.com/us/podcast/id1765716600 Spotify: https://open.spotify.com/show/38DK3W1Fq1xxQalhDSueFg CHAPTERS: (00:00:00) Teaser (00:01:09) About the Episode (00:04:25) Guest Intro (00:06:25) Xuan's Background (00:12:03) AI Near-Term Outlook (00:17:32) Sponsors: Notion | Weights & Biases RAG++ (00:20:18) Alignment Approaches (00:26:11) Critiques of RLHF (00:34:40) Sponsors: Oracle Cloud Infrastructure (OCI) (00:35:50) Beyond Preferences (00:40:27) Roles and AI Systems (00:45:19) What AI Owes Us (00:51:52) Drexler's AI Services (01:01:08) Constitutional AI (01:09:43) Technical Approach (01:22:01) Norms and Deviations (01:32:31) Norm Decay (01:38:06) Self-Other Overlap (01:44:05) Closing Thoughts (01:54:23) Outro SOCIAL LINKS: Website: https://www.cognitiverevolution.ai Twitter (Podcast): https://x.com/cogrev_podcast Twitter (Nathan): https://x.com/labenz LinkedIn: https://www.linkedin.com/in/nathanlabenz/ Youtube: https://www.youtube.com/@CognitiveRevolutionPodcast Apple: https://podcasts.apple.com/de/podcast/the-cognitive-revolution-ai-builders-researchers-and/id1669813431 Spotify: https://open.spotify.com/show/6yHyok3M3BjqzR0VB5MSyk

No Priors: Artificial Intelligence | Machine Learning | Technology | Startups
Model Plateaus and Enterprise AI Adoption with Cohere's Aidan Gomez

No Priors: Artificial Intelligence | Machine Learning | Technology | Startups

Play Episode Listen Later Nov 21, 2024 44:15


In this episode of No Priors, Sarah is joined by Aidan Gomez, cofounder and CEO of Cohere. Aidan reflects on his journey to co-authoring the groundbreaking 2017 paper, “Attention is All You Need,” during his internship, and shares his motivations for building Cohere, which delivers AI-powered language models and solutions for businesses. The discussion explores the current state of enterprise AI adoption and Aidan's advice for companies navigating the build vs. buy decision for AI tools. They also examine the drivers behind the flattening of model improvements and discuss where large language models (LLMs) fall short for predictive tasks. The conversation explores what the market has yet to account for in the rapidly evolving AI ecosystem, as well as Aidan's personal perspectives on AGI—what it might look like and when it could arrive. Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @AidanGomez Show Notes: 0:00 Introduction 0:36 Co-authoring “Attention is all you need” 2:27 Leaving Google and founding Cohere 4:04 Cohere's mission and models 6:15 Pitfalls of current AI  8:14 How enterprises are deploying AI today 10:58 Build vs. buy strategy for AI tools 14:37 Barriers to enterprise adoption  20:04 Which types of companies should pretrain models? 24:25 Addressing flaws in open-source models 25:12 Current and expected progress in scaling laws 29:54 Advances in multi-step problem solving and reasoning 32:29  Key drivers behind the flattening curve of model improvements  36:25 Exploring AGI 39:59 Limitations of LLMs 42:10 What the market has mispriced

AI Under Trump? The Stakes of 2024 w/ Joshua Steinman [Pt 2 of 2]

Play Episode Listen Later Nov 2, 2024 77:00


In this special episode of The Cognitive Revolution, Nathan shares his thoughts on the upcoming election and its potential impact on AI development. He explores the AI-forward case for Trump, featuring an interview with Joshua Steinman. Nathan outlines his reasons for not supporting Trump, focusing on US-China relations, leadership approach, and the need for a positive-sum mindset in the AI era. He discusses the importance of stable leadership during pivotal moments and explains why he'll be voting for Kamala Harris, despite some reservations. This thought-provoking episode offers a nuanced perspective on the intersection of politics and AI development. Be notified early when Turpentine's drops new publication: https://www.turpentine.co/exclusiveaccess SPONSORS: Weights & Biases RAG++: Advanced training for building production-ready RAG applications. Learn from experts to overcome LLM challenges, evaluate systematically, and integrate advanced features. Includes free Cohere credits. Visit https://wandb.me/cr to start the RAG++ course today. Shopify: Shopify is the world's leading e-commerce platform, offering a market-leading checkout system and exclusive AI apps like Quikly. Nobody does selling better than Shopify. Get a $1 per month trial at https://shopify.com/cognitive Notion: Notion offers powerful workflow and automation templates, perfect for streamlining processes and laying the groundwork for AI-driven automation. With Notion AI, you can search across thousands of documents from various platforms, generating highly relevant analysis and content tailored just for you - try it for free at https://notion.com/cognitiverevolution LMNT: LMNT is a zero-sugar electrolyte drink mix that's redefining hydration and performance. Ideal for those who fast or anyone looking to optimize their electrolyte intake. Support the show and get a free sample pack with any purchase at https://drinklmnt.com/tcr CHAPTERS: (00:00:00) About the Show (00:00:22) Sponsors: Weights & Biases RAG++ (00:01:28) About the Episode (00:13:13) Reflecting on Trump (00:15:32) Introducing Josh (00:16:35) AI Arms Race Concerns (00:20:20) Arms Race History (00:22:35) Building Trust (00:25:19) Ashenbrenner Model (00:27:17) Global Good vs. Self-Interest (00:28:20) Sponsors: Shopify | Notion (00:31:16) Working with Trump (00:33:54) Media Misrepresentation (00:40:09) Cabinet Member Leverage (00:44:41) Sponsors: LMNT (00:46:23) China's Communist Party (00:48:36) AI and National Policy (00:50:14) The Reality of AGI (00:52:39) Framing the Disagreement (01:01:41) Slaughterbots and AI Future (01:04:24) Risks of Engagement (01:09:29) Sustainability of Military Tech (01:13:01) Closing Statements (01:14:55) Outro SOCIAL LINKS: Website: https://www.cognitiverevolution.ai Twitter (Podcast): https://x.com/cogrev_podcast Twitter (Nathan): https://x.com/labenz LinkedIn: https://www.linkedin.com/in/nathanlabenz/ Youtube: https://www.youtube.com/@CognitiveRevolutionPodcast Apple: https://podcasts.apple.com/de/podcast/the-cognitive-revolution-ai-builders-researchers-and/id1669813431 Spotify: https://open.spotify.com/show/6yHyok3M3BjqzR0VB5MSyk

The Case for Trump and the Future of AI – Part 1, with Samuel Hammond, Senior Economist, Foundation of American Innovation

Play Episode Listen Later Nov 1, 2024 139:49


In this special episode of The Cognitive Revolution, Nathan shares his thoughts on the upcoming election and its potential impact on AI development. He explores the AI-forward case for Trump, featuring an interview with Samuel Hammond. Nathan outlines his reasons for not supporting Trump, focusing on US-China relations, leadership approach, and the need for a positive-sum mindset in the AI era. He discusses the importance of stable leadership during pivotal moments and explains why he'll be voting for Kamala Harris, despite some reservations. This thought-provoking episode offers a nuanced perspective on the intersection of politics and AI development. Be notified early when Turpentine's drops new publication: https://www.turpentine.co/exclusiveaccess SPONSORS: Weights & Biases RAG++: Advanced training for building production-ready RAG applications. Learn from experts to overcome LLM challenges, evaluate systematically, and integrate advanced features. Includes free Cohere credits. Visit https://wandb.me/cr to start the RAG++ course today. Shopify: Shopify is the world's leading e-commerce platform, offering a market-leading checkout system and exclusive AI apps like Quikly. Nobody does selling better than Shopify. Get a $1 per month trial at https://shopify.com/cognitive Notion: Notion offers powerful workflow and automation templates, perfect for streamlining processes and laying the groundwork for AI-driven automation. With Notion AI, you can search across thousands of documents from various platforms, generating highly relevant analysis and content tailored just for you - try it for free at https://notion.com/cognitiverevolution LMNT: LMNT is a zero-sugar electrolyte drink mix that's redefining hydration and performance. Ideal for those who fast or anyone looking to optimize their electrolyte intake. Support the show and get a free sample pack with any purchase at https://drinklmnt.com/tcr CHAPTERS: (00:00:00) About the Show (00:00:22) Sponsors: Weights & Biases RAG++ (00:01:28) About the Episode (00:13:13) Introductions (00:14:22) The Case for Trump (00:16:32) Trump: A Wildcard (00:26:10) Sponsors: Shopify | Notion (00:29:06) Ideological AI Policy (00:33:47) Republican Ideologies (00:40:31) Sponsors: LMNT (00:42:11) Trump and Silicon Valley (00:47:49) Republican Nuance (00:53:36) Elon Musk and AI (00:55:43) Utilitarian Analysis (00:58:01) Internal Consistency (01:00:31) Trump's Cabinet (01:05:53) Immigration Reform (01:15:30) Creative Destruction (01:22:29) Racing China (01:32:51) The Chip Ban (01:44:20) Standard Setting (01:48:36) Values and Diplomacy (01:52:50) American Strength (01:55:56) Red Queen Dynamic (01:59:23) Interest Groups & AI (02:08:32) Concluding Thoughts (02:17:45) Outro SOCIAL LINKS: Website: https://www.cognitiverevolution.ai Twitter (Podcast): https://x.com/cogrev_podcast Twitter (Nathan): https://x.com/labenz LinkedIn: https://www.linkedin.com/in/nathanlabenz/ Youtube: https://www.youtube.com/@CognitiveRevolutionPodcast

Gemini Update: Search Grounding, JSON Mode, Code Execution, & More – with Google's Logan Kilpatrick and Shrestha Basu Mallick

Play Episode Listen Later Oct 31, 2024 56:59


Nathan interviews Google product managers Shrestha Basu Mallick and Logan Kilpatrick about the Gemini API and AI Studio. They discuss Google's new grounding feature, allowing Gemini models to access real-time web information via Google search. The conversation explores Gemini's rapid growth, its position in the AI landscape, and Google's competitive strategy. Nathan shares insights from integrating Gemini into his own application and ponders the future of large language model capabilities across providers. Tune in for an in-depth look at Google's AI API product strategy and the latest Gemini features. Be notified early when Turpentine's drops new publication: https://www.turpentine.co/exclusiveaccess SPONSORS: Weights & Biases RAG++: Advanced training for building production-ready RAG applications. Learn from experts to overcome LLM challenges, evaluate systematically, and integrate advanced features. Includes free Cohere credits. Visit https://wandb.me/cr to start the RAG++ course today. Shopify: Shopify is the world's leading e-commerce platform, offering a market-leading checkout system and exclusive AI apps like Quikly. Nobody does selling better than Shopify. Get a $1 per month trial at https://shopify.com/cognitive Notion: Notion offers powerful workflow and automation templates, perfect for streamlining processes and laying the groundwork for AI-driven automation. With Notion AI, you can search across thousands of documents from various platforms, generating highly relevant analysis and content tailored just for you - try it for free at https://notion.com/cognitiverevolution LMNT: LMNT is a zero-sugar electrolyte drink mix that's redefining hydration and performance. Ideal for those who fast or anyone looking to optimize their electrolyte intake. Support the show and get a free sample pack with any purchase at https://drinklmnt.com/tcr CHAPTERS: (00:00:00) About the Show (00:00:53) Sponsors: Weights & Biases RAG++ (00:01:28) About the Episode (00:04:15) Gemini API Growth (00:05:26) Intro to AI Studio (00:07:35) Vertex vs. AI Studio (00:09:33) Developer Adoption (00:14:23) Gemini Use Cases (Part 1) (00:17:41) Sponsors: Shopify | Notion (00:20:01) Gemini Use Cases (Part 2) (00:23:08) Multimodality & Flash (00:26:29) Free Tier & Costs (00:31:43) Inference Costs (00:32:55) Fine-tuning & Vision (00:36:59) Sponsors: LMNT (00:38:04) Search Grounding (00:44:42) Grounding Sources (00:46:58) Competitive Landscape (00:50:36) Design Decisions (00:54:54) Outro SOCIAL LINKS: Website: https://www.cognitiverevolution.ai Twitter (Podcast): https://x.com/cogrev_podcast Twitter (Nathan): https://x.com/labenz LinkedIn: https://www.linkedin.com/in/nathanlabenz/ Youtube: https://www.youtube.com/@CognitiveRevolutionPodcast Apple: https://podcasts.apple.com/de/podcast/the-cognitive-revolution-ai-builders-researchers-and/id1669813431 Spotify: https://open.spotify.com/show/6yHyok3M3BjqzR0VB5MSyk

Big Technology Podcast
The Next Gen AI Models: Reliable, Consistent, Trustworthy — With Aidan Gomez

Big Technology Podcast

Play Episode Listen Later Oct 30, 2024 45:17


Aidan Gomez is the co-author of the "Attention Is All You Need" paper that launched the AI revolution and CEO of Cohere, an enterprise AI company. Gomez joins Big Technology to discuss the myths, facts, and realities of today's AI landscape. Tune in to hear why the real value of AI isn't in flashy consumer apps but in automating crucial back-office processes that could save businesses billions. We also cover the truth about AI capabilities, the likelihood of AGI, synthetic data training, and whether an intelligence explosion is possible. Hit play for a refreshingly grounded discussion about where AI is actually making an impact, from one of the field's pioneering voices. --- Enjoying Big Technology Podcast? Please rate us five stars ⭐⭐⭐⭐⭐ in your podcast app of choice. For weekly updates on the show, sign up for the pod newsletter on LinkedIn: https://www.linkedin.com/newsletters/6901970121829801984/ Want a discount for Big Technology on Substack? Here's 40% off for the first year: https://tinyurl.com/bigtechnology Questions? Feedback? Write to: bigtechnologypodcast@gmail.com

Training Zamba: A Hybrid Model Master Class with Zyphra's Quentin Anthony

Play Episode Listen Later Oct 30, 2024 145:00


In this episode of The Cognitive Revolution, Nathan dives deep into the world of state space models with returning co-host Jason Meaux and special guest Quentin Anthony, Head of Model Training at Zyphra. Explore the cutting-edge Zamba 2-7b model, which combines selective state space and attention mechanisms. Uncover practical insights on model training, architectural choices, and the challenges of scaling AI. From learning schedules to hybrid architectures, loss metrics to context length extension, this technical discussion covers it all. Don't miss this in-depth conversation on the future of personalized, on-device AI. Check out more about Zyphra and Jason Meaux here: Zyphra's website: https://www.zyphra.com Zamba2-7B Blog: https://www.zyphra.com/post/zamba2-7b Zamba2 GitHub: https://github.com/Zyphra/Zamba2 Tree attention: https://www.zyphra.com/post/tree-attention-topology-aware-decoding-for-long-context-attention-on-gpu-clusters Jason's Meaux Twitter: https://x.com/KamaraiCode Jason's Meaux website: https://www.statespace.info Be notified early when Turpentine's drops new publication: https://www.turpentine.co/exclusiveaccess SPONSORS: Weights & Biases RAG++: Advanced training for building production-ready RAG applications. Learn from experts to overcome LLM challenges, evaluate systematically, and integrate advanced features. Includes free Cohere credits. Visit https://wandb.me/cr to start the RAG++ course today. Shopify: Shopify is the world's leading e-commerce platform, offering a market-leading checkout system and exclusive AI apps like Quikly. Nobody does selling better than Shopify. Get a $1 per month trial at https://shopify.com/cognitive Notion: Notion offers powerful workflow and automation templates, perfect for streamlining processes and laying the groundwork for AI-driven automation. With Notion AI, you can search across thousands of documents from various platforms, generating highly relevant analysis and content tailored just for you - try it for free at https://notion.com/cognitiverevolution LMNT: LMNT is a zero-sugar electrolyte drink mix that's redefining hydration and performance. Ideal for those who fast or anyone looking to optimize their electrolyte intake. Support the show and get a free sample pack with any purchase at https://drinklmnt.com/tcr CHAPTERS: (00:00:00) Teaser (00:00:42) About the Show (00:01:05) About the Episode (00:03:09) Introducing Zyphra (00:07:28) Personalization in AI (00:12:48) State Space Models & Efficiency (Part 1) (00:19:22) Sponsors: Weights & Biases RAG++ | Shopify (00:21:26) State Space Models & Efficiency (Part 2) (00:22:23) Dense Attention to Shared Attention (00:29:41) Zyphra's Early Bet on Mamba (Part 1) (00:33:18) Sponsors: Notion | LMNT (00:36:00) Zyphra's Early Bet on Mamba (Part 2) (00:37:22) Loss vs. Model Quality (00:44:53) Emergence & Grokking (00:50:06) Loss Landscapes & Convergence (00:56:55) Sophia, Distillation & Secrets (01:09:00) Competing with Big Tech (01:23:50) The Future of Model Training (01:30:02) Deep Dive into Zamba 1 (01:34:24) Zamba 2 and Mamba 2 (01:38:56) Context Extension & Memory (01:44:04) Sequence Parallelism (01:45:44) Zamba 2 Architecture (01:53:57) Mamba Attention Hybrids (02:00:00) Lock-in Effects (02:05:32) Mamba Hybrids in Robotics (02:07:07) Ease of Use & Compatibility (02:12:10) Tree Attention vs. Ring Attention (02:22:02) Zyphra's Vision & Goals (02:23:57) Outro SOCIAL LINKS: Website: https://www.cognitiverevolution.ai Twitter (Podcast): https://x.com/cogrev_podcast Twitter (Nathan): https://x.com/labenz LinkedIn: https://www.linkedin.com/in/nathanlabenz/

Can AIs Generate Novel Research Ideas? with lead author Chenglei Si

Play Episode Listen Later Oct 23, 2024 84:56


In this episode of The Cognitive Revolution, Nathan delves into the fascinating world of AI-generated research ideas with Stanford PhD student Chenglei Si. They discuss a groundbreaking study that pits AI against human researchers in generating novel AI research concepts. Learn about the surprising results that show AI-generated ideas scoring higher on novelty and excitement, and explore the implications for the future of AI research and development. Join us for an insightful conversation that challenges our understanding of AI capabilities and their potential impact on scientific discovery. Link to the research paper being discussed: https://arxiv.org/abs/2409.04109 Be notified early when Turpentine's drops new publication: https://www.turpentine.co/exclusiveaccess SPONSORS: Weights & Biases RAG++: Advanced training for building production-ready RAG applications. Learn from experts to overcome LLM challenges, evaluate systematically, and integrate advanced features. Includes free Cohere credits. Visit https://wandb.me/cr to start the RAG++ course today. Shopify: Shopify is the world's leading e-commerce platform, offering a market-leading checkout system and exclusive AI apps like Quikly. Nobody does selling better than Shopify. Get a $1 per month trial at https://shopify.com/cognitive Notion: Notion offers powerful workflow and automation templates, perfect for streamlining processes and laying the groundwork for AI-driven automation. With Notion AI, you can search across thousands of documents from various platforms, generating highly relevant analysis and content tailored just for you - try it for free at https://notion.com/cognitiverevolution Brave: The Brave search API can be used to assemble a data set to train your AI models and help with retrieval augmentation at the time of inference. All while remaining affordable with developer first pricing, integrating the Brave search API into your workflow translates to more ethical data sourcing and more human representative data sets. Try the Brave search API for free for up to 2000 queries per month at https://bit.ly/BraveTCR Oracle: Oracle Cloud Infrastructure (OCI) is a single platform for your infrastructure, database, application development, and AI needs. OCI has four to eight times the bandwidth of other clouds; offers one consistent price, and nobody does data better than Oracle. If you want to do more and spend less, take a free test drive of OCI at https://oracle.com/cognitive CHAPTERS: (00:00:00) About the Show (00:00:22) Sponsors: Weights & Biases RAG++ (00:01:28) About the Episode (00:05:30) Introducing Chenglei Si (00:06:22) Path to Automating Research (00:07:58) Notable AI Research Projects (00:15:26) Evaluating Research Ideas (Part 1) (00:19:39) Sponsors: Shopify | Notion (00:22:33) Evaluating Research Ideas (Part 2) (00:25:49) Research Setup and Design (00:29:38) AI Prompting and Idea Generation (00:34:40) Diversity vs. Quality of Ideas (Part 1) (00:34:40) Sponsors: Brave | Oracle (00:36:44) Diversity vs. Quality of Ideas (Part 2) (00:42:05) Inference Scaling and Execution (00:45:04) Anonymizing and Evaluating Ideas (00:53:22) Headline Results and Analysis (00:58:45) Observations and Insights (01:09:02) Novelty Indicators and Deception (01:11:59) Top AI-Generated Ideas (01:14:41) Next Steps and Future Directions (01:20:43) Expectations for the Future (01:23:14) Outro

通勤十分鐘 On The Way To Work
S5EP438 加拿大新創Cohere的崛起 AI創新背後的多倫多力量與進軍企業市場的野心 與 JP Morgan財報超乎預期 拉動銀行類股飆漲

通勤十分鐘 On The Way To Work

Play Episode Listen Later Oct 18, 2024 30:01


本集節目由【Zenfone 11 Ultra】贊助播出 盡情吸收 AI 新知的你,也懂得運用 AI 了嗎? Zenfone 11 Ultra 號稱最懂台灣人的 AI手機 有AI 錄音筆記、AI 即時通話口譯等超好用資訊處理、高效功能 更唯一支援繁體中文! 隨手還能用 AI 拍攝超穩定運鏡 快來了解 AI 手機的生活化應用>> https://reurl.cc/0d02WK 大家週五愉快!本集節目為台灣時間10/18的節目 如何開啟Podcast訂閱服務 Patreon訂閱往這邊走 免費訂閱通勤精釀電子報 合作邀約請聯繫:onthewaytowork2020@gmail.com IG: @onthe_waytowork https://www.instagram.com/onthe_waytowork/ Powered by Firstory Hosting

Leading Indicators of AI Danger: Owain Evans on Situational Awareness & Out-of-Context Reasoning, from The Inside View

Play Episode Listen Later Oct 16, 2024 146:37


In this special crossover episode of The Cognitive Revolution, Nathan introduces a conversation from The Inside View featuring Owain Evans, AI alignment researcher at UC Berkeley's Center for Human Compatible AI. Evans and host Michael Trazzi delve into critical AI safety topics, including situational awareness and out-of-context reasoning. Discover Evans' groundbreaking work on the reversal curse and connecting the dots, exploring how large language models process and infer information. This timely discussion highlights the importance of situational awareness in AI systems, particularly in light of recent advancements in AI capabilities. Don't miss this insightful exploration of the evolving relationship between human and artificial intelligence. Check out "The Inside View" Podcast here: https://theinsideview.ai/ Apply to join over 400 Founders and Execs in the Turpentine Network: https://www.turpentinenetwork.co/ SPONSORS: Weights & Biases RAG++: Advanced training for building production-ready RAG applications. Learn from experts to overcome LLM challenges, evaluate systematically, and integrate advanced features. Includes free Cohere credits. Visit https://wandb.me/cr to start the RAG++ course today. Shopify: Shopify is the world's leading e-commerce platform, offering a market-leading checkout system and exclusive AI apps like Quikly. Nobody does selling better than Shopify. Get a $1 per month trial at https://shopify.com/cognitive. LMNT: LMNT is a zero-sugar electrolyte drink mix that's redefining hydration and performance. Ideal for those who fast or anyone looking to optimize their electrolyte intake. Support the show and get a free sample pack with any purchase at https://drinklmnt.com/tcr. Notion: Notion offers powerful workflow and automation templates, perfect for streamlining processes and laying the groundwork for AI-driven automation. With Notion AI, you can search across thousands of documents from various platforms, generating highly relevant analysis and content tailored just for you - try it for free at https://notion.com/cognitiverevolution Oracle: Oracle Cloud Infrastructure (OCI) is a single platform for your infrastructure, database, application development, and AI needs. OCI has four to eight times the bandwidth of other clouds; offers one consistent price, and nobody does data better than Oracle. If you want to do more and spend less, take a free test drive of OCI at https://oracle.com/cognitive CHAPTERS: (00:00:00) About the Show (00:00:22) Sponsors: Weights & Biases RAG++ (00:01:28) About the Episode (00:04:10) Intro (00:05:09) Owain Evans' Research (00:06:36) Situational Awareness (00:09:07) Measuring Situational Awareness (00:14:29) Claude's Situational Awareness (00:19:06) Sponsors: Shopify | LMNT (00:22:01) Needle in a Haystack (00:26:26) Concrete Examples of Tasks (00:34:51) Sponsors: Notion | Oracle (00:37:29) Anti-Imitation Tasks (00:50:03) GPT-4 Base Model Results (01:01:48) Benchmark Saturation (01:07:23) Future Research Directions (01:12:01) Out-of-Context Reasoning (01:27:29) Safety Implications (01:36:24) Scaling and Reasoning (01:44:28) Mixture of Functions (01:54:12) Research Style and Taste (02:08:51) Capabilities and Downsides (02:18:56) Reception and Impact (02:25:30) Outro SOCIAL LINKS: Website: https://www.cognitiverevolution.ai Twitter (Podcast): https://x.com/cogrev_podcast Twitter (Nathan): https://x.com/labenz LinkedIn: https://www.linkedin.com/in/nathanlabenz/ Youtube: https://www.youtube.com/@CognitiveRevolutionPodcast Apple: https://podcasts.apple.com/de/podcast/the-cognitive-revolution-ai-builders-researchers-and/id1669813431 Spotify: https://open.spotify.com/show/6yHyok3M3BjqzR0VB5MSyk

POLITICO Dispatch
Cohere's CEO wants to build a ‘boring but profound' AI future

POLITICO Dispatch

Play Episode Listen Later Oct 9, 2024 19:53


Artificial intelligence may not be as smart as humans — at least not yet — but the technology is progressing faster than Aidan Gomez ever imagined. Now, the Cohere CEO says the trick is convincing people and companies to embrace it. On POLITICO Tech, Gomez sits down with host Steven Overly to talk about what that will take and how fast it can happen. Learn more about your ad choices. Visit megaphone.fm/adchoices

Runway's Video Revolution: Empowering Creators with General World Models, with CTO Anastasis Germanidis

Play Episode Listen Later Oct 9, 2024 57:01


Nathan and co-host Stephen Parker delve into the world of AI video generation with Anastasis Germanidis, Co-Founder and CTO of Runway. This episode of The Cognitive Revolution explores the cutting-edge technology behind Runway's Gen 3 models and their impact on the creative industry. From emergent properties in scaled-up models to the democratization of video creation, join us for an illuminating discussion on the future of AI-generated content and its potential to reshape entertainment and culture. Check out Runway here: https://runwayml.com Apply to join over 400 Founders and Execs in the Turpentine Network: https://www.turpentinenetwork.co/ SPONSORS: Notion: Notion offers powerful workflow and automation templates, perfect for streamlining processes and laying the groundwork for AI-driven automation. With Notion AI, you can search across thousands of documents from various platforms, generating highly relevant analysis and content tailored just for you - try it for free at https://notion.com/cognitiverevolution Weights & Biases RAG++: Advanced training for building production-ready RAG applications. Learn from experts to overcome LLM challenges, evaluate systematically, and integrate advanced features. Includes free Cohere credits. Visit https://wandb.me/cr to start the RAG++ course today. Omneky: Omneky is an omnichannel creative generation platform that lets you launch hundreds of thousands of ad iterations that actually work customized across all platforms, with a click of a button. Omneky combines generative AI and real-time advertising data. Mention "Cog Rev" for 10% off https://www.omneky.com/ Oracle: Oracle Cloud Infrastructure (OCI) is a single platform for your infrastructure, database, application development, and AI needs. OCI has four to eight times the bandwidth of other clouds; offers one consistent price, and nobody does data better than Oracle. If you want to do more and spend less, take a free test drive of OCI at https://oracle.com/cognitive RECOMMENDED PODCAST: This Won't Last - Eavesdrop on Keith Rabois, Kevin Ryan, Logan Bartlett, and Zach Weinberg's monthly backchannel ft their hottest takes on the future of tech, business, and venture capital. Spotify: https://open.spotify.com/show/2HwSNeVLL1MXy0RjFPyOSz CHAPTERS: (00:00:00) About the Show (00:00:22) About the Episode (00:03:05) Introduction and AI for Creative Work (00:03:39) Video Generation as World Modeling (00:05:52) Emergent Properties in Scaled Models (00:08:44) Importance of Architecture vs Data (00:10:57) Multimodal Models (00:15:52) Sponsors: Notion | Weights & Biases RAG++ (00:18:37) Video Understanding and AGI (00:25:03) AI Agents for Video Creation (00:27:30) Runway's culture of shipping (00:29:20) Balancing Research Publication and Strategy (00:33:19) Sponsors: Omneky | Oracle (00:34:40) Features Variety Release Cycle (00:36:54) Power Users (00:38:56) Interactive Video (00:40:40) Scaling Challenges (00:42:21) Future of Creativity (00:45:24) Competing with Giants (00:47:39) Model Divergence (00:49:28) Disclosure vs. Strategy (00:51:19) Runway ML's API (00:54:23) Outro SOCIAL LINKS: Website: https://www.cognitiverevolution.ai Twitter (Podcast): https://x.com/cogrev_podcast Twitter (Nathan): https://x.com/labenz LinkedIn: https://www.linkedin.com/in/nathanlabenz/

Equity
Found: Has Rippling won? with Parker Conrad from Rippling

Equity

Play Episode Listen Later Oct 2, 2024 49:47


HR software is big, big business. And no one understands that better than Parker Conrad the CEO and co-founder of Rippling, a global HR company that offers global payroll, onboarding, time tracking, benefits management and more. This week, Equity is bringing you an episode of our sister show, Found. The Found crew talk with Conrad about what goes into building a leading HR tech company—from what it's like building out features companies love, to dealing with fierce competition in this ever growing landscape. Conrad also gets into the power imbalance that can arise between VCs and founders and the drama at his previous company that inspired him to build Rippling. Found posts every Tuesday. Subscribe on Apple, Spotify or wherever you listen to podcasts to be alerted when new episodes drop. Check out the other TechCrunch podcast: Equity . Subscribe to Found to hear more stories from founders each week. Credits: Equity is produced by Theresa Loconsolo with editing by Kell. Bryce Durbin is our Illustrator. We'd also like to thank the audience development team and Henry Pickavet, who manages TechCrunch audio products.

Equity
'Super weird' is the best way to describe this startup's pivot

Equity

Play Episode Listen Later Sep 20, 2024 25:50


This week on Equity, the podcast crew discusses several weird things and at least one cool thing.Kirsten Korosec, Devin Coldewey, and Rebecca Bellan first talked about the least weird thing of the week, how nice it is that Cohere co-founder Nick Frosst has a band that people really like.Then we get weird. First the good weird: a helmet that squeezes your head, but for a really good reason. It prevents hair loss from chemotherapy. Devin covered Luminate's latest fundraise and news, and everyone was pleased that money was going to a startup that may really be helping people feel better about themselves during a difficult time. The company is hoping to improve at-home care as well.Next, Kirsten explained the weird phenomenon of Flink, the “quick commerce” startup that just recently was rumored to be on the block for about $106 million, instead raising $115 million. Quite a turnaround! But as the team discusses, it may be that investors see the possibility that the “tumultuous time” for this sector is ending and Flink may have a good grip on the German market. Still…Then the weirdness begins in earnest. Rebecca is at the “Principled Business Summit,” aimed at “reclaiming capitalism” from, apparently, itself. She is getting mixed messages from the crowd and the content, which seems to combine enthusiasm for doing the right thing with some fringe tendencies to do… other things.And weirdest of all, autonomous trucking startup TuSimple's pivot to… AI-generated animation and video games. What?! Though there is some overlap between simulation and animation/gaming, it's a wild and unexpected change for the company, and a lot of shareholders are not going for it. Apparently the new division is working on another adaptation of “The Three-Body Problem,” so that's good… but what about the $450 million they were going to spend on trucks? That conflict is playing out before our eyes. Press play, and catch up!Equity is TechCrunch's flagship podcast, produced by Theresa Loconsolo, and posts every Wednesday and Friday. Subscribe to us on Apple Podcasts, Overcast, Spotify and all the casts. You also can follow Equity on X and Threads, at @EquityPod. For the full episode transcript, for those who prefer reading over listening, check out our full archive of episodes over at Simplecast. Credits: Equity is produced by Theresa Loconsolo with editing by Kell. Bryce Durbin is our Illustrator. We'd also like to thank the audience development team and Henry Pickavet, who manages TechCrunch audio products.

Techmeme Ride Home
Mon. 09/16 – A Bunch Of Apple Stories

Techmeme Ride Home

Play Episode Listen Later Sep 16, 2024 16:32


A bunch of Apple stories today. FDA approval for sleep apnea detection for the watch. Signs of poor pre-order sales for the phone. And a quick review of the new Airpods. Also, how did Intel lose out on making the chips for the next gen Playstation. And are dating apps responsible for income inequality?Sponsors:ArcticWolf.com/registerLinks:Apple Watch sleep apnea detection gets FDA approval (TechCrunch)iPhone 16 first weekend pre-order analysis: estimated total sales of about 37 million units; Pro series demand lower than expected (Ming-Chi Kuo)France picks Sejourne as nominee for EU Commission after Breton clash (Reuters)Slack now lets users add AI agents from Asana, Cohere, Adobe, Workday and more (VentureBeat)Exclusive: How Intel lost the Sony PlayStation business (Reuters)Apple AirPods 4 review: defying expectations (The Verge)Online Dating Caused a Rise in US Income Inequality, Research Paper Shows (Bloomberg)See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.