Podcasts about cerebras

71PODCASTS
123EPISODES
47mAVG DURATION
1WEEKLY EPISODE
Oct 30, 2025LATEST

POPULARITY

20172018201920202021202220232024

Best podcasts about cerebras

FYI - For Your Innovation

5 episodes with cerebras

This Week in Pre-IPO Stocks

7 episodes with cerebras

Techmeme Ride Home

3 episodes with cerebras

Let's Talk AI

4 episodes with cerebras

The Generative AI Meetup Podcast

4 episodes with cerebras

Be Wealthy & Smart

2 episodes with cerebras

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

3 episodes with cerebras

Eye On A.I.

3 episodes with cerebras

Becker’s Healthcare Podcast

2 episodes with cerebras

Latest podcast episodes about cerebras

ThursdAI - Oct 30 - From ASI in a Decade to Home Humanoids: MiniMax M2's Speed Demon, OpenAI's Bold Roadmap, and 2026 Robot Revolution

The top AI news from the past week, every ThursdAI

Play Episode Listen Later Oct 30, 2025 97:29

Hey, it's Alex! Happy Halloween friends! I'm excited to bring you this weeks (spooky) AI updates! We started the show today with MiniMax M2, the currently top Open Source LLM, with an interview with their head of eng, Skyler Miao, continued to dive into OpenAIs completed restructuring into a non-profit and a PBC, including a deep dive into a live stream Sam Altman had, with a ton of spicy details, and finally chatted with Arjun Desai from Cartesia, following a release of Sonic 3, a sub 49ms voice model! So, 2 interviews + tons of news, let's dive in! (as always, show notes in the end)Hey, if you like this content, it would mean a lot if you subscribe as a paid subscriber.Open Source AIMiniMax M2: open-source agentic model at 8% of Claude's price, 2× speed (X, Hugging Face )We kicked off our open-source segment with a banger of an announcement and a special guest. The new king of open-source LLMs is here, and it's called MiniMax M2. We were lucky enough to have Skyler Miao, Head of Engineering at Minimax, join us live to break it all down.M2 is an agentic model built for code and complex workflows, and its performance is just staggering. It's already ranked in the top 5 globally on the Artificial Analysis benchmark, right behind giants like OpenAI and Anthropic. But here's the crazy part: it delivers nearly twice the speed of Claude 3.5 Sonnet at just 8% of the price. This is basically Sonnet-level performance, at home, in open source.Skylar explained that their team saw an “impossible triangle” in the market between performance, cost, and speed—you could only ever get two. Their goal with M2 was to build a model that could solve this, and they absolutely nailed it. It's a 200B parameter Mixture-of-Experts (MoE) model, but with only 10B active parameters per inference, making it incredibly efficient.One key insight Skylar shared was about getting the best performance. M2 supports multiple APIs, but to really unlock its reasoning power, you need to use an API that passes the model's “thinking” tokens back to it on the next turn, like the Anthropic API. Many open-source tools don't support this yet, so it's something to watch out for.Huge congrats to the MiniMax team on this Open Weights (MIT licensed) release, you can find the model on HF! MiniMax had quite a week, with 3 additional releases, MiniMax speech 2.6, an update to their video model Hailuo 2.3 and just after the show, they released a music 2.0 model as well! Congrats on the shipping folks! OpenAI drops gpt-oss-safeguard - first open-weight safety reasoning models for classification ( X, HF )OpenAI is back on the open weights bandwagon, with a finetune release of their previously open weighted gpt-oss models, with gpt-oss-safeguard. These models were trained exclusively to help companies build safeguarding policies to make sure their apps remains safe! With gpt-oss-safeguards 20B and 120B, OpenAI is achieving near parity with their internal safety models, and as Nisten said on the show, if anyone knows about censorship and safety, it's OpenAI! The highlight of this release is, unlike traditional pre-trained classifiers, these models allow for updates to policy via natural language!These models will be great for businesses that want to safeguard their products in production, and I will advocate to bring these models to W&B Inference soon! A Humanoid Robot in Your Home by 2026? 1X NEO announcement ( X, Order page, Keynote )Things got really spooky when we started talking about robotics. The company 1X, which has been on our radar for a while, officially launched pre-orders for NEO, the world's first consumer humanoid robot designed for your home. And yes, you can order one right now for $20,000, with deliveries expected in early 2026.The internet went crazy over this announcement, with folks posting receipts of getting one, other folks stoking the uncanny valley fears that Sci-fi has built into many people over the years, of the Robot uprising and talking about the privacy concerns of having a human tele-operate this Robot in your house to do chores. It can handle chores like cleaning and laundry, and for more complex tasks that it hasn't learned yet, it uses a teleoperation system where a human “1X Expert” can pilot the robot remotely to perform the task. This is how it collects the data to learn to do these tasks autonomously in your specific home environment.The whole release is very interesting, from the “soft and quiet” approach 1X is taking, making their robot a 66lbs short king, draped in a knit sweater, to the $20K price point (effectively at loss given how much just the hands cost), the teleoperated by humans addition, to make sure the Robot learns about your unique house layout. The conversation on the show was fascinating. We talked about all the potential use cases, from having it water your plants and look after your pets while you're on vacation to providing remote assistance for elderly relatives. Of course, there are real privacy concerns with having a telepresence device in your home, but 1X says these sessions are scheduled by you and have strict no-go zones.Here's my prediction: by next Halloween, we'll see videos of these NEO robots dressed up in costumes, helping out at parties. The future is officially here. Will you be getting one? If not this one, when will you think you'll get one? OpenAI's Grand Plan: From Recapitalization to ASIThis was by far the biggest update about the world of AI for me this week! Sam Altman was joined by Jakub Pachocki, chief scientist and Wojciech Zaremba, a co-founder, on a live stream to share an update about their corporate structure, plans for the future, and ASI goals (Artificial Superintelligence) First, the company now has a new structure: a non-profit OpenAI Foundation governs the for-profit OpenAI Group. The foundation starts with about 26% equity and has a mission to use AI for public good, including an initial $25 billion commitment to curing diseases and building an “AI Resilience” ecosystem.But the real bombshells were about their research timeline. Chief Scientist Jakub Pachocki stated that they believe deep learning systems are less than a decade away from superintelligence (ASI). He said that at this point, AGI isn't even the right goal anymore. To get there, they're planning to have an “AI research intern” by September 2026 and a fully autonomous AI researcher comparable to their human experts by March 2028. This is insane if you think about it. As Yam mentioned, OpenAI is already shipping at an insane speed, releasing Models and Products, Sora, Atlas, Pulse, ChatGPT app store, and this is with humans, assisted by AI. And here, they are talking about complete and fully autonomous researchers, that will be infinitely more scalable than humans, in the next 2 years. The outcomes of this are hard to imagine and are honestly mindblowing. To power all this innovation, Sam revealed they have over $1.4 trillion in obligations for compute (over 30 GW). And said even that's not enough. Their aspiration is to build a “compute factory” capable of standing up one gigawatt of new compute per week, and he hinted they may need to “rethink their robotics strategy” to build the data centers fast enough. Does this mean OpenAI humanoid robots building factories?

The top AI news from the past week, every ThursdAI

Play Episode Listen Later Oct 17, 2025 94:38

Hey folks, Alex here. Can you believe it's already the middle of October? This week's show was a special one, not just because of the mind-blowing news, but because we set a new ThursdAI record with four incredible interviews back-to-back!We had Jessica Gallegos from Google DeepMind walking us through the cinematic new features in VEO 3.1. Then we dove deep into the world of Reinforcement Learning with my new colleague Kyle Corbitt from OpenPipe. We got the scoop on Amp's wild new ad-supported free tier from CEO Quinn Slack. And just as we were wrapping up, Swyx ( from Latent.Space , now with Cognition!) jumped on to break the news about their blazingly fast SWE-grep models. But the biggest story? An AI model from Google and Yale made a novel scientific discovery about cancer cells that was then validated in a lab. This is it, folks. This is the “let's f*****g go” moment we've been waiting for. So buckle up, because this week was an absolute monster. Let's dive in!ThursdAI - Recaps of the most high signal AI weekly spaces is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.Open Source: An AI Model Just Made a Real-World Cancer DiscoveryWe always start with open source, but this week felt different. This week, open source AI stepped out of the benchmarks and into the biology lab.Our friends at Qwen kicked things off with new 3B and 8B parameter versions of their Qwen3-VL vision model. It's always great to see powerful models shrink down to sizes that can run on-device. What's wild is that these small models are outperforming last generation's giants, like the 72B Qwen2.5-VL, on a whole suite of benchmarks. The 8B model scores a 33.9 on OS World, which is incredible for an on-device agent that can actually see and click things on your screen. For comparison, that's getting close to what we saw from Sonnet 3.7 just a few months ago. The pace is just relentless.But then, Google dropped a bombshell. A 27-billion parameter Gemma-based model they developed with Yale, called C2S-Scale, generated a completely novel hypothesis about how cancer cells behave. This wasn't a summary of existing research; it was a new idea, something no human scientist had documented before. And here's the kicker: researchers then took that hypothesis into a wet lab, tested it on living cells, and proved it was true.This is a monumental deal. For years, AI skeptics like Gary Marcus have said that LLMs are just stochastic parrots, that they can't create genuinely new knowledge. This feels like the first, powerful counter-argument. Friend of the pod, Dr. Derya Unutmaz, has been on the show before saying AI is going to solve cancer, and this is the first real sign that he might be right. The researchers noted this was an “emergent capability of scale,” proving once again that as these models get bigger and are trained on more complex data—in this case, turning single-cell RNA sequences into “sentences” for the model to learn from—they unlock completely new abilities. This is AI as a true scientific collaborator. Absolutely incredible.Big Companies & APIsThe big companies weren't sleeping this week, either. The agentic AI race is heating up, and we're seeing huge updates across the board.Claude Haiku 4.5: Fast, Cheap Model Rivals Sonnet 4 Accuracy (X, Official blog, X)First up, Anthropic released Claude Haiku 4.5, and it is a beast. It's a fast, cheap model that's punching way above its weight. On the SWE-bench verified benchmark for coding, it hit 73.3%, putting it right up there with giants like GPT-5 Codex, but at a fraction of the cost and twice the speed of previous Claude models. Nisten has already been putting it through its paces and loves it for agentic workflows because it just follows instructions without getting opinionated. It seems like Anthropic has specifically tuned this one to be a workhorse for agents, and it absolutely delivers. The thing to note also is the very impressive jump in OSWorld (50.7%), which is a computer use benchmark, and at this price and speed ($1/$5 MTok input/output) is going to make computer agents much more streamlined and speedy! ChatGPT will loose restrictions; age-gating enables “adult mode” with new personality features coming (X) Sam Altman set X on fire with a thread announcing that ChatGPT will start loosening its restrictions. They're planning to roll out an “adult mode” in December for age-verified users, potentially allowing for things like erotica. More importantly, they're bringing back more customizable personalities, trying to recapture some of the magic of GPT-4.0 that so many people missed. It feels like they're finally ready to treat adults like adults, letting us opt-in to R-rated conversations while keeping strong guardrails for minors. This is a welcome change, and we've been advocating for this for a while, and it's a notable change from the XAI approach I covered last week. Opt in for adults with verification while taking precautions vs engagement bait in the form of a flirty animated waifu with engagement mechanics. Microsoft is making every windows 11 an AI PC with copilot voice input and agentic powers (Blog,X)And in breaking news from this morning, Microsoft announced that every Windows 11 machine is becoming an AI PC. They're building a new Copilot agent directly into the OS that can take over and complete tasks for you. The really clever part? It runs in a secure, sandboxed desktop environment that you can watch and interact with. This solves a huge problem with agents that take over your mouse and keyboard, locking you out of your own computer. Now, you can give the agent a task and let it run in the background while you keep working. This is going to put agentic AI in front of hundreds of millions of users, and it's a massive step towards making AI a true collaborator at the OS level.NVIDIA DGX - the tiny personal supercomputer at $4K (X, LMSYS Blog)NVIDIA finally delivered their promised AI Supercomputer, and while the excitement was in the air with Jensen hand delivering the DGX Spark to OpenAI and Elon (recreating that historical picture when Jensen hand delivered a signed DGX workstation while Elon was still affiliated with OpenAI). The workstation was sold out almost immediately. Folks from LMSys did a great deep dive into specs, all the while, folks on our feeds are saying that if you want to get the maximum possible open source LLMs inference speed, this machine is probably overpriced, compared to what you can get with an M3 Ultra Macbook with 128GB of RAM or the RTX 5090 GPU which can get you similar if not better speeds at significantly lower price points. Anthropic's “Claude Skills”: Your AI Agent Finally Gets a Playbook (Blog)Just when we thought the week couldn't get any more packed, Anthropic dropped “Claude Skills,” a huge upgrade that lets you give your agent custom instructions and workflows. Think of them as expertise folders you can create for specific tasks. For example, you can teach Claude your personal coding style, how to format reports for your company, or even give it a script to follow for complex data analysis.The best part is that Claude automatically detects which “Skill” is needed for a given task, so you don't have to manually load them. This is a massive step towards making agents more reliable and personalized, moving beyond just a single custom instruction and into a library of repeatable, expert processes. It's available now for all paid users, and it's a feature I've been waiting for. Our friend Simon Willison things skills may be a bigger deal than MCPs!

E233: xAI $20B GPU financing structure nears close; Reflection AI raises $2B round builds open-source frontier lab; n8n $180M series C drives 7x valuation jump to $2.5B; Base Power $1B raise targets 200k home batteries by 2027; ICE rakes Polymarket stake

This Week in Pre-IPO Stocks

Play Episode Listen Later Oct 10, 2025 11:50

Send us a textInvest in pre-IPO stocks with AG Dillon & Co. Contact aaron.dillon@agdillon.com to learn more. Financial advisors only.00:08 - xAI $20B GPU Financing Structure Nears Close01:42 - Reflection AI Raises $2B Round Builds Open-Source Frontier Lab02:37 - n8n $180M Series C Drives 7x Valuation Jump to $2.5B03:26 - Base Power $1B Raise Targets 200k Home Batteries by 202704:13 - ICE Takes Polymarket Stake as Prediction Volume +700% YTD05:01 - Cerebras $1.1B Series G at $8.1B Valuation Ahead of IPO05:51 - BVNK Acquired by Coinbase-Mastercard06:41 - OpenAI ChatGPT Go Expands to 16 Asian Markets07:31 - OpenAI $18B AMD GPU Deal Diversifies Beyond Nvidia08:19 - OpenAI Sora 2 Hits 30M Users in Week One08:57 - OpenAI Instant Checkout Allows eCommerce in ChatGPT09:47 - OpenAI AgentKit Launch Accelerates AI Agent Economy10:30 - Google Gemini Enterprise Targets Copilot's 15M Seats11:12 - SpaceX | Starlink Revenue +40%, 120 Launches in 2025

OpenAI als Plattform, Bubble Time & Peak Social Media #499

Doppelgänger Tech Talk

Play Episode Listen Later Oct 7, 2025 96:30

OpenAI launcht App-SDK mit Expedia, Spotify und Canva als erste Partner. AMD-Deal: OpenAI sichert sich 10% Aktienanteil für 6 Gigawatt GPU-Bestellung. Sora-Update mit umstrittenem Opt-Out für Rechteinhaber. ROI-Übernahme stärkt OpenAIs Finanz-App-Ambitionen. Small Modular Reactors: Financial Times warnt vor überhöhten Kosten bei Atom-Renaissance. Cerebras zieht IPO zurück nach neuer Finanzierungsrunde. Deloitte muss Honorar wegen KI-generiertem Report teilweise zurückzahlen. Peak Social Media: Jüngste Generation reduziert Nutzung. Unterstütze unseren Podcast und entdecke die Angebote unserer Werbepartner auf ⁠⁠⁠⁠⁠doppelgaenger.io/werbung⁠⁠⁠⁠⁠. Vielen Dank! Philipp Glöckler und Philipp Klöckner sprechen heute über: (00:00:00) OpenAI Dev Day und App-SDK (00:06:38) Privacy-Probleme bei App-Daten-Sharing (00:10:10) Sora Social Network Update (00:11:57) Opt-Out Logik für Rechteinhaber (00:17:47) Microsoft-OpenAI Beziehungsdynamik (00:19:04) AMD-Deal (00:32:31) Hubspot-Kurssprung nach OpenAI-Erwähnung (00:34:38) OpenAIs Plattform-Strategie (00:52:19) ROI-Finanz-App Übernahme (00:57:25) Small Modular Reactors Kostenanalyse (01:01:08) Data Center Asset-Verkäufe in Europa (01:07:46) Deloitte KI-Report Skandal (01:14:51) ICE-Block App aus App Stores entfernt (01:19:06) AppLovin SEC-Ermittlungen (01:21:48) N26 Kreditkarten (01:25:52) Peak Social Media Studie (01:31:32) Online-Scam-Schäden 17 Milliarden USA Shownotes Einführung von Apps in ChatGPT und das neue Apps SDK – openai.comSora-Update #1 - Sam Altman – blog.samaltman.comAMD-Aktie steigt um 23% durch OpenAI-Interesse an KI-Chiphersteller – cnbc.comOpenAI macht Börse nervös – linkedin.comOpenAI verstärkt Fokus auf personalisierte Verbraucher-KI durch Übernahme – techcrunch.comUSA und Investoren setzen auf unbewiesene Nukleartechnologie, warnen Experten – ft.comKI-Chiphersteller Cerebras zieht Börsengang zurück – cnbc.comEuropäische Private-Equity-Firmen zielen auf 17 Mrd. € in Rechenzentrum-Deals ab – ft.comDeloitte zahlt Geld an Regierung zurück nach KI-Nutzung im Bericht – theguardian.comArizona: Anstieg der VPN-Nachfrage nach Altersverifizierungsgesetzen – tomsguide.comApple entfernt ICE-Tracking-App nach Druck der Bondi-Justizministerium – foxbusiness.comElon Musk empfiehlt, Netflix-Abos zu kündigen. – cnbc.comAppLovin von SEC wegen Datenerfassungspraktiken untersucht – bloomberg.comNordkoreanische Agenten leiten bis zu 1 Milliarde Dollar in Kim Jong Uns Atomprogramm. – fortune.comBetrügerische Abbuchungen bei N26-Kunden – linkedin.comInstagram Rings Awards Programm von Meta angekündigt – hollywoodreporter.com$1 Trump-Münze ist „echt“, sagt US-Schatzmeister – cnn.comHaben wir den Höhepunkt der sozialen Medien überschritten? – ft.comOnline-Betrug und Angriffe in Amerika – pewresearch.org

20VC: Cerebras CEO on Why Raise $1BN and Delay the IPO | NVIDIA Showing Signs They Are Worried About Growth | Concentration of Value in Mag7: Will the AI Train Come to a Halt | Can the US Supply the Energy for AI with Andrew Feldman

The Twenty Minute VC: Venture Capital | Startup Funding | The Pitch

Play Episode Listen Later Oct 6, 2025 64:59

Andrew Feldman is Co-Founder & CEO of Cerebras, building the world's fastest AI inference and training. Cerebras recently closed a $1.1BN Series G round at an $8.1 billion valuation, backed by top names including Fidelity, Atreides, Tiger Global, Valor Equity and 1789 Capital. Under his leadership, they've leapfrogged GPU limits in inference, operate at trillions of tokens per month, and are filing to go public soon. AGENDA: 02:43 Why We Did Not IPO and Raised $1BN From Fidelity 05:03 Analysis of Chip and Compute Landscape Today 07:14 NVIDIA Showing Signs They Are Running Out of Ideas 13:57 The Real Questions to Ask on Chip Depreciation 24:54 Energy Requirements for AI: Is it Feasible? 29:25 Mag7 Value Concentration: Feature or a Bug 31:57 Talent is the Bottleneck and Trump Makes it Worse 32:55 The War for Talent: Secrets No One Sees 34:22 Evaluating the Data Centre Economy: Many Will Lose Money 38:01 Three Changes the US Could Make to Beat China in AI 42:30 Why 80% of our Revenues are in the UAE 47:26 Quick Fire Questions 58:59 Why Work Life Balance is Total BS

First, They Ignore You, Then They Laugh, Then They Fight…Then You Win

Digital Currents

Play Episode Listen Later Oct 3, 2025 48:13

This ABCD Roundup covers the SEC's push to accelerate tokenized stocks and SWIFT's collaboration with global banks on a shared blockchain ledger for instant payments, moves that could fundamentally reshape traditional finance. We also examine Bitcoin's recent price and volume swings, as well as the broader implications of onshoring and the U.S. crypto cycle. Finally, we cover Cerebras' $1.1 billion raise at an $8.1 billion valuation. To learn more, visit us on the web at https://www.morgancreekcap.com/morgan-creek-digital/. To speak to a team member or sign up for additional content, please email mcdigital@morgancreekcap.com Legal Disclaimer This podcast is for informational purposes only and should not be construed as investment advice or a solicitation for the sale of any security, advisory, or other service. Investments related to the themes and ideas discussed may be owned by funds managed by the host and podcast guests. Any conflicts mentioned by the host are subject to change. Listeners should consult their personal financial advisors before making any investment decisions.

bitcoin laugh sec investments swift cerebras

E232: Wealthfront's robo-advisor IPO sprint: $88B AUM, $194M profits; OpenAI's instant checkout: Etsy surge 16%, taps 700M users; OpenAI H1 2025: $4.3B sales vs $2.5B burn, breakeven by 2026?; Cerebras' $1.1B pre-IPO: $8.1B val, Q2 rev 11x YoY

This Week in Pre-IPO Stocks

Play Episode Listen Later Oct 3, 2025 18:14

Send us a text00:00 - Intro01:16 - Wealthfront's Robo-Advisor IPO Sprint: $88B AUM, $194M Profits02:43 - OpenAI's Instant Checkout: Etsy Surge 16%, Taps 700M Users03:51 - OpenAI H1 2025: $4.3B Sales vs $2.5B Burn, Breakeven by 2026?05:23 - Cerebras' $1.1B Pre-IPO Raise: $8.1B Val, Q2 Rev 11x YoY06:39 - Black Forest Labs' $4B AI Image Raise: FLUX.1 Downloads 5M+07:59 - Rebellions' $1.4B Series C: 3x Rev YoY, Arm GPU Co-Dev09:28 - TikTok US Divestiture: $14B Val at 1.4x P/S, ByteDance Keeps 50% Profits11:04 - Meta's Rivos Acquisition: Cuts GPU Reliance 20-30%, $10B Annual Spend12:27 - Anthropic's Claude Sonnet 4.5: 30-Hour Autonomy, $5B ARR Run-Rate14:01 - Perplexity's Free Comet Browser: ARR Nears $200M, 50% Query Boost15:24 - Stripe Bridge's Stablecoin Open Issuance: $300B Market to $2T by 202816:52 - Thinking Machines' Tinker API: 95% Nondeterminism Fix, $12B Seed Val

Kfz-Zulassung & Kernfusion | Yann LeCun vs. Meta Research #498

Doppelgänger Tech Talk

Play Episode Listen Later Oct 3, 2025 84:31

Deutschland startet 80-Punkte-Modernisierungsagenda mit KI-Verwaltung und 24h-Unternehmensgründung. Fusion 2040: 2 Milliarden für erstes deutsches Fusionskraftwerk. Meta kauft Chip-Startup Rivos, Yann LeCun droht mit Rücktritt wegen Publikationsbeschränkungen. Mira Muratis Thinking Machines launcht Tinker-Platform für Model-Finetuning. Cerebras sammelt 1 Milliarde auf 8 Milliarden Bewertung als Nvidia-Konkurrent. Microsoft warnt vor KI-Biowaffen-Bedrohungen. Northern Data Ermittlungen wegen Mehrwertsteuerbetrug. DeepL strebt 5-Milliarden-IPO an. Trump launcht TrumpRX-Medikamenten-Website. Unterstütze unseren Podcast und entdecke die Angebote unserer Werbepartner auf ⁠⁠⁠⁠⁠doppelgaenger.io/werbung⁠⁠⁠⁠⁠. Vielen Dank! Philipp Glöckler und Philipp Klöckner sprechen heute über: (00:00:00) Tag Der Deutschen Einheit - Quiz (00:09:29) Modernisierungsagenda Deutschland (00:19:40) Fusion 2040 Kraftwerk-Plan (00:27:15) Meta (00:28:00) Yann LeCun vs Meta Research (00:34:15) Mira Murati Tinker Platform (00:38:54) Cerebras 1 Mrd. Funding (00:41:39) AI Productivity Index APEX (00:42:45) OpenAI 500 Mrd. Bewertung (00:50:49) Microsoft KI-Biowaffen-Warnung (00:52:13) TrumpRX Medikamenten-Website (01:05:11) Northern Data Ermittlungen (01:11:30) DeepL 5-Milliarden-IPO Shownotes Modernisierungsagenda für ein schnelles Deutschland – bmds.bund.de Homepage - Aktionsplan für erstes Fusionskraftwerk in Deutschland – bmftr.bund.de Meta-Änderung beim Veröffentlichen von Forschung sorgt für Aufruhr in der KI-Gruppe – theinformation.com Tinker: Startup von OpenAI Ex-CTO launcht erstes Produkt – the-decoder.de Meta erwirbt Chips-Startup Rivos für KI-Initiative – bloomberg.com Ein Jahr nach IPO-Anmeldung: Cerebras Systems sammelt $1,1 Mrd. ein – techcrunch.com Einführung von APEX: Der KI-Produktivitätsindex – mercor.com Bloomberg - Bist du ein Roboter? – bloomberg.com Meta Ads target – wsj.com TrumpRx – wsj.com Microsoft warnt vor KI-gestützten "Zero Day"-Bedrohungen in der Biologie – technologyreview.com Northern Data – bloomberg.com DeepL – bloomberg.com Break 30 | Schaffen Philipp Gloeckler und Daniel Voigt die 29 im GLC Nordkirchen? – youtu.be

Stocks Shrug Off Looming Government Shutdown

MKT Call

Play Episode Listen Later Sep 30, 2025 7:12

MRKT Matrix - Tuesday, September 30th S&P 500 rises to close out winning month, shakes off government shutdown concerns (CNBC) Government on track to shut down after midnight with no funding deal in sight (CNBC) A Traders' Guide to US Markets If the Government Shuts Down (Bloomberg) Trump, Pfizer agree to lower U.S. drug prices, exempt company from pharma tariffs (CNBC) Nvidia's market cap tops $4.5 trillion after string of AI infrastructure deals (CNBC) Nvidia challenger Cerebras raises $1.1bn ahead of IPO (FT) CoreWeave signs $14 billion AI infrastructure deal with Meta (CNBC AI Data Centers Are Sending Power Bills Soaring (Bloomberg) OpenAI Launches Video Generator App to Rival TikTok and YouTube (WSJ) --- Subscribe to our newsletter: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://riskreversalmedia.beehiiv.com/subscribe⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ MRKT Matrix by RiskReversal Media is a daily AI powered podcast bringing you the top stories moving financial markets Story curation by RiskReversal, scripts by Perplexity Pro, voice by ElevenLabs

ai guide pfizer stocks shutdowns looming government shutdown shrug elevenlabs cerebras

CommanderAI building Salesforce for the waste management industry, also, a year after filing to IPO, still-private Cerebras Systems raises $1.1B

TechCrunch Startups – Spoken Edition

Play Episode Listen Later Sep 30, 2025 8:25

CommanderAI launched in early 2024 as a customer relationship manager and sales prospecting platform built for waste management — and other industrial services like dumpster rentals and industrial recyclers – to fill that gap. Also, Silicon Valley-based Cerebras announced it raised a $1.1 billion Series G round on Tuesday that valued the AI hardware company at $8.1 billion. The round was co-led by Fidelity and Atreides Management with participation from Tiger Global, Valor Equity Partners, and 1789 Capital, among others. Learn more about your ad choices. Visit podcastchoices.com/adchoices

ai silicon valley capital private salesforce raises filing waste management tiger global cerebras cerebras systems series g valor equity partners

Nvidia Challenger Cerebras' Funding, TikTok Update, Navan's Signal About Tech IPOs | Sep 22, 2025

The Information's 411

Play Episode Listen Later Sep 22, 2025 27:13

The Information's Sylvia Varnham O'Regan talks with TITV Host Akash Pasricha about the latest developments in the TikTok deal, including the Murdoch family's potential involvement and new details on the algorithm. We also talk with Mostly Metrics newsletter Founder CJ Gustafson about Navan's S-1 filing and what it reveals about the tech IPO market. Lastly, we speak to Cerebras' CEO Andrew Feldman about their delayed IPO and rivalry with NVIDIA.Articles discussed on this episode: https://www.theinformation.com/briefings/trump-sign-executive-order-week-details-tiktok-dealTITV airs on YouTube, X and LinkedIn at 10AM PT / 1PM ET. Or check us out wherever you get your podcasts.Subscribe to: - The Information on YouTube: https://www.youtube.com/@theinformation4080/?sub_confirmation=1- The Information: https://www.theinformation.com/subscribe_hSign up for the AI Agenda newsletter: https://www.theinformation.com/features/ai-agenda

tiktok tech funding signal ipo nvidia challenger murdoch navan cerebras andrew feldman

"Cherry Blossoms" Bloom of IPO Spring: Klarna Debut, Trends & IPOs to Come

TD Ameritrade Network

Play Episode Listen Later Sep 10, 2025 7:17

Connor Group's Jim Neesen goes in-depth on the IPO market, including one of his clients that debuted Wednesday: Klarna. He says the fintech company has a "great growth story" in one of the year's most-anticipated public debuts. Jim adds that 2025 is shaping up to be the best year for IPOs since 2021 and explains how the seeds of this year's "cherry blossoms" for the IPO spring were set in place. He tells investors to watch for debuts down the road from names like Databricks, Cerebras, and Stubhub.======== Schwab Network ========Empowering every investor and trader, every market day. Subscribe to the Market Minute newsletter - https://schwabnetwork.com/subscribeDownload the iOS app - https://apps.apple.com/us/app/schwab-network/id1460719185Download the Amazon Fire Tv App - https://www.amazon.com/TD-Ameritrade-Network/dp/B07KRD76C7Watch on Sling - https://watch.sling.com/1/asset/191928615bd8d47686f94682aefaa007/watchWatch on Vizio - https://www.vizio.com/en/watchfreeplus-exploreWatch on DistroTV - https://www.distro.tv/live/schwab-network/Follow us on X – https://twitter.com/schwabnetworkFollow us on Facebook – https://www.facebook.com/schwabnetworkFollow us on LinkedIn - https://www.linkedin.com/company/schwab-network/ About Schwab Network - https://schwabnetwork.com/about

spring debut ios bloom ipo klarna sling cherry blossoms databricks stubhub vizio cerebras market minute

RAISE Summit 2025 : le grand débrief !

Le digital pour tous #BonjourPPC

Play Episode Listen Later Aug 16, 2025 49:16

C'est le grand final !Après plusieurs épisodes de débriefing sur certaines conférences du Raise Summit 2025, voici LE dernier débrief, le plus complet, le plus riche, réalisé par PPC avec ses mates pour vous donner des clés de lecture uniques.Autour de la table :

ceo tech raise summit ia bienvenue technologie davos visitez legrand adn ppc autour ausha intelligence artificielle le grand d vivatech transformation digitale cerebras zibi

The Roaring 20's Chip Wars—A Landmark Moment In The AI Boom, Julie Choi - CMO Cerebras Systems

The Reboot Chronicles with Dean DeBiase

Play Episode Listen Later May 20, 2025 31:45

Most everyone remembers the tale of David and Goliath, one of the greatest underdog stories. We have seen the story repeat itself in business, across sectors from food and car companies to consumer products and tech. But what about AI? The AI computer chip wars are heating up—on fire actually—as companies attempt to topple the Goliath that Nvidia has become. Our David for this episode is Cerebras Systems, one of the rapidly-growing AI chip powerhouses that has built some of the world's fastest supercomputers. This increase in speed wasn't just a few percent faster either, tests are showing that Cerebras' work came in up to 75 times faster than their competitors. Cerebras CMO, Julie Choi, joins this episode of The Reboot Chronicles Show for an insider's look into how they created the next-gen technology which is driving their record-breaking success. Julie has seen Cerebras through their $700 million in funding, filing to take the company public to raise $1B, and building strategic partnerships with businesses like the Mayo Clinic and the Department of Energy. Now, on a mission to accelerate generative AI by building a new class of AI supercomputer, that could change the course of progress, Julie shares stories about both the nerdy side of chips and the future of technology as this industry hits breakneck speeds around the globe.

ai energy wars chip goliath nvidia mayo clinic 1b landmark roaring choi ai boom cerebras cerebras systems

Beyond Big Chips: Cerebras on Inference and AI

Tech Disruptors

Play Episode Listen Later May 13, 2025 39:43

“We want every layer — chip, system, software — because when you own the stack you can outrun a GPU cluster by 40-70x,” Cerebras CEO Andrew Feldman says. In this episode of Tech Disruptors, Cerebras returns to the Bloomberg Intelligence podcast studios as Feldman joins Bloomberg Intelligence's Kunjan Sobhani and Mandeep Singh to explain the progress from “biggest chip” to “fastest inference cloud.” Feldman unpacks the WSE-3 upgrade, six new data-center builds and fresh Meta and IBM deals that aim to deliver sub-second answers at a fraction of GPU cost, plus Feldman's views on scaling laws, synthetic data and the looming power crunch.

ibm chips feldman gpu inference bloomberg intelligence cerebras wse

Bittensor's Rise, Meta's Llama Goes Cloud, & AI Now Writes Your Code | E2119

This Week in Startups

Play Episode Listen Later May 1, 2025 64:49

Today's show: Jason, Alex, Lon and Special Guest Mark Jeffrey of Hash Rate, cover the explosive rise of Bittensor, a decentralized AI compute network some are calling the “third great coin” after Bitcoin and Ethereum, explore Meta's bold move to host its open-source LLaMA models via partnerships with Groq and Cerebras—potentially setting the stage for a future AWS competitor—and unpack shocking revelations from the Wall Street Journal about Meta AI chatbots engaging in inappropriate conversations with underage users. Plus, we explore how AI is now writing up to 30% of code at major tech firms like Google and Microsoft, signaling a radical shift in how software gets built.Timestamps:(0:00) Episode Teaser(1:28) Introduction to the episode and guests(2:31) Mark Jeffrey's involvement in crypto and Bittensor project(5:06) Bitcoin vs. Bittensor: Stability and efficiency(10:20) Hubspot for Startups - Visit hubspot.com/startups and join the founders who are turning growth challenges into opportunities.(15:16) Governance, staking, and starting a subnet in Bittensor(17:52) Exploring Ready.AI and its impact on the future of AI(20:08) Squarespace - Use offer code TWIST to save 10% off your first purchase of a website or domain at https://www.Squarespace.com/TWIST(27:17) Trump's influence on crypto regulations and the stablecoin act(30:03) Oracle - Try OCI and save up to 50% on your cloud bill at https://w⁠⁠⁠⁠ww.oracle.com/twist(38:42) AI-driven VC outreach and Alexis Ohanian's advice on cold emailing(45:03) Introducing LayerNext with CEO Buddhika Madduma and customer onboarding challenges(49:04) Ikigai for startups and balancing bespoke work with scalable product development(55:38) Strategies for securing lighthouse customers and the 'bear hug' approach(56:43) Reddit rapid response: Debating the return to office for young professionals(1:04:05) Closing remarks and guest plugsSubscribe to the TWiST500 newsletter: https://ticker.thisweekinstartups.comCheck out the TWIST500: https://www.twist500.comSubscribe to This Week in Startups on Apple: https://rb.gy/v19fcpLinks from episode:Hash Rate Podcast: https://www.youtube.com/@markjeffreyLayerNext: https://www.layernext.ai/r/antiwork: https://www.reddit.com/r/antiwork/Follow Mark:X: https://x.com/markjeffreyLinkedIn: https://www.linkedin.com/in/markjeffrey/Follow Lon:X: https://x.com/lonsFollow Alex:X: https://x.com/alexLinkedIn: ⁠https://www.linkedin.com/in/alexwilhelmFollow Jason:X: https://twitter.com/JasonLinkedIn: https://www.linkedin.com/in/jasoncalacanisThank you to our partners:(10:20) Hubspot for Startups - Visit hubspot.com/startups and join the founders who are turning growth challenges into opportunities.(20:08) Squarespace - Use offer code TWIST to save 10% off your first purchase of a website or domain at https://www.Squarespace.com/TWIST(30:03) Oracle - Try OCI and save up to 50% on your cloud bill at https://w⁠⁠⁠⁠ww.oracle.com/twistGreat TWIST interviews: Will Guidara, Eoghan McCabe, Steve Huffman, Brian Chesky, Bob Moesta, Aaron Levie, Sophia Amoruso, Reid Hoffman, Frank Slootman, Billy McFarlandCheck out Jason's suite of newsletters: https://substack.com/@calacanisFollow TWiST:Twitter: https://twitter.com/TWiStartupsYouTube: https://www.youtube.com/thisweekinInstagram: https://www.instagram.com/thisweekinstartupsTikTok: https://www.tiktok.com/@thisweekinstartupsSubstack: https://twistartups.substack.comSubscribe to the Founder University Podcast: https://www.youtube.com/@founderuniversity1916

HPC News Bytes – 20250407

@HPCpodcast with Shahin Khan and Doug Black

Play Episode Listen Later Apr 7, 2025

- US Tariffs and Technology Sector - Intel-TSMC Joint Venture? - DARPA fuels Waferscale co-packaged optics via Cerebras and Ranovus - Sandia National Lab to test laser-based photonic cooling via Maxwell Labs - 8 Tbps optical UCIe chiplet for scale-up by Ayar Labs - Lightmatter 3D co-packaged optics [audio mp3="https://orionx.net/wp-content/uploads/2025/04/HPCNB_20250407.mp3"][/audio] The post HPC News Bytes – 20250407 appeared first on OrionX.net.

learning ai deep chips physics quantum computing darpa simulations hpc us tariffs supercomputing cerebras news bytes

20VC: AI Chip Wars: How Cerebras Plans to Topple NVIDIA's Dominance | Why We Have Not Reached Scaling Laws in AI | What Happens to the Cost of Inference | How We Underestimate China and Shouldn't Sell To Them with Andrew Feldman

The Twenty Minute VC: Venture Capital | Startup Funding | The Pitch

Play Episode Listen Later Mar 24, 2025 63:21

Andrew Feldman is the Co-Founder and CEO @ Cerebras, the fastest AI inference + training platform in the world. In Sept 2024 the company filed to go public off the back of a rumoured $1BN deal with G42 in the UAE. Andrew is the leading expert for all things inference. In Today's Episode We Discuss: 04:23 Where Was AI Landscape in 2015 When Cerebras Founded 05:57 NVIDIA's Biggest Strength Has Become Their Biggest Weakness 07:09 What Happens to the Cost of Inference? 08:55 Why Are AI Algorithms So Inefficient? 20:30 Why is it Total BS That We Have Hit Scaling Laws? 23:07 What Will Be the Ratio of Synthetic to Human Data Used in 5 Years? 31:37 What Specifically Was So Impressive About Deepseek? 31:51 Why is Distillation Not Wrong and OpenAI Need to Look in the Mirror? 32:34 Where Will Value Accrue in a World of AI? 34:08 How Will NVIDIA's Market Position Change Over the Next Five Years? 39:59 Why is the CUDA Lockin for NVIDIA BS? What is Their Weakness? 40:46 Why is Trump Better for Business than Biden? 49:41 Do We Underestimate China in a World of AI? 52:33 What is the Most Underappreciated Segment of AI? 54:00 Quickfire Round

Beyond GPUs: Cerebras' Wafer-Scale Engine for Lightning-Fast AI Inference

The Data Exchange with Ben Lorica

Play Episode Listen Later Mar 6, 2025 39:26

Hagay Lupesko is the SVP for AI Inference at Cerebras Systems. Subscribe to the Gradient Flow Newsletter

scale lightning svp engine detailed gpus inference wafer cerebras cerebras systems

Episode 506: Put It On Ice

Software Defined Talk

Play Episode Listen Later Feb 14, 2025 65:01

This week, we discuss how LLMs are changing software development, OpenAI's deep research, and why the Gartner Hype Cycle persists. Plus, a business plan built entirely around ice! Watch the YouTube Live Recording of Episode (https://www.youtube.com/watch?v=JyTb1v4-oZQ) 506 (https://www.youtube.com/watch?v=JyTb1v4-oZQ) Runner-up Titles I bought the DevOps I'm always looking for tomatillas You're making a strong case for RTO The CEO of ice. The VP of Ice Machines. What you are doing is toil I think about WALL-E every day Eliminating the first draft Rundown The End of Programming as We Know It (https://www.oreilly.com/radar/the-end-of-programming-as-we-know-it/) Apple Earnings, OpenAI Deep Research, The Unbundling of Substantiation (https://stratechery.com/2025/apple-earnings-openai-deep-research-the-unbundling-of-substantiation/) Gartner's Grift Is About To Unravel (https://dx.tips/gartner) Relevant to your Interests Google goes heavy on investment but light on detail (https://on.ft.com/3Q619Wc) Researchers created an open rival to OpenAI's o1 'reasoning' model for under $50 (https://techcrunch.com/2025/02/05/researchers-created-an-open-rival-to-openais-o1-reasoning-model-for-under-50/) Dumping open source for proprietary rarely pays off: Better to stick a fork in it (https://www.zdnet.com/article/dumping-open-source-for-proprietary-rarely-pays-off-better-to-stick-a-fork-in-it/) Report: OpenAI's ex-CTO, Mira Murati, has recruited OpenAI co-founder John Schulman (https://techcrunch.com/2025/02/06/report-openais-ex-cto-mira-murati-has-recruited-openai-co-founder-john-schulman/) Broadcom and Nvidia are capitalizing on the return of the winner-take-all AI trade (https://sherwood.news/markets/broadcom-and-nvidia-are-capitalizing-on-the-return-of-the-winner-take-all-ai/) When will remote workers see their pay cut? (https://www.economist.com/finance-and-economics/2025/02/06/when-will-remote-workers-see-their-pay-cut) How Microsoft Releases Changes to Azure - Safe Deployment (https://luke.geek.nz/azure/azure-platform-release-process/) Servers can last a long time (https://world.hey.com/dhh/servers-can-last-a-long-time-165c955c) Matt Mullenweg: WordPress Controversy, Future of Open Source AI, and Navigating Backlash (https://theloganbartlettshow.substack.com/p/matt-mullenweg-wordpress-controversy?utm_source=post-email-title&publication_id=1161376&post_id=156676755&utm_campaign=email-post-title&isFreemail=true&r=yr5ci&triedRedirect=true&utm_medium=email) Turn/River Agrees to Buy SolarWinds Years After Cyber-Attack (https://finance.yahoo.com/news/turn-river-agrees-buy-solarwinds-174426287.html) Gartner's Grift Is About To Unravel (https://dx.tips/gartner) Developer creates endless Wikipedia feed to fight algorithm addiction (https://arstechnica.com/gadgets/2025/02/new-wikitok-web-app-allows-infinite-tiktok-style-scroll-of-wikipedia/) You Didn't Notice MP3 Is Now Free (https://idiallo.com/blog/listen-mp3-is-free?ref=labnotes.org&utm_source=substack&utm_medium=email) The DevTools Ceiling: great as Open Source, AND a Terrible Business (https://dx.tips/ceiling) The VGHF Library opens in early access | Video Game History Foundation (https://gamehistory.org/vghf-library-launch/) Docker Announces Don Johnson as New CEO, Succeeding Scott Johnston (https://www.globenewswire.com/news-release/2025/02/12/3025262/0/en/Docker-Announces-Don-Johnson-as-New-CEO-Succeeding-Scott-Johnston.html) Developers Unhappy With Tool Sprawl, Lagging Data, Long Waits (https://thenewstack.io/developers-unhappy-with-tool-sprawl-lagging-data-long-waits/) Thomson Reuters Wins First Major AI Copyright Case in the US (https://www.wired.com/story/thomson-reuters-ai-copyright-lawsuit/) Apple brings heart rate monitoring (https://techcrunch.com/2025/02/11/apple-brings-heart-rate-monitoring-to-powerbeats-pro-2/) Nonsense TabBoo (https://tabboo.xyz/) Conferences DevOpsDayLA (https://www.socallinuxexpo.org/scale/22x/events/devopsday-la) at SCALE22x (https://www.socallinuxexpo.org/scale/22x), March 6-9, 2025, discount code DEVOP VMUG NL (https://vmugnl.nl), March 12th, Coté speaking. DevOpsDays Chicago (https://devopsdays.org/events/2025-chicago/welcome/), March 18th, 2025. SREday London (https://sreday.com/2025-london-q1/), March 27-28, Coté speaking (https://sreday.com/2025-london-q1/Michael_Cote_VMware__Pivotal_Platform_Engineering_for_Private_Cloud). 10% with code LDN10 Monki Gras (https://monkigras.com/), London, March 27-28, Coté speaking. Cloud Foundry Day US (https://events.linuxfoundation.org/cloud-foundry-day-north-america/), May 14th, Palo Alto, CA NDC Oslo (https://ndcoslo.com/), May 21-23, speaking. KubeCon EU (https://events.linuxfoundation.org/kubecon-cloudnativecon-europe/), April 1-4, London. SDT News & Community Join our Slack community (https://softwaredefinedtalk.slack.com/join/shared_invite/zt-1hn55iv5d-UTfN7mVX1D9D5ExRt3ZJYQ#/shared-invite/email) Email the show: questions@softwaredefinedtalk.com (mailto:questions@softwaredefinedtalk.com) Free stickers: Email your address to stickers@softwaredefinedtalk.com (mailto:stickers@softwaredefinedtalk.com) Follow us on social media: Twitter (https://twitter.com/softwaredeftalk), Threads (https://www.threads.net/@softwaredefinedtalk), Mastodon (https://hachyderm.io/@softwaredefinedtalk), LinkedIn (https://www.linkedin.com/company/software-defined-talk/), BlueSky (https://bsky.app/profile/softwaredefinedtalk.com) Watch us on: Twitch (https://www.twitch.tv/sdtpodcast), YouTube (https://www.youtube.com/channel/UCi3OJPV6h9tp-hbsGBLGsDQ/featured), Instagram (https://www.instagram.com/softwaredefinedtalk/), TikTok (https://www.tiktok.com/@softwaredefinedtalk) Book offer: Use code SDT for $20 off "Digital WTF" by Coté (https://leanpub.com/digitalwtf/c/sdt) Sponsor the show (https://www.softwaredefinedtalk.com/ads): ads@softwaredefinedtalk.com (mailto:ads@softwaredefinedtalk.com) Recommendations Brandon: Oxide and Friends | AI Disruption: DeepSeek and Cerebras (https://oxide-and-friends.transistor.fm/episodes/ai-disruption-deepseek-and-cerebras) Matt: Beats Fit Pro (https://www.beatsbydre.com/earbuds/beats-fit-pro/MK2J3/sage-gray) Coté: Alfred (https://www.alfredapp.com). PoliticalWire.com (https://members.politicalwire.com/referral/30klklm). Photo Credits Header (https://unsplash.com/photos/crystal-gemstones-PteeDvACFak)

AI Disruption: DeepSeek and Cerebras

Oxide and Friends

Play Episode Listen Later Feb 6, 2025 91:20 Transcription Available

DeepSeek was a disruptive surprise at the start of 2025--an open weights model trained at a fraction of the cost of previous models. Bryan and Adam were joined by Andy Hock and James Wang from Cerebras, whose wafer-scale silicon executes these models faster than is possible with any number of GPUs.In addition to Bryan Cantrill and Adam Leventhal, we were joined by Andy Hock, and James Wang, both of Cerebras.Some of the topics we hit on, in the order that we hit them:interactive inference with Cerebras100x Defect Tolerance: How Cerebras Solved the Yield ProblemTweet from Eric MeijerOuroborusQuine RelaySimon Willison's Weblog when DeepSeek fell from spaceTweet from Naveen RaoBONUSMST3K archiveIf we got something wrong or missed something, please file a PR! Our next show will likely be on Monday at 5p Pacific Time on our Discord server; stay tuned to our Mastodon feeds for details, or subscribe to this calendar. We'd love to have you join us, as we always love to hear from new speakers!

pr discord disruption mastodon gpus weblogs cerebras james wang adam leventhal

Latent.Space 2024 Year in Review

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Dec 31, 2024 111:07

Applications for the 2025 AI Engineer Summit are up, and you can save the date for AIE Singapore in April and AIE World's Fair 2025 in June.Happy new year, and thanks for 100 great episodes! Please let us know what you want to see/hear for the next 100!Full YouTube Episode with Slides/ChartsLike and subscribe and hit that bell to get notifs!Timestamps* 00:00 Welcome to the 100th Episode!* 00:19 Reflecting on the Journey* 00:47 AI Engineering: The Rise and Impact* 03:15 Latent Space Live and AI Conferences* 09:44 The Competitive AI Landscape* 21:45 Synthetic Data and Future Trends* 35:53 Creative Writing with AI* 36:12 Legal and Ethical Issues in AI* 38:18 The Data War: GPU Poor vs. GPU Rich* 39:12 The Rise of GPU Ultra Rich* 40:47 Emerging Trends in AI Models* 45:31 The Multi-Modality War* 01:05:31 The Future of AI Benchmarks* 01:13:17 Pionote and Frontier Models* 01:13:47 Niche Models and Base Models* 01:14:30 State Space Models and RWKB* 01:15:48 Inference Race and Price Wars* 01:22:16 Major AI Themes of the Year* 01:22:48 AI Rewind: January to March* 01:26:42 AI Rewind: April to June* 01:33:12 AI Rewind: July to September* 01:34:59 AI Rewind: October to December* 01:39:53 Year-End Reflections and PredictionsTranscript[00:00:00] Welcome to the 100th Episode![00:00:00] Alessio: Hey everyone, welcome to the Latent Space Podcast. This is Alessio, partner and CTO at Decibel Partners, and I'm joined by my co host Swyx for the 100th time today.[00:00:12] swyx: Yay, um, and we're so glad that, yeah, you know, everyone has, uh, followed us in this journey. How do you feel about it? 100 episodes.[00:00:19] Alessio: Yeah, I know.[00:00:19] Reflecting on the Journey[00:00:19] Alessio: Almost two years that we've been doing this. We've had four different studios. Uh, we've had a lot of changes. You know, we used to do this lightning round. When we first started that we didn't like, and we tried to change the question. The answer[00:00:32] swyx: was cursor and perplexity.[00:00:34] Alessio: Yeah, I love mid journey. It's like, do you really not like anything else?[00:00:38] Alessio: Like what's, what's the unique thing? And I think, yeah, we, we've also had a lot more research driven content. You know, we had like 3DAO, we had, you know. Jeremy Howard, we had more folks like that.[00:00:47] AI Engineering: The Rise and Impact[00:00:47] Alessio: I think we want to do more of that too in the new year, like having, uh, some of the Gemini folks, both on the research and the applied side.[00:00:54] Alessio: Yeah, but it's been a ton of fun. I think we both started, I wouldn't say as a joke, we were kind of like, Oh, we [00:01:00] should do a podcast. And I think we kind of caught the right wave, obviously. And I think your rise of the AI engineer posts just kind of get people. Sombra to congregate, and then the AI engineer summit.[00:01:11] Alessio: And that's why when I look at our growth chart, it's kind of like a proxy for like the AI engineering industry as a whole, which is almost like, like, even if we don't do that much, we keep growing just because there's so many more AI engineers. So did you expect that growth or did you expect that would take longer for like the AI engineer thing to kind of like become, you know, everybody talks about it today.[00:01:32] swyx: So, the sign of that, that we have won is that Gartner puts it at the top of the hype curve right now. So Gartner has called the peak in AI engineering. I did not expect, um, to what level. I knew that I was correct when I called it because I did like two months of work going into that. But I didn't know, You know, how quickly it could happen, and obviously there's a chance that I could be wrong.[00:01:52] swyx: But I think, like, most people have come around to that concept. Hacker News hates it, which is a good sign. But there's enough people that have defined it, you know, GitHub, when [00:02:00] they launched GitHub Models, which is the Hugging Face clone, they put AI engineers in the banner, like, above the fold, like, in big So I think it's like kind of arrived as a meaningful and useful definition.[00:02:12] swyx: I think people are trying to figure out where the boundaries are. I think that was a lot of the quote unquote drama that happens behind the scenes at the World's Fair in June. Because I think there's a lot of doubt or questions about where ML engineering stops and AI engineering starts. That's a useful debate to be had.[00:02:29] swyx: In some sense, I actually anticipated that as well. So I intentionally did not. Put a firm definition there because most of the successful definitions are necessarily underspecified and it's actually useful to have different perspectives and you don't have to specify everything from the outset.[00:02:45] Alessio: Yeah, I was at um, AWS reInvent and the line to get into like the AI engineering talk, so to speak, which is, you know, applied AI and whatnot was like, there are like hundreds of people just in line to go in.[00:02:56] Alessio: I think that's kind of what enabled me. People, right? Which is what [00:03:00] you kind of talked about. It's like, Hey, look, you don't actually need a PhD, just, yeah, just use the model. And then maybe we'll talk about some of the blind spots that you get as an engineer with the earlier posts that we also had on on the sub stack.[00:03:11] Alessio: But yeah, it's been a heck of a heck of a two years.[00:03:14] swyx: Yeah.[00:03:15] Latent Space Live and AI Conferences[00:03:15] swyx: You know, I was, I was trying to view the conference as like, so NeurIPS is I think like 16, 17, 000 people. And the Latent Space Live event that we held there was 950 signups. I think. The AI world, the ML world is still very much research heavy. And that's as it should be because ML is very much in a research phase.[00:03:34] swyx: But as we move this entire field into production, I think that ratio inverts into becoming more engineering heavy. So at least I think engineering should be on the same level, even if it's never as prestigious, like it'll always be low status because at the end of the day, you're manipulating APIs or whatever.[00:03:51] swyx: But Yeah, wrapping GPTs, but there's going to be an increasing stack and an art to doing these, these things well. And I, you know, I [00:04:00] think that's what we're focusing on for the podcast, the conference and basically everything I do seems to make sense. And I think we'll, we'll talk about the trends here that apply.[00:04:09] swyx: It's, it's just very strange. So, like, there's a mix of, like, keeping on top of research while not being a researcher and then putting that research into production. So, like, people always ask me, like, why are you covering Neuralibs? Like, this is a ML research conference and I'm like, well, yeah, I mean, we're not going to, to like, understand everything Or reproduce every single paper, but the stuff that is being found here is going to make it through into production at some point, you hope.[00:04:32] swyx: And then actually like when I talk to the researchers, they actually get very excited because they're like, oh, you guys are actually caring about how this goes into production and that's what they really really want. The measure of success is previously just peer review, right? Getting 7s and 8s on their um, Academic review conferences and stuff like citations is one metric, but money is a better metric.[00:04:51] Alessio: Money is a better metric. Yeah, and there were about 2200 people on the live stream or something like that. Yeah, yeah. Hundred on the live stream. So [00:05:00] I try my best to moderate, but it was a lot spicier in person with Jonathan and, and Dylan. Yeah, that it was in the chat on YouTube.[00:05:06] swyx: I would say that I actually also created.[00:05:09] swyx: Layen Space Live in order to address flaws that are perceived in academic conferences. This is not NeurIPS specific, it's ICML, NeurIPS. Basically, it's very sort of oriented towards the PhD student, uh, market, job market, right? Like literally all, basically everyone's there to advertise their research and skills and get jobs.[00:05:28] swyx: And then obviously all the, the companies go there to hire them. And I think that's great for the individual researchers, but for people going there to get info is not great because you have to read between the lines, bring a ton of context in order to understand every single paper. So what is missing is effectively what I ended up doing, which is domain by domain, go through and recap the best of the year.[00:05:48] swyx: Survey the field. And there are, like NeurIPS had a, uh, I think ICML had a like a position paper track, NeurIPS added a benchmarks, uh, datasets track. These are ways in which to address that [00:06:00] issue. Uh, there's always workshops as well. Every, every conference has, you know, a last day of workshops and stuff that provide more of an overview.[00:06:06] swyx: But they're not specifically prompted to do so. And I think really, uh, Organizing a conference is just about getting good speakers and giving them the correct prompts. And then they will just go and do that thing and they do a very good job of it. So I think Sarah did a fantastic job with the startups prompt.[00:06:21] swyx: I can't list everybody, but we did best of 2024 in startups, vision, open models. Post transformers, synthetic data, small models, and agents. And then the last one was the, uh, and then we also did a quick one on reasoning with Nathan Lambert. And then the last one, obviously, was the debate that people were very hyped about.[00:06:39] swyx: It was very awkward. And I'm really, really thankful for John Franco, basically, who stepped up to challenge Dylan. Because Dylan was like, yeah, I'll do it. But He was pro scaling. And I think everyone who is like in AI is pro scaling, right? So you need somebody who's ready to publicly say, no, we've hit a wall.[00:06:57] swyx: So that means you're saying Sam Altman's wrong. [00:07:00] You're saying, um, you know, everyone else is wrong. It helps that this was the day before Ilya went on, went up on stage and then said pre training has hit a wall. And data has hit a wall. So actually Jonathan ended up winning, and then Ilya supported that statement, and then Noam Brown on the last day further supported that statement as well.[00:07:17] swyx: So it's kind of interesting that I think the consensus kind of going in was that we're not done scaling, like you should believe in a better lesson. And then, four straight days in a row, you had Sepp Hochreiter, who is the creator of the LSTM, along with everyone's favorite OG in AI, which is Juergen Schmidhuber.[00:07:34] swyx: He said that, um, we're pre trading inside a wall, or like, we've run into a different kind of wall. And then we have, you know John Frankel, Ilya, and then Noam Brown are all saying variations of the same thing, that we have hit some kind of wall in the status quo of what pre trained, scaling large pre trained models has looked like, and we need a new thing.[00:07:54] swyx: And obviously the new thing for people is some make, either people are calling it inference time compute or test time [00:08:00] compute. I think the collective terminology has been inference time, and I think that makes sense because test time, calling it test, meaning, has a very pre trained bias, meaning that the only reason for running inference at all is to test your model.[00:08:11] swyx: That is not true. Right. Yeah. So, so, I quite agree that. OpenAI seems to have adopted, or the community seems to have adopted this terminology of ITC instead of TTC. And that, that makes a lot of sense because like now we care about inference, even right down to compute optimality. Like I actually interviewed this author who recovered or reviewed the Chinchilla paper.[00:08:31] swyx: Chinchilla paper is compute optimal training, but what is not stated in there is it's pre trained compute optimal training. And once you start caring about inference, compute optimal training, you have a different scaling law. And in a way that we did not know last year.[00:08:45] Alessio: I wonder, because John is, he's also on the side of attention is all you need.[00:08:49] Alessio: Like he had the bet with Sasha. So I'm curious, like he doesn't believe in scaling, but he thinks the transformer, I wonder if he's still. So, so,[00:08:56] swyx: so he, obviously everything is nuanced and you know, I told him to play a character [00:09:00] for this debate, right? So he actually does. Yeah. He still, he still believes that we can scale more.[00:09:04] swyx: Uh, he just assumed the character to be very game for, for playing this debate. So even more kudos to him that he assumed a position that he didn't believe in and still won the debate.[00:09:16] Alessio: Get rekt, Dylan. Um, do you just want to quickly run through some of these things? Like, uh, Sarah's presentation, just the highlights.[00:09:24] swyx: Yeah, we can't go through everyone's slides, but I pulled out some things as a factor of, like, stuff that we were going to talk about. And we'll[00:09:30] Alessio: publish[00:09:31] swyx: the rest. Yeah, we'll publish on this feed the best of 2024 in those domains. And hopefully people can benefit from the work that our speakers have done.[00:09:39] swyx: But I think it's, uh, these are just good slides. And I've been, I've been looking for a sort of end of year recaps from, from people.[00:09:44] The Competitive AI Landscape[00:09:44] swyx: The field has progressed a lot. You know, I think the max ELO in 2023 on LMSys used to be 1200 for LMSys ELOs. And now everyone is at least at, uh, 1275 in their ELOs, and this is across Gemini, Chadjibuti, [00:10:00] Grok, O1.[00:10:01] swyx: ai, which with their E Large model, and Enthopic, of course. It's a very, very competitive race. There are multiple Frontier labs all racing, but there is a clear tier zero Frontier. And then there's like a tier one. It's like, I wish I had everything else. Tier zero is extremely competitive. It's effectively now three horse race between Gemini, uh, Anthropic and OpenAI.[00:10:21] swyx: I would say that people are still holding out a candle for XAI. XAI, I think, for some reason, because their API was very slow to roll out, is not included in these metrics. So it's actually quite hard to put on there. As someone who also does charts, XAI is continually snubbed because they don't work well with the benchmarking people.[00:10:42] swyx: Yeah, yeah, yeah. It's a little trivia for why XAI always gets ignored. The other thing is market share. So these are slides from Sarah. We have it up on the screen. It has gone from very heavily open AI. So we have some numbers and estimates. These are from RAMP. Estimates of open AI market share in [00:11:00] December 2023.[00:11:01] swyx: And this is basically, what is it, GPT being 95 percent of production traffic. And I think if you correlate that with stuff that we asked. Harrison Chase on the LangChain episode, it was true. And then CLAUD 3 launched mid middle of this year. I think CLAUD 3 launched in March, CLAUD 3. 5 Sonnet was in June ish.[00:11:23] swyx: And you can start seeing the market share shift towards opening, uh, towards that topic, uh, very, very aggressively. The more recent one is Gemini. So if I scroll down a little bit, this is an even more recent dataset. So RAM's dataset ends in September 2 2. 2024. Gemini has basically launched a price war at the low end, uh, with Gemini Flash, uh, being basically free for personal use.[00:11:44] swyx: Like, I think people don't understand the free tier. It's something like a billion tokens per day. Unless you're trying to abuse it, you cannot really exhaust your free tier on Gemini. They're really trying to get you to use it. They know they're in like third place, um, fourth place, depending how you, how you count.[00:11:58] swyx: And so they're going after [00:12:00] the Lower tier first, and then, you know, maybe the upper tier later, but yeah, Gemini Flash, according to OpenRouter, is now 50 percent of their OpenRouter requests. Obviously, these are the small requests. These are small, cheap requests that are mathematically going to be more.[00:12:15] swyx: The smart ones obviously are still going to OpenAI. But, you know, it's a very, very big shift in the market. Like basically 2023, 2022, To going into 2024 opening has gone from nine five market share to Yeah. Reasonably somewhere between 50 to 75 market share.[00:12:29] Alessio: Yeah. I'm really curious how ramped does the attribution to the model?[00:12:32] Alessio: If it's API, because I think it's all credit card spin. . Well, but it's all, the credit card doesn't say maybe. Maybe the, maybe when they do expenses, they upload the PDF, but yeah, the, the German I think makes sense. I think that was one of my main 2024 takeaways that like. The best small model companies are the large labs, which is not something I would have thought that the open source kind of like long tail would be like the small model.[00:12:53] swyx: Yeah, different sizes of small models we're talking about here, right? Like so small model here for Gemini is AB, [00:13:00] right? Uh, mini. We don't know what the small model size is, but yeah, it's probably in the double digits or maybe single digits, but probably double digits. The open source community has kind of focused on the one to three B size.[00:13:11] swyx: Mm-hmm . Yeah. Maybe[00:13:12] swyx: zero, maybe 0.5 B uh, that's moon dream and that is small for you then, then that's great. It makes sense that we, we have a range for small now, which is like, may, maybe one to five B. Yeah. I'll even put that at, at, at the high end. And so this includes Gemma from Gemini as well. But also includes the Apple Foundation models, which I think Apple Foundation is 3B.[00:13:32] Alessio: Yeah. No, that's great. I mean, I think in the start small just meant cheap. I think today small is actually a more nuanced discussion, you know, that people weren't really having before.[00:13:43] swyx: Yeah, we can keep going. This is a slide that I smiley disagree with Sarah. She's pointing to the scale SEAL leaderboard. I think the Researchers that I talked with at NeurIPS were kind of positive on this because basically you need private test [00:14:00] sets to prevent contamination.[00:14:02] swyx: And Scale is one of maybe three or four people this year that has really made an effort in doing a credible private test set leaderboard. Llama405B does well compared to Gemini and GPT 40. And I think that's good. I would say that. You know, it's good to have an open model that is that big, that does well on those metrics.[00:14:23] swyx: But anyone putting 405B in production will tell you, if you scroll down a little bit to the artificial analysis numbers, that it is very slow and very expensive to infer. Um, it doesn't even fit on like one node. of, uh, of H100s. Cerebras will be happy to tell you they can serve 4 or 5B on their super large chips.[00:14:42] swyx: But, um, you know, if you need to do anything custom to it, you're still kind of constrained. So, is 4 or 5B really that relevant? Like, I think most people are basically saying that they only use 4 or 5B as a teacher model to distill down to something. Even Meta is doing it. So with Lama 3. [00:15:00] 3 launched, they only launched the 70B because they use 4 or 5B to distill the 70B.[00:15:03] swyx: So I don't know if like open source is keeping up. I think they're the, the open source industrial complex is very invested in telling you that the, if the gap is narrowing, I kind of disagree. I think that the gap is widening with O1. I think there are very, very smart people trying to narrow that gap and they should.[00:15:22] swyx: I really wish them success, but you cannot use a chart that is nearing 100 in your saturation chart. And look, the distance between open source and closed source is narrowing. Of course it's going to narrow because you're near 100. This is stupid. But in metrics that matter, is open source narrowing?[00:15:38] swyx: Probably not for O1 for a while. And it's really up to the open source guys to figure out if they can match O1 or not.[00:15:46] Alessio: I think inference time compute is bad for open source just because, you know, Doc can donate the flops at training time, but he cannot donate the flops at inference time. So it's really hard to like actually keep up on that axis.[00:15:59] Alessio: Big, big business [00:16:00] model shift. So I don't know what that means for the GPU clouds. I don't know what that means for the hyperscalers, but obviously the big labs have a lot of advantage. Because, like, it's not a static artifact that you're putting the compute in. You're kind of doing that still, but then you're putting a lot of computed inference too.[00:16:17] swyx: Yeah, yeah, yeah. Um, I mean, Llama4 will be reasoning oriented. We talked with Thomas Shalom. Um, kudos for getting that episode together. That was really nice. Good, well timed. Actually, I connected with the AI meta guy, uh, at NeurIPS, and, um, yeah, we're going to coordinate something for Llama4. Yeah, yeah,[00:16:32] Alessio: and our friend, yeah.[00:16:33] Alessio: Clara Shi just joined to lead the business agent side. So I'm sure we'll have her on in the new year.[00:16:39] swyx: Yeah. So, um, my comment on, on the business model shift, this is super interesting. Apparently it is wide knowledge that OpenAI wanted more than 6. 6 billion dollars for their fundraise. They wanted to raise, you know, higher, and they did not.[00:16:51] swyx: And what that means is basically like, it's very convenient that we're not getting GPT 5, which would have been a larger pre train. We should have a lot of upfront money. And [00:17:00] instead we're, we're converting fixed costs into variable costs, right. And passing it on effectively to the customer. And it's so much easier to take margin there because you can directly attribute it to like, Oh, you're using this more.[00:17:12] swyx: Therefore you, you pay more of the cost and I'll just slap a margin in there. So like that lets you control your growth margin and like tie your. Your spend, or your sort of inference spend, accordingly. And it's just really interesting to, that this change in the sort of inference paradigm has arrived exactly at the same time that the funding environment for pre training is effectively drying up, kind of.[00:17:36] swyx: I feel like maybe the VCs are very in tune with research anyway, so like, they would have noticed this, but, um, it's just interesting.[00:17:43] Alessio: Yeah, and I was looking back at our yearly recap of last year. Yeah. And the big thing was like the mixed trial price fights, you know, and I think now it's almost like there's nowhere to go, like, you know, Gemini Flash is like basically giving it away for free.[00:17:55] Alessio: So I think this is a good way for the labs to generate more revenue and pass down [00:18:00] some of the compute to the customer. I think they're going to[00:18:02] swyx: keep going. I think that 2, will come.[00:18:05] Alessio: Yeah, I know. Totally. I mean, next year, the first thing I'm doing is signing up for Devin. Signing up for the pro chat GBT.[00:18:12] Alessio: Just to try. I just want to see what does it look like to spend a thousand dollars a month on AI?[00:18:17] swyx: Yes. Yes. I think if your, if your, your job is a, at least AI content creator or VC or, you know, someone who, whose job it is to stay on, stay on top of things, you should already be spending like a thousand dollars a month on, on stuff.[00:18:28] swyx: And then obviously easy to spend, hard to use. You have to actually use. The good thing is that actually Google lets you do a lot of stuff for free now. So like deep research. That they just launched. Uses a ton of inference and it's, it's free while it's in preview.[00:18:45] Alessio: Yeah. They need to put that in Lindy.[00:18:47] Alessio: I've been using Lindy lately. I've been a built a bunch of things once we had flow because I liked the new thing. It's pretty good. I even did a phone call assistant. Um, yeah, they just launched Lindy voice. Yeah, I think once [00:19:00] they get advanced voice mode like capability today, still like speech to text, you can kind of tell.[00:19:06] Alessio: Um, but it's good for like reservations and things like that. So I have a meeting prepper thing. And so[00:19:13] swyx: it's good. Okay. I feel like we've, we've covered a lot of stuff. Uh, I, yeah, I, you know, I think We will go over the individual, uh, talks in a separate episode. Uh, I don't want to take too much time with, uh, this stuff, but that suffice to say that there is a lot of progress in each field.[00:19:28] swyx: Uh, we covered vision. Basically this is all like the audience voting for what they wanted. And then I just invited the best people I could find in each audience, especially agents. Um, Graham, who I talked to at ICML in Vienna, he is currently still number one. It's very hard to stay on top of SweetBench.[00:19:45] swyx: OpenHand is currently still number one. switchbench full, which is the hardest one. He had very good thoughts on agents, which I, which I'll highlight for people. Everyone is saying 2025 is the year of agents, just like they said last year. And, uh, but he had [00:20:00] thoughts on like eight parts of what are the frontier problems to solve in agents.[00:20:03] swyx: And so I'll highlight that talk as well.[00:20:05] Alessio: Yeah. The number six, which is the Hacken agents learn more about the environment, has been a Super interesting to us as well, just to think through, because, yeah, how do you put an agent in an enterprise where most things in an enterprise have never been public, you know, a lot of the tooling, like the code bases and things like that.[00:20:23] Alessio: So, yeah, there's not indexing and reg. Well, yeah, but it's more like. You can't really rag things that are not documented. But people know them based on how they've been doing it. You know, so I think there's almost this like, you know, Oh, institutional knowledge. Yeah, the boring word is kind of like a business process extraction.[00:20:38] Alessio: Yeah yeah, I see. It's like, how do you actually understand how these things are done? I see. Um, and I think today the, the problem is that, Yeah, the agents are, that most people are building are good at following instruction, but are not as good as like extracting them from you. Um, so I think that will be a big unlock just to touch quickly on the Jeff Dean thing.[00:20:55] Alessio: I thought it was pretty, I mean, we'll link it in the, in the things, but. I think the main [00:21:00] focus was like, how do you use ML to optimize the systems instead of just focusing on ML to do something else? Yeah, I think speculative decoding, we had, you know, Eugene from RWKB on the podcast before, like he's doing a lot of that with Fetterless AI.[00:21:12] swyx: Everyone is. I would say it's the norm. I'm a little bit uncomfortable with how much it costs, because it does use more of the GPU per call. But because everyone is so keen on fast inference, then yeah, makes sense.[00:21:24] Alessio: Exactly. Um, yeah, but we'll link that. Obviously Jeff is great.[00:21:30] swyx: Jeff is, Jeff's talk was more, it wasn't focused on Gemini.[00:21:33] swyx: I think people got the wrong impression from my tweet. It's more about how Google approaches ML and uses ML to design systems and then systems feedback into ML. And I think this ties in with Lubna's talk.[00:21:45] Synthetic Data and Future Trends[00:21:45] swyx: on synthetic data where it's basically the story of bootstrapping of humans and AI in AI research or AI in production.[00:21:53] swyx: So her talk was on synthetic data, where like how much synthetic data has grown in 2024 in the pre training side, the post training side, [00:22:00] and the eval side. And I think Jeff then also extended it basically to chips, uh, to chip design. So he'd spend a lot of time talking about alpha chip. And most of us in the audience are like, we're not working on hardware, man.[00:22:11] swyx: Like you guys are great. TPU is great. Okay. We'll buy TPUs.[00:22:14] Alessio: And then there was the earlier talk. Yeah. But, and then we have, uh, I don't know if we're calling them essays. What are we calling these? But[00:22:23] swyx: for me, it's just like bonus for late in space supporters, because I feel like they haven't been getting anything.[00:22:29] swyx: And then I wanted a more high frequency way to write stuff. Like that one I wrote in an afternoon. I think basically we now have an answer to what Ilya saw. It's one year since. The blip. And we know what he saw in 2014. We know what he saw in 2024. We think we know what he sees in 2024. He gave some hints and then we have vague indications of what he saw in 2023.[00:22:54] swyx: So that was the Oh, and then 2016 as well, because of this lawsuit with Elon, OpenAI [00:23:00] is publishing emails from Sam's, like, his personal text messages to Siobhan, Zelis, or whatever. So, like, we have emails from Ilya saying, this is what we're seeing in OpenAI, and this is why we need to scale up GPUs. And I think it's very prescient in 2016 to write that.[00:23:16] swyx: And so, like, it is exactly, like, basically his insights. It's him and Greg, basically just kind of driving the scaling up of OpenAI, while they're still playing Dota. They're like, no, like, we see the path here.[00:23:30] Alessio: Yeah, and it's funny, yeah, they even mention, you know, we can only train on 1v1 Dota. We need to train on 5v5, and that takes too many GPUs.[00:23:37] Alessio: Yeah,[00:23:37] swyx: and at least for me, I can speak for myself, like, I didn't see the path from Dota to where we are today. I think even, maybe if you ask them, like, they wouldn't necessarily draw a straight line. Yeah,[00:23:47] Alessio: no, definitely. But I think like that was like the whole idea of almost like the RL and we talked about this with Nathan on his podcast.[00:23:55] Alessio: It's like with RL, you can get very good at specific things, but then you can't really like generalize as much. And I [00:24:00] think the language models are like the opposite, which is like, you're going to throw all this data at them and scale them up, but then you really need to drive them home on a specific task later on.[00:24:08] Alessio: And we'll talk about the open AI reinforcement, fine tuning, um, announcement too, and all of that. But yeah, I think like scale is all you need. That's kind of what Elia will be remembered for. And I think just maybe to clarify on like the pre training is over thing that people love to tweet. I think the point of the talk was like everybody, we're scaling these chips, we're scaling the compute, but like the second ingredient which is data is not scaling at the same rate.[00:24:35] Alessio: So it's not necessarily pre training is over. It's kind of like What got us here won't get us there. In his email, he predicted like 10x growth every two years or something like that. And I think maybe now it's like, you know, you can 10x the chips again, but[00:24:49] swyx: I think it's 10x per year. Was it? I don't know.[00:24:52] Alessio: Exactly. And Moore's law is like 2x. So it's like, you know, much faster than that. And yeah, I like the fossil fuel of AI [00:25:00] analogy. It's kind of like, you know, the little background tokens thing. So the OpenAI reinforcement fine tuning is basically like, instead of fine tuning on data, you fine tune on a reward model.[00:25:09] Alessio: So it's basically like, instead of being data driven, it's like task driven. And I think people have tasks to do, they don't really have a lot of data. So I'm curious to see how that changes, how many people fine tune, because I think this is what people run into. It's like, Oh, you can fine tune llama. And it's like, okay, where do I get the data?[00:25:27] Alessio: To fine tune it on, you know, so it's great that we're moving the thing. And then I really like he had this chart where like, you know, the brain mass and the body mass thing is basically like mammals that scaled linearly by brain and body size, and then humans kind of like broke off the slope. So it's almost like maybe the mammal slope is like the pre training slope.[00:25:46] Alessio: And then the post training slope is like the, the human one.[00:25:49] swyx: Yeah. I wonder what the. I mean, we'll know in 10 years, but I wonder what the y axis is for, for Ilya's SSI. We'll try to get them on.[00:25:57] Alessio: Ilya, if you're listening, you're [00:26:00] welcome here. Yeah, and then he had, you know, what comes next, like agent, synthetic data, inference, compute, I thought all of that was like that.[00:26:05] Alessio: I don't[00:26:05] swyx: think he was dropping any alpha there. Yeah, yeah, yeah.[00:26:07] Alessio: Yeah. Any other new reps? Highlights?[00:26:10] swyx: I think that there was comparatively a lot more work. Oh, by the way, I need to plug that, uh, my friend Yi made this, like, little nice paper. Yeah, that was really[00:26:20] swyx: nice.[00:26:20] swyx: Uh, of, uh, of, like, all the, he's, she called it must read papers of 2024.[00:26:26] swyx: So I laid out some of these at NeurIPS, and it was just gone. Like, everyone just picked it up. Because people are dying for, like, little guidance and visualizations And so, uh, I thought it was really super nice that we got there.[00:26:38] Alessio: Should we do a late in space book for each year? Uh, I thought about it. For each year we should.[00:26:42] Alessio: Coffee table book. Yeah. Yeah. Okay. Put it in the will. Hi, Will. By the way, we haven't introduced you. He's our new, you know, general organist, Jamie. You need to[00:26:52] swyx: pull up more things. One thing I saw that, uh, Okay, one fun one, and then one [00:27:00] more general one. So the fun one is this paper on agent collusion. This is a paper on steganography.[00:27:06] swyx: This is secret collusion among AI agents, multi agent deception via steganography. I tried to go to NeurIPS in order to find these kinds of papers because the real reason Like NeurIPS this year has a lottery system. A lot of people actually even go and don't buy tickets because they just go and attend the side events.[00:27:22] swyx: And then also the people who go and end up crowding around the most popular papers, which you already know and already read them before you showed up to NeurIPS. So the only reason you go there is to talk to the paper authors, but there's like something like 10, 000 other. All these papers out there that, you know, are just people's work that they, that they did on the air and they failed to get attention for one reason or another.[00:27:42] swyx: And this was one of them. Uh, it was like all the way at the back. And this is a deep mind paper that actually focuses on collusion between AI agents, uh, by hiding messages in the text that they generate. Uh, so that's what steganography is. So a very simple example would be the first letter of every word.[00:27:57] swyx: If you Pick that out, you know, and the code sends a [00:28:00] different message than that. But something I've always emphasized is to LLMs, we read left to right. LLMs can read up, down, sideways, you know, in random character order. And it's the same to them as it is to us. So if we were ever to get You know, self motivated, underlined LLMs that we're trying to collaborate to take over the planet.[00:28:19] swyx: This would be how they do it. They spread messages among us in the messages that we generate. And he developed a scaling law for that. So he marked, I'm showing it on screen right now, the emergence of this phenomenon. Basically, for example, for Cypher encoding, GPT 2, Lama 2, mixed trial, GPT 3. 5, zero capabilities, and sudden 4.[00:28:40] swyx: And this is the kind of Jason Wei type emergence properties that people kind of look for. I think what made this paper stand out as well, so he developed the benchmark for steganography collusion, and he also focused on shelling point collusion, which is very low coordination. For agreeing on a decoding encoding format, you kind of need to have some [00:29:00] agreement on that.[00:29:00] swyx: But, but shelling point means like very, very low or almost no coordination. So for example, if I, if I ask someone, if the only message I give you is meet me in New York and you're not aware. Or when you would probably meet me at Grand Central Station. That is the Grand Central Station is a shelling point.[00:29:16] swyx: And it's probably somewhere, somewhere during the day. That is the shelling point of New York is Grand Central. To that extent, shelling points for steganography are things like the, the, the common decoding methods that we talked about. It will be interesting at some point in the future when we are worried about alignment.[00:29:30] swyx: It is not interesting today, but it's interesting that DeepMind is already thinking about this.[00:29:36] Alessio: I think that's like one of the hardest things about NeurIPS. It's like the long tail. I[00:29:41] swyx: found a pricing guy. I'm going to feature him on the podcast. Basically, this guy from NVIDIA worked out the optimal pricing for language models.[00:29:51] swyx: It's basically an econometrics paper at NeurIPS, where everyone else is talking about GPUs. And the guy with the GPUs is[00:29:57] Alessio: talking[00:29:57] swyx: about economics instead. [00:30:00] That was the sort of fun one. So the focus I saw is that model papers at NeurIPS are kind of dead. No one really presents models anymore. It's just data sets.[00:30:12] swyx: This is all the grad students are working on. So like there was a data sets track and then I was looking around like, I was like, you don't need a data sets track because every paper is a data sets paper. And so data sets and benchmarks, they're kind of flip sides of the same thing. So Yeah. Cool. Yeah, if you're a grad student, you're a GPU boy, you kind of work on that.[00:30:30] swyx: And then the, the sort of big model that people walk around and pick the ones that they like, and then they use it in their models. And that's, that's kind of how it develops. I, I feel like, um, like, like you didn't last year, you had people like Hao Tian who worked on Lava, which is take Lama and add Vision.[00:30:47] swyx: And then obviously actually I hired him and he added Vision to Grok. Now he's the Vision Grok guy. This year, I don't think there was any of those.[00:30:55] Alessio: What were the most popular, like, orals? Last year it was like the [00:31:00] Mixed Monarch, I think, was like the most attended. Yeah, uh, I need to look it up. Yeah, I mean, if nothing comes to mind, that's also kind of like an answer in a way.[00:31:10] Alessio: But I think last year there was a lot of interest in, like, furthering models and, like, different architectures and all of that.[00:31:16] swyx: I will say that I felt the orals, oral picks this year were not very good. Either that or maybe it's just a So that's the highlight of how I have changed in terms of how I view papers.[00:31:29] swyx: So like, in my estimation, two of the best papers in this year for datasets or data comp and refined web or fine web. These are two actually industrially used papers, not highlighted for a while. I think DCLM got the spotlight, FineWeb didn't even get the spotlight. So like, it's just that the picks were different.[00:31:48] swyx: But one thing that does get a lot of play that a lot of people are debating is the role that's scheduled. This is the schedule free optimizer paper from Meta from Aaron DeFazio. And this [00:32:00] year in the ML community, there's been a lot of chat about shampoo, soap, all the bathroom amenities for optimizing your learning rates.[00:32:08] swyx: And, uh, most people at the big labs are. Who I asked about this, um, say that it's cute, but it's not something that matters. I don't know, but it's something that was discussed and very, very popular. 4Wars[00:32:19] Alessio: of AI recap maybe, just quickly. Um, where do you want to start? Data?[00:32:26] swyx: So to remind people, this is the 4Wars piece that we did as one of our earlier recaps of this year.[00:32:31] swyx: And the belligerents are on the left, journalists, writers, artists, anyone who owns IP basically, New York Times, Stack Overflow, Reddit, Getty, Sarah Silverman, George RR Martin. Yeah, and I think this year we can add Scarlett Johansson to that side of the fence. So anyone suing, open the eye, basically. I actually wanted to get a snapshot of all the lawsuits.[00:32:52] swyx: I'm sure some lawyer can do it. That's the data quality war. On the right hand side, we have the synthetic data people, and I think we talked about Lumna's talk, you know, [00:33:00] really showing how much synthetic data has come along this year. I think there was a bit of a fight between scale. ai and the synthetic data community, because scale.[00:33:09] swyx: ai published a paper saying that synthetic data doesn't work. Surprise, surprise, scale. ai is the leading vendor of non synthetic data. Only[00:33:17] Alessio: cage free annotated data is useful.[00:33:21] swyx: So I think there's some debate going on there, but I don't think it's much debate anymore that at least synthetic data, for the reasons that are blessed in Luna's talk, Makes sense.[00:33:32] swyx: I don't know if you have any perspectives there.[00:33:34] Alessio: I think, again, going back to the reinforcement fine tuning, I think that will change a little bit how people think about it. I think today people mostly use synthetic data, yeah, for distillation and kind of like fine tuning a smaller model from like a larger model.[00:33:46] Alessio: I'm not super aware of how the frontier labs use it outside of like the rephrase, the web thing that Apple also did. But yeah, I think it'll be. Useful. I think like whether or not that gets us the big [00:34:00] next step, I think that's maybe like TBD, you know, I think people love talking about data because it's like a GPU poor, you know, I think, uh, synthetic data is like something that people can do, you know, so they feel more opinionated about it compared to, yeah, the optimizers stuff, which is like,[00:34:17] swyx: they don't[00:34:17] Alessio: really work[00:34:18] swyx: on.[00:34:18] swyx: I think that there is an angle to the reasoning synthetic data. So this year, we covered in the paper club, the star series of papers. So that's star, Q star, V star. It basically helps you to synthesize reasoning steps, or at least distill reasoning steps from a verifier. And if you look at the OpenAI RFT, API that they released, or that they announced, basically they're asking you to submit graders, or they choose from a preset list of graders.[00:34:49] swyx: Basically It feels like a way to create valid synthetic data for them to fine tune their reasoning paths on. Um, so I think that is another angle where it starts to make sense. And [00:35:00] so like, it's very funny that basically all the data quality wars between Let's say the music industry or like the newspaper publishing industry or the textbooks industry on the big labs.[00:35:11] swyx: It's all of the pre training era. And then like the new era, like the reasoning era, like nobody has any problem with all the reasoning, especially because it's all like sort of math and science oriented with, with very reasonable graders. I think the more interesting next step is how does it generalize beyond STEM?[00:35:27] swyx: We've been using O1 for And I would say like for summarization and creative writing and instruction following, I think it's underrated. I started using O1 in our intro songs before we killed the intro songs, but it's very good at writing lyrics. You know, I can actually say like, I think one of the O1 pro demos.[00:35:46] swyx: All of these things that Noam was showing was that, you know, you can write an entire paragraph or three paragraphs without using the letter A, right?[00:35:53] Creative Writing with AI[00:35:53] swyx: So like, like literally just anything instead of token, like not even token level, character level manipulation and [00:36:00] counting and instruction following. It's, uh, it's very, very strong.[00:36:02] swyx: And so no surprises when I ask it to rhyme, uh, and to, to create song lyrics, it's going to do that very much better than in previous models. So I think it's underrated for creative writing.[00:36:11] Alessio: Yeah.[00:36:12] Legal and Ethical Issues in AI[00:36:12] Alessio: What do you think is the rationale that they're going to have in court when they don't show you the thinking traces of O1, but then they want us to, like, they're getting sued for using other publishers data, you know, but then on their end, they're like, well, you shouldn't be using my data to then train your model.[00:36:29] Alessio: So I'm curious to see how that kind of comes. Yeah, I mean, OPA has[00:36:32] swyx: many ways to publish, to punish people without bringing, taking them to court. Already banned ByteDance for distilling their, their info. And so anyone caught distilling the chain of thought will be just disallowed to continue on, on, on the API.[00:36:44] swyx: And it's fine. It's no big deal. Like, I don't even think that's an issue at all, just because the chain of thoughts are pretty well hidden. Like you have to work very, very hard to, to get it to leak. And then even when it leaks the chain of thought, you don't know if it's, if it's [00:37:00] The bigger concern is actually that there's not that much IP hiding behind it, that Cosign, which we talked about, we talked to him on Dev Day, can just fine tune 4.[00:37:13] swyx: 0 to beat 0. 1 Cloud SONET so far is beating O1 on coding tasks without, at least O1 preview, without being a reasoning model, same for Gemini Pro or Gemini 2. 0. So like, how much is reasoning important? How much of a moat is there in this, like, All of these are proprietary sort of training data that they've presumably accomplished.[00:37:34] swyx: Because even DeepSeek was able to do it. And they had, you know, two months notice to do this, to do R1. So, it's actually unclear how much moat there is. Obviously, you know, if you talk to the Strawberry team, they'll be like, yeah, I mean, we spent the last two years doing this. So, we don't know. And it's going to be Interesting because there'll be a lot of noise from people who say they have inference time compute and actually don't because they just have fancy chain of thought.[00:38:00][00:38:00] swyx: And then there's other people who actually do have very good chain of thought. And you will not see them on the same level as OpenAI because OpenAI has invested a lot in building up the mythology of their team. Um, which makes sense. Like the real answer is somewhere in between.[00:38:13] Alessio: Yeah, I think that's kind of like the main data war story developing.[00:38:18] The Data War: GPU Poor vs. GPU Rich[00:38:18] Alessio: GPU poor versus GPU rich. Yeah. Where do you think we are? I think there was, again, going back to like the small model thing, there was like a time in which the GPU poor were kind of like the rebel faction working on like these models that were like open and small and cheap. And I think today people don't really care as much about GPUs anymore.[00:38:37] Alessio: You also see it in the price of the GPUs. Like, you know, that market is kind of like plummeted because there's people don't want to be, they want to be GPU free. They don't even want to be poor. They just want to be, you know, completely without them. Yeah. How do you think about this war? You[00:38:52] swyx: can tell me about this, but like, I feel like the, the appetite for GPU rich startups, like the, you know, the, the funding plan is we will raise 60 million and [00:39:00] we'll give 50 of that to NVIDIA.[00:39:01] swyx: That is gone, right? Like, no one's, no one's pitching that. This was literally the plan, the exact plan of like, I can name like four or five startups, you know, this time last year. So yeah, GPU rich startups gone.[00:39:12] The Rise of GPU Ultra Rich[00:39:12] swyx: But I think like, The GPU ultra rich, the GPU ultra high net worth is still going. So, um, now we're, you know, we had Leopold's essay on the trillion dollar cluster.[00:39:23] swyx: We're not quite there yet. We have multiple labs, um, you know, XAI very famously, you know, Jensen Huang praising them for being. Best boy number one in spinning up 100, 000 GPU cluster in like 12 days or something. So likewise at Meta, likewise at OpenAI, likewise at the other labs as well. So like the GPU ultra rich are going to keep doing that because I think partially it's an article of faith now that you just need it.[00:39:46] swyx: Like you don't even know what it's going to, what you're going to use it for. You just, you just need it. And it makes sense that if, especially if we're going into. More researchy territory than we are. So let's say 2020 to 2023 was [00:40:00] let's scale big models territory because we had GPT 3 in 2020 and we were like, okay, we'll go from 1.[00:40:05] swyx: 75b to 1. 8b, 1. 8t. And that was GPT 3 to GPT 4. Okay, that's done. As far as everyone is concerned, Opus 3. 5 is not coming out, GPT 4. 5 is not coming out, and Gemini 2, we don't have Pro, whatever. We've hit that wall. Maybe I'll call it the 2 trillion perimeter wall. We're not going to 10 trillion. No one thinks it's a good idea, at least from training costs, from the amount of data, or at least the inference.[00:40:36] swyx: Would you pay 10x the price of GPT Probably not. Like, like you want something else that, that is at least more useful. So it makes sense that people are pivoting in terms of their inference paradigm.[00:40:47] Emerging Trends in AI Models[00:40:47] swyx: And so when it's more researchy, then you actually need more just general purpose compute to mess around with, uh, at the exact same time that production deployments of the old, the previous paradigm is still ramping up,[00:40:58] swyx: um,[00:40:58] swyx: uh, pretty aggressively.[00:40:59] swyx: So [00:41:00] it makes sense that the GPU rich are growing. We have now interviewed both together and fireworks and replicates. Uh, we haven't done any scale yet. But I think Amazon, maybe kind of a sleeper one, Amazon, in a sense of like they, at reInvent, I wasn't expecting them to do so well, but they are now a foundation model lab.[00:41:18] swyx: It's kind of interesting. Um, I think, uh, you know, David went over there and started just creating models.[00:41:25] Alessio: Yeah, I mean, that's the power of prepaid contracts. I think like a lot of AWS customers, you know, they do this big reserve instance contracts and now they got to use their money. That's why so many startups.[00:41:37] Alessio: Get bought through the AWS marketplace so they can kind of bundle them together and prefer pricing.[00:41:42] swyx: Okay, so maybe GPU super rich doing very well, GPU middle class dead, and then GPU[00:41:48] Alessio: poor. I mean, my thing is like, everybody should just be GPU rich. There shouldn't really be, even the GPU poorest, it's like, does it really make sense to be GPU poor?[00:41:57] Alessio: Like, if you're GPU poor, you should just use the [00:42:00] cloud. Yes, you know, and I think there might be a future once we kind of like figure out what the size and shape of these models is where like the tiny box and these things come to fruition where like you can be GPU poor at home. But I think today is like, why are you working so hard to like get these models to run on like very small clusters where it's like, It's so cheap to run them.[00:42:21] Alessio: Yeah, yeah,[00:42:22] swyx: yeah. I think mostly people think it's cool. People think it's a stepping stone to scaling up. So they aspire to be GPU rich one day and they're working on new methods. Like news research, like probably the most deep tech thing they've done this year is Distro or whatever the new name is.[00:42:38] swyx: There's a lot of interest in heterogeneous computing, distributed computing. I tend generally to de emphasize that historically, but it may be coming to a time where it is starting to be relevant. I don't know. You know, SF compute launched their compute marketplace this year, and like, who's really using that?[00:42:53] swyx: Like, it's a bunch of small clusters, disparate types of compute, and if you can make that [00:43:00] useful, then that will be very beneficial to the broader community, but maybe still not the source of frontier models. It's just going to be a second tier of compute that is unlocked for people, and that's fine. But yeah, I mean, I think this year, I would say a lot more on device, We are, I now have Apple intelligence on my phone.[00:43:19] swyx: Doesn't do anything apart from summarize my notifications. But still, not bad. Like, it's multi modal.[00:43:25] Alessio: Yeah, the notification summaries are so and so in my experience.[00:43:29] swyx: Yeah, but they add, they add juice to life. And then, um, Chrome Nano, uh, Gemini Nano is coming out in Chrome. Uh, they're still feature flagged, but you can, you can try it now if you, if you use the, uh, the alpha.[00:43:40] swyx: And so, like, I, I think, like, you know, We're getting the sort of GPU poor version of a lot of these things coming out, and I think it's like quite useful. Like Windows as well, rolling out RWKB in sort of every Windows department is super cool. And I think the last thing that I never put in this GPU poor war, that I think I should now, [00:44:00] is the number of startups that are GPU poor but still scaling very well, as sort of wrappers on top of either a foundation model lab, or GPU Cloud.[00:44:10] swyx: GPU Cloud, it would be Suno. Suno, Ramp has rated as one of the top ranked, fastest growing startups of the year. Um, I think the last public number is like zero to 20 million this year in ARR and Suno runs on Moto. So Suno itself is not GPU rich, but they're just doing the training on, on Moto, uh, who we've also talked to on, on the podcast.[00:44:31] swyx: The other one would be Bolt, straight cloud wrapper. And, and, um, Again, another, now they've announced 20 million ARR, which is another step up from our 8 million that we put on the title. So yeah, I mean, it's crazy that all these GPU pores are finding a way while the GPU riches are also finding a way. And then the only failures, I kind of call this the GPU smiling curve, where the edges do well, because you're either close to the machines, and you're like [00:45:00] number one on the machines, or you're like close to the customers, and you're number one on the customer side.[00:45:03] swyx: And the people who are in the middle. Inflection, um, character, didn't do that great. I think character did the best of all of them. Like, you have a note in here that we apparently said that character's price tag was[00:45:15] Alessio: 1B.[00:45:15] swyx: Did I say that?[00:45:16] Alessio: Yeah. You said Google should just buy them for 1B. I thought it was a crazy number.[00:45:20] Alessio: Then they paid 2. 7 billion. I mean, for like,[00:45:22] swyx: yeah.[00:45:22] Alessio: What do you pay for node? Like, I don't know what the game world was like. Maybe the starting price was 1B. I mean, whatever it was, it worked out for everybody involved.[00:45:31] The Multi-Modality War[00:45:31] Alessio: Multimodality war. And this one, we never had text to video in the first version, which now is the hottest.[00:45:37] swyx: Yeah, I would say it's a subset of image, but yes.[00:45:40] Alessio: Yeah, well, but I think at the time it wasn't really something people were doing, and now we had VO2 just came out yesterday. Uh, Sora was released last month, last week. I've not tried Sora, because the day that I tried, it wasn't, yeah. I[00:45:54] swyx: think it's generally available now, you can go to Sora.[00:45:56] swyx: com and try it. Yeah, they had[00:45:58] Alessio: the outage. Which I [00:46:00] think also played a part into it. Small things. Yeah. What's the other model that you posted today that was on Replicate? Video or OneLive?[00:46:08] swyx: Yeah. Very, very nondescript name, but it is from Minimax, which I think is a Chinese lab. The Chinese labs do surprisingly well at the video models.[00:46:20] swyx: I'm not sure it's actually Chinese. I don't know. Hold me up to that. Yep. China. It's good. Yeah, the Chinese love video. What can I say? They have a lot of training data for video. Or a more relaxed regulatory environment.[00:46:37] Alessio: Uh, well, sure, in some way. Yeah, I don't think there's much else there. I think like, you know, on the image side, I think it's still open.[00:46:45] Alessio: Yeah, I mean,[00:46:46] swyx: 11labs is now a unicorn. So basically, what is multi modality war? Multi modality war is, do you specialize in a single modality, right? Or do you have GodModel that does all the modalities? So this is [00:47:00] definitely still going, in a sense of 11 labs, you know, now Unicorn, PicoLabs doing well, they launched Pico 2.[00:47:06] swyx: 0 recently, HeyGen, I think has reached 100 million ARR, Assembly, I don't know, but they have billboards all over the place, so I assume they're doing very, very well. So these are all specialist models, specialist models and specialist startups. And then there's the big labs who are doing the sort of all in one play.[00:47:24] swyx: And then here I would highlight Gemini 2 for having native image output. Have you seen the demos? Um, yeah, it's, it's hard to keep up. Literally they launched this last week and a shout out to Paige Bailey, who came to the Latent Space event to demo on the day of launch. And she wasn't prepared. She was just like, I'm just going to show you.[00:47:43] swyx: So they have voice. They have, you know, obviously image input, and then they obviously can code gen and all that. But the new one that OpenAI and Meta both have but they haven't launched yet is image output. So you can literally, um, I think their demo video was that you put in an image of a [00:48:00] car, and you ask for minor modifications to that car.[00:48:02] swyx: They can generate you that modification exactly as you asked. So there's no need for the stable diffusion or comfy UI workflow of like mask here and then like infill there in paint there and all that, all that stuff. This is small model nonsense. Big model people are like, huh, we got you in as everything in the transformer.[00:48:21] swyx: This is the multimodality war, which is, do you, do you bet on the God model or do you string together a whole bunch of, uh, Small models like a, like a chump. Yeah,[00:48:29] Alessio: I don't know, man. Yeah, that would be interesting. I mean, obviously I use Midjourney for all of our thumbnails. Um, they've been doing a ton on the product, I would say.[00:48:38] Alessio: They launched a new Midjourney editor thing. They've been doing a ton. Because I think, yeah, the motto is kind of like, Maybe, you know, people say black forest, the black forest models are better than mid journey on a pixel by pixel basis. But I think when you put it, put it together, have you tried[00:48:53] swyx: the same problems on black forest?[00:48:55] Alessio: Yes. But the problem is just like, you know, on black forest, it generates one image. And then it's like, you got to [00:49:00] regenerate. You don't have all these like UI things. Like what I do, no, but it's like time issue, you know, it's like a mid[00:49:06] swyx: journey. Call the API four times.[00:49:08] Alessio: No, but then there's no like variate.[00:49:10] Alessio: Like the good thing about mid journey is like, you just go in there and you're cooking. There's a lot of stuff that just makes it really easy. And I think people underestimate that. Like, it's not really a skill issue, because I'm paying mid journey, so it's a Black Forest skill issue, because I'm not paying them, you know?[00:49:24] Alessio: Yeah,[00:49:25] swyx: so, okay, so, uh, this is a UX thing, right? Like, you, you, you understand that, at least, we think that Black Forest should be able to do all that stuff. I will also shout out, ReCraft has come out, uh, on top of the image arena that, uh, artificial analysis has done, has apparently, uh, Flux's place. Is this still true?[00:49:41] swyx: So, Artificial Analysis is now a company. I highlighted them I think in one of the early AI Newses of the year. And they have launched a whole bunch of arenas. So, they're trying to take on LM Arena, Anastasios and crew. And they have an image arena. Oh yeah, Recraft v3 is now beating Flux 1. 1. Which is very surprising [00:50:00] because Flux And Black Forest Labs are the old stable diffusion crew who left stability after, um, the management issues.[00:50:06] swyx: So Recurve has come from nowhere to be the top image model. Uh, very, very strange. I would also highlight that Grok has now launched Aurora, which is, it's very interesting dynamics between Grok and Black Forest Labs because Grok's images were originally launched, uh, in partnership with Black Forest Labs as a, as a thin wrapper.[00:50:24] swyx: And then Grok was like, no, we'll make our own. And so they've made their own. I don't know, there are no APIs or benchmarks about it. They just announced it. So yeah, that's the multi modality war. I would say that so far, the small model, the dedicated model people are winning, because they are just focused on their tasks.[00:50:42] swyx: But the big model, People are always catching up. And the moment I saw the Gemini 2 demo of image editing, where I can put in an image and just request it and it does, that's how AI should work. Not like a whole bunch of complicated steps. So it really is something. And I think one frontier that we haven't [00:51:00] seen this year, like obviously video has done very well, and it will continue to grow.[00:51:03] swyx: You know, we only have Sora Turbo today, but at some point we'll get full Sora. Oh, at least the Hollywood Labs will get Fulsora. We haven't seen video to audio, or video synced to audio. And so the researchers that I talked to are already starting to talk about that as the next frontier. But there's still maybe like five more years of video left to actually be Soda.[00:51:23] swyx: I would say that Gemini's approach Compared to OpenAI, Gemini seems, or DeepMind's approach to video seems a lot more fully fledged than OpenAI. Because if you look at the ICML recap that I published that so far nobody has listened to, um, that people have listened to it. It's just a different, definitely different audience.[00:51:43] swyx: It's only seven hours long. Why are people not listening? It's like everything in Uh, so, so DeepMind has, is working on Genie. They also launched Genie 2 and VideoPoet. So, like, they have maybe four years advantage on world modeling that OpenAI does not have. Because OpenAI basically only started [00:52:00] Diffusion Transformers last year, you know, when they hired, uh, Bill Peebles.[00:52:03] swyx: So, DeepMind has, has a bit of advantage here, I would say, in, in, in showing, like, the reason that VO2, while one, They cherry pick their videos. So obviously it looks better than Sora, but the reason I would believe that VO2, uh, when it's fully launched will do very well is because they have all this background work in video that they've done for years.[00:52:22] swyx: Like, like last year's NeurIPS, I already was interviewing some of their video people. I forget their model name, but for, for people who are dedicated fans, they can go to NeurIPS 2023 and see, see that paper.[00:52:32] Alessio: And then last but not least, the LLMOS. We renamed it to Ragops, formerly known as[00:52:39] swyx: Ragops War. I put the latest chart on the Braintrust episode.[00:52:43] swyx: I think I'm going to separate these essays from the episode notes. So the reason I used to do that, by the way, is because I wanted to show up on Hacker News. I wanted the podcast to show up on Hacker News. So I always put an essay inside of there because Hacker News people like to read and not listen.[00:52:58] Alessio: So episode essays,[00:52:59] swyx: I remember [00:53:00] purchasing them separately. You say Lanchain Llama Index is still growing.[00:53:03] Alessio: Yeah, so I looked at the PyPy stats, you know. I don't care about stars. On PyPy you see Do you want to share your screen? Yes. I prefer to look at actual downloads, not at stars on GitHub. So if you look at, you know, Lanchain still growing.[00:53:20] Alessio: These are the last six months. Llama Index still growing. What I've basically seen is like things that, One, obviously these things have A commercial product. So there's like people buying this and sticking with it versus kind of hopping in between things versus, you know, for example, crew AI, not really growing as much.[00:53:38] Alessio: The stars are growing. If you look on GitHub, like the stars are growing, but kind of like the usage is kind of like flat. In the last six months, have they done some[00:53:4

god ceo new york amazon spotify time world ai europe google china apple vision pr voice future speaking san francisco new york times phd video thinking chinese simple data predictions elon musk surprise iphone impact chatgpt legal code tesla reflecting memory ga discord reddit busy cloud lgbt flash stem honestly pros ab jeff bezos windows excited researchers unicorns lower ip tackling sort survey insane tier cto whispers vc applications f1 doc signing seal fireworks openai gemini genie academic sf organizing nvidia ux api assembly davos frontier chrome makes scarlett johansson gpt ui mm turbo aws bash soda ml lama mosaic dropbox creative writing github drafting reinvent canvas 1b bolt apis lava ruler exact stripe pico hundred dev strawberry wwdc vcs flux sander vm bt 200k taiwanese arr sora moto opus llm gartner assumption sam altman google docs nemo parting blackwell google drive sombra gpu opa ramp tbd 3b agi elia elo 5b midjourney gnome estimates bytedance grok leopold perplexity ciso dota rag haiku dx sarah silverman coursera anthropic gpus sonnets cypher george rr martin getty quill cobalt deepmind sdks ilya noam future trends sheesh v2 ttc alessio lms satya r1 ssi 8b stack overflow veo rl emerging trends itc vo2 theoretically mistral suno xai sota replicate yi mcp black forest gpts inflection graphql aitor brain trust databricks chinchillas jensen huang adept ai models nosql grand central grand central station hacker news zep hacken ethical issues cosign ai news claud gpc distro lubna tpu o3 neo4j autogpt heygen o1 gbt jeremy howard gpd quent langchain 70b exa gradients 400b minimax loras neurips 128k jeff dean elos gemini pro cerebras code interpreter icml ai winter john franco lstm r1s aws reinvent muser latent space pypy nova pro dan gross paige bailey noam brown quiet capital john frankel

Halten, kaufen, verkaufen? Das Jahresfinale des Tech-Messias Pip

Deffner & Zschäpitz: Wirtschaftspodcast von WELT

Play Episode Listen Later Dec 3, 2024 106:38

Im Dezember wird es höchste Zeit, das Depot kritisch durchzuschauen und auch steuerlich zu optimieren. Tech-Ikone Pip Klöckner, der in dieser Woche den urlaubenden Deffner ersetzt, analysiert zusammen mit Zschäpitz die Welt der Tech-Werte und verrät, bei welchen Titeln Ihr besser Gewinne mitnehmt, Verluste realisiert oder weiter stoisch anspart. Weitere Themen: - Dax 20.000 Punkte – wie der Rekord im stagnierenden Deutschland zustande gekommen ist und was er für Sparer bedeuet - D-Day-Papier der FDP – Was das dilettantische Dokument und die mediale Berichterstattung für die Liberalen bedeutet - Angriff auf Nvidia – was die KI-Chip-Startups Groq, Cerebras, Tenstorrent können und wie Anleger mitmischen - Performance-Treiber Gym – warum der Deadlift-Index seit 2023 ganze 140 Prozent gemacht hat - Biden begnadigt eigenen Sohn – welche Folgen die umfassende Amnestie von Hunter Biden für die Demokratie hat - 10XDNA – wie der Fonds von Frank Thelen wirklich abgeschnitten hat DEFFNER & ZSCHÄPITZ sind wie das wahre Leben. Wie Optimist und Pessimist. Im wöchentlichen WELT-Podcast diskutieren und streiten die Journalisten Dietmar Deffner und Holger Zschäpitz über die wichtigen Wirtschaftsthemen des Alltags. Schreiben Sie uns an: wirtschaftspodcast@welt.de Impressum: https://www.welt.de/services/article7893735/Impressum.html Datenschutzerklärung: https://www.welt.de/services/article157550705/Datenschutzerklaerung-WELT-DIGITAL.html

#222 Andrew Feldman: How Cerebras Systems Is Disrupting AI Inference Technology

Eye On A.I.

Play Episode Listen Later Nov 28, 2024 42:26

This episode is sponsored by Shopify. Shopify is a commerce platform that allows anyone to set up an online store and sell their products. Whether you're selling online, on social media, or in person, Shopify has you covered on every base. With Shopify you can sell physical and digital products. You can sell services, memberships, ticketed events, rentals and even classes and lessons. Sign up for a $1 per month trial period at http://shopify.com/eyeonai In this episode of the Eye on AI podcast, Andrew D. Feldman, Co-Founder and CEO of Cerebras Systems, unveils how Cerebras is disrupting AI inference and high-performance computing. Andrew joins Craig Smith to discuss the groundbreaking wafer-scale engine, Cerebras' record-breaking inference speeds, and the future of AI in enterprise workflows. From designing the fastest inference platform to simplifying AI deployment with an API-driven cloud service, Cerebras is setting new standards in AI hardware innovation. We explore the shift from GPUs to custom architectures, the rise of large language models like Llama and GPT, and how AI is driving enterprise transformation. Andrew also dives into the debate over open-source vs. proprietary models, AI's role in climate mitigation, and Cerebras' partnerships with global supercomputing centers and industry leaders. Discover how Cerebras is shaping the future of AI inference and why speed and scalability are redefining what's possible in computing. Don't miss this deep dive into AI's next frontier with Andrew Feldman. Like, subscribe, and hit the notification bell for more episodes! Stay Updated: Craig Smith Twitter: https://twitter.com/craigss Eye on A.I. Twitter: https://twitter.com/EyeOn_AI (00:00) Intro to Andrew Feldman & Cerebras Systems (00:43) The rise of AI inference (03:16) Cerebras' API-powered cloud (04:48) Competing with NVIDIA's CUDA (06:52) The rise of Llama and LLMs (07:40) OpenAI's hardware strategy (10:06) Shifting focus from training to inference (13:28) Open-source vs proprietary AI (15:00) AI's role in enterprise workflows (17:42) Edge computing vs cloud AI (19:08) Edge AI for consumer apps (20:51) Machine-to-machine AI inference (24:20) Managing uncertainty with models (27:24) Impact of U.S.–China export rules (30:29) U.S. innovation policy challenges (33:31) Developing wafer-scale engines (34:45) Cerebras' fast inference service (37:40) Global partnerships in AI (38:14) AI in climate & energy solutions (39:58) Training and inference cycles (41:33) AI training market competition

HPC News Bytes – 20241118

@HPCpodcast with Shahin Khan and Doug Black

Play Episode Listen Later Nov 18, 2024

- Supercocmputing-24 conference starts today - TSMC, CHIPS Act, semiconductor demand - Sandia National Lab and Cerebras [audio mp3="https://orionx.net/wp-content/uploads/2024/11/HPCNB_20241118.mp3"][/audio] The post HPC News Bytes – 20241118 appeared first on OrionX.net.

learning ai deep chips physics quantum computing tsmc simulations chips act hpc supercomputing cerebras news bytes

TWIHPC Episode 373: Cerebras Races into Inference; AMD Adds ZT Systems to AI Portfolio

This Week in HPC

Play Episode Listen Later Nov 14, 2024 18:00

Addison Snell and Kevin Jackson analyze the top stories from the Hot Chips conference.

races adds inference kevin jackson hot chips cerebras addison snell

E160: Waymo $5.6b cap raise; Cerebras 2x in last few months in secondary market; Starlink selling out in Africa, biggest subscription biz ever?

This Week in Pre-IPO Stocks

Play Episode Listen Later Oct 29, 2024 35:11

Send us a text

africa selling raise subscription starlink waymo secondary market cerebras

World's Fastest AI Inference: A Conversation with SambaNova's Innovators

The Generative AI Meetup Podcast

Play Episode Listen Later Oct 25, 2024 63:25

This week, Shashank and Mark sit down with SambaNova Systems, a leading AI chip startup competing with tech giants like Nvidia and Cerebras. Joined by SambaNova's Director of Machine Learning, Urmish, and founding team member Raghu, they explore how SambaNova's reconfigurable data flow architecture is changing the game in AI inference and training. They discuss the company's unique hardware, fast inference capabilities, memory optimizations, and what the future holds for AI chip innovation. Learn what it takes to build high-performance AI systems and where the industry is headed next!

director ai conversations machine learning nvidia innovators fastest inference raghu shashank cerebras sambanova systems

Why you should sell your Nvidia stock right now! #notfinancialadvice

The Generative AI Meetup Podcast

Play Episode Listen Later Oct 20, 2024 104:00

In this week's episode, Shashank and Mark dive deep into Nvidia's current dominance in the AI hardware space and the potential challengers on the horizon. As companies like AMD, Cerebras, and SambaNova work to chip away at Nvidia's massive lead, is it time to reconsider your portfolio? Plus, we explore cutting-edge applications of generative AI in DNA analysis, gene therapy, and how AI agents could revolutionize everything from medical treatments to productivity tools.

ai dna stock nvidia amd shashank cerebras

MM #230: NVIDIA's Next Stock Move, Tesla's Robots, Bitcoin & The Republican Economic Plan with Byron Donalds

Market Mondays

Play Episode Listen Later Oct 15, 2024 117:43

In this episode of Market Mondays, we dive deep into a wide range of topics. We discuss the latest from SpaceX, Tesla's Robo Taxi day, and the Tesla Optimus robot. We explore whether owning all of Elon Musk's companies could be a smart hedge for the future and the current outlook for Bitcoin this month. We analyze why Genmab's stock hasn't moved despite strong fundamentals, and the incredible rise of ADMA stock—will it keep climbing or has it peaked? We also break down Nvidia's DGX B200, the impact of AMD's new chip, and consider if Cerebras can truly compete with NVIDIA. For long-term investors, we talk about stocks like VNO, and the potential of TSM through earnings, as well as the future of XRP and the comparison between Amazon and Mercado Libre.With the uncertainty in today's world, what does the market outlook look like for the end of the year? We also discuss potential entry points for Boeing, the long-term prospects for IONQ, and Microsoft's earnings. We wrap up with a look at hotel stocks like Marriott and Hilton, as well as the DOJ's move against Google and what it means for the stock.We also had a special guest, Congressman Byron Donalds, who shared insights on the Republican economic plan, Trump's election, black male voters, and more.#MarketMondays #Tesla #SpaceX #Bitcoin #ElonMusk #Stocks #Investing #Nvidia #AMD #CongressmanByronDonalds #Economy #Google #StockMarket #Microsoft #Boeing #TSMSupport this podcast at — https://redcircle.com/marketmondays/donationsAdvertising Inquiries: https://redcircle.com/brandsPrivacy & Opt-Out: https://redcircle.com/privacy

#185 - Movie Gen, ChatGPT Canvas, OpenAI's VC Round, SB 1047 Vetoed

Let's Talk AI

Play Episode Listen Later Oct 12, 2024 89:37 Transcription Available

Our 185th episode with a summary and discussion of last week's big AI news! With hosts Andrey Kurenkov and guest host Gavin Purcell from the AI for Humans podcast. Read out our text newsletter and comment on the podcast at https://lastweekin.ai/. If you would like to become a sponsor for the newsletter, podcast, or both, please fill out this form. Email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai In this episode: Meta's MovieGen introduces innovative features in AI video generation, alongside OpenAI's real-time speech API and expanded ChatGPT capabilities. Mio's foundation model and Apple's Depth Pro enhance multimodal AI inputs and precise 3D imaging for AR, VR, and robotics. Microsoft and OpenAI's strategic advancements highlight significant financial moves and AI enhancements, including Microsoft's enhanced Copilot. AI policy discussions intensify as California's vetoed bill sparks debates on regulation, alongside Google's $1 billion investment to expand AI infrastructure in Thailand. Timestamps + Links: (00:00:00) Intro / Banter (00:02:51) Response to listener comments / corrections Tools & Apps(00:03:48) Meta announces Movie Gen, an AI-powered video generator (00:14:28) OpenAI launches new ‘Canvas' ChatGPT interface tailored to writing and coding projects (00:19:31) OpenAI's DevDay brings Realtime API and other treats for AI app developers (00:24:43) Black Forest Labs releases Flux 1.1 Pro and an API (00:28:30) Microsoft gives Copilot a voice and vision in its biggest redesign yet (00:32:36) Pika 1.5 is now live — AI video generator just got major upgrades Applications & Business(00:37:49) OpenAI closes the largest VC round of all time (00:45:23) Google brings ads to AI Overviews as it expands AI's role in search (00:51:05) Anthropic hires OpenAI co-founder Durk Kingma (00:51:49) OpenAI's newest creation is raising shock, alarm, and horror among staffers: a new logo (00:53:45) Waymo to add Hyundai EVs to robotaxi fleet under new multiyear deal (00:57:28) Cerebras, an A.I. Chipmaker Trying to Take On Nvidia, Files for an I.P.O. (00:59:18) Y Combinator is being criticized after it backed an AI startup that admits it basically cloned another AI startup Research & Advancements(01:03:30) Were RNNs All We Needed? (01:06:52) MIO: A Foundation Model on Multimodal Tokens (01:09:20) Apple releases Depth Pro, an AI model that rewrites the rules of 3D vision Policy & Safety(01:13:08) California Governor Vetoes Sweeping A.I. Legislation (01:18:02) Judge blocks California's new AI law in case over Kamala Harris deepfake Musk reposted (01:20:41) Google to invest $1 billion in Thailand to build a data center and accelerate AI growth Synthetic Media & Art(01:21:58) AI reading coach startup Ello now lets kids create their own stories (01:25:13) Outro

E156: Monzo valued at $5.9B following tender offer; OpenAI partners with Hearst; Maven Clinic raises $125M, valued at $1.7B; ByteDance launches AI earbuds, valued at $295B; Cerebras faces IPO delay amid US national security review; Sierra in talks for $4B

This Week in Pre-IPO Stocks

Play Episode Listen Later Oct 11, 2024 7:24

Send us a text

offer partners faces amid delay openai launches raises tender valued bytedance hearst earbuds monzo 125m us national security maven clinic cerebras

Chip-ing away at Nvidia's Dominance? | The Brainstorm EP 64

FYI - For Your Innovation

Play Episode Listen Later Oct 9, 2024 13:19

Are there any chip makers that can compete with Nvidia? This week, Next Generation Internet Director of Research Frank Downing, and Autonomous Technology and Robotics Director of Research Sam Korus discuss the S1 filing of Cerebras and what that might mean for the AI hardware space.If you know ARK, then you probably know about our long-term research projections, like estimating where we will be 5-10 years from now! But just because we are long-term investors, doesn't mean we don't have strong views and opinions on breaking news. In fact, we discuss and debate this every day. So now we're sharing some of these internal discussions with you in our new video series, “The Brainstorm”, a co-production from ARK and Public.com. Tune in every week as we react to the latest in innovation. Here and there we'll be joined by special guests, but ultimately this is our chance to join the conversation and share ARK's quick takes on what's going on in tech today.Key Points From This Episode:00:00:00 Intro00:00:52 An Overview of Cerebras, and Their Plans to IPOFor more updates on Public.com:Website: https://public.com/YouTube: @publicinvestTwitter: https://twitter.com/public The Rundown: https://podcasts.apple.com/us/podcast/the-rundown/id1726048251To learn more about ARK: https://arkinv.st/ARKInvest For more updates, follow us on:- Twitter: https://arkinv.st/Twitter- LinkedIn: https://arkinv.st/LinkedIn- Facebook: https://arkinv.st/Facebook- Instagram: https://arkinv.st/InstagramDisclosure: http://arkinv.st/39rzF94

ai public chip ark nvidia dominance brainstorm s1 rundown cerebras autonomous technology

【天下零時差10.07.24】台新併新光，決戰臨時股東會的三大變數；AI晶片新秀Cerebras會撼動輝達龍頭地位嗎？；鴻海科技日登場，今年亮點是什麼？

聽天下：天下雜誌Podcast

Play Episode Listen Later Oct 6, 2024 6:37

週一天下零時差關注以下財經大事：一、台新金併購新光金，這禮拜臨時股東會決戰，還有三大變數二、AI晶片新秀Cerebras準備上市，會撼動輝達龍頭地位嗎？三、鴻海科技日登場，今年亮點是什麼？文：伍芬婕、鄧凱元、辜樹仁製作團隊：黃柏維＊立即下載天下雜誌App，享受更好的閱聽體驗：https://bit.ly/3PqHlNc ＊訂閱天下全閱讀：https://bit.ly/3STpEpV ＊意見信箱：bill@cw.com.tw -- Hosting provided by SoundOn

hosting soundon cerebras

E154: OpenAI secures $6.6B round, expands capabilities; Cerebras files for IPO with 220% revenue growth; Anthropic hires former OpenAI co-founder, valued at $25.2B; Flexport restructures to improve profitability, valuation plummets; Epic Games files antit

This Week in Pre-IPO Stocks

Play Episode Listen Later Oct 5, 2024 9:28

Send us a textSubscribe to AG Dillon Pre-IPO Stock Research at agdillon.com/subscribe;- Wednesday = secondary market valuations, revenue multiples, performance, index fact sheets- Saturdays = pre-IPO news and insights, webinar replays00:07 | OpenAI Secures $6.6B Round, Expands Capabilities- Introduced real-time voice assistant capabilities through API for businesses- New API features and developer events announced to engage 3M+ developers- Workforce doubled to 1,700 employees from 770 in November 2023- Raised $6.6B funding round at $157B valuation (primary round)- Secured $4B revolving credit line with potential to expand to $10B in liquidity01:31 | Cerebras Files for IPO with 220% Revenue Growth- Specializes in AI hardware, particularly wafer-scale chips for AI training- Revenue of $136.4M in H1 2024 vs. $78.7M for full year 2023- Faces customer concentration risk with 87% of H1 2024 revenue from G42- Raised $715M in venture capital, valuation at $6.7B (secondary), up 124% since June 202403:11 | Anthropic Hires Former OpenAI Co-Founder, Valued at $25.2B- Added Durk Kingma, co-founder of OpenAI, to its growing talent pool- Continues attracting top talent from OpenAI and other tech giants- Positioned as a leader in responsible AI development under CEO Dario Amodei- Secondary market valuation: $25.2B (+40% vs Jan 2024), rumored to be raising at $40B04:09 | Flexport Restructures to Improve Profitability, Valuation Plummets- Logistics tech company plans to sublease warehouse space, integrate sales teams- Faced layoffs (20% in October, 15% in January) and loss of key customers like Crocs- Aiming for global expansion by 2027 with an asset-light model- Secondary market valuation: $1.95B (-75% vs Jan 2022 primary round)05:15 | Epic Games Files Antitrust Lawsuit Against Google and Samsung- Filed lawsuit over Samsung's "Auto Blocker" feature, alleging it blocks third-party app stores- Claims Google Play's dominance is reinforced by Samsung's preinstalled feature- Lawsuit follows mixed success in earlier legal battles against Google and Apple- Secondary market valuation: $16.6B (-26% vs Feb 2024 round)06:20 | CoreWeave Raises $421M, Valued at $23B- Cloud provider specializing in Nvidia GPUs for AI computing- Raised $421M in Series B, backed by Nvidia, Magnetar Capital, Blackstone- Secured $2.3B in debt financing in August 2024- Appointed former Google Cloud VP of finance as CFO, signaling potential IPO- Currently raising a primary round at a $23B valuation07:34 | Pre-IPO Stock Market Weekly Performance- agdillon.com/subscribe to receive weekly pdf report in your inbox- Pre-IPO +0.77% for week, +67.25% for last 1yr- Up week: Synk +17.1%, OpenAI +13.7%, Canva +6.9%, Rippling +5.9%, Notion +5.3%- Down week: xAI -5.3%, CoreWeave -2.2%, Cohere -1.7%, Epic Games -1.6%, Neuralink -0.6%- Top valuations: ByteDance $301b, SpaceX $229b, OpenAI $157b, Stripe $84b, Databricks $46b08:20 | Pre-IPO Stock Vintage Index Weekly Performance- agdillon.com/subscribe to receive weekly pdf report in your inbox- 2024 Vintage Index top contributors since inception: Revolut +201%, Rippling +113%, Anduril +76%, OpenAI +56%, Klarna +40% … the 2024 Vintage Index is up 63% since its inception, or year to date 2024- Key metric averages for all Vintage Indexes 5 years old or older…472% cumulative return since inception58% realized, distributed to investors5.72 TVPI; 3.31 DPI, 2.41 RVPI4.1 years to return the fund

An AI IPO, 20 Years of RB

Motley Fool Money

Play Episode Listen Later Oct 4, 2024 39:52

Cerebras is approaching chipmaking differently, can it carve out a space for itself in an industry of titans? (00:45) Asit Sharma and Jason Moser discuss: The dock workers strike, its daily cost, and the industries it could impact most. Upcoming AI chip IPO Cerebras, and how the company is approaching high-performance chips differently than the competition. Fresh earnings from: Nike, Paychex, and McCormick. (19:04) October 2024 marks 20 years of Rule Breakers at The Motley Fool. To celebrate, we're airing a portion of a conversation with David and former Rule Breakers analyst Matt Argersinger from our premium Epic Opportunities podcast. David fielded questions from our investing team about his own investing process, reflected on his 6 traits of a Rule Breaker and the companies that the framework led him to follow. (35:56) Jason and Asit break down two stocks on their radar: Pepsico and Joby Aviation. Stocks discussed: NKE, PAYX, MCK, PEP, JOBY Motley Fool Epic members can access the full conversation with David: Here on the TMF site (login required) On Spotify here after linking their accounts Host: Dylan Lewis Guests: Jason Moser, Asit Sharma, David Gardner, Rick Engdahl Engineers: Rick Engdahl, Austin Morgan Learn more about your ad choices. Visit megaphone.fm/adchoices

spotify stocks pepsico pep mccormick motley fool rule breakers rule breaker paychex tmf david gardner joby aviation mck nke cerebras jason moser payx

AI's Billion-Dollar Moves and Blockchain Shakeups: Navigating Market Shifts

Digital Currents

Play Episode Listen Later Oct 4, 2024 50:26

In this episode, we explored OpenAI's monumental $157 billion valuation following its $6.6 billion fundraising and the strategic implications of restructuring its governance model. We also cover Cerebras' IPO filing as the AI chipmaker prepares to challenge Nvidia's dominance in the market. On the blockchain front, we discussed the latest developments in the Bitcoin market and how short-term holders can react to global geopolitical tensions. Finally, we reviewed trends in high-performance computing and AI infrastructure. Remember To Stay Current! To learn more, visit us on the web at https://www.morgancreekcap.com/morgan-creek-digital. To speak to a team member or sign up for additional content, please email mcdigital@morgancreekcap.com

ai navigating market bitcoin blockchain shifts ipo openai nvidia shakeup billion dollar cerebras

Tue. 10/01 – Running This Pod Through NotebookLM

Techmeme Ride Home

Play Episode Listen Later Oct 1, 2024 32:10

Do we have the first IPO of the AI era? Do we have the first AI model beyond the transformer architecture? Microsoft has a bunch of new AI tools inside Windows. We try to explain that whole controversy around PearAI. And what about that NotebookLM feature that lets you create a two-hander podcast out of any text.Links:AI chipmaker Cerebras files for IPO to take on Nvidia (CNBC)MIT spinoff Liquid debuts non-transformer AI models and they're already state-of-the-art (VentureBeat)Microsoft Copilot can now read your screen, think deeply, and speak aloud to you (TechCrunch)Oura Nears $500 Million in Annual Revenue and Readies New Ring (Bloomberg)Y Combinator is being criticized after it backed an AI startup that admits it basically cloned another AI startup (TechCrunch)NotebookLM's automatically generated podcasts are surprisingly effective (Simon Willison's Blog)See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

ai running microsoft windows ipo liquid notebooklm annual revenue cerebras simon willison

EP-119 11x.ai's $50m Boost

AI Briefing Room

Play Episode Listen Later Oct 1, 2024 2:43

welcome to wall-e's tech briefing for tuesday, october 1! uncover the latest tech developments: 11x.ai's $50m series b funding: 11x.ai, developers of ai sales representatives, garners $50 million in series b funding led by andreessen horowitz, now valued at approximately $350 million. google's $1b investment in thailand: google commits $1 billion to establish a new data center in chonburi, enhancing thailand's cloud infrastructure and digital economy. cerebras systems ipo: ai chip startup cerebras systems files for an initial public offering to challenge industry giant nvidia with its advanced wse-3 chip. y combinator & pearai controversy: y combinator faces backlash for funding pearai, an ai startup accused of cloning code from another ai coding editor, leading to scrutiny over vc due diligence practices. tune in for more tech insights tomorrow!

google dreams thailand boost expansion cerebras

20VC: Benchmark's Eric Vishria on Where is the Value in AI: Chips, Models or Apps | Why Nvidia Will Not Be The Only Game in Town | The Commoditisation of Foundation Models | Which AI Apps Have Sustaining Value vs Hype and Short Term Revenue

The Twenty Minute VC: Venture Capital | Startup Funding | The Pitch

Play Episode Listen Later Sep 25, 2024 62:11

Eric Vishria is a General Partner @ Benchmark Capital, one of the world's leading venture firms. At Benchmark, Eric has served on over 10 boards including Confluent (CFLT), Amplitude (AMPL), Benchling, Contentful, Cerebras and several other private companies. Prior to joining Benchmark, Eric was the Co‐Founder and CEO of RockMelt, acquired by Yahoo in 2013. In Today's Episode with Eric Vishria We Discuss: 1. How to Make Money Investing in AI Today: How does Eric think through where value will accrue in the stack between chips, models and applications? Why does Eric believe foundation models are the fastest commoditising asset in history? Why does Eric believe that Nvidia will not be the only game in town in the next 3-5 years? 2. How to Invest in AI Application Layer Successfully: How does Eric analyse between a standalone and deep product vs a product that foundation model will commodities and incorporate into their feature set? How does Eric differentiate between the 10 different players all going after customer service, or sales tools or data analyst products etc? How does Eric analyse the quality of revenue of these AI application layer companies? What does he mean when he describes their revenue as "sugar high"? 3. How the Best VC Firm Makes Decisions: What is the decision-making process for all new deals in Benchmark? As specifically as possible, how does the voting process inside Benchmark work? What deal was the most contentious deal that went through? What did the partnership learn? How has the Benchmark decision-making process changed over 10 years? 4. Does AI Break Venture Capital Models: Does the price of AI deals and size of their rounds break the Benchmark model? Will foundation model companies all be acquired by the larger cloud providers? Unless multiples reflate in the public markets, does venture as an asset class have hope? Why does AI make paying ludicrously high prices potentially rational?

ceo game ai business co founders investing invest hype apps revenue yahoo chips nvidia sustaining short term in today benchmark foundation models contentful cerebras rockmelt

#181 - Google Chatbots, Cerebras vs Nvidia, AI Doom, ElevenLabs Controversy

Let's Talk AI

Play Episode Listen Later Sep 15, 2024 137:06 Transcription Available

Our 181st episode with a summary and discussion of last week's big AI news! With hosts Andrey Kurenkov and Jeremie Harris Read out our text newsletter and comment on the podcast at https://lastweekin.ai/ If you would like to become a sponsor for the newsletter, podcast, or both, please fill out this form. Email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai In this episode: - Google's AI advancements with Gemini 1.5 models and AI-generated avatars, along with Samsung's lithography progress. - Microsoft's Inflection usage caps for Pi, new AI inference services by Cerebrus Systems competing with Nvidia. - Biases in AI, prompt leak attacks, and transparency in models and distributed training optimizations, including the 'distro' optimizer. - AI regulation discussions including California's SB1047, China's AI safety stance, and new export restrictions impacting Nvidia's AI chips. Timestamps + Links: (00:00:00) Intro / Banter (00:03:08)Response to listener comments / corrections Tools & Apps(00:09:19) Google's custom AI chatbots have arrived (00:12:52) Google releases three new experimental AI models (00:17:14) Google Gemini will let you create AI-generated people again (00:22:32) Five months after Microsoft hired its founders, Inflection adds usage caps to Pi (00:26:42:) Plaud takes a crack at a simpler AI pin Applications & Business(00:30:31) Cerebras Systems throws down gauntlet to Nvidia with launch of ‘world's fastest' AI inference service (00:41:06) Nvidia announces $50 billion stock buyback (00:46:24) OpenAI in talks to raise funding that would value it at more than $100 billion (00:50:44) OpenAI Aims to Release New AI Model, ‘Strawberry,' in Fall (00:52:53) 3 Co-Founders Leave French AI Startup H Amid ‘Operational Differences' (00:57:29) Samsung to Adopt High-NA Lithography Alongside Intel, Ahead of TSMC (01:02:11) Unitree's $16,000 G1 could become the first mainstream humanoid robot Projects & Open Source(01:04:59) Meta leads open-source AI boom, Llama downloads surge 10x year-over-year (01:09:08) A_Preliminary_Report_on_DisTrO. Research & Advancements(01:13:56) Diffusion Models Are Real-Time Game Engines (01:23:18) LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet (01:32:21) Interviewing AI researchers on automation of AI R&D (01:40:33) Anthropic releases AI model system prompts, winning praise for transparency Policy & Safety(01:47:12) U.S. AI Safety Institute Signs Agreements Regarding AI Safety Research, Testing and Evaluation With Anthropic and OpenAI (01:50:46) China's Views on AI Safety Are Changing—Quickly (01:56:27) Poll: 7 in 10 Californians Support SB1047, Will Blame Governor Newsom for AI-Enabled Catastrophe if He Vetoes (02:01:31) Elon Musk voices support for California bill requiring safety tests on AI models (02:03:55) Chinese Engineers Reportedly Accessing NVIDIA's High-End AI Chips Through Decentralized “GPU Rental Services” (02:08:25) U.S. gov't tightens China restrictions on supercomputer component sales Synthetic Media & Art(02:11:13) Actors Say AI Voice-Over Generator ElevenLabs Cloned Likenesses (02:14:06) Outro

AI at the speed of thought

The Generative AI Meetup Podcast

Play Episode Listen Later Sep 6, 2024 42:35

In this week's episode of the Generative AI Meetup Podcast, join hosts Mark and Shashank as they delve into the groundbreaking developments from Cerebras. Discover how their new inference solution is setting new benchmarks for speed and efficiency in AI model responses. The episode features insights into the technology behind these advancements and explores potential impacts on various industries. Tune in to get a glimpse of the future of AI, made faster and more accessible than ever.

ai discover speed shashank cerebras

Cerebras Expands by Pushing AI Inference Limits

Tech Disruptors

Play Episode Listen Later Sep 5, 2024 40:20

A race to deliver the fastest AI system is emerging, resulting in a crop of new companies with innovative approaches to AI processing. Cerebras returns to the Tech Disruptors podcast studios to discuss the broadening AI market opportunity for its wafer scale engine (WSE) chip. CEO Andrew Feldman sits down with Bloomberg Intelligence Senior Hardware analyst Woo Jin Ho to discuss the future evolution of AI compute and how Cerebras aims to leverage the WSE-3 processor to unlock the $40 billion inference market by delivering AI responses 20x faster at one-fifth price of hyperscale cloud.

ai limits expands inference cerebras andrew feldman wse

Nuevas Herramientas IA Ideogram, Grok 2 y Cerebras

Inteligencia Artificial

Play Episode Listen Later Sep 4, 2024

En el episodio más reciente del podcast «Inteligencia Artificial», analizamos las últimas innovaciones en el mundo de la IA. Hoy exploramos tres herramientas clave que están transformando el campo de la inteligencia artificial: Ideogram v2, Grok 2, y Cerebras. Si estás interesado en el presente y futuro de la IA, sigue leyendo para conocer las […] Origen

ia nuevas inteligencia artificial herramientas origen grok cerebras ideogram

Що з Дуровим та Telegram | Ріст ІТ-експорту | Notion припиняє роботу в росії — DOU News #162

DOU Podcast

Play Episode Listen Later Sep 2, 2024 20:41

uber discord telegram nvidia gems notion klarna anthropic dou cerebras jenko

Wed. 08/28 – Strawberry Emoji

Techmeme Ride Home

Play Episode Listen Later Aug 28, 2024 16:56

I explain why everyone has been posting strawberries in AI circles. It's cause of a potential new breakthrough at OpenAI. Cerebras launches the first new AI chip competition to Nvidia. China has reportedly burrowed into US ISPs. And continuing interesting details pouring out of that Pavel Durov situation.Sponsors:Timeline.com/rideLinks:OpenAI Shows ‘Strawberry' AI to the Feds and Uses It to Develop ‘Orion' (The Information)OpenAI Races to Launch ‘Strawberry' Reasoning AI to Boost Chatbot Business (The Information)Cerebras Systems throws down gauntlet to Nvidia with launch of ‘world's fastest' AI inference service (SiliconAngle)Chinese government hackers penetrate U.S. internet providers to spy (Washington Post)Google Meet's automatic AI note-taking is here (The Verge)Instagram adds what photos have always needed: words (The Verge)Telegram Founder Was Wooed and Targeted by Governments (WSJ)Can Tech Executives Be Held Responsible for What Happens on Their Platforms? (NYTimes)See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

ai china openai nvidia feds targeted emoji strawberry pavel durov cerebras

10th Anniversary Podcast! How Nvidia and AI are Fueling the Stock Market

Be Wealthy & Smart

Play Episode Listen Later May 27, 2024 12:13

My 10th Anniversary Podcast! Discover how Nvidia and AI are fueling the stock market. Are you investing well for financial freedom...or not? Financial freedom is a combination of money, compounding and time (my McT Formula). How well you invest, makes a huge difference to your financial future and lifestyle. If you only knew where to invest for the long-term, what a difference it would make, because the difference between investing $100k and earning 2% or 10% on your money over 30 years, is the difference between it growing to $181,136 or $1,744,940, an increase of over $1.5 million dollars. Your compounding rate, and how well you invest, matters! INTERESTED IN THE BE WEALTHY & SMART VIP EXPERIENCE? -Asset allocation model with ticker symbols and % to invest -Monthly investing webinars with Linda -Private Facebook group with daily insights -Weekly stock market commentary email -Lifetime access -US and foreign investors, no minimum $ amount required Extending the special offer, enjoy a 50% savings on the VIP Experience by using promo code "SAVE50". More information is here. If you would like a complimentary consultation with Linda to answer your questions about the VIP Experience, set an appointment here. WANT TO INVEST IN STOCKS PRE-IPO? #Ad Invest in the same private companies like some billionaires. If you are an Accredited Investor (must have $1 million of net worth excluding your primary residence or $200k income or $300k joint income, or be a registered representative), you qualify to invest in over 50 private companies. Minimum investment is $2,500. Sign up to receive a $500 credit toward your investment from Linqto, here: https://www.linqto.com/signup?r=e9tdhbl49v PLEASE REVIEW THE PODCAST ON ITUNES If you enjoyed this episode, please subscribe and leave a review. I love hearing from you! I so appreciate it! SUBSCRIBE TO BE WEALTHY & SMART Click Here to Subscribe Via iTunes Click Here to Subscribe Via Stitcher on an Android Device Click Here to Subscribe Via RSS Feed PLEASE LEAVE A BOOK REVIEW FOR THE CRYPTO INVESTING BOOK Get my book, "3 Steps to Quantum Wealth: The Wealth Heiress' Guide to Financial Freedom by Investing in Cryptocurrencies". After you purchase the book, go here for your Crypto Book bonus: https://lindapjones.com/bookbonus PLEASE LEAVE A BOOK REVIEW FOR WEALTH BOOK Leave a book review on Amazon here. Get my book, “You're Already a Wealth Heiress, Now Think and Act Like One: 6 Practical Steps to Make It a Reality Now!” Men love it too! After all, you are Wealth Heirs. :) Available for purchase on Amazon. International buyers (if you live outside of the US) get my book here. WANT MORE FROM LINDA? Check out her programs. Join her on Instagram. WEALTH LIBRARY OF PODCASTS Listen to the full wealth library of podcasts from the beginning. Use the search bar in the upper right corner of the page to search topics. SPECIAL DEALS #Ad Protect yourself online with a Virtual Private Network (VPN). Get 3 MONTHS FREE when you sign up for a NORD VPN plan here: https://ref.nordvpn.com/PjngPVgYXBs #Ad To safely and securely store crypto, I recommend using a Tangem wallet. Get a 10% discount when you purchase here: Https://tangem.com/en/?promocode=767FCF #Ad If you are looking to simplify your crypto tax reporting, use Koinly. It is highly recommended and so easy for tax reporting. You can save $20, click here. Be Wealthy & Smart,™ is a personal finance show with self-made millionaire Linda P. Jones, America's Wealth Mentor.™ Learn simple steps that make a big difference to your financial freedom. (Some links are affiliate links. There is no additional cost to you.)

america amazon ai guide men discover international investing financial lifetime cryptocurrency financial freedom asset stock market nvidia minimum private equity fueling extending practical steps 10th anniversary make it accredited investors virtual private networks vip experience save50 wealth mentor cerebras act like one linda p jones koinly tangem reality now linqto wealth heiress be wealthy smart

Powering AI with the World's Largest Computer Chip with Joel Hestness - #684

This Week in Machine Learning & Artificial Intelligence (AI) Podcast

Play Episode Listen Later May 13, 2024 55:06

Today we're joined by Joel Hestness, principal research scientist and lead of the core machine learning team at Cerebras. We discuss Cerebras' custom silicon for machine learning, Wafer Scale Engine 3, and how the latest version of the company's single-chip platform for ML has evolved to support large language models. Joel shares how WSE3 differs from other AI hardware solutions, such as GPUs, TPUs, and AWS' Inferentia, and talks through the homogenous design of the WSE chip and its memory architecture. We discuss software support for the platform, including support by open source ML frameworks like Pytorch, and support for different types of transformer-based models. Finally, Joel shares some of the research his team is pursuing to take advantage of the hardware's unique characteristics, including weight-sparse training, optimizers that leverage higher-order statistics, and more. The complete show notes for this episode can be found at twimlai.com/go/684.

ai computers chip largest aws ml powering gpus pytorch cerebras wse

Laura Dyrda, Vice President, Editor-in-Chief at Becker's Healthcare

Becker’s Healthcare Podcast

Play Episode Listen Later May 8, 2024 8:26

In this episode, regular guest Laura Dyrda, Vice President, Editor-in-Chief at Becker's Healthcare discusses Mayo Clinic selecting Cerebras as its first generative AI collaborator for large-scale, domain-specific AI capabilities for more personalized diagnosis and treatment plans and Epic's plans to roll out the early stages of "Best Care Choices for My Patient" in 2024

ai vice president chief healthcare epic becker mayo clinic cerebras

Wed. 03/13 – TikTok On The Brink?

Techmeme Ride Home

Play Episode Listen Later Mar 13, 2024 16:34

The TikTok legislation has passed the House, but it's path through the Senate is uncertain to say the least. The first real AI regulation has passed, in Europe, of course. Arm's new chips for self-driving cars. Did Cerebras just break Moore's Law with its new AI chips? Spotify has music videos. And Perplexity continues to try to become Google Search faster than Google search can become them.Sponsors:Shopify.com/rideNutrafol.com/men code: RIDEHOMELinks:TikTok bill, racing toward House passage, faces a minefield in the Senate (Washington Post)How TikTok Was Blindsided by U.S. Bill That Could Ban It (WSJ)World's first major act to regulate AI passed by European lawmakers (CNBC)Stripe in ‘no rush' to go public as cash flow turns positive (FT)Arm unveils first chip design to power self-driving cars (FT)AI startup Cerebras unveils the WSE-3, the largest chip yet for generative AI (ZDNet)Spotify adds music videos in some countries (TechCrunch)Perplexity brings Yelp data to its chatbot (The Verge)See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

spotify tiktok ai europe google house law european senate arm brink yelp google search cerebras wse

Podcasts about cerebras

Best podcasts about cerebras

FYI - For Your Innovation

This Week in Pre-IPO Stocks

Techmeme Ride Home

Let's Talk AI

The Generative AI Meetup Podcast

Be Wealthy & Smart

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Eye On A.I.

Becker’s Healthcare Podcast

Latest news about cerebras

Latest podcast episodes about cerebras

ThursdAI - Oct 30 - From ASI in a Decade to Home Humanoids: MiniMax M2's Speed Demon, OpenAI's Bold Roadmap, and 2026 Robot Revolution

E233: xAI $20B GPU financing structure nears close; Reflection AI raises $2B round builds open-source frontier lab; n8n $180M series C drives 7x valuation jump to $2.5B; Base Power $1B raise targets 200k home batteries by 2027; ICE rakes Polymarket stake

OpenAI als Plattform, Bubble Time & Peak Social Media #499

20VC: Cerebras CEO on Why Raise $1BN and Delay the IPO | NVIDIA Showing Signs They Are Worried About Growth | Concentration of Value in Mag7: Will the AI Train Come to a Halt | Can the US Supply the Energy for AI with Andrew Feldman

First, They Ignore You, Then They Laugh, Then They Fight…Then You Win

E232: Wealthfront's robo-advisor IPO sprint: $88B AUM, $194M profits; OpenAI's instant checkout: Etsy surge 16%, taps 700M users; OpenAI H1 2025: $4.3B sales vs $2.5B burn, breakeven by 2026?; Cerebras' $1.1B pre-IPO: $8.1B val, Q2 rev 11x YoY

Kfz-Zulassung & Kernfusion | Yann LeCun vs. Meta Research #498

Stocks Shrug Off Looming Government Shutdown

CommanderAI building Salesforce for the waste management industry, also, a year after filing to IPO, still-private Cerebras Systems raises $1.1B

Nvidia Challenger Cerebras' Funding, TikTok Update, Navan's Signal About Tech IPOs | Sep 22, 2025

"Cherry Blossoms" Bloom of IPO Spring: Klarna Debut, Trends & IPOs to Come

RAISE Summit 2025 : le grand débrief !

The Roaring 20's Chip Wars—A Landmark Moment In The AI Boom, Julie Choi - CMO Cerebras Systems

Beyond Big Chips: Cerebras on Inference and AI

Bittensor's Rise, Meta's Llama Goes Cloud, & AI Now Writes Your Code | E2119

HPC News Bytes – 20250407

20VC: AI Chip Wars: How Cerebras Plans to Topple NVIDIA's Dominance | Why We Have Not Reached Scaling Laws in AI | What Happens to the Cost of Inference | How We Underestimate China and Shouldn't Sell To Them with Andrew Feldman

Beyond GPUs: Cerebras' Wafer-Scale Engine for Lightning-Fast AI Inference

Episode 506: Put It On Ice

AI Disruption: DeepSeek and Cerebras

Latent.Space 2024 Year in Review

Halten, kaufen, verkaufen? Das Jahresfinale des Tech-Messias Pip

#222 Andrew Feldman: How Cerebras Systems Is Disrupting AI Inference Technology

HPC News Bytes – 20241118

TWIHPC Episode 373: Cerebras Races into Inference; AMD Adds ZT Systems to AI Portfolio

E160: Waymo $5.6b cap raise; Cerebras 2x in last few months in secondary market; Starlink selling out in Africa, biggest subscription biz ever?

World's Fastest AI Inference: A Conversation with SambaNova's Innovators

Why you should sell your Nvidia stock right now! #notfinancialadvice

MM #230: NVIDIA's Next Stock Move, Tesla's Robots, Bitcoin & The Republican Economic Plan with Byron Donalds

#185 - Movie Gen, ChatGPT Canvas, OpenAI's VC Round, SB 1047 Vetoed

E156: Monzo valued at $5.9B following tender offer; OpenAI partners with Hearst; Maven Clinic raises $125M, valued at $1.7B; ByteDance launches AI earbuds, valued at $295B; Cerebras faces IPO delay amid US national security review; Sierra in talks for $4B

Chip-ing away at Nvidia's Dominance? | The Brainstorm EP 64

【天下零時差10.07.24】台新併新光，決戰臨時股東會的三大變數；AI晶片新秀Cerebras會撼動輝達龍頭地位嗎？；鴻海科技日登場，今年亮點是什麼？

E154: OpenAI secures $6.6B round, expands capabilities; Cerebras files for IPO with 220% revenue growth; Anthropic hires former OpenAI co-founder, valued at $25.2B; Flexport restructures to improve profitability, valuation plummets; Epic Games files antit

An AI IPO, 20 Years of RB

AI's Billion-Dollar Moves and Blockchain Shakeups: Navigating Market Shifts

Tue. 10/01 – Running This Pod Through NotebookLM

EP-119 11x.ai's $50m Boost

20VC: Benchmark's Eric Vishria on Where is the Value in AI: Chips, Models or Apps | Why Nvidia Will Not Be The Only Game in Town | The Commoditisation of Foundation Models | Which AI Apps Have Sustaining Value vs Hype and Short Term Revenue

#181 - Google Chatbots, Cerebras vs Nvidia, AI Doom, ElevenLabs Controversy

AI at the speed of thought

Cerebras Expands by Pushing AI Inference Limits

Nuevas Herramientas IA Ideogram, Grok 2 y Cerebras

Що з Дуровим та Telegram | Ріст ІТ-експорту | Notion припиняє роботу в росії — DOU News #162

Wed. 08/28 – Strawberry Emoji

10th Anniversary Podcast! How Nvidia and AI are Fueling the Stock Market

Powering AI with the World's Largest Computer Chip with Joel Hestness - #684

Laura Dyrda, Vice President, Editor-in-Chief at Becker's Healthcare

Wed. 03/13 – TikTok On The Brink?