Podcasts about gpus

  • 1,148PODCASTS
  • 2,513EPISODES
  • 50mAVG DURATION
  • 1DAILY NEW EPISODE
  • May 28, 2025LATEST

POPULARITY

20172018201920202021202220232024

Categories



Best podcasts about gpus

Show all podcasts related to gpus

Latest podcast episodes about gpus

Marketplace
Can anyone compete with Nvidia?

Marketplace

Play Episode Listen Later May 28, 2025 25:29


Nvidia, as you probably know, makes chips — more specifically, GPUs, which are needed to power artificial intelligence systems. But as AI adoption ramps up, why does it feel like Nvidia's still the only chipmaker in the game? In this episode, why the California-based firm is, for now, peerless, and which companies may be angling to compete. Plus: Dwindling tourists worry American retailers, Dick's Sporting Goods sticks to its partly-sunny forecast and the share of single women as first-time homebuyers grows.Every story has an economic angle. Want some in your inbox? Subscribe to our daily or weekly newsletter.Marketplace is more than a radio show. Check out our original reporting and financial literacy content at marketplace.org — and consider making an investment in our future.

Marketplace All-in-One
Can anyone compete with Nvidia?

Marketplace All-in-One

Play Episode Listen Later May 28, 2025 25:29


Nvidia, as you probably know, makes chips — more specifically, GPUs, which are needed to power artificial intelligence systems. But as AI adoption ramps up, why does it feel like Nvidia's still the only chipmaker in the game? In this episode, why the California-based firm is, for now, peerless, and which companies may be angling to compete. Plus: Dwindling tourists worry American retailers, Dick's Sporting Goods sticks to its partly-sunny forecast and the share of single women as first-time homebuyers grows.Every story has an economic angle. Want some in your inbox? Subscribe to our daily or weekly newsletter.Marketplace is more than a radio show. Check out our original reporting and financial literacy content at marketplace.org — and consider making an investment in our future.

Startup Project
How Chronosphere Solved Observability in Containerized Environments to Build $1.6B Company | Uber spin-out, 5x Cheap & Impact of AI in Observability | CEO Martin Mao | Startup Project #101

Startup Project

Play Episode Listen Later May 18, 2025 50:47


Martin Mao is the co-founder and CEO of Chronosphere, an observability platform built for the modern containerized world. Prior to Chronosphere, Martin led the observability team at Uber, tackling the unique challenges of large-scale distributed systems. With a background as a technical lead at AWS, Martin brings unique experience in building scalable and reliable infrastructure. In this episode, he shares the story behind Chronosphere, its approach to cost-efficient observability, and the future of monitoring in the age of AI.What you'll learn:The specific observability challenges that arise when transitioning to containerized environments and microservices architectures, including increased data volume and new problem sources.How Chronosphere addresses the issue of wasteful data storage by providing features that identify and optimize useful data, ensuring customers only pay for valuable insights.Chronosphere's strategy for competing with observability solutions offered by major cloud providers like AWS, Azure, and Google Cloud, focusing on specialized end-to-end product.The innovative ways in which Chronosphere's products, including their observability platform and telemetry pipeline, improve the process of detecting and resolving problems.How Chronosphere is leveraging AI and knowledge graphs to normalize unstructured data, enhance its analytics engine, and provide more effective insights to customers.Why targeting early adopters and tech-forward companies is beneficial for product innovation, providing valuable feedback for further improvements and new features. How observability requirements are changing with the rise of AI and LLM-based applications, and the unique data collection and evaluation criteria needed for GPUs.Takeaways:Chronosphere originated from the observability challenges faced at Uber, where existing solutions couldn't handle the scale and complexity of a containerized environment.Cost efficiency is a major differentiator for Chronosphere, offering significantly better cost-benefit ratios compared to other solutions, making it attractive for companies operating at scale.The company's telemetry pipeline product can be used with existing observability solutions like Splunk and Elastic to reduce costs without requiring a full platform migration.Chronosphere's architecture is purposely single-tenanted to minimize coupled infrastructures, ensuring reliability and continuous monitoring even when core components go down.AI-driven insights for observability may not benefit from LLMs that are trained on private business data, which can be diverse and may cause models to overfit to a specific case.Many tech-forward companies are using the platform to monitor model training which involves GPU clusters and a new evaluation criterion that is unlike general CPU workload.The company found a huge potential by scrubbing the diverse data and building knowledge graphs to be used as a source of useful information when problems are recognized.Subscribe to Startup Project for more engaging conversations with leading entrepreneurs!→ Email updates: ⁠https://startupproject.substack.com/⁠#StartupProject #Chronosphere #Observability #Containers #Microservices #Uber #AWS #Monitoring #CloudNative #CostOptimization #AI #ArtificialIntelligence #LLM #MLOps #Entrepreneurship #Podcast #YouTube #Tech #Innovation

airhacks.fm podcast with adam bien
Accelerating LLMs with TornadoVM: From GPU Kernels to Model Inference

airhacks.fm podcast with adam bien

Play Episode Listen Later May 18, 2025 71:04


An airhacks.fm conversation with Juan Fumero (@snatverk) about: tornadovm as a Java parallel framework for accelerating data parallelization on GPUs and other hardware, first GPU experiences with ELSA Winner and Voodoo cards, explanation of TornadoVM as a plugin to existing JDKs that uses Graal as a library, TornadoVM's programming model with @parallel and @reduce annotations for parallelizable code, introduction of kernel API for lower-level GPU programming, TornadoVM's ability to dynamically reconfigure and select the best hardware for workloads, implementation of LLM inference acceleration with TornadoVM, challenges in accelerating Llama models on GPUs, introduction of tensor types in TornadoVM to support FP8 and FP16 operations, shared buffer capabilities for GPU memory management, comparison of Java Vector API performance versus GPU acceleration, discussion of model quantization as a potential use case for TornadoVM, exploration of Deep Java Library (DJL) and its ND array implementation, potential standardization of tensor types in Java, integration possibilities with Project Babylon and its Code Reflection capabilities, TornadoVM's execution plans and task graphs for defining accelerated workloads, ability to run on multiple GPUs with different backends simultaneously, potential enterprise applications for LLMs in Java including model distillation for domain-specific models, discussion of Foreign Function & Memory API integration in TornadoVM, performance comparison between different GPU backends like OpenCL and CUDA, collaboration with Intel Level Zero oneAPI and integrated graphics support, future plans for RISC-V support in TornadoVM Juan Fumero on twitter: @snatverk

Rebel FM
Rebel FM Episode 663 - 05/16/2025

Rebel FM

Play Episode Listen Later May 17, 2025 66:58


We're just a pair with a shorter show this week as we chat a bit more about Doom: The Dark Ages and the nature of PC system requirements (and the state of PC hardware and GPUs right now), plus Crashlands 2, Final Destination Bloodlines, then a bunch of your emails.  This week's music:  Finishing Move Inc. - Unchained Predator

TechLinked
Fortnite/App Store Shenanigans, Computex GPUs, Grok's breakdown + more!

TechLinked

Play Episode Listen Later May 17, 2025 9:47


Timestamps: 0:00 See ya on Wed, May 21 0:09 Epic's plan for Apple to block Fortnite 3:29 Intel Arc Pro B60, RX 9060 XT 4:27 OpenAI Codex, Grok's breakdown 5:50 MSI! 6:41 QUICK BITS INTRO 6:47 Spotify podcast play counts 7:14 The Steam data breach that wasn't 7:41 Australian rocket top fell off 8:07 BREAKING: Vader is bad guy NEWS SOURCES: https://lmg.gg/oRJxT Learn more about your ad choices. Visit megaphone.fm/adchoices

Open to Debate
Can the U.S. Outpace China in AI Through Chip Controls?

Open to Debate

Play Episode Listen Later May 16, 2025 53:15


The AI revolution is underway, and the U.S. and China are racing to the top. At the heart of this competition are semiconductors—especially advanced GPUs that power everything from natural language processing to autonomous weapons. The U.S. is betting that export controls can help check China's technological ambitions. But will this containment strategy work—or could it inadvertently accelerate China's drive for self-sufficiency? Those who think chip controls will work argue that restricting China's access gives the U.S. critical breathing room to advance AI safely, set global norms, and maintain dominance. Those who believe chip controls are inadequate, or could backfire, warn that domestic chipmakers, like Nvidia and Intel, also rely on sales from China. Cutting off access could harm U.S. competitiveness in the long run, especially if other countries don't fully align with U.S. policy.     As the race for AI supremacy intensifies, we debate the question: Can the U.S. Outpace China in AI Through Chip Controls?    Arguing Yes:     Lindsay Gorman, Managing Director and Senior Fellow of the German Marshall Fund's Technology Program; Venture Scientist at Deep Science Ventures      Will Hurd, Former U.S. Representative and CIA Officer     Arguing No:    Paul Triolo, Senior Vice President and Partner at DGA-Albright Stonebridge Group     Susan Thornton, Former Diplomat; Visiting Lecturer in Law and Senior Fellow at the Yale Law School Paul Tsai China Center    Emmy award-winning journalist John Donvan moderates  This debate was produced in partnership with Johns Hopkins University.    This debate was recorded on May 14, 2025 at 6 PM at Shriver Hall, 3400 N Charles St Ste 14, in Baltimore, Maryland.  Learn more about your ad choices. Visit podcastchoices.com/adchoices

The Hardware Unboxed Podcast
Will AMD Make the RX 9060 XT Overpriced?

The Hardware Unboxed Podcast

Play Episode Listen Later May 16, 2025 85:35


Episode 71: We chat about potential upcoming Intel Arc GPUs including the B770 and B580 24GB, discuss the RX 9060 XT and what AMD might do with pricing, and round out the chat with some discoveries about laptop GPUs.CHAPTERS00:00 - Intro05:13 - Intel Arc B770, Is It Real?22:10 - New Arc B580 Configurations?26:18 - Lead-Up to the RX 9060 XT39:04 - Price Considerations and Concerns with 9060 XT1:04:30 - The RTX 5090 Laptop is a Joke1:14:25 - Updates From Our Boring LivesSUBSCRIBE TO THE PODCASTAudio: https://shows.acast.com/the-hardware-unboxed-podcastVideo: https://www.youtube.com/channel/UCqT8Vb3jweH6_tj2SarErfwSUPPORT US DIRECTLYPatreon: https://www.patreon.com/hardwareunboxedLINKSYouTube: https://www.youtube.com/@Hardwareunboxed/Twitter: https://twitter.com/HardwareUnboxedBluesky: https://bsky.app/profile/hardwareunboxed.bsky.social Hosted on Acast. See acast.com/privacy for more information.

Geek-Tech Shorts
PixxelCast 152 - Steam Hackeado, tu Switch 2 no es tuyo y NVIVIA se esconde

Geek-Tech Shorts

Play Episode Listen Later May 16, 2025 145:15


Suscríbete para más: https://www.youtube.com/c/pixxelersSigueme en redes: https://linktr.ee/jlrock92Discord: https://discord.gg/EFkfqhMZDUNOTAS:- Steam Hack: https://tinyurl.com/36sj4297- Nintendo EULA: https://tinyurl.com/2makhwks- Switch 2 specs: https://youtu.be/huxDoYXS8Ng- RTX 5060 escondida: https://tinyurl.com/y4evnup3- NVIDIA sube precios: https://tinyurl.com/y8488vhu- Intel ARC B770: https://tinyurl.com/y58yzpjj- Huawei laptops OS: https://tinyurl.com/2nrfzdy4- Trend Chromebooks: https://tinyurl.com/ym7ha275- Apple bateado: https://tinyurl.com/34a43r5a- Nadie usa Google: https://tinyurl.com/2j2uapnv- Android Spam: https://www.youtube.com/live/l3yDd3CmA_Y- Trump x Elon IA: https://tinyurl.com/238npmhu- Elon Musk vende X: https://tinyurl.com/3a5ude3r- Ley IA republicana: https://tinyurl.com/2feam4d5- Ley rastrea GPUs: https://tinyurl.com/527du9fz- Testimonio IA: https://tinyurl.com/9z3mcw2e- HBO Max: https://tinyurl.com/bdhytbux

Bitcoin Magazine
The Riyadh Accord, Bitcoin Treasury Arbitrage & Tech Feudalism | The Bitcoin Policy Hour Ep. 6

Bitcoin Magazine

Play Episode Listen Later May 14, 2025 66:53


In Episode 6 The Bitcoin Policy Hour, the Bitcoin Policy Institute team unpacks the emerging “Riyadh Accord,” a sweeping geopolitical realignment where the United States, Saudi Arabia, and other Gulf nations are bundling AI, Bitcoin, and techno-industrial leverage into a new framework of global influence.As Blackwell chips begin to replace F-35s as diplomatic bargaining tools, and sovereign wealth funds quietly accumulate Bitcoin, Riyadh is fast becoming the epicenter of digital energy, intelligence infrastructure, and monetary power. The conversation explores how U.S. foreign policy is shifting from military entanglements toward high-tech trade agreements and capital co-investments — with AI and Bitcoin at the core.PLUS, they explore the recent slew of BTC treasury companies amping up activity and how they fit into the picture of jurisdictional and memetic arbitrage.Chapters:00:00 - Introduction01:45 - What Is the “Riyadh Accord”?07:30 - Blackwell Chips Replace F-35s in Middle East Bargains13:30 - AGI Infrastructure: Will the AI Run Happen in Riyadh?17:00 - A New Multipolar World Centered on Energy and Compute21:00 - Samourai Wallet, Legal Overreach, and Bitcoin's Core Ethos26:30 - Policy Hypocrisy: Bitcoin Freedom vs. Surveillance State31:00 - AI Feudalism and the Fight for Decentralized Money36:00 - The Open Source AI vs. Corporate Subscription Future41:00 - Bearer Assets: Bitcoin, GPUs, and Energy as Sovereign Tools46:00 - Global Reflexivity and Bitcoin Treasury Companies51:00 - Metaplanet, Nakamoto, and the Meme Wrapped in Arbitrage56:00 - Bitcoin's Geopolitical Moment: What Comes Next?01:01:00 - Closing Thoughts: Bitcoin Banks, Policy Risks, and Meme Economics⭐ Join top policymakers, technologists, and Bitcoin industry leaders at the 2025 Bitcoin Policy Summit, June 25–26 in Washington, D.C.

Immigration Law for Tech Startups
227: Why We Still Don't Have Robot Butlers with Anshuman Kumar

Immigration Law for Tech Startups

Play Episode Listen Later May 13, 2025 37:55


Unlock the secrets behind the rapid evolution of robotics with Anshuman Kumar, head of hardware at Matic Robots, as we dissect what makes a robot more than just a machine. Discover how modern marvels, from everyday tools to cutting-edge autonomous vehicles, are reshaping our lives. Anshuman shares the technological breakthroughs that are fueling this transformation, revealing the vital roles that GPUs, AI, and a blend of mechanics, electronics, and algorithms play in creating robots capable of perceiving and interacting with their surroundings like never before. Anshuman Kumar is the Head of Hardware at Matic Robots, where he pioneered the mechanical design for Matic - the world's first truly autonomous, private, and perceptive floor cleaning robot. Previously, he was a key engineer at Tesla Motors, resolving critical reliability and scaling challenges for the Model S and Model 3 traction inverters. With a Master's in Product Design from Carnegie Mellon University and a Bachelor's in Mechanical Engineering from IIT Delhi, Anshuman also founded and led the Carnegie Mellon Hyperloop team to be awarded in the SpaceX Hyperloop competition.  In this episode, you'll hear about: Exploration of the robotics spectrum from simple tools to complex autonomous vehicles. Technological breakthroughs in AI, GPUs, and algorithms driving robotic advancements. The role of cameras and computer vision in enhancing home robotics and ensuring privacy. Matic Robots' innovative on-device processing to address privacy concerns in consumer robotics. Cultural and market dynamics explored through a roti-making appliance's success in the US. Importance of curiosity and tackling unglamorous problems in the startup and tech industry. Follow and Review: We'd love for you to follow us if you haven't yet. Click that purple '+' in the top right corner of your Apple Podcasts app. We'd love it even more if you could drop a review or 5-star rating over on Apple Podcasts. Simply select “Ratings and Reviews” and “Write a Review” then a quick line with your favorite part of the episode. It only takes a second and it helps spread the word about the podcast. Supporting Resources: Linkedin - https://www.linkedin.com/in/anshuman-kumar/  Website - https://maticrobots.com/   Contact: anshuman@maticrobots.com ; anshumankumar.iitd@gmail.com Matic Website :  https://maticrobots.com/ Hardware Nation Episode : https://www.youtube.com/watch?v=AoUnXZg0Wb0&t=249s&pp=ygUVaGFyZHdhcmUgbmF0aW9uIG1hdGlj Matic Privacy : https://maticrobots.com/blog/why-matic-is-the-most-private-and-secure-robot-vacuum/ Matic Mopping : https://maticrobots.com/blog/the-magic-behind-matics-mopping/ Matic Sweeping : https://maticrobots.com/blog/why-matic-brushroll-is-different/ Alcorn Immigration Law: Subscribe to the monthly Alcorn newsletter Sophie Alcorn Podcast: Episode 16: E-2 Visa for Founders and Employees Episode 19: Australian Visas Including E-3 Episode 20: TN Visas and Status for Canadian and Mexican Citizens Immigration Options for Talent, Investors, and Founders Immigration Law for Tech Startups eBook

The New Stack Podcast
Google AI Infrastructure PM On New TPUs, Liquid Cooling and More

The New Stack Podcast

Play Episode Listen Later May 13, 2025 19:38


At Google Cloud Next '25, the company introduced Ironwood, its most advanced custom Tensor Processing Unit (TPU) to date. With 9,216 chips per pod delivering 42.5 exaflops of compute power, Ironwood doubles the performance per watt compared to its predecessor. Senior product manager Chelsie Czop explained that designing TPUs involves balancing power, thermal constraints, and interconnectivity. Google's long-term investment in liquid cooling, now in its fourth generation, plays a key role in managing the heat generated by these powerful chips. Czop highlighted the incremental design improvements made visible through changes in the data center setup, such as liquid cooling pipe placements. Customers often ask whether to use TPUs or GPUs, but the answer depends on their specific workloads and infrastructure. Some, like Moloco, have seen a 10x performance boost by moving directly from CPUs to TPUs. However, many still use both TPUs and GPUs. As models evolve faster than hardware, Google relies on collaborations with teams like DeepMind to anticipate future needs.Learn more from The New Stack about the latest AI infrastructure insights from Google Cloud:Google Cloud Therapist on Bringing AI to Cloud Native InfrastructureA2A, MCP, Kafka and Flink: The New Stack for AI AgentsJoin our community of newsletter subscribers to stay on top of the news and at the top of your game. 

Pivot
AMD CEO Lisa Su on the “Dead Sexy” AI Chips Race - On with Kara Swisher

Pivot

Play Episode Listen Later May 10, 2025 52:47


In 2014, when Lisa Su took over as CEO of Advanced Micro Devices, AMD was on the verge of bankruptcy. Su bet hard on hardware and not only pulled the semiconductor company back from the brink, but also led it to surpass its historical rival, Intel, in market cap. Since the launch of ChatGPT made high-powered chips like AMDs “sexy” again, demand for chips has intensified exponentially, but so has the public spotlight on the industry — including from the federal government.  In a live conversation, at the Johns Hopkins University Bloomberg Center, as part of their inaugural Discovery Series, Kara talks to Su about her strategy in face of the Trump administration's tariff and export control threats, how to safeguard the US in the global AI race, and what she says when male tech leaders brag about the size of their GPUs. Listen to more from On with Kara Swisher here. Learn more about your ad choices. Visit podcastchoices.com/adchoices

Waking Up With AI
Decentralizing AI

Waking Up With AI

Play Episode Listen Later May 8, 2025 13:45


This week on “Waking Up With AI,” Anna Gressel looks at how decentralized AI training could revolutionize the field by allowing for the collaborative use of advanced GPUs worldwide, expanding access to model development while raising interesting questions about export controls and regulatory frameworks. ## Learn More About Paul, Weiss's Artificial Intelligence Practice: https://www.paulweiss.com/practices/litigation/artificial-intelligence

The New Stack Podcast
Google Cloud Therapist on Bringing AI to Cloud Native Infrastructure

The New Stack Podcast

Play Episode Listen Later May 8, 2025 24:04


At Google Cloud Next, Bobby Allen, Group Product Manager for Google Kubernetes Engine (GKE), emphasized GKE's foundational role in supporting AI platforms. While AI dominates current tech conversations, Allen highlighted that cloud-native infrastructure like Kubernetes is what enables AI workloads to function efficiently. GKE powers key Google services like Vertex AI and is trusted by organizations including DeepMind, gaming companies, and healthcare providers for AI model training and inference. Allen explained that GKE offers scalability, elasticity, and support for AI-specific hardware like GPUs and TPUs, making it ideal for modern workloads. He noted that Kubernetes was built with capabilities—like high availability and secure orchestration—that are now essential for AI deployment. Looking forward, GKE aims to evolve into a model router, allowing developers to access the right AI model based on function, not vendor, streamlining the development experience. Allen described GKE as offering maximum control with minimal technical debt, future-proofed by Google's continued investment in open source and scalable architecture.Learn more from The New Stack about the latest insights with Google Cloud: Google Kubernetes Engine Customized for Faster AI WorkKubeCon Europe: How Google Will Evolve Kubernetes in the AI EraApache Ray Finds a Home on the Google Kubernetes EngineJoin our community of newsletter subscribers to stay on top of the news and at the top of your game. 

Big Technology Podcast
Is AI Scaling Dead? — With Gary Marcus

Big Technology Podcast

Play Episode Listen Later May 7, 2025 54:22


Gary Marcus is a cognitive scientist, author, and longtime AI skeptic. Marcus joins Big Technology to discuss whether large‑language‑model scaling is running into a wall. Tune in to hear a frank debate on the limits of “just add GPUs" and what that means for the next wave of AI. We also cover data‑privacy fallout from ad‑driven assistants, open‑source bio‑risk fears, and the quest for interpretability. Hit play for a reality check on AI's future — and the insight you need to follow where the industry heads next. --- Enjoying Big Technology Podcast? Please rate us five stars ⭐⭐⭐⭐⭐ in your podcast app of choice. Want a discount for Big Technology on Substack? Here's 25% off for the first year: https://www.bigtechnology.com/subscribe?coupon=0843016b Questions? Feedback? Write to: bigtechnologypodcast@gmail.com

Construction Brothers
The Tin Hat Club

Construction Brothers

Play Episode Listen Later May 7, 2025 79:34


Today we're in conversation with Siddhant Mehta, Project Manager at Skanska, to explore how AI is transforming construction. From choosing the right tools to critiquing SaaS pricing models, Sid shares insights on tech adoption, AI coding, and the future of project management.00:46 – Sid's Journey AbroadSid Mehta shares his story from Mumbai to the U.S., managing multimillion-dollar projects and finding his place in construction management.02:03 – Building Tech NetworksHow Skanska leverages emerging tech groups, vendor evaluations, and peer networks to spread innovation across teams.03:55 – Tech Adoption RealitiesSid challenges perceptions of slow adoption in construction, highlighting why pilot projects need time to show results.05:14 – The Feedback GapWhy construction tech tools often miss the mark, and how missing field feedback hurts tool development.06:43 – Choosing the Right ToolSid explains why not every tech solution fits every project, stressing the importance of aligning tools with project type and phase.09:06 – SaaS Pricing RantA frank critique of SaaS pricing in construction, questioning project-based fees versus simpler subscriptions.12:00 – Naming Names (Kinda) A playful yet pointed critique of familiar industry pricing models—without naming names (but we all know who).17:05 – Rise of AI CodingExploring tools like Replit, Claude, and Cursor, and the rise of “vibe coding” in construction tech and software development.23:02 – AI's Development ImpactHow AI coding shifts the role of developers, and why front-end engineering faces more disruption than back-end.28:00 – Data Centers & DemandHow AI's growth drives demand for data centers, reshaping infrastructure needs for GPUs, power, and cooling.35:00 – Environmental ImpactsA look at the ecological consequences of data center expansion, from water usage to energy demands.40:48 – AI Saves the DayReal-world examples of AI replacing executive assistants, saving hours on email, scheduling, and admin tasks in construction.45:00 – Skanska's Internal AIHow Skanska built internal chatbots to automate project schedules, saving schedulers hours every week.47:26 – Ripple Effect of AISid reflects on how AI's time savings can scale across thousands of employees, transforming workflows organization-wide.50:00 – Marketing's AI ShiftWhy SEO strategies are changing in an AI world, and how creative content is being reshaped by generative tools.54:00 – AI's Rapid AccelerationClosing thoughts on how quickly AI is evolving, and why getting on board now is key for construction leaders.Go build something awesome!CHECK OUT THE PARTNERS THAT MAKE OUR SHOW POSSIBLE: https://www.brospodcast.com/partnersFIND US ONLINE: -Our website: https://www.brospodcast.com -LinkedIn:   / constructionbrospodcast   -Instagram:   / constructionbrospodcast    -TikTok: https://www.tiktok.com/@constructionbrothers?lang=en-Eddie on LinkedIn:   / eddie-c-057b3b11   -Tyler on LinkedIn:   / tylerscottcampbell  If you enjoy the podcast, please rate us on Apple Podcasts or wherever you listen to us! Thanks for listening!

Data Protection Gumbo
298: The Battle for AI Supremacy Isn't About Models—It's About Infrastructure - Thunder Compute

Data Protection Gumbo

Play Episode Listen Later May 6, 2025 33:32


Carl Peterson, CEO of Thunder Compute uncovers how Thunder Computer is redefining GPU utilization by enabling network-attached virtual GPUs—dramatically slashing costs and democratizing access. Carl shares the startup's Y Combinator origin story, the impact of DeepSeek, and how virtualization is transforming AI development for individuals and enterprises alike. We also unpack GPU security, job disruption from AI, and the accelerating arms race in model development. A must-listen for anyone navigating AI, compute efficiency, and data protection.

Tech.eu
We are sprinting towards implementation of our first chip”, says UK startup in AI inference race

Tech.eu

Play Episode Listen Later May 5, 2025 33:46


In the Tech.eu podcast, Fractile founder Walter Goodwin discusses Fractile's AI inference chips which he claims can run LLMs faster and more energy efficient than Nvidia's GPUs.

Estadão Notícias
Tecnologia #375: #Start Eldorado: planejar e experimentar, o sucesso para a IA

Estadão Notícias

Play Episode Listen Later May 3, 2025 24:16


Na adoção de sistemas de Inteligência Artificial para uma ou mais áreas do negócio, as empresas devem começar cada um dos projetos de maneira estruturada e bem planejada. Pensando mais pequeno, se for o caso - e depois, baseadas nos dados, irem escalando o uso da IA de forma mais assertiva, evitando cair no hype de investir recursos e tempo em uma tecnologia tão inovadora sem muito planejamento, apenas para surfar a onda do momento. Para falar desse tema e das iniciativas para democratizar a IA a grandes, médios e pequenos negócios, por meio do processamento em CPUs, mais baratas e energeticamente mais eficientes do que os grandes sistemas baseados em GPUs, tecnologia que vem ganhando espaço no mercado, e o casamento entre a Inteligência Artificial com o desenvolvimento no padrão Open Source (Código Aberto), que incentiva a colaboração e a integração de ecossistemas com diferentes parceiros, o Start Eldorado recebe Sandra Vaz, country manager da Red Hat para o Brasil, que conversou sobre estes e mais temas com o apresentador Daniel Gonzales. O programa vai ao ar todas as quartas-feiras, às 21h, em FM 107,3 para toda a Grande São Paulo, site, app, canais digitais e assistentes de voz.See omnystudio.com/listener for privacy information.

On with Kara Swisher
AMD CEO Lisa Su on AI Chips, Trump's Tariffs and the Magic of Open Source

On with Kara Swisher

Play Episode Listen Later May 1, 2025 52:47


In 2014, when Lisa Su took over as CEO of Advanced Micro Devices, AMD was on the verge of bankruptcy. Su bet hard on hardware and not only pulled the semiconductor company back from the brink, but also led it to surpass its historical rival, Intel, in market cap. Since the launch of ChatGPT made high-powered chips like AMDs “sexy” again, demand for chips has intensified exponentially, but so has the public spotlight on the industry — including from the federal government.  In a live conversation, at the Johns Hopkins University Bloomberg Center, as part of their inaugural Discovery Series, Kara talks to Su about her strategy in face of the Trump administration's tariff and export control threats, how to safeguard the US in the global AI race, and what she says when male tech leaders brag about the size of their GPUs. Questions? Comments? Email us at on@voxmedia.com or find us on Instagram, TikTok, and Bluesky @onwithkaraswisher. Learn more about your ad choices. Visit podcastchoices.com/adchoices

The New Stack Podcast
Arm's Open Source Leader on Meeting the AI Challenge

The New Stack Podcast

Play Episode Listen Later May 1, 2025 18:21


At Arm, open source is the default approach, with proprietary software requiring justification, says Andrew Wafaa, fellow and senior director of software communities. Speaking at KubeCon + CloudNativeCon Europe, Wafaa emphasized Arm's decade-long commitment to open source, highlighting its investment in key projects like the Linux kernel, GCC, and LLVM. This investment is strategic, ensuring strong support for Arm's architecture through vital tools and system software.Wafaa also challenged the hype around GPUs in AI, asserting that CPUs—especially those enhanced with Arm's Scalable Matrix Extension (SME2) and Scalable Vector Extension (SVE2)—are often more suitable for inference workloads. CPUs offer greater flexibility, and Arm's innovations aim to reduce dependency on expensive GPU fleets.On the AI framework front, Wafaa pointed to PyTorch as the emerging hub, likening its ecosystem-building potential to Kubernetes. As a PyTorch Foundation board member, he sees PyTorch becoming the central open source platform in AI development, with broad community and industry backing.Learn more from The New Stack about the latest insights about Arm: Edge Wars Heat Up as Arm Aims to Outflank Intel, Qualcomm Arm: See a Demo About Migrating a x86-Based App to ARM64Join our community of newsletter subscribers to stay on top of the news and at the top of your game. 

Irish Tech News Audio Articles
Alienware Launches Next-Generation Area-51 Laptops to Power the Future of Gaming

Irish Tech News Audio Articles

Play Episode Listen Later Apr 30, 2025 3:12


Alienware has officially launched its most powerful and design-forward gaming laptops to date, the new Alienware Area-51 series. First announced at CES 2025, the Area-51 laptops mark a return of the brand's most iconic platform, reimagined with next-generation technology and a striking new design language inspired by extraterrestrial phenomena. Available in 16- and 18-inch models, the Area-51 laptops are built to deliver maximum performance and innovation for gamers and creators alike. Equipped with up to an NVIDIA GeForce RTX 5090 Laptop GPU and an Intel Core Ultra 9 275HX CPU, the laptops offer a total performance package of up to 280W, making them the most powerful laptops Alienware has ever produced. A completely reengineered Cryo-tech thermal architecture enables up to 37% more airflow while being 15% quieter than previous models, ensuring high performance without compromise. The new design, dubbed AW30, draws inspiration from the Aurora Borealis, bringing a serene, ethereal aesthetic to the Alienware lineup. A Liquid Teal finish with colour-shifting iridescence, translucent rear thermal shelf with gradient AlienFX lighting, and a Clear Gorilla Glass panel on the underside offer both form and function. These elements combine to deliver an immersive visual experience that mirrors the laptops' gaming capabilities. Additional features include RGB fans, a zero-hinge design for cleaner lines, and support for the latest Gen 5 SSDs with up to 12TB of storage and Microsoft DirectStorage for faster load times. The Alienware Area-51 18 and 16 laptops with RTX 5080 GPUs are now available for purchase on Dell Technologies Ireland website, starting at €4,098.99 and €4,298.99, respectively. Configurations with RTX 5090 and 5070 Ti GPUs will be available soon. With this launch, Alienware reaffirms its commitment to pushing the boundaries of gaming performance and design, offering Irish gamers and tech enthusiasts an experience into the future of high-end computing. More about Irish Tech News Irish Tech News are Ireland's No. 1 Online Tech Publication and often Ireland's No.1 Tech Podcast too. You can find hundreds of fantastic previous episodes and subscribe using whatever platform you like via our Anchor.fm page here: https://anchor.fm/irish-tech-news If you'd like to be featured in an upcoming Podcast email us at Simon@IrishTechNews.ie now to discuss. Irish Tech News have a range of services available to help promote your business. Why not drop us a line at Info@IrishTechNews.ie now to find out more about how we can help you reach our audience. You can also find and follow us on Twitter, LinkedIn, Facebook, Instagram, TikTok and Snapchat.

Gradient Dissent - A Machine Learning Podcast by W&B
Inside Cursor: The future of AI coding with Co-founder Sualeh Asif

Gradient Dissent - A Machine Learning Podcast by W&B

Play Episode Listen Later Apr 29, 2025 49:36


In this episode of Gradient Dissent, host Lukas Biewald talks with Sualeh Asif, the CPO and co-founder of Cursor, one of the fastest-growing and most loved AI-powered coding platforms. Sualeh shares the story behind Cursor's creation, the technical and design decisions that set it apart, and how AI models are changing the way we build software. They dive deep into infrastructure challenges, the importance of speed and user experience, and how emerging trends in agents and reasoning models are reshaping the developer workflow.Sualeh also discusses scaling AI inference to support hundreds of millions of requests per day, building trust through product quality, and his vision for how programming will evolve in the next few years.⏳Timestamps:00:00 How Cursor got started and why it took off04:50 Switching from Vim to VS Code and the rise of CoPilot08:10 Why Cursor won among competitors: product philosophy and execution10:30 How user data and feedback loops drive Cursor's improvements12:20 Iterating on AI agents: what made Cursor hold back and wait13:30 Competitive coding background: advantage or challenge?16:30 Making coding fun again: latency, flow, and model choices19:10 Building Cursor's infrastructure: from GPUs to indexing billions of files26:00 How Cursor prioritizes compute allocation for indexing30:00 Running massive ML infrastructure: surprises and scaling lessons34:50 Why Cursor chose DeepSeek models early36:00 Where AI agents are heading next40:07 Debugging and evaluating complex AI agents42:00 How coding workflows will change over the next 2–3 years46:20 Dream future projects: AI for reading codebases and papers

What's new in Cloud FinOps?
WNiCF - March 2025 - News

What's new in Cloud FinOps?

Play Episode Listen Later Apr 28, 2025 25:56


Send us a textIn this episode of What's New in Cloud Phenops, Stephen Old and Frank discuss the latest updates in cloud computing, focusing on Azure, Google Cloud, and AWS. They cover the retirement of certain Azure virtual machines, the introduction of serverless GPUs, and the benefits of Amazon Bedrock for cost transparency. The conversation also touches on new features for Azure databases, insights from a Forrester study on Spanner, and the importance of calculating AI costs. Additionally, they discuss licensing changes for Amazon FSX, tiered storage for Spanner, and the deprecation of the AWS connector to Azure. The episode concludes with a look at sustainability efforts and upcoming events in the cloud computing space.takeawaysServerless GPUs enable on-demand AI workloads with automatic scaling.Amazon Bedrock introduces real-time cost transparency for custom models.Physical snapshots for Azure databases enhance backup flexibility.Forrester study shows significant ROI with Spanner.Understanding AI costs on Google Cloud is crucial for budgeting.Amazon FSX for NetApp removes SnapLock licensing fees.Tiered storage for Spanner optimizes cost and performance.AWS connector to Azure is deprecated, focusing on native solutions.Azure OpenAI service offers discounts for provisioned reservations.

Irish Tech News Audio Articles
Simplifying IT for the AI and Multicloud Era

Irish Tech News Audio Articles

Play Episode Listen Later Apr 28, 2025 5:36


Guest post by Brian O' Toole, Consumption and Software Sales Leader at Dell Technologies AI is rapidly reshaping the business landscape, making digital transformation not just a priority but a necessity for Irish organisations. Yet as companies look to harness its potential, they often find themselves navigating increasingly complex IT environments - a challenge that can feel overwhelming for businesses of all sizes. Whether it's navigating cloud migration or staying secure and scaling AI projects or even just managing day-to-day IT workloads with limited resources, there's one thing we keep hearing from businesses and organisations alike is that 'we need to simplify'. At Dell Technologies, we've seen these challenges firsthand - and that's why we're helping organisations embrace technology as-a-Service. Adopting this approach can help simplify operations, modernise IT infrastructure, and give businesses the agility they need to innovate at speed in the AI era. A Fresh Approach to IT Management Today, IT teams face a perfect storm of priorities from business leaders responding to external challenges. These priorities pressure IT leaders to do more with less as they get operations teams to innovate while addressing expanding regulatory frameworks around data. All these pressures and potentially competing priorities increase the risk of IT decision sprawl that could solve problems in one area while adding complexity in others. To help IT and business leaders navigate this environment and shift IT costs from capital expenditure (CapEx) to operational expenditure (OpEx), Dell APEX Cloud Platforms provide integrated infrastructure management that reduces multicloud complexity while strengthening security and governance. APEX is a portfolio of fully integrated, turnkey systems that integrate Dell infrastructure, software and cloud operating stacks to deliver consistent multicloud operations. By extending cloud operating models to on-premises and edge environments, Dell APEX Cloud Platforms bridge the cloud divide by delivering consistent cloud operations everywhere. With Dell APEX Cloud Platforms, you can: Minimize multicloud costs and complexity in the cloud ecosystem of your choice. Increase application value by accelerating productivity with familiar experiences that enable you to develop anywhere and deploy everywhere. Improve security and governance by enforcing consistent cloud ecosystem management from cloud to edge and enhancing control with layered security. The shift to an As-a-Service approach gives businesses control without the chaos. Whether a scaling startup or an established large business planning to advance their Multicloud solutions or leverage AI-driven applications, they can get access to latest technology such as storage, servers, devices and cloud services - on demand with only the cost for what they use. Enabling organisations to innovate in an AI and Multicloud era For organisations, the shift to an as-a-service model is not just about simplifying IT systems, it's about ensuring they can unlock innovation and growth. Businesses can pay for what they use which aligns technology investment to actual value and usage. This approach is especially critical for costly infrastructure such as GPUs, servers, and storage which all require substantial investment. By spreading costs over time, organisations in Ireland can forge a cost-effective pathway to leveraging cutting-edge AI capabilities without being locked into long-term technology commitments. In Ireland, we're seeing a growing appetite for more agile, scalable IT models, especially among businesses embracing AI, hybrid work, and Multicloud strategies. As the debate between public and private clouds are fading, Multicloud ecosystems are the future, and Dell APEX is leading the charge. With partnerships spanning hyper scalers like Microsoft, Red Hat, VMware, and Google Cloud, Dell APEX delivers simplified IT management across environments. Dell APEX innovatio...

The Hardware Unboxed Podcast
8GB GPUs Are Very Bad Now, Is The RX 9060 XT in Trouble?

The Hardware Unboxed Podcast

Play Episode Listen Later Apr 25, 2025 73:36


Episode 69: The GeForce RTX 5060 Ti 8GB is really bad, there are many problems with it (especially at the price), so is the upcoming AMD Radeon RX 9060 XT 8GB in trouble? We discuss all of that in today's episode, and yes, we're getting into VRAM yet again.CHAPTERS00:00 - Intro00:33 - 8GB GPUs are Dead on Arrival13:54 - The Main Problem is the Name34:56 - Can it Use the Advertised Features?41:18 - AMD Radeon RX 9060 XT Rumor Talk59:16 - Updates From Our Boring LivesSUBSCRIBE TO THE PODCASTAudio: https://shows.acast.com/the-hardware-unboxed-podcastVideo: https://www.youtube.com/channel/UCqT8Vb3jweH6_tj2SarErfwSUPPORT US DIRECTLYPatreon: https://www.patreon.com/hardwareunboxedLINKSYouTube: https://www.youtube.com/@Hardwareunboxed/Twitter: https://twitter.com/HardwareUnboxedBluesky: https://bsky.app/profile/hardwareunboxed.bsky.social Hosted on Acast. See acast.com/privacy for more information.

AI DAILY: Breaking News in AI
CREATIVITY IN AN AI WORLD

AI DAILY: Breaking News in AI

Play Episode Listen Later Apr 23, 2025 3:44


Plus AI Films Can Now Win OscarsLike this? Get AIDAILY, delivered to your inbox, 3x a week. Subscribe to our newsletter at https://aidaily.usMIT's 'Periodic Table' of Machine Learning Is a Total Game-ChangerMIT researchers just dropped a "periodic table" for machine learning, mapping out how 20+ classic algorithms are mathematically connected. This framework lets scientists remix existing methods to create new AI models—like one that beat current image classifiers by 8%. Even cooler? There are blank spots hinting at undiscovered algorithms. It's like AI's own version of scientific alchemy.When AI Makes Art: What Happens to Human Creativity?As AI-generated art floods our feeds, the real question isn't "Can AI be creative?" but "What does creativity mean now?" This editor's letter explores how AI reshapes art, urging us to see it not as a threat but as a partner. The future of creativity might be more collaborative than we ever imagined. AI Films Are Now Eligible for Oscars—But Human Creativity Still ReignsThe Academy has updated its rules: films using AI tools are officially Oscar-eligible. But here's the twist—AI won't boost or hurt your chances. What matters is how much human creativity is at the core. This comes after AI-enhanced performances like Adrien Brody's in The Brutalist stirred debate. The message? AI can assist, but the heart of the story better be human.China and the U.S. Are Building AI Empires—But on Totally Different FoundationsThe U.S. and China are both racing to dominate AI, but their strategies are worlds apart. While the U.S. flexes with cutting-edge GPUs and cloud giants, China is going DIY—crafting its own chips like Huawei's Ascend 910C and training models like DeepSeek on limited hardware. Despite U.S. export bans, China's scaling up with what it's got, proving that necessity really is the mother of invention.The 4 Types of AI Agent Users—and What They WantA new survey breaks down AI agent users into four vibes: Smarty Pants (info junkies), Minimalists (keep it simple), Life-Hackers (efficiency nerds), and Tastemakers (curated everything). Each group has unique needs, from decision support to personalized recs. For brands, it's a cheat sheet for building AI tools that actually click with users.Who Has Time to Be Polite to ChatGPT?TechRadar's Graham Barlow questions the habit of saying "please" and "thank you" to AI like ChatGPT. He argues that since AI lacks consciousness, such politeness is unnecessary and time-consuming. Barlow also notes that these extra words increase computational load, leading to higher costs and environmental impact. He advocates for concise interactions, reserving courtesy for human exchanges.

Oracle University Podcast
Integrating APEX with OCI AI Services

Oracle University Podcast

Play Episode Listen Later Apr 22, 2025 20:01


Discover how Oracle APEX leverages OCI AI services to build smarter, more efficient applications. Hosts Lois Houston and Nikita Abraham interview APEX experts Chaitanya Koratamaddi, Apoorva Srinivas, and Toufiq Mohammed about how key services like OCI Vision, Oracle Digital Assistant, and Document Understanding integrate with Oracle APEX.   Packed with real-world examples, this episode highlights all the ways you can enhance your APEX apps.   Oracle APEX: Empowering Low Code Apps with AI: https://mylearn.oracle.com/ou/course/oracle-apex-empowering-low-code-apps-with-ai/146047/ Oracle University Learning Community: https://education.oracle.com/ou-community LinkedIn: https://www.linkedin.com/showcase/oracle-university/ X: https://x.com/Oracle_Edu   Special thanks to Arijit Ghosh, David Wright, Kris-Ann Nansen, Radhika Banka, and the OU Studio Team for helping us create this episode.   ---------------------------------------------------------------   Episode Transcript: 00:00 Welcome to the Oracle University Podcast, the first stop on your cloud journey. During this series of informative podcasts, we'll bring you foundational training on the most popular Oracle technologies. Let's get started! 00:25 Lois: Hello and welcome to the Oracle University Podcast. I'm Lois Houston, Director of Innovation Programs with Oracle University, and with me is Nikita Abraham, Team Lead: Editorial Services. Nikita: Hi everyone! Last week, we looked at how generative AI powers Oracle APEX and in today's episode, we're going to focus on integrating APEX with OCI AI Services. Lois: That's right, Niki. We're going to look at how you can use Oracle AI services like OCI Vision, Oracle Digital Assistant, Document Understanding, OCI Generative AI, and more to enhance your APEX apps. 01:03 Nikita: And to help us with it all, we've got three amazing experts with us, Chaitanya Koratamaddi, Director of Product Management at Oracle, and senior product managers, Apoorva Srinivas and Toufiq Mohammed. In today's episode, we'll go through each Oracle AI service and look at how it interacts with APEX. Apoorva, let's start with you. Can you explain what the OCI Vision service is? Apoorva: Oracle Cloud Infrastructure Vision is a serverless multi-tenant service accessible using the console or REST APIs. You can upload images to detect and classify objects in them. With prebuilt models available, developers can quickly build image recognition into their applications without machine learning expertise. OCI Vision service provides a fully managed model infrastructure. With complete integration with OCI Data Labeling, you can build custom models easily. OCI Vision service provides pretrained models-- Image Classification, Object Detection, Face Detection, and Text Recognition. You can build custom models for Image Classification and Object Detection. 02:24 Lois: Ok. What about its use cases? How can OCI Vision make APEX apps more powerful? Apoorva: Using OCI Vision, you can make images and videos discoverable and searchable in your APEX app.  You can use OCI Vision to detect and classify objects in the images. OCI Vision also highlights the objects using a red rectangular box. This comes in handy in use cases such as detecting vehicles that have violated the rules in traffic images. You can use OCI Vision to identify visual anomalies in your data. This is a very popular use case where you can detect anomalies in cancer X-ray images to detect cancer. These are some of the most popular use cases of using OCI Vision with your APEX app. But the possibilities are endless and you can use OCI Vision for any of your image analysis. 03:29 Nikita: Let's shift gears to Oracle Digital Assistant. Chaitanya, can you tell us what it's all about? Chaitanya: Oracle Digital Assistant is a low-code conversational AI platform that allows businesses to build and deploy AI assistants. It provides natural language understanding, automatic speech recognition, and text-to-speech capabilities to enable human-like interactions with customers and employees. Oracle Digital Assistant comes with prebuilt templates for you to get started.  04:00 Lois: What are its key features and benefits, Chaitanya? How does it enhance the user experience? Chaitanya: Oracle Digital Assistant provides conversational AI capabilities that include generative AI features, natural language understanding and ML, AI-powered voice, and analytics and insights. Integration with enterprise applications become easier with unified conversational experience, prebuilt chatbots for Oracle Cloud applications, and chatbot architecture frameworks. Oracle Digital Assistant provides advanced conversational design tools, conversational designer, dialogue and domain trainer, and native multilingual support. Oracle Digital Assistant is open, scalable, and secure. It provides multi-channel support, automated bot-to-agent transfer, and integrated authentication profile. 04:56 Nikita: And what about the architecture? What happens at the back end? Chaitanya: Developers assemble digital assistants from one or more skills. Skills can be based on prebuilt skills provided by Oracle or third parties, custom developed, or based on one of the many skill templates available. 05:16 Lois: Chaitanya, what exactly are “skills” within the Oracle Digital Assistant framework?  Chaitanya: Skills are individual chatbots that are designed to interact with users and fulfill specific type of tasks. Each skill helps a user complete a task through a combination of text messages and simple UI elements like select list. When a user request is submitted through a channel, the Digital Assistant routes the user's request to the most appropriate skill to satisfy the user's request. Skills can combine multilingual NLP deep learning engine, a powerful dialogflow engine, and integration components to connect to back-end systems.  Skills provide a modular way to build your chatbot functionality. Now users connect with a chatbot through channels such as Facebook, Microsoft Teams, or in our case, Oracle APEX chatbot, which is embedded into an APEX application. 06:21 Nikita: That's fascinating. So, what are some use cases of Oracle Digital Assistant in APEX apps? Chaitanya: Digital assistants streamline approval processes by collecting information, routing requests, and providing status updates. Digital assistants offer instant access to information and documentation, answering common questions and guiding users. Digital assistants assist sales teams by automating tasks, responding to inquiries, and guiding prospects through the sales funnel. Digital assistants facilitate procurement by managing orders, tracking deliveries, and handling supplier communication. Digital assistants simplify expense approvals by collecting reports, validating receipts, and routing them for managerial approval. Digital assistants manage inventory by tracking stock levels, reordering supplies, and providing real-time inventory updates. Digital assistants have become a common UX feature in any enterprise application. 07:28 Want to learn how to design stunning, responsive enterprise applications directly from your browser with minimal coding? The new Oracle APEX Developer Professional learning path and certification enables you to leverage AI-assisted development, including generative AI and Database 23ai, to build secure, scalable web and mobile applications with advanced AI-powered features. From now through May 15, 2025, we're waiving the certification exam fee (valued at $245). So, what are you waiting for? Visit mylearn.oracle.com to get started today. 08:09 Nikita: Welcome back! Thanks for that, Chaitanya. Toufiq, let's talk about the OCI Document Understanding service. What is it? Toufiq: Using this service, you can upload documents to extract text, tables, and other key data. This means the service can automatically identify and extract relevant information from various types of documents, such as invoices, receipts, contracts, etc. The service is serverless and multitenant, which means you don't need to manage any servers or infrastructure. You can access this service using the console, REST APIs, SDK, or CLI, giving you multiple ways to integrate. 08:55 Nikita: What do we use for APEX apps?  Toufiq: For APEX applications, we will be using REST APIs to integrate the service. Additionally, you can process individual files or batches of documents using the ProcessorJob API endpoint. This flexibility allows you to handle different volumes of documents efficiently, whether you need to process a single document or thousands at once. With these capabilities, the OCI Document Understanding service can significantly streamline your document processing tasks, saving time and reducing the potential for manual errors. 09:36 Lois: Ok.  What are the different types of models available? How do they cater to various business needs? Toufiq: Let us start with pre-trained models. These are ready-to-use models that come right out of the box, offering a range of functionalities. The available models are Optical Character Recognition (OCR) enables the service to extract text from documents, allowing you to digitize, scan the documents effortlessly. You can precisely extract text content from documents. Key-value extraction, useful in streamlining tasks like invoice processing. Table extraction can intelligently extract tabular data from documents. Document classification automatically categorizes documents based on their content. OCR PDF enables seamless extraction of text from PDF files. Now, what if your business needs go beyond these pre-trained models. That's where custom models come into play. You have the flexibility to train and build your own models on top of these foundational pre-trained models. Models available for training are key value extraction and document classification. 10:50 Nikita: What does the architecture look like for OCI Document Understanding? Toufiq: You can ingest or supply the input file in two different ways. You can upload the file to an OCI Object Storage location. And in your request, you can point the Document Understanding service to pick the file from this Object Storage location.  Alternatively, you can upload a file directly from your computer. Once the file is uploaded, the Document Understanding service can process the file and extract key information using the pre-trained models. You can also customize models to tailor the extraction to your data or use case. After processing the file, the Document Understanding service stores the results in JSON format in the Object Storage output bucket. Your Oracle APEX application can then read the JSON file from the Object Storage output location, parse the JSON, and store useful information at local table or display it on the screen to the end user. 11:52 Lois: And what about use cases? How are various industries using this service? Toufiq: In financial services, you can utilize Document Understanding to extract data from financial statements, classify and categorize transactions, identify and extract payment details, streamline tax document management. Under manufacturing, you can perform text extraction from shipping labels and bill of lading documents, extract data from production reports, identify and extract vendor details. In the healthcare industry, you can automatically process medical claims, extract patient information from forms, classify and categorize medical records, identify and extract diagnostic codes. This is not an exhaustive list, but provides insights into some industry-specific use cases for Document Understanding. 12:50 Nikita: Toufiq, let's switch to the big topic everyone's excited about—the OCI Generative AI Service. What exactly is it? Toufiq: OCI Generative AI is a fully managed service that provides a set of state of the art, customizable large language models that cover a wide range of use cases. It provides enterprise grade generative AI with data governance and security, which means only you have access to your data and custom-trained models. OCI Generative AI provides pre-trained out-of-the-box LLMs for text generation, summarization, and text embedding. OCI Generative AI also provides necessary tools and infrastructure to define models with your own business knowledge. 13:37 Lois: Generally speaking, how is OCI Generative AI useful?  Toufiq: It supports various large language models. New models available from Meta and Cohere include Llama2 developed by Meta, and Cohere's Command model, their flagship text generation model. Additionally, Cohere offers the Summarize model, which provides high-quality summaries, accurately capturing essential information from documents, and the Embed model, converting text to vector embeddings representation. OCI Generative AI also offers dedicated AI clusters, enabling you to host foundational models on private GPUs. It integrates LangChain and open-source framework for developing new interfaces for generative AI applications powered by language models. Moreover, OCI Generative AI facilitates generative AI operations, providing content moderation controls, zero downtime endpoint model swaps, and endpoint deactivation and activation capabilities. For each model endpoint, OCI Generative AI captures a series of analytics, including call statistics, tokens processed, and error counts. 14:58 Nikita: What about the architecture? How does it handle user input? Toufiq: Users can input natural language, input/output examples, and instructions. The LLM analyzes the text and can generate, summarize, transform, extract information, or classify text according to the user's request. The response is sent back to the user in the specified format, which can include raw text or formatting like bullets and numbering, etc. 15:30 Lois: Can you share some practical use cases for generative AI in APEX apps?  Toufiq: Some of the OCI generative AI use cases for your Oracle APEX apps include text summarization. Generative AI can quickly summarize lengthy documents such as articles, transcripts, doctor's notes, and internal documents. Businesses can utilize generative AI to draft marketing copy, emails, blog posts, and product descriptions efficiently. Generative AI-powered chatbots are capable of brainstorming, problem solving, and answering questions. With generative AI, content can be rewritten in different styles or languages. This is particularly useful for localization efforts and catering to diverse audience. Generative AI can classify intent in customer chat logs, support tickets, and more. This helps businesses understand customer needs better and provide tailored responses and solutions. By searching call transcripts, internal knowledge sources, Generative AI enables businesses to efficiently answer user queries. This enhances information retrieval and decision-making processes. 16:47 Lois: Before we let you go, can you explain what Select AI is? How is it different from the other AI services? Toufiq: Select AI is a feature of Autonomous Database. This is where Select AI differs from the other AI services. Be it OCI Vision, Document Understanding, or OCI Generative AI, these are all freely managed standalone services on Oracle Cloud, accessible via REST APIs. Whereas Select AI is a feature available in Autonomous Database. That means to use Select AI, you need Autonomous Database.  17:26 Nikita: And what can developers do with Select AI? Toufiq: Traditionally, SQL is the language used to query the data in the database. With Select AI, you can talk to the database and get insights from the data in the database using human language. At the very basic, what Select AI does is it generates SQL queries using natural language, like an NL2SQL capability.  17:52 Nikita: How does it actually do that? Toufiq: When a user asks a question, the first step Select AI does is look into the AI profile, which you, as a developer, define. The AI profile holds crucial information, such as table names, the LLM provider, and the credentials needed to authenticate with the LLM service. Next, Select AI constructs a prompt. This prompt includes information from the AI profile and the user's question.  Essentially, it's a packet of information containing everything the LLM service needs to generate SQL. The next step is generating SQL using LLM. The prompt prepared by Select AI is sent to the available LLM services via REST. Which LLM to use is configured in the AI profile. The supported providers are OpenAI, Cohere, Azure OpenAI, and OCI Generative AI. Once the SQL is generated by the LLM service, it is returned to the application. The app can then handle the SQL query in various ways, such as displaying the SQL results in a report format or as charts, etc.  19:05 Lois: This has been an incredible discussion! Thank you, Chaitanya, Apoorva, and Toufiq, for walking us through all of these amazing AI tools. If you're ready to dive deeper, visit mylearn.oracle.com and search for the Oracle APEX: Empowering Low Code Apps with AI course. You'll find step-by-step guides and demos for everything we covered today.  Nikita: Until next week, this is Nikita Abraham… Lois: And Lois Houston signing off! 19:31 That's all for this episode of the Oracle University Podcast. If you enjoyed listening, please click Subscribe to get all the latest episodes. We'd also love it if you would take a moment to rate and review us on your podcast app. See you again on the next episode of the Oracle University Podcast.  

Everyday AI Podcast – An AI and ChatGPT Podcast
EP 506: How Distributed Computing is Unlocking Affordable AI at Scale

Everyday AI Podcast – An AI and ChatGPT Podcast

Play Episode Listen Later Apr 17, 2025 22:26


Everyone's chasing bigger AI. The real opportunity? Smarter scaling.Distributed computing is quietly rewriting the rules of what's possible—not just for tech giants, but for everyone building with AI.We're talking cost. We're talking scale. And we're definitely talking disruption.Tom Curry, CEO and Co-Founder of DistributeAI, joins us as we dig into the future of distributed power and practical AI performance.Newsletter: Sign up for our free daily newsletterMore on this Episode: Episode PageJoin the discussion: Thoughts on this? Join the convo.Upcoming Episodes: Check out the upcoming Everyday AI Livestream lineupWebsite: YourEverydayAI.comEmail The Show: info@youreverydayai.comConnect with Jordan on LinkedInTopics Covered in This Episode:Distributed Computing for Affordable AIOpen Source vs. Proprietary AI ModelsGPU Demand and Compute LimitationsEdge Computing and Privacy ConcernsSmall Business AI Compute SolutionsFuture Trends in AI Model SizesImpact of Open Source AI DominanceTimestamps:00:00 Rising Importance of AI Compute06:21 AI Model Resource Constraints09:24 AI Models' Efficiency vs. Complexity12:24 Edge Compute for Daily Tasks16:00 Compute Cost Drives AI Market16:58 AI Models: Balancing Cost and Innovation20:43 Adaptability in Rapidly Changing BusinessKeywords:Distributed computing, compute, GPUs, generative AI, ChatGPT, large language models, open source models, proprietary models, affordable AI, scale, Distribute AI, spare compute, Tom Curry, mid-level businesses, accessible AI ecosystem, API access, power grid, NVIDIA, OpenAI, tokens, chain of thought, models size, reasoning models, edge computing, cell phones analogy, data privacy, DeepSeek, Google Gemini 3, Eloscores, open models, hybrid models, centralized model, OpenAI strategy, Anthropic, Claw tokens, commoditization, applications, government contracts, integration, UX and UI, technology advancements, private source AI, business leaders, AI deployment strategy, flexibility in AI.Send Everyday AI and Jordan a text message. (We can't reply back unless you leave contact info)

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Evan Conrad, co-founder of SF Compute, joined us to talk about how they started as an AI lab that avoided bankruptcy by selling GPU clusters, why CoreWeave financials look like a real estate business, and how GPUs are turning into a commodities market. Chapters: 00:00:05 - Introductions 00:00:12 - Introduction of guest Evan Conrad from SF Compute 00:00:12 - CoreWeave Business Model Discussion 00:05:37 - CoreWeave as a Real Estate Business 00:08:59 - Interest Rate Risk and GPU Market Strategy Framework 00:16:33 - Why Together and DigitalOcean will lose money on their clusters 00:20:37 - SF Compute's AI Lab Origins 00:25:49 - Utilization Rates and Benefits of SF Compute Market Model 00:30:00 - H100 GPU Glut, Supply Chain Issues, and Future Demand Forecast 00:34:00 - P2P GPU networks 00:36:50 - Customer stories 00:38:23 - VC-Provided GPU Clusters and Credit Risk Arbitrage 00:41:58 - Market Pricing Dynamics and Preemptible GPU Pricing Model 00:48:00 - Future Plans for Financialization? 00:52:59 - Cluster auditing and quality control 00:58:00 - Futures Contracts for GPUs 01:01:20 - Branding and Aesthetic Choices Behind SF Compute 01:06:30 - Lessons from Previous Startups 01:09:07 - Hiring at SF Compute Chapters 00:00:00 Introduction and Background 00:00:58 Analysis of GPU Business Models 00:01:53 Challenges with GPU Pricing 00:02:48 Revenue and Scaling with GPUs 00:03:46 Customer Sensitivity to GPU Pricing 00:04:44 Core Weave's Business Strategy 00:05:41 Core Weave's Market Perception 00:06:40 Hyperscalers and GPU Market Dynamics 00:07:37 Financial Strategies for GPU Sales 00:08:35 Interest Rates and GPU Market Risks 00:09:30 Optimal GPU Contract Strategies 00:10:27 Risks in GPU Market Contracts 00:11:25 Price Sensitivity and Market Competition 00:12:21 Market Dynamics and GPU Contracts 00:13:18 Hyperscalers and GPU Market Strategies 00:14:15 Nvidia and Market Competition 00:15:12 Microsoft's Role in GPU Market 00:16:10 Challenges in GPU Market Dynamics 00:17:07 Economic Realities of the GPU Market 00:18:03 Real Estate Model for GPU Clouds 00:18:59 Price Sensitivity and Chip Design 00:19:55 SF Compute's Beginnings and Challenges 00:20:54 Navigating the GPU Market 00:21:54 Pivoting to a GPU Cloud Provider 00:22:53 Building a GPU Market 00:23:52 SF Compute as a GPU Marketplace 00:24:49 Market Liquidity and GPU Pricing 00:25:47 Utilization Rates in GPU Markets 00:26:44 Brokerage and Market Flexibility 00:27:42 H100 Glut and Market Cycles 00:28:40 Supply Chain Challenges and GPU Glut 00:29:35 Future Predictions for the GPU Market 00:30:33 Speculations on Test Time Inference 00:31:29 Market Demand and Test Time Inference 00:32:26 Open Source vs. Closed AI Demand 00:33:24 Future of Inference Demand 00:34:24 Peer-to-Peer GPU Markets 00:35:17 Decentralized GPU Market Skepticism 00:36:15 Redesigning Architectures for New Markets 00:37:14 Supporting Grad Students and Startups 00:38:11 Successful Startups Using SF Compute 00:39:11 VCs and GPU Infrastructure 00:40:09 VCs as GPU Credit Transformators 00:41:06 Market Timing and GPU Infrastructure 00:42:02 Understanding GPU Pricing Dynamics 00:43:01 Market Pricing and Preemptible Compute 00:43:55 Price Volatility and Market Optimization 00:44:52 Customizing Compute Contracts 00:45:50 Creating Flexible Compute Guarantees 00:46:45 Financialization of GPU Markets 00:47:44 Building a Spot Market for GPUs 00:48:40 Auditing and Standardizing Clusters 00:49:40 Ensuring Cluster Reliability 00:50:36 Active Monitoring and Refunds 00:51:33 Automating Customer Refunds 00:52:33 Challenges in Cluster Maintenance 00:53:29 Remote Cluster Management 00:54:29 Standardizing Compute Contracts 00:55:28 Unified Infrastructure for Clusters 00:56:24 Creating a Commodity Market for GPUs 00:57:22 Futures Market and Risk Management 00:58:18 Reducing Risk with GPU Futures 00:59:14 Stabilizing the GPU Market 01:00:10 SF Compute's Anti-Hype Approach 01:01:07 Calm Branding and Expectations 01:02:07 Promoting San Francisco's Beauty 01:03:03 Design Philosophy at SF Compute 01:04:02 Artistic Influence on Branding 01:05:00 Past Projects and Burnout 01:05:59 Challenges in Building an Email Client 01:06:57 Persistence and Iteration in Startups 01:07:57 Email Market Challenges 01:08:53 SF Compute Job Opportunities 01:09:53 Hiring for Systems Engineering 01:10:50 Financial Systems Engineering Role 01:11:50 Conclusion and Farewell

TD Ameritrade Network
NVDA Strengths to Weather Tariff Storm, DeepSeek Hurdles Ahead

TD Ameritrade Network

Play Episode Listen Later Apr 11, 2025 8:40


Citi lowered its price target on Nvidia (NVDA) to $150 from $163 and trimmed GPUs estimates as hyperscalers cut back A.I. CapEx spending. Jeff Pierce believes the price target cut won't leave a scratch on the A.I. giant's trajectory. 90% of analysts still have a buy rating on Nvidia, though Jeff Pierce urges investors to keep DeepSeek in mind as the Chinese A.I. model can generate headwinds. Dan Deming offers an example options trade for Nvidia.======== Schwab Network ========Empowering every investor and trader, every market day.Subscribe to the Market Minute newsletter - https://schwabnetwork.com/subscribeDownload the iOS app - https://apps.apple.com/us/app/schwab-network/id1460719185Download the Amazon Fire Tv App - https://www.amazon.com/TD-Ameritrade-Network/dp/B07KRD76C7Watch on Sling - https://watch.sling.com/1/asset/191928615bd8d47686f94682aefaa007/watchWatch on Vizio - https://www.vizio.com/en/watchfreeplus-exploreWatch on DistroTV - https://www.distro.tv/live/schwab-network/Follow us on X – / schwabnetwork Follow us on Facebook – / schwabnetwork Follow us on LinkedIn - / schwab-network About Schwab Network - https://schwabnetwork.com/about

100x Entrepreneur
$0 To $5M In 6 Months: Sharad Sanghi On Why Local AI Clouds Matter NOW | Neon Show

100x Entrepreneur

Play Episode Listen Later Apr 11, 2025 51:54


Meet Sharad Sanghi who built India's first data center, he spent 25 years building Netmagic, India's largest data center company. He came to India in 1995, worked at VSNL as the country was discovering the internet, and built a company focused on internet for businesses.Now he is building Neysa, an AI cloud startup, which recently raised $50M to help businesses adopt AI, all from India. With Neysa businesses can use AI without writing a lot of code or use five different tools to run it. And do everything—train models, test them, deploy them, and monitor them—in one single dashboard.Sharad has a lot of perspective to share—as someone who was at the forefront while India adopted the internet, and now, the AI wave.If you're building in AI, part of an enterprise exploring AI, or simply thinking about where India is in the AI race—this episode is for you.0:00 – Trailer02:00 – How Neysa makes AI easy & cheap?04:37 – From datacentre to AI06:54 – 1200 GPUs & 15 Clients07:24 – Selling internet to Businesses12:03 – Why India Needs Local AI Clouds13:20 – How Neysa Plans to Stand Out15:14 – Is Scaling a Hyperscaler Easy?19:26 – Can India Shift from IT to AI?26:40 – AI in Large Enterprises vs Mid-Market27:51 – AI Revolution vs Cloud in 200631:06 – What's the Moat for AI Startups?33:08 – Is Private Data the Real AI Goldmine?35:59 – Why Product & GTM Matter Early On38:06 – Why Scaling must come before Demand42:12 – OpenAI's edge on Deepseek?44:52 – How should enterprises navigate AI?46:32 – Did Only NVIDIA Predict the AI Boom?48:17 – Founder 1.0 V/S 2.0-------------Hi, I am your host Siddhartha! I have been an entrepreneur from 2012-2017 building two products AddoDoc and Babygogo. After selling my company to SHEROES, I and my partner Nansi decided to start up again. But we felt unequipped in our skillset in 2018 to build a large company. We had known 0-1 journeys from our startups but lacked the experience of building 1-10 journeys. Hence was born The Neon Show (Earlier 100x Entrepreneur) to learn from founders and investors, the mindset to scale yourself and your company. This quest still keeps us excited even after 5 years and doing 200+ episodes.We welcome you to our journey to understand what goes behind building a super successful company. Every episode is done with a very selfish motive, that I and Nansi should come out as a better entrepreneur and professional after absorbing the learnings.-------------Check us out on:Website: https://neon.fund/Instagram: https://www.instagram.com/theneonshoww/LinkedIn: https://www.linkedin.com/company/beneon/Twitter: https://x.com/TheNeonShowwConnect with Siddhartha on:LinkedIn: https://www.linkedin.com/in/siddharthaahluwalia/Twitter: https://x.com/siddharthaa7-------------This video is for informational purposes only. The views expressed are those of the individuals quoted and do not constitute professional advice.Send us a text

Web3 with Sam Kamani
244: From Harvard to Hardware: Hoansoo from Exabits on Web3 Compute and Scaling with GPUs

Web3 with Sam Kamani

Play Episode Listen Later Apr 9, 2025 26:12


In this episode of Web3 with Sam Kamani, Sam is joined by co-host Amanda Whitcroft to interview Hoansoo Lee, co-founder of Exabits.ai. With a PhD from Harvard and deep expertise in edge computing, Hoansoo shares how Exabits is decentralizing the GPU cloud for AI by combining high-performance chips like the H100 and Blackwell with tokenized infrastructure on Web3 rails.They explore why AI compute is the "new energy," how Exabits differentiates from competitors like CoreWeave, and the opportunities for DeFi and structured finance in this emerging landscape. Hoansoo also discusses the limitations of decentralized compute, the challenges around AI experimentation, and how data, compute, and causality intersect in building next-gen AI.Whether you're a founder building in AI, a researcher, or a curious investor, this episode is packed with deep insights into the future of decentralized compute and what's next in the AI x Web3 convergence.Key Timestamps[00:00:00] Introduction: Sam introduces co-host Amanda and guest Hoansoo Lee from Exabits.ai.[00:01:00] What is Exabits?: Hoansoo explains Exabits in one sentence—high-quality GPU compute for AI.[00:02:00] Who Uses It: Discussing their customer base across Web2 and Web3.[00:03:00] Hardware Stack: Exabits runs 60,000+ GPUs including H100s and Blackwells.[00:04:00] Competitive Landscape: Why Exabits is different from other Web3 dePIN projects.[00:05:00] Founding Story: How a background in edge computing led to building Exabits.[00:06:00] Go-to-Market: Customer acquisition through partnerships, referrals, and conferences.[00:07:00] Growth Opportunity: Why structured finance and GPU financialization is the next big thing.[00:08:00] AI Efficiency vs. Demand: DeepSeek, scaling laws, and the compute boom.[00:10:00] Energy + Compute: AI's demand for energy and its parallels to historical tech trends.[00:11:00] Decentralized Compute: Limitations of latency-sensitive decentralized AI infrastructure.[00:13:00] AI = Bitcoin Mining 2.0: The evolution from minting Bitcoin to minting intelligence.[00:14:00] Pillars of AI: From compute/data/models to experimentation and causal inference.[00:17:00] AI Limits: Why synthetic data can't replace real-world experimentation.[00:18:00] Scarcity & Innovation: How chip scarcity could spark further innovation.[00:20:00] In-House Servers: Why building H200 racks in-house is a differentiator.[00:21:00] How It Works: A user's experience on Exabits from login to compute access.[00:23:00] Founder Advice: Hoansoo's take on building something with real customers and solid fundamentals[00:24:00] Roadmap: Data center expansion, orchestration features, and governance via staking.[00:25:00] TGE Ahead: Exabits' upcoming token generation event and next steps.Connecthttps://www.exabits.ai/https://www.linkedin.com/company/exabitsai/https://x.com/exa_bitshttps://www.linkedin.com/in/hoansoo-lee-21586b9/https://www.linkedin.com/in/amanda-whitcroft-324879164/DisclaimerNothing mentioned in this podcast is investment advice and please do your own research. Finally, it would mean a lot if you can leave a review of this podcast on Apple Podcasts or Spotify and share this podcast with a friend.Be a guest on the podcast or contact us - https://www.web3pod.xyz/

Audio-Podcast – OrionX.net: Deep Insight, Market Execution, Customer Engagement
Analyst Roundtable: GPUs, AI, Quantum, Bitcoin, China – OXD27

Audio-Podcast – OrionX.net: Deep Insight, Market Execution, Customer Engagement

Play Episode Listen Later Apr 8, 2025


Analyst roundtable with Adrian Cockcroft, Stephen Perrenod, Chris Kruell, and Shahin Khan covering recent advances in, and impacts of: GPUs, including a post-view of the GTC conference, AI DataCenters, Quantum Computing, Bitcoin, and China. [audio mp3="https://orionx.net/wp-content/uploads/2025/04/OXD027_ART_GPUs_Quantum_Crypto_China_20250408.mp3"][/audio] The post Analyst Roundtable: GPUs, AI, Quantum, Bitcoin, China – OXD27 appeared first on OrionX.net.

One Knight in Product
The TRUTH About Large Language Models and Agentic AI (with Andriy Burkov, Author "The Hundred-Page Language Models Book")

One Knight in Product

Play Episode Listen Later Apr 8, 2025 84:32


Andriy Burkov is a renowned machine learning expert and leader. He's also the author of (so far) three books on machine learning, including the recently-released "The Hundred-Page Language Models Book", which takes curious people from the very basics of language models all the way up to building their own LLM. Andriy is also a formidable online presence and is never afraid to call BS on over-the-top claims about AI capabilities via his punchy social media posts. Episode highlights: 1. Large Language Models are neither magic nor conscious LLMs boil down to relatively simple mathematics at an unfathomably large scale. Humans are terrible at visualising big numbers and cannot comprehend the size of the dataset or the number of GPUs that have been used to create the models. You can train the same LLM on a handful of records and get garbage results, or throw millions of dollars at it and get good results, but the fundamentals are identical, and there's no consciousness hiding in between the equations. We see good-looking output, and we think it's talking to us. It isn't. 2. As soon as we saw it was possible to do mathematics on words, LLMs were inevitable There were language models before LLMs, but the invention of the transformer architecture truly accelerated everything. That said, the fundamentals trace further back to "simpler" algorithms, such as word2vec, which proved that it is possible to encode language information in a numeric format, which meant that the vast majority of linguistic information could be represented by embeddings, which enabled people to run equations on language. After that, it was just a matter of time before they got scaled out. 3. LLMs look intelligent because people generally ask about things they already know about The best way to be disappointed by an LLM's results is to ask detailed questions about something you know deeply. It's quite likely that it'll give good results to start with, because most people's knowledge is so unoriginal that, somewhere in the LLM's training data, there are documents that talk about the thing you asked about. But, it will degrade over time and confidently keep writing even when it doesn't know the answer. These are not easily solvable problems and are, in fact, fundamental parts of the design of an LLM. 4. Agentic AI relies on unreliable actors with no true sense of agency The concept of agents is not new, and people have been talking about them for years. The key aspect of AI agents is that they need self-motivation and goals of their own, rather than being told to have goals and then simulating the desire to achieve them. That's not to say that some agents are not useful in their own right, but the goal of fully autonomous, agentic systems is a long way off, and may not even be solvable. 5. LLMs represent the most incredible technical advance since the personal computer, but people should quit it with their most egregious claims LLMs are an incredible tool and can open up whole new worlds for people who are able to get the best out of them. There are limits to their utility, and some of their shortcomings are likely unsolvable, but we should not minimise their impact. However, there are unethical people out there making completely unsubstantiated claims based on zero evidence and a fundamental misunderstanding of how these models work. These people are scaring people and encouraging terrible decision-making from the gullible. We need to see through the hype. Buy "The Hundred-Page Language Model Book" "Large language models (LLMs) have fundamentally transformed how machines process and generate information. They are reshaping white-collar jobs at a pace comparable only to the revolutionary impact of personal computers. Understanding the mathematical foundations and inner workings of language models has become crucial for maintaining relevance and competitiveness in an increasingly automated workforce. This book guides you through the evolution of language models, starting from machine learning fundamentals. Rather than presenting transformers right away, which can feel overwhelming, we build understanding of language models step by step—from simple count-based methods through recurrent neural networks to modern architectures. Each concept is grounded in clear mathematical foundations and illustrated with working Python code." Check it out on the book's website: https://thelmbook.com/. You can also check out Machine Learning Engineering: https://www.mlebook.com and The Hundred-Page Machine Learning Book: https://www.themlbook.com/. Follow Andriy You can catch up with Andriy here: LinkedIn: https://www.linkedin.com/in/andriyburkov/ Twitter/"X": https://twitter.com/burkov True Positive Newsletter: https://aiweekly.substack.com/

Chain Reaction
Travis Good: Machine Intelligence as a new world currency: facing down OpenAI with Ambient, a hyperscaled decentralized PoW-powered alternative

Chain Reaction

Play Episode Listen Later Apr 7, 2025 91:23


Join Tom Shaughnessy as he hosts Travis Good, CEO and co-founder of Ambient, for a deep dive into the world's first useful proof-of-work blockchain powered by AI. Fresh out of stealth, Ambient reimagines the intersection of crypto and AI by creating a decentralized network where mining secures the chain through verified AI inference on a 600B+ parameter model.

The Tech Blog Writer Podcast
3230: Inside io.net's On-Demand GPU Infrastructure

The Tech Blog Writer Podcast

Play Episode Listen Later Apr 4, 2025 52:17


What happens when blockchain meets AI infrastructure at scale? In today's episode, I sit down with Tory Green from io.net to explore how a decentralized GPU network could reshape the future of machine learning, AI development, and compute accessibility. io.net has grown rapidly over the past year. With more than 325,000 verified GPUs already in its decentralized network, it's offering an alternative to the high costs and limitations of traditional cloud compute. What caught my attention is the platform's ability to reduce GPU costs by up to 90 percent, giving startups and researchers access to performance that would otherwise be out of reach. In fact, over 73 partners have already integrated io.net's infrastructure, helping drive month-over-month network earnings growth of nearly 60 percent. But this conversation goes far deeper than computing. Tory walks me through the vision of a more transparent, open, and incentive-driven AI development ecosystem. From its collaboration with OKX to power Web3 infrastructure for AI developers to enabling real-world applications like Zerebro AI agents, io.net is building a new paradigm for how infrastructure should work in the era of intelligent systems. We also dive into the convergence of blockchain and AI, and why this isn't just a niche Web3 experiment. It's about creating real incentives for data sharing, enabling collaboration across models, and removing bottlenecks in how builders access the tools they need. Tory also shares how the company is evolving from a computing network into a full-stack AI development platform, including tools for no-code agent creation. So what will the next generation of AI applications look like when they're powered by a global, decentralized network instead of a handful of cloud giants? And how can developers take advantage of this shift today?

The Tech Blog Writer Podcast
3250: Couchbase: Overcoming Infrastructure Hurdles in Enterprise AI

The Tech Blog Writer Podcast

Play Episode Listen Later Apr 1, 2025 33:51


Here's the thing: we all heard Sundar Pichai say that the easy wins in AI have faded and that we may see fewer headline‑grabbing releases from the big players over the next year. That comment feels like a red flag for momentum, but I see it as a green light for action.  In this episode, I chatted with Rahul Pradhan, VP of Product and Strategy at Couchbase, about how teams can take advantage of this pause to move projects from simple experiments into solid, production‑ready services. I ask why many organizations hesitate to send their data to public AI endpoints. Rahul explains that when you've invested years building data platforms, handing over your proprietary information—even in encrypted form—can feel like handing over the keys to your kingdom. He walks us through how running models inside your security perimeter keeps private data safe and brings up model accuracy since you can tailor inputs and scrub out noise before it ever reaches the inference engine. Next, we tackle the question of stability. Companies often assume that the path to a live service is straightforward once a pilot works. Rahul warns that managing GPUs, orchestrating models, and serving them at low latency all require skill sets that live at the crossroads of ML engineering and traditional software development. We round out our conversation by shifting focus from tools to teams. Technology alone cannot carry an AI initiative. We need leaders who set a clear vision, data stewards who govern every data flow, and developers who feel as comfortable writing database queries as they define training pipelines. Rahul offers thoughtful advice on building that culture and shares examples of industries—healthcare, financial services, and retail—where the most far‑reaching uses of AI are taking root. If you're wondering how to push your proof of concept into a robust service that customers depend on, this episode is for you. I promise you'll come away with ideas you can apply tomorrow and a fresh view of why a little breathing room in AI releases can become the launch pad for your subsequent big success.    

Mo News
Trump Angry At Putin; President Discusses Third Term; Top Vaccine Scientist Pushed Out; ChatGPT Images Breaks Internet

Mo News

Play Episode Listen Later Mar 31, 2025 44:18


A daily non-partisan, conversational breakdown of today's top news and breaking news stories Headlines: – Trump Says He's ‘Very Angry' at Putin; Says He Doesn't Care If Foreign Car Prices Rise (03:40) – Americans' Economic Outlook A Bit More Pessimistic, Despite Egg Prices Plummeting (08:50) – Will Anyone Be Fired In The Aftermath Of Signalgate? (11:30) – Trump Says There Are “Methods” For Him To Pursue A Third Term (17:30) – Myanmar, Thailand Quake Death Toll Rises Above 1,600 (19:50) – RFK Jr. Forces Out Peter Marks, FDA's Top Vaccine Scientist (22:45) – Leader Of Violent MS-13 Gang Arrested In Virginia, Feds Say (26:00) – Columbia President Is Replaced as Trump Threatens University's Funding (27:30) – Sam Altman: ChatGPT's Viral Image-Generation AI is ‘Melting' OpenAI's GPUs (32:20) – NCAA Final Four Is Set On Men's Side (34:20) – On This Day In History (36:00) Thanks To Our Sponsors: – Vanta – Get $1,000 off – Shopify – $1 per-month trial Code: monews – Industrious - Coworking office. 30% off day pass – LMNT - Free Sample Pack with any LMNT drink mix purchase – Athletic Greens – AG1 Powder + 1 year of free Vitamin D & 5 free travel packs – BetterHelp – 10% off your first month

The Circuit
Episode 111: Talking Chips and Wafers with Chips and Wafers

The Circuit

Play Episode Listen Later Mar 31, 2025 55:06


In this episode, Ben Bajarin and Jay Goldberg engage with Simi Sherman and Chaim Eisenberg from Chips and Wafers to explore the intricacies of the semiconductor industry. They discuss the importance of both qualitative and quantitative analysis in understanding market trends, the challenges of data collection, and the unique insights their company provides. The conversation delves into the competitive landscape of ASICs versus GPUs, the significance of tracking various data points, and how this information can be leveraged for predictive analysis in investments. In this conversation, Simi Sherman and Ben Bajarin delve into the intricacies of investment data, emphasizing the importance of using the right data points for informed decision-making. They discuss specific company examples, the predictive power of data, and the evolving landscape of the semiconductor industry, particularly the shift towards disaggregated designs and chiplets. The conversation highlights the gap between investor expectations and company performance, and concludes with insights into how analysts can leverage data to build a clearer picture of future trends.

The Vergecast
OpenAI has a Studio Ghibli problem

The Vergecast

Play Episode Listen Later Mar 28, 2025 124:36


In this episode, we do a Studio Ghibli-like rendition of The Vergecast. First, Nilay and David discuss some big news in the gadget world, from the mysteriously viral midrange Canon camera to the upgrades we're expecting out of Apple in the next few months. Plus, is it over for Amazon's Echo brand? After all that, The Verge's Kylie Robison joins the show to discuss everything happening at OpenAI: the company launched a new image generator inside of ChatGPT, and it immediately became both a huge hit and a big mess. (Par for the course with OpenAI, really.) Kylie also explains why Perplexity is probably not buying TikTok, no matter how much it might want to. Finally, in the lightning round, it's time for everyone's favorite segment, Brendan Carr Is a Dummy, followed by the latest on the Signal attack-planning chaos in the government, some news about Elon Musk pressuring Reddit CEO Steve Huffmann, and what's next for the car industry with huge tariffs looming. Oh, and a little bit of exciting e-bike news Further reading: From Meta: Bringing the Magic of Friends Back to Facebook Apple's AirPods Max with USB-C will soon support lossless audio The Apple Watch may get cameras and Apple Intelligence Apple's WWDC 2025 event starts June 9th Don't expect an overhauled Messages app in iOS 19. Amazon tests renaming Echo smart speakers and smart displays to just ‘Alexa'  OpenAI reshuffles leadership as Sam Altman pivots to technical focus OpenAI upgrades image generation and rolls it out in ChatGPT and Sora ChatGPT's new image generator is delayed for free users ChatGPT is turning everything into Studio Ghibli art  OpenAI says ‘our GPUs are melting' as it limits ChatGPT image generation requests OpenAI expects to earn $12.7 billion in revenue this year. Nvidia Infinite Creative Microsoft adds ‘deep reasoning' Copilot AI for research and data analysis Google says its new ‘reasoning' Gemini AI models are the best ones yet Google is rolling out Gemini's real-time AI video features Perplexity's bid for TikTok continues Trump's FCC says it will start investigating Disney, too From Status: Sounding the Carr Alarm Trump officials leaked a military strike in a Signal group chat The Atlantic releases strike group chat messages And the Most Tortured Signal-Gate Backronym Award goes to… | The Verge Elon Musk pressured Reddit's CEO on content moderation | The Verge Trump's plans to save TikTok may fail to keep it online, Democrats warn Rivian spins out secret e-bike lab into a new company called Also BYD beats Tesla. Trump says he will impose a 25 percent tariff on imported vehicles Email us at vergecast@theverge.com or call us at 866-VERGE11, we love hearing from you. Learn more about your ad choices. Visit podcastchoices.com/adchoices

Explain Like I'm Five - ELI5 Mini Podcast
ELI5 GPUs - why are they better than CPUs for artificial intelligence?

Explain Like I'm Five - ELI5 Mini Podcast

Play Episode Listen Later Mar 28, 2025 7:57


What distinguishes CPUs from GPUs in architecture, and how does this impact their performance in computing tasks? Why are GPUs considered better at handling tasks like graphics rendering compared to CPUs? How do different rendering techniques in games versus offline programs affect the processing demands on CPUs and GPUs?  ... we explain like I'm five Thank you to the r/explainlikeimfive community and in particular the following users whose questions and comments formed the basis of this discussion: insane_eraser, popejustice, warlocktx, pourliver, dmartis, and arentol. To the community that has supported us so far, thanks for all your feedback and comments. Join us on Twitter: https://www.twitter.com/eli5ThePodcast/ or send us an e-mail: ELI5ThePodcast@gmail.com

PC Perspective Podcast
Podcast #816 - RTX 5060 Delay, RX 9070 Breaks Sales Records, Thrustmaster T818 Review, and much MORE

PC Perspective Podcast

Play Episode Listen Later Mar 28, 2025 76:16


Josh finally had another burger, and published his SECOND review of the month!!  All hail the Thrustmaster!   Mindfactory never went anywhere, Windows likes your printers again, and all the GPU news you can stand!  And we know who you are, since 23andMe sold all your data... J/K!00:00 Intro01:33 Food with Josh03:19 RTX 5060 series delay rumor, 8GB and 16GB Ti confirmed06:19 AMD says RX 9070 series had 10x more first-week sales12:12 Making wafers at TSMC Arizona might be just 10% more expensive14:02 Mindfactory attempts a comeback15:31 Windows update fixes printer issues16:32 Also, Windows update breaks VEEAM recovery19:28 MSI selling PSUs with only one 8-pin PCIe connector21:58 Google Maps may have deleted your timeline data25:48 23andMe potentially selling all of your personal data29:31 Podcast sponsor - Incogni30:55 (in)Security Corner43:16 Podcast sponsor - Stash44:25 Gaming Quick Hits51:36 Thrustmaster T818 Review1:05:51 Picks of the Week1:14:27 Outro ★ Support this podcast on Patreon ★

The Jubal Show
Nina's What's Trending - From "Brat Summer" to "Dilly-Dally Spring," BTS Lullabies, and Melting GPUs

The Jubal Show

Play Episode Listen Later Mar 28, 2025 5:15 Transcription Available


The Headlines: Brat Summer Is Out—Welcome to Dilly-Dally Spring BTS Drops Lullaby Album for the Next Generation OpenAI Struggles to Keep Up with Demand as GPUs "Melt" "Brat Summer" Is Over—Make Way for "Dilly-Dally Spring" Last year, Brat Summer took over social media, fueled by Charli XCX’s bass-heavy party album Brat. The trend embodied wild, reckless nights, smudged makeup from the night before, and spontaneous tattoos. But 2025 is ushering in a different vibe—TikTokers are now embracing Dilly-Dally Spring. Instead of partying until dawn, the new aesthetic is all about doing… absolutely nothing. Slower days, lazy afternoons, and taking life at a leisurely pace are the new seasonal goals. Will you be dilly-dallying this spring?Source BTS Is Making Lullabies Now—Yes, Really BTS may be on hiatus for military service, but that’s not stopping them from reaching a new audience—babies. The K-pop giants are lending their music to a Rockabye Baby! album, turning hits like Butter, Permission to Dance, and Dynamite into soothing lullabies. The album drops next Friday, making it the perfect soundtrack for the next generation of BTS stans—starting from the crib.Source OpenAI Limits Image Generation as GPUs "Melt" Under Demand The hype around ChatGPT’s AI image generation has been so intense that OpenAI is struggling to keep up. CEO Sam Altman announced on X that the company has temporarily limited image requests, saying, “It’s super fun seeing people love images in ChatGPT, but our GPUs are melting.” No word yet on how long the limit will last, but OpenAI is working to boost efficiency to handle the overwhelming demand.Source Nina's What's Trending is your daily dose of the hottest headlines, viral moments, and must-know stories from The Jubal Show! From celebrity gossip and pop culture buzz to breaking news and weird internet trends, Nina’s got you covered with everything trending right now. She delivers it with wit, energy, and a touch of humor. Stay in the know and never miss a beat—because if it’s trending, Nina’s talking about it! This is just a tiny piece of The Jubal Show. You can find every podcast we have, including the full show every weekday right here…➡︎ https://thejubalshow.com/podcasts The Jubal Show is everywhere, and also these places:Website ➡︎ https://thejubalshow.comInstagram ➡︎ https://instagram.com/thejubalshowX/Twitter ➡︎ https://twitter.com/thejubalshowTikTok ➡︎ https://www.tiktok.com/@the.jubal.showFacebook ➡︎ https://facebook.com/thejubalshowYouTube ➡︎ https://www.youtube.com/@JubalFreshSupport the show: https://the-jubal-show.beehiiv.com/subscribeSee omnystudio.com/listener for privacy information.

The Hardware Unboxed Podcast
GPU Pricing is Bad But Improving?

The Hardware Unboxed Podcast

Play Episode Listen Later Mar 21, 2025 109:25


Episode 64: This week we're chatting about GPU pricing and supply, Intel missing a big opportunity with the B580, upcoming entry-level GPUs and how they should be positioned, and the Ryzen 9 9950X3DCHAPTERS00:00 - Intro02:43 - GPU Pricing Is All Sorts of Bad16:30 - Bye Bye RTX 5090 Supply29:22 - Intel Misses B580 Opportunity41:05 - Entry Level GPU Discussion1:06:34 - AMD Should Learn from RX 90701:17:47 - Ryzen 9 9950X3D Recap1:26:36 - Updates From Our Boring LivesSUBSCRIBE TO THE PODCASTAudio: https://shows.acast.com/the-hardware-unboxed-podcastVideo: https://www.youtube.com/channel/UCqT8Vb3jweH6_tj2SarErfwSUPPORT US DIRECTLYPatreon: https://www.patreon.com/hardwareunboxedLINKSYouTube: https://www.youtube.com/@Hardwareunboxed/Twitter: https://twitter.com/HardwareUnboxedBluesky: https://bsky.app/profile/hardwareunboxed.bsky.social Hosted on Acast. See acast.com/privacy for more information.

DLC
BONUS CONTENT: Half-Life 2 RTX

DLC

Play Episode Listen Later Mar 18, 2025 22:04


Jeff and Christian received 50-series GPUs from Nvidia to test out a brand-new demo of Half-Life 2 RTX, the update to the FPS classic utilizing DLSS 4, full path ray-tracing, updated textures, and so much more. The embargo is up, and the guys are excited to share their thoughts about the demo.

Broken Silicon
301. AMD 9070 XT Ultimate Edition, FSR 4 vs DLSS 4, Nvidia Supply | Ancient Gameplays

Broken Silicon

Play Episode Listen Later Mar 17, 2025 101:31


Fabio joins to discuss how close FSR 4 is to DLSS 4, RX 9000 Series, and the future of GPUs! [SPON: Use "brokensilicon“ at CDKeyOffer for $23 Win11 Pro: https://www.cdkeyoffer.com/cko/Moore11 ] [SPON: Check out MINISFORUM's AI X1 Pro Zen 5 Mini PC: https://shrsl.com/4uyi9 ] 0:00 Fabio's YouTube Origins (Intro Banter) 3:28 RDNA 4 Launch Discussion, Frame Generation Thoughts 9:38 Nvidia Blackwell Thoughts 15:23 Nvidia's Plummeting Mindshare, RADEON Market Share 23:15 Zen 6 Medusa Halo vs RTX 5000 Laptops 31:28 RDNA 4 Architecture & Ray Tracing 38:39 RX 9070 XT 32GB Ultimate Edition 46:09 RDNA 4 Reference Coolers, AIB Control 51:29 FSR 4 vs DLSS 4 Analysis 1:10:52 European Supply & Demand, RX 9060 XT Pricing, RTX 5050 1:25:39 FSR 4 on Linux, Shopping Advice, Wafer Costs Check out Ancient Gameplays: https://www.youtube.com/ancientgameplays Last Episode Fabio was on: https://youtu.be/vyQxNN9EF3w?si=AfgncxfOalyJABQv AG 9070 XT Review: https://youtu.be/KnL_PtQBGqk?si=WIjG67LaaoYljQ9m MLID 9070 XT Analysis: https://youtu.be/huy65HPPLSY?si=vwblHxshld7mGX6S MLID Supply Update: https://www.youtube.com/live/hgq-7ViVPx8?si=SAmvtdkbOTnl62-7 https://www.techpowerup.com/gpu-specs/geforce-rtx-5080.c4217

Thoughts on the Market
Will GenAI Turn a Profit in 2025?

Thoughts on the Market

Play Episode Listen Later Mar 3, 2025 12:49


Our Semiconductors and Software analysts Joe Moore and Keith Weiss dive into the biggest market debate around AI and why it's likely to shape conversations at Morgan Stanley's Technology, Media and Telecom (TMT) Conference in San Francisco. ----- Transcript -----Joe Moore: Welcome to Thoughts on the Market. I'm Joe Moore, Morgan Stanley's Head of U.S. Semiconductors.Keith Weiss: And I'm Keith Weiss, Head of U.S. Software.Joe Moore: Today on the show, one of the biggest market debates in the tech sector has been around AI and the Return On Investment, or ROI. In fact, we think this will be the number one topic of conversation at Morgan Stanley's annual Technology, Media and Telecom (TMT) conference in San Francisco.And that's precisely where we're bringing you this episode from.It's Monday, March 3rd, 7am in San Francisco.So, let's get right into it. ChatGPT was released November 2022. Since then, the biggest tech players have gained more than $9 trillion in combined market capitalization. They're up more than double the amount of the S&P 500 index. And there's a lot of investor expectation for a new technology cycle centered around AI. And that's what's driving a lot of this momentum.You know, that said, there's also a significant investor concern around this topic of ROI, especially given the unprecedented level of investment that we've seen and sparse data points still on the returns.So where are we now? Is 2025 going to be a year when the ROI and GenAI finally turns positive?Keith Weiss: If we take a step back and think about the staging of how innovation cycles tend to play out, I think it's a helpful context.And it starts with research. I would say the period up until When ChatGPT was released – up until that November 2022 – was a period of where the fundamental research was being done on the transformer models; utilizing, machine learning. And what fundamental research is, is trying to figure out if these fundamental capabilities are realistic. If we can do this in software, if you will.And with the release of ChatGPT, it was a very strong, uh, stamp of approval of ‘Yes, like these transformer models can work.'Then you start stage two. And I think that's basically November 22 through where are today of, where you have two tracks going on. One is development. So these large language models, they can do natural language processing well.They can contextually understand unstructured and semi structured data. They can generate content. They could create text; they could create images and videos.So, there's these fundamental capabilities. But you have to develop a product to get work done. How are we going to utilize those capabilities? So, we've been working on development of product over the past two years. And at the same time, we've been scaling out the infrastructure for that product development.And now, heading into 2025, I think we're ready to go into the next stage of the innovation cycle, which will be market uptake.And that's when revenue starts to flow to the software companies that are trying to automate business processes. We definitely think that monetization starts to ramp in 2025, which should prove out a better ROI or start to prove out the ROI of all this investment that we've been making.Joe Moore: Morgan Stanley Research projects that GenAI can potentially drive a $1.1 trillion dollar revenue opportunity in 2028, up from $45 billion in 2024. Can you break this down for our listeners?Keith Weiss: We recently put out a report where we tried to size kind of what the revenue generation capability is from GenerativeAI, because that's an important part of this ROI equation. You have the return on the top of where you could actually monetize this. On the bottom, obviously, investment. And we took a look at all the investment needed to serve this type of functionality.The [$]1.1 trillion, if you will, it breaks down into two big components. Um, One side of the equation is in my backyard, and that's the enterprise software side of the equation. It's about a third of that number. And what we see occurring is the automation of more and more of the work being done by information workers; for people in overall.And what we see is about 25 percent, of overall labor being impacted today. And we see that growing to over 45 percent over the next three years.So, what that's going to look like from a software perspective is a[n] opportunity ramping up to about, just about $400 billion of software opportunity by 2028. At that point, GenerativeAI will represent about 22 percent of overall software spending. At that point, the overall software market we expect to be about a $1.8 trillion market.The other side of the equation, the bigger side of the equation, is actually the consumer platforms. And that kind of makes sense if you think about the broader economy, it's basically one-third B2B, two-thirds B2C. The automation is relatively equivalent on both sides of the equation.Joe Moore: So, let's drill further into your outlook for software. What are the biggest catalysts you expect to see this year, and then over the coming three years?Keith Weiss: The key catalyst for this year is proving out the efficacy of these solutions, right?Proving out that they're going to drive productivity gains and yield real hard dollar ROI for the end customer. And I think where we'll see that is from labor savings.Once that occurs, and I think it's going to be over the next 12 to 18 months, then we go into the period of mainstream adoption. You need to start utilizing these technologies to drive the efficiencies within your businesses to be able to keep up with your competitors. So, that's the main thing that we're looking for in the near term.Over the next three years, what you're looking for is the breakthrough technologies. Where can we find opportunities not just to create efficiencies within existing processes, but to completely rewrite the business process.That's where you see new big companies emerge within the software opportunity – is the people that really fundamentally change the equation around some of these processes.So, Joe, turning it over to you, hardware remains a bottleneck for AI innovation. Why is that the case? And what are the biggest hurdles in the semiconductor space right now?Joe Moore: Well, this has proven to be an extremely computationally intensive application, and I think it started with training – where you started seeing tens of thousands of GPUs or XPUS clustered together to train these big models, these Large Language Models. And you started hearing comments two years ago around the development of ChatGPT that, you know, the scaling laws are tricky.You might need five times as much hardware to make a model that's 10 percent smarter. But the challenge of making a model that's 10 percent smarter, the table stakes of that are very significant. And so, you see, you know, those investments continuing to scale up. And that's been a big debate for the market.But we've heard from most of the big spenders in the market that we are continuing to scale up training. And then after that happened, we started seeing inference suddenly as a big user of advanced processors, GPUs, in a way that they hadn't before. And that was sort of simple conversational types of AI.Now as you start migrating into more of a reasoning AI, a multi pass approach, you're looking at a really dramatic scaling in the amount of hardware, that's required from both GPUs and XPUs.And at the same time the hardware companies are focused a lot on how do we deliver that – so that it doesn't become prohibitively expensive; which it is very expensive. But there's a lot of improvement. And that's where you're sort of seeing this tug of war in the stocks; that when you see something that's deflationary, uh, it becomes a big negative. But the reality is the hardware is designed to be deflationary because the workloads themselves are inflationary.And so I think there's a lot of growth still ahead of us. A lot of investment, and a lot of rich debate in the market about this.Keith Weiss: Let's pull on that thread a little bit. You talked initially about the scaling of the GPU clusters to support training. Over the past year, we've gotten a little bit more pushback on the ideas or the efficacy of those scaling laws.They've come more under question. And at the same time, we've seen the availability of some lower cost, but still very high-performance models. Is this going to reshape the investments from the large semiconductor players in terms of how they're looking to address the market?Joe Moore: I think we have to assess that over time. Right now, there are very clear comments from everybody who's in charge of scaling large models that they intend to continue to scale.I think there is a benefit to doing so from the standpoint of creating a richer model, but is the ROI there? You know, and that's where I think, you know, your numbers do a very good job of justifying our model for our core companies – where we can say, okay, this is not a bubble. This is investment that's driven by these areas of economic benefit that our software and internet teams are seeing.And I think there is a bit of an arms race at the high end of the market where people just want to have the biggest cluster. And that's, we think that's about 30 percent of the revenue right now in hardware – is supporting those really big models. But we're also seeing, to your point, a very rich hardware configuration on the inference side post training model customization. Nvidia said on their on their earnings call recently that they see several orders of magnitude more compute required for those applications than for that pre-training. So, I think over time that's where the growth is going to come from.But you know, right now we're seeing growth really from all aspects of the market.Keith Weiss: Got it. So, a lot of really big opportunities out there utilizing these GPUs and ASICs, but also a lot of unknowns and potential risks. So, what are the key catalysts that you're looking for in the semiconductor space over the course of this year and maybe over the next three years?Joe Moore: Well, 2025 is, is a year that is really mostly about supply.You know, we're ramping up, new hardware But also, several companies doing custom silicon. We have to ramp all that hardware up and it's very complicated.It uses every kind of trick and technique that semiconductors use to do advanced packaging and things like that. And so, it's a very challenging supply chain and it has been for two years. And fortunately, it's happened in a time when there's plenty of semiconductor capacity out there.But I think, you know, we're ramping very quickly. And I think what you're seeing is the things that matter this year are gonna be more about how quickly we can get that supply, what are the gross margins on hardware, things like that.I think beyond that, we have to really get a sense of, you know, these ROI questions are really important beyond 2025. Because again, this is not a bubble. But hardware is cyclical and there; it doesn't slow gracefully. So, there will be periods where investment may fall off and it'll be a difficult time to own the stocks. And that's, you know, we do think that over time, the value sort of transitions from hardware to software.But we model for 2026 to be a year where it starts to slow down a little bit. We start to see some consolidation in these investments.Now, 12 months ago, I thought that about 2025. So, the timeframe keeps getting pushed out. It remains very robust. But I think at some point it will plateau a little bit and we'll start to see some fragmentation; and we'll start to see markets like, you know, reasoning models, inference models becoming more and more critical. But that's where when I hear you and Brian Nowak talking about sort of the early stage that we are of actually implementing this stuff, that inference has a long way to go in terms of growth.So, we're optimistic around the whole AI space for semiconductors. Obviously, the market is as well. So, there's expectations, challenges there. But there's still a lot of growth ahead of us.So Keith, looking towards the future, as AI expands the functionality of software, how will that transform the business models of your companies?Keith Weiss: We're also fundamentally optimistic about software and what GenerativeAI means for the overall software industry.If we look at software companies today, particularly application companies, a lot of what you're trying to do is make information workers more productive. So, it made a lot of sense to price based upon the number of people who are using your software. Or you've got a lot of seat-based models.Now we're talking about completely automating some of those processes, taking people out of the loop altogether. You have to price differently. You have to price based upon the number of transactions you're running, or some type of consumptive element of the amount of work that you're getting done. I think the other thing that we're going to see is the market opportunity expanding well beyond information workers.So, the way that we count the value, the way that we accrue the value might change a little bit. But the underlying value proposition remains the same. It's about automating, creating productivity in those business processes, and then the software companies pricing for their fair share of that productivity.Joe Moore: Great. Well, let me just say this has been a really useful process for me. The collaboration between our teams is really helpful because as a semiconductor analyst, you can see the data points, you can see the hardware being built. And I know the enthusiasm that people have on a tactical level. But understanding where the returns are going to come from and what milestones we need to watch to see any potential course correction is very valuable.So on that note, it's time for us to get to the exciting panels at the Morgan Stanley TMT conference. Uh, And we'll have more from the conference on the show later this week. Keith, thanks for taking the time to talk.Keith Weiss: Great speaking with you, Joe.Joe Moore: And thanks for listening. If you enjoy Thoughts on the Market, please leave us a review wherever you listen and share the podcast with a friend or colleague today.

Market Mondays
Navigating Market Volatility: Insights on AI, Stock Opportunities, and Global Competition

Market Mondays

Play Episode Listen Later Feb 27, 2025 6:54


Welcome back to Market Mondays! In this insightful clip, hosts Rashad Bilal, Ian Dunlap, and Troy Millings dive into the current state of the stock market, analyzing the recent volatility and its implications for investors. With the Dow dropping 700 points last Friday, financial experts dissect whether we are experiencing a cycle of buying opportunities or continued uncertainty.The trio kicks off the discussion by addressing investor anxieties amid a turbulent market. Ian Dunlap provides valuable lessons on market volatility, emphasizing that what goes up must come down, and foresees potential recovery between Tuesday and Thursday. While discussing the market's intricacies, Ian highlights factors such as future expiration, executive leadership plans in Washington, and the impact of aggressive cuts through Doge, all contributing to the current financial climate.Troy Millings adds to the discourse, pointing out Meta's historic 21-day run of stock appreciation and the subsequent pullback. Viewing it as a natural correction, Troy encourages investors to see this as an opportunity for better entry points. Alongside Meta, they delve into other tech giants with robust AI spending plans, including Nvidia's forthcoming earnings report, anticipated as a significant event in the first quarter.As the conversation broadens to international competition, Troy and Ian emphasize the global AI race between China and the United States. Alibaba's allocation of over 70 billion into AI and the purchase of 230,000 GPUs sets the stage for a tech revolution, signaling China's keen interest in maintaining a competitive edge.The discussion reflects on the broader implications of AI advancements, citing companies like Badu, the Chinese counterpart to Google, and BYD versus Tesla. The hosts stress the significance of the ongoing tech civil war and the importance of the United States remaining a frontrunner in the AI domain.Lastly, the clip underscores the critical need to monitor both domestic and international developments. China's strategic moves and scheduled meetings with large-cap companies highlight the importance of staying informed of global market trends and events that could substantially influence the AI landscape.Join us as we navigate the complexities of today's market and explore the endless possibilities and challenges on the horizon. Stay tuned to Market Mondays for more expert insights and comprehensive analyses. Don't forget to like, subscribe, and hit the notification bell to stay updated!#MarketMondays #StockMarket #AIRevolution #InvestmentOpportunities #GlobalCompetition #MarketVolatility #TechCivilWar #FinanceInsights #DowJones #AI #InvestingTipsSupport this podcast at — https://redcircle.com/marketmondays/donationsAdvertising Inquiries: https://redcircle.com/brandsPrivacy & Opt-Out: https://redcircle.com/privacy