Podcasts about Observability

  • 449PODCASTS
  • 1,383EPISODES
  • 41mAVG DURATION
  • 5WEEKLY NEW EPISODES
  • Jul 19, 2025LATEST

POPULARITY

20172018201920202021202220232024

Categories



Best podcasts about Observability

Show all podcasts related to observability

Latest podcast episodes about Observability

Software Lifecycle Stories
Insights into Dev Tooling with Tom Elliott

Software Lifecycle Stories

Play Episode Listen Later Jul 19, 2025 50:51


I am in conversation with Tom Elliott, founder of Ocuroot and former Engineering Productivity lead at Yext, Introduction:Tom Elliott shares his career journey, starting from his early interest in computers to his current role in Dev tooling .Career Insights:Tom discusses the challenges of entering the industry during the financial crash and his transition from contract work to a full-time role at VMware .He highlights his experience at VMware, working on early-stage projects like building login pages and authentication systems .Shift to New York:Tom talks about his move to New York and his work at a small VPN startup, focusing on user-facing applications .Experience at Yext:Tom shares his journey at Yext, starting as a mobile developer and gradually moving to backend development and Dev tooling .He emphasizes the importance of being close to the users and getting immediate feedback on the tools he built .Challenges and Solutions:Tom discusses the challenges of working in large organizations, such as resolving merge conflicts and managing long-lived branches .He explains the benefits of trunk-based development and feature flags for managing multiple features and environments .Observability and Deployment:Tom highlights the importance of observability and the use of tools like open telemetry for distributed tracing .He shares insights on managing different deployment environments and ensuring consistency across regions .Quality and CI/CD Pipelines:Tom talks about the emphasis on quality and the importance of CI/CD pipelines in ensuring reliable software releases .He shares his experience of setting up CI/CD pipelines to avoid issues like broken installers .Conclusion:Tom reflects on the importance of flexibility and prototyping in software development .He shares his thoughts on the future of AI in coding and the role of human operators in leveraging AI tools .Bio:During nearly 20 years in the tech industry, Tom has worked for companies large and small on both sides of the pond and all layers of the tech stack from user-facing mobile and desktop applications to the backest of backends: DevOps. He is currently building Ocuroot, his own take on a CI/CD solution, based on his experiences scaling large numbers of environments for B2B SaaS products.Links: * LinkedIn: https://www.linkedin.com/in/telliott1984/ * BlueSky: https://bsky.app/profile/telliott.me* Blog: https://thefridaydeploy.substack.com/* Ocuroot: https://www.ocuroot.com

inControl
ep34 - The inControl guide to ... Controllability & Observability

inControl

Play Episode Listen Later Jul 17, 2025 59:28


Outline00:00 - Intro01:06 - The big idea03:42 - Controllability, observability, and  ... the space race!14:52 - Kálmán and the state-space paradigm00:00 - The math and intuition: state-space basics, definitions, and duality00:00 - A touch of nonlinearity00:00 - Developments in the field: a chronological tour00:00 - Controllability and observability: quo vaditis?00:00 - OutroLinksKálmán: https://tinyurl.com/bdzj7mtrControllability: https://tinyurl.com/28s5zxpnObservability: https://tinyurl.com/yjxncxdnPaper - "Contributions to the theory of optimal control": https://tinyurl.com/9wwf8pvhPaper - "Discovery and Invention": https://tinyurl.com/ryfn463nKálmán's speech -  Kyoto Prize : https://tinyurl.com/2ahrjdahPaper - Controllability of complex networks: https://tinyurl.com/3zk99n4sSupport the showPodcast infoPodcast website: https://www.incontrolpodcast.com/Apple Podcasts: https://tinyurl.com/5n84j85jSpotify: https://tinyurl.com/4rwztj3cRSS: https://tinyurl.com/yc2fcv4yYoutube: https://tinyurl.com/bdbvhsj6Facebook: https://tinyurl.com/3z24yr43Twitter: https://twitter.com/IncontrolPInstagram: https://tinyurl.com/35cu4kr4Acknowledgments and sponsorsThis episode was supported by the National Centre of Competence in Research on «Dependable, ubiquitous automation» and the IFAC Activity fund. The podcast benefits from the help of an incredibly talented and passionate team. Special thanks to L. Seward, E. Cahard, F. Banis, F. Dörfler, J. Lygeros, ETH studio and mirrorlake . Music was composed by A New Element.

Getup Kubicast
#176 - IA + DevOps & Machine Learning

Getup Kubicast

Play Episode Listen Later Jul 17, 2025 61:25


Recebemos o Daniel Romeiro — mais conhecido como Infoslack — para mergulhar de cabeça no universo em ebulição de Inteligência Artificial, DevOps e Machine Learning. Neste episódio, exploramos como filtrar o ruído do hype com uma abordagem de filtro reverso e discutimos os bastidores do deploy de modelos de Machine Learning em produção.Trocamos experiências sobre observabilidade avançada em pipelines de IA e compartilhamos insights sobre como acumular habilidades DevOps ao longo da carreira, sem jamais perder o pé no chão. Entre uma piada e outra, analisamos também o impacto dos testes A/B em tempo real e a complexidade de gerenciar artefatos de IA em escala.Por fim, refletimos sobre as perspectivas futuras: qual será o próximo grande passo para SREs que querem continuar relevantes em um cenário dominado por IA generativa? Nós conversamos sobre como arquiteturas mal planejadas podem se tornar gargalos de latência e apresentamos estratégias para garantir alta disponibilidade mesmo quando as APIs externas decidem ficar fora do ar.Links Importantes:- Daniel Romeiro - https://www.linkedin.com/in/infoslack/- João Brito - https://www.linkedin.com/in/juniorjbn- Assista ao FilmeTEArapia - https://youtu.be/M4QFmW_HZh0?si=HIXBDWZJ8yPbpflMParticipe de nosso programa de acesso antecipado e tenha um ambiente mais seguro em instantes!https://getup.io/zerocveO Kubicast é uma produção da Getup, empresa especialista em Kubernetes e projetos open source para Kubernetes. Os episódios do podcast estão nas principais plataformas de áudio digital e no YouTube.com/@getupcloud.

Software Engineering Radio - The Podcast for Professional Software Developers
SE Radio 677: Jacob Visovatti and Conner Goodrum on Testing ML Models for Enterprise Products

Software Engineering Radio - The Podcast for Professional Software Developers

Play Episode Listen Later Jul 15, 2025 60:54


Jacob Visovatti and Conner Goodrum of Deepgram speak with host Kanchan Shringi about testing ML models for enterprise use and why it's critical for product reliability and quality. They discuss the challenges of testing machine learning models in enterprise environments, especially in foundational AI contexts. The conversation particularly highlights the differences in testing needs between companies that build ML models from scratch and those that rely on existing infrastructure. Jacob and Conner describe how testing is more complex in ML systems due to unstructured inputs, varied data distribution, and real-time use cases, in contrast to traditional software testing frameworks such as the testing pyramid. To address the difficulty of ensuring LLM quality, they advocate for iterative feedback loops, robust observability, and production-like testing environments. Both guests underscore that testing and quality assurance are interdisciplinary efforts that involve data scientists, ML engineers, software engineers, and product managers. Finally, this episode touches on the importance of synthetic data generation, fuzz testing, automated retraining pipelines, and responsible model deployment—especially when handling sensitive or regulated enterprise data. Brought to you by IEEE Computer Society and IEEE Software magazine.

Software Engineering Radio - The Podcast for Professional Software Developers
SE Radio 676: Samuel Colvin on the Pydantic Ecosystem

Software Engineering Radio - The Podcast for Professional Software Developers

Play Episode Listen Later Jul 10, 2025 62:06


Samuel Colvin, the CEO and founder of Pydantic, speaks with host Gregory M. Kapfhammer about the ecosystem of Pydantic's Python frameworks, including Pydantic, Pydantic AI, and Pydantic Logfire. Along with discussing the design, implementation, and use of these frameworks, they dive into the refactoring of Pydantic and the follow-on performance improvements. They also explore ways in which Python programmers can use these three frameworks to build, test, evaluate, and monitor their own applications that interact with both local and cloud-based large language models. Brought to you by IEEE Computer Society and IEEE Software magazine.

Solving for Change
Observability Isn't About Tools—It's About People

Solving for Change

Play Episode Listen Later Jul 4, 2025 33:58 Transcription Available


With a growing number of interconnected applications distributed across different environments, enterprise software systems are becoming more complex every day. At the same time, todays organizations and their customers expect applications to be available 24/7. So how can IT teams ensure all of these applications and systems are running at their best? Observability offers a solution.    In this episode, Marc LeBlanc talks with Adriana Villela, Principal Developer Advocate at Dynatrace about the importance of observability for today's enterprises and why people and culture are the keys to success more than tools. 

Smart Software with SmartLogic
SDUI at Scale: GraphQL & Elixir at Cars.com with Zack Kayser

Smart Software with SmartLogic

Play Episode Listen Later Jul 3, 2025 49:18


Zack Kayser, Staff Software Engineer at cars.com, joins Elixir Wizards Sundi Myint and Charles Suggs to discuss how Cars.com adopted a server-driven UI (SDUI) architecture powered by Elixir and GraphQL to deliver consistent, updatable interfaces across web, iOS, and Android. We explore why SDUI matters for feature velocity, how a mature design system and schema planning make it feasible, and what it takes, culturally and technically, to move UI logic from client code into a unified backend. Key topics discussed in this episode: SDUI fundamentals and how it differs from traditional server-side rendering GraphQL as the single source of truth for UI components and layouts Defining abstract UI components on the server to eliminate duplicate logic Leveraging a robust design system as the foundation for SDUI success API-first development and cross-team coordination for schema changes Mock data strategies for early UI feedback without breaking clients Handling breaking changes and hot-fix deployments via server-side updates Enabling flexible layouts and A/B testing through server-controlled ordering Balancing server-driven vs. client-managed UI Iterative SDUI rollout versus “big-bang” migrations in large codebases Using type specs and Dialyxir for clear cross-team communication Integration testing at the GraphQL layer to catch UI regressions early Quality engineering's role in validating server-driven interfaces Production rollback strategies across web and native platforms Considerations for greenfield projects adopting SDUI from day one Zack and Ethan's upcoming Instrumenting Elixir Apps book Links mentioned: https://cars.com https://github.com/absinthe-graphql/absinthe Telemetry & Observability for Elixir Apps Ep: https://youtu.be/1V2xEPqqCso https://www.phoenixframework.org/blog/phoenix-liveview-1.0-released https://hexdocs.pm/phoenixliveview/assigns-eex.html https://graphql.org/ https://tailwindcss.com/ https://github.com/jeremyjh/dialyxir https://github.com/rrrene/credo GraphQL Schema https://graphql.org/learn/schema/ SwiftUI https://developer.apple.com/documentation/swiftui/  Kotlin https://kotlinlang.org/ https://medium.com/airbnb-engineering/a-deep-dive-into-airbnbs-server-driven-ui-system-842244c5f5 Zack's Twitter: https://x.com/kayserzl/ Zack's LinkedIn: https://www.linkedin.com/in/zack-kayser-93b96b88  Special Guest: Zack Kayser.

Getup Kubicast
#174 - ObservIAbilidade com Luccas Quadros

Getup Kubicast

Play Episode Listen Later Jul 3, 2025 49:49


No episódio 174 do Kubicast, nós convidamos Lucas Quadros, desenvolvedor de software no time de IAI e Machine Learning da Grafana, para mergulharmos no universo da observabilidade. Em uma conversa técnica e bem-humorada, exploramos como logs e processamento de linguagem natural (NLP) se cruzam para transformar dados brutos em insights acionáveis e sobre a evolução de algoritmos de detecção de anomalias em séries temporais.Avançamos na discussão sobre IA generativa aplicada ao monitoramento: desde a criação de dashboards dinâmicos até a configuração inteligente de alertas e SLOs. Falamos ainda sobre a arquitetura de agentes de observabilidade capazes de navegar em enormes quantidades de métricas, traces e logs, ajudando a acelerar investigações de incidentes.Para fechar, debatemos aspectos de segurança e as trocas de conhecimento por meio de protocolos MCP que conectam LLMs aos nossos repositórios, dashboards e runbooks. Comentamos casos de uso, desafios de privacidade de dados e perspectivas para o futuro da automação em observabilidade.Links Importantes:- Luccas Quadros - Não tem rede social!!!- AIOps no KCD RJ - https://youtu.be/WTWmOybEOK4?si=QujwWRx8QxpOY43g- João Brito - https://www.linkedin.com/in/juniorjbn- Assista ao FilmeTEArapia - https://youtu.be/M4QFmW_HZh0?si=HIXBDWZJ8yPbpflMParticipe de nosso programa de acesso antecipado e tenha um ambiente mais seguro em instantes!https://getup.io/zerocveO Kubicast é uma produção da Getup, empresa especialista em Kubernetes e projetos open source para Kubernetes. Os episódios do podcast estão nas principais plataformas de áudio digital e no YouTube.com/@getupcloud.

Code RED
#28 -Infrastructure in Flux: Marcin Wyszynski on OpenTofu, Observability, and Standardizing IaC at Scale

Code RED

Play Episode Listen Later Jul 3, 2025 43:57


Spacelift co-founder Marcin Wyszynski joins Dash0's Mirko Novakovic to talk about the past, present, and future of Infrastructure as Code. They unpack the birth of OpenTofu and why standardizing observability – especially with OpenTelemetry – is critical for managing increasingly complex infrastructure. Marcin also explains Spacelift's mission to build infrastructure tooling that plays well in today's agentic ecosystem and why he thinks DevOps often misapplies AI.

AWS for Software Companies Podcast
Ep114: From Chaos to Clarity - AI-Powered Security and Observability Investigation with Sumo Logic Mo Copilot on AWS

AWS for Software Companies Podcast

Play Episode Listen Later Jul 2, 2025 26:14


Kui Jia, Sumo Logic's Vice President of Engineering and Head of AI, shares how their AWS-powered AI agents transform chaotic security investigations into streamlined workflows.Topics Include:Kui Jia leads AI Engineering at Sumo LogicSREs and SOC analysts work under chaotic, high-pressure conditionsTeams constantly switch between different vendor tools and platformsInvestigation requires quick hypothesis formation and complex query writingSumo Logic processes petabytes of data daily across enterprisesCompany serves 2,000+ enterprise customers for 15 yearsPlatform focuses on observability and cybersecurity use casesInvestigation journey: discover, diagnose, decide, act, learn phasesData flows from ingestion through analytics to human insightsTraditional workflow relies heavily on tribal domain knowledgeSenior engineers create queries that juniors struggle to understandWar room situations demand immediate answers, not learning curvesContext switching between tools wastes time and creates frictionMultiple AI generations deployed: ML anomaly detection to GenAIAgentic AI enables reasoning, planning, tools, and evaluation capabilitiesMo Copilot launched at AWS re:Invent as AI agent suiteNatural language converts high-level questions into Sumo queriesSystem provides intelligent autocomplete and multi-turn conversationsInsight agents summarize logs and security signals automaticallyKnowledge integration combines foundation models with proprietary metadataAI generates playbooks and remediation scripts for automated actionsThree-tier architecture: Infrastructure, AI Tooling, and Application layersBuilt on AWS Bedrock with Nova models for performanceFocus on reusable infrastructure and AI tooling componentsData differentiation more important than AI model selectionGolden datasets and contextualized metadata are development challengesGuardrails and evaluation frameworks critical for enterprise deploymentAI observability enables debugging and performance monitoringEnterprise agents achievable within one year development timelineFuture vision: multiple AI agents collaborating with human investigatorsParticipants:Kui Jia – Vice President of AI Engineering, Head of AI, Sumo LogicFurther Links:Website: https://www.sumologic.com/Sumo Logic in the AWS MarketplaceSee how Amazon Web Services gives you the freedom to migrate, innovate, and scale your software company at https://aws.amazon.com/isv/

Software Engineering Radio - The Podcast for Professional Software Developers
SE Radio 675: Brian Demers on Observability into the Toolchain

Software Engineering Radio - The Podcast for Professional Software Developers

Play Episode Listen Later Jul 1, 2025 47:41


Brian Demers, Developer Advocate at Gradle, speaks with host Giovanni Asproni about the importance of having observability in the toolchain. Such information about build times, compiler warnings, test executions, and any other system used to build the production code can help to reduce defects, increase productivity, and improve the developer experience. During the conversation they touch upon what is possible with today's tools; the impact on productivity and developer experience; and the impact, both in terms of risks and opportunities, introduced by the use of artificial intelligence. Brought to you by IEEE Computer Society and IEEE Software magazine.

MLOps.community
AI Reliability, Spark, Observability, SLAs and Starting an AI Infra Company

MLOps.community

Play Episode Listen Later Jun 27, 2025 97:22


LLMs are reshaping the future of data and AI—and ignoring them might just be career malpractice. Yoni Michael and Kostas Pardalis unpack what's breaking, what's emerging, and why inference is becoming the new heartbeat of the data pipeline.// BioKostas PardalisKostas is an engineer-turned-entrepreneur with a passion for building products and companies in the data space. He's currently the co-founder of Typedef. Before that, he worked closely with the creators of Trino at Starburst Data on some exciting projects. Earlier in his career, he was part of the leadership team at Rudderstack, helping the company grow from zero to a successful Series B in under two years. He also founded Blendo in 2014, one of the first cloud-based ELT solutions.Yoni MichaelYoni is the Co-Founder of typedef, a serverless data platform purpose-built to help teams process unstructured text and run LLM inference pipelines at scale. With a deep background in data infrastructure, Yoni has spent over a decade building systems at the intersection of data and AI — including leading infrastructure at Tecton and engineering teams at Salesforce.Yoni is passionate about rethinking how teams extract insight from massive troves of text, transcripts, and documents — and believes the future of analytics depends on bridging traditional data pipelines with modern AI workflows. At Typedef, he's working to make that future accessible to every team, without the complexity of managing infrastructure.// Related LinksWebsite: https://www.typedef.aihttps://techontherocks.showhttps://www.cpard.xyz~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExploreMLOps Swag/Merch: [https://shop.mlops.community/]Connect with Demetrios on LinkedIn: /dpbrinkmConnect with Kostas on LinkedIn: /kostaspardalis/Connect with Yoni on LinkedIn: /yonimichael/Timestamps:[00:00] Breaking Tools, Evolving Data Workloads[06:35] Building Truly Great Data Teams[10:49] Making Data Platforms Actually Useful[18:54] Scaling AI with Native Integration[24:04] Empowering Employees to Build Agents[28:17] Rise of the AI Sherpa[36:09] Real AI Infrastructure Pain Points[38:05] Fixing Gaps Between Data, AI[46:04] Smarter Decisions Through Better Data[50:18] LLMs as Human-Machine Interfaces[53:40] Why Summarization Still Falls Short[01:01:15] Smarter Chunking, Fixing Text Issues[01:09:08] Evaluating AI with Canary Pipelines[01:11:46] Finding Use Cases That Matter[01:17:38] Cutting Costs, Keeping AI Quality[01:25:15] Aligning MLOps to Business Outcomes[01:29:44] Communities Thrive on Cross-Pollination[01:34:56] Evaluation Tools Quietly Consolidating

Screaming in the Cloud
See Why GenAI Workloads Are Breaking Observability with Wayne Segar

Screaming in the Cloud

Play Episode Listen Later Jun 26, 2025 33:15


What happens when you try to monitor something fundamentally unpredictable? In this featured guest episode, Wayne Segar from Dynatrace joins Corey Quinn to tackle the messy reality of observing AI workloads in enterprise environments. They explore why traditional monitoring breaks down with non-deterministic AI systems, how AI Centers of Excellence are helping overcome compliance roadblocks, and why “human in the loop” beats full automation in most real-world scenarios.From Cursor's AI-driven customer service fail to why enterprises are consolidating from 15+ observability vendors, this conversation dives into the gap between AI hype and operational reality, and why the companies not shouting the loudest about AI might be the ones actually using it best.Show Highlights(00:00) - Cold Open(00:48) – Introductions and what Dynatrace actually does(03:28) – Who Dynatrace serves(04:55) – Why AI isn't prominently featured on Dynatrace's homepage(05:41) – How Dynatrace built AI into its platform 10 years ago(07:32) – Observability for GenAI workloads and their complexity(08:00) – Why AI workloads are "non-deterministic" and what that means for monitoring(12:00) – When AI goes wrong(13:35) – “Human in the loop”: Why the smartest companies keep people in control(16:00) – How AI Centers of Excellence are solving the compliance bottleneck(18:00) – Are enterprises too paranoid about their data?(21:00) – Why startups can innovate faster than enterprises(26:00) – The "multi-function printer problem" plaguing observability platforms(29:00) – Why you rarely hear customers complain about Dynatrace(31:28) – Free trials and playground environmentsAbout Wayne SegarWayne Segar is Director of Global Field CTOs at Dynatrace and part of the Global Center of Excellence where he focuses on cutting-edge cloud technologies and enabling the adoption of Dynatrace at large enterprise customers. Prior to joining Dynatrace, Wayne was a Dynatrace customer where he was responsible for performance and customer experience at a large financial institution. LinksDynatrace website: https://dynatrace.comDynatrace free trial: https://dynatrace.com/trialDynatrace AI observability: https://dynatrace.com/platform/artificial-intelligence/Wayne Segar on LinkedIn: https://www.linkedin.com/in/wayne-segar/SponsorDynatrace: http://www.dynatrace.com 

Complex Systems with Patrick McKenzie (patio11)
The AI infrastructure stack with Jennifer Li, a16z

Complex Systems with Patrick McKenzie (patio11)

Play Episode Listen Later Jun 26, 2025 45:50


In this episode, Patrick McKenzie (@patio11) is joined by Jennifer Li, a general partner at a16z investing in enterprise, infrastructure and AI. Jennifer breaks down how AI workloads are creating new demands on everything from inference pipelines to observability systems, explaining why we're seeing a bifurcation between language models and diffusion models at the infrastructure level. They explore emerging categories like reinforcement learning environments that help train agents, the evolution of web scraping for agentic workflows, and why Jennifer believes the API economy is about to experience another boom as agents become the primary consumers of software interfaces.–Full transcript: www.complexsystemspodcast.com/the-ai-infrastructure-stack-with-jennifer-li-a16z/–Sponsor:  VantaVanta automates security compliance and builds trust, helping companies streamline ISO, SOC 2, and AI framework certifications. Learn more at https://vanta.com/complex–Links:Jennifer Li's writing at a16z https://a16z.com/author/jennifer-li/ –Timestamps:(00:00) Intro(00:55) The AI shift and infrastructure(02:24) Diving into middleware and AI models(04:23) Challenges in AI infrastructure(07:07) Real-world applications and optimizations(15:15) Sponsor: Vanta(16:38) Real-world applications and optimizations (cont'd)(19:05) Reinforcement learning and synthetic environments(23:05) The future of SaaS and AI integration(26:02) Observability and self-healing systems(32:49) Web scraping and automation(37:29) API economy and agent interactions(44:47) Wrap

The MAD Podcast with Matt Turck
Guillermo Rauch: Why Software Development Will Never Be the Same

The MAD Podcast with Matt Turck

Play Episode Listen Later Jun 26, 2025 105:40


In this episode, Vercel CEO Guillermo Rauch goes deep on how V0, their text-to-app platform, has already generated over 100 million applications and doubled Vercel's user base in under a year.Guillermo reveals how a tiny SWAT team inside Vercel built V0 from scratch, why “vibe coding” is making software creation accessible to everyone (not just engineers), and how the AI Cloud is automating DevOps, making cloud infrastructure self-healing, and letting companies expose their data to AI agents in just five lines of code.You'll hear why “every company will have to rethink itself as a token factory,” how Vercel's Next.js went from a conference joke to powering Walmart, Nike, and Midjourney, and why the next billion app creators might not write a single line of code. Guillermo breaks down the difference between vibe coding and agentic engineering, shares wild stories of users building apps from napkin sketches, and explains how Vercel is infusing “taste” and best practices directly into their AI models.We also dig into the business side: how Vercel's AI-powered products are driving explosive growth, why retention and margins are strong, and how the company is adapting to a new wave of non-technical users. Plus: the future of MCP servers, the security challenges of agent-to-agent communication, and why prompting and AI literacy are now must-have skills.VercelWebsite - https://vercel.comX/Twitter - https://x.com/vercelGuillermo RauchLinkedIn - https://www.linkedin.com/in/rauchgX/Twitter - https://x.com/rauchgFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)LinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturck(00:00) Intro (02:08) What Is V0 and Why Did It Take Off So Fast? (04:10) How Did a Tiny Team Build V0 So Quickly? (07:51) V0 vs Other AI Coding Tools (10:35) What is Vibe Coding? (17:05) Is V0 Just Frontend? Moving Toward Full Stack and Integrations (19:40) What Skills Make a Great Vibe Coder? (23:35) Vibe Coding as the GUI for AI: The Future of Interfaces (29:46) Developer Love = Agent Love (33:41) Having Taste as Developer (39:10) MCP Servers: The New Protocol for AI-to-AI Communication (43:11) Security, Observability, and the Risks of Agentic Web (45:25) Are Enterprises Ready for the Agentic Future? (49:42) Closing the Feedback Loop: Customer Service and Product Evolution (56:06) The Vercel AI Cloud: From Pixels to Tokens (01:10:14) How Vercel Adapts to the ICP Change? (01:13:47) Retention, Margins, and the Business of AI Products (01:16:51) The Secret Behind Vercel Last Year Growth (01:24:15) The Importance of Online Presence (01:30:49) Everything, Everywhere, All at Once: Being CEO 101 (01:34:59) Guillermo's Advice to Younger Self

The Tech Blog Writer Podcast
3324: How Splunk Helps Businesses Cut Through Digital Noise

The Tech Blog Writer Podcast

Play Episode Listen Later Jun 23, 2025 21:14


How do you keep complex digital experiences running smoothly when every layer, from networks to cloud infrastructure to applications, can break in ways that frustrate customers and burn out IT teams? This question is at the heart of my conversation recorded live at Cisco Live in San Diego with Patrick Lin, Senior Vice President and General Manager for Observability at Splunk, now part of Cisco. In this episode, Patrick explains how observability has evolved far beyond simple monitoring and is becoming the nerve centre for digital resilience in a world where reactive alerts no longer cut it. We unpack how Splunk and Cisco ThousandEyes are now deeply integrated, giving teams a single source of truth that connects application behaviour, infrastructure health, and network performance, even across systems they do not directly control. Patrick also shares what these two-way integrations mean in practice: faster incident resolution, fewer blame games, and far less time wasted chasing false alerts. We explore how AI is enhancing this vision by cutting through the noise to detect real anomalies, correlate related events, and suggest root causes at a speed no human team could match. If your business depends on staying online and your teams are drowning in disconnected data, this conversation offers a glimpse into the next phase of unified observability and assurance. It might even help quiet the flood of alerts that keep IT professionals awake at night. How is your organisation tackling alert fatigue and rising complexity? Listen in and tell me what strategies you have found that actually work.

TechCrunch Startups – Spoken Edition
Observability startup Coralogix becomes a unicorn, eyes India expansion

TechCrunch Startups – Spoken Edition

Play Episode Listen Later Jun 20, 2025 5:37


Coralogix, an Israeli startup offering a full-stack observability and security platform, has raised $115 million at a pre-money valuation of over $1 billion, almost doubling in three years from its last round in 2022. Learn more about your ad choices. Visit podcastchoices.com/adchoices

A Bootiful Podcast
Micrometer.io lead Tommy Ludwig on the latest-and-greatest in observability for the Spring developer

A Bootiful Podcast

Play Episode Listen Later Jun 19, 2025 46:40


Hi, Spring fans! In this episode, I talk to Micrometer.io lead Tommy Ludwig on the latest-and-greatest in observability for the Spring developer

Smart Software with SmartLogic
LangChain: LLM Integration for Elixir Apps with Mark Ericksen

Smart Software with SmartLogic

Play Episode Listen Later Jun 12, 2025 38:18


Mark Ericksen, creator of the Elixir LangChain framework, joins the Elixir Wizards to talk about LLM integration in Elixir apps. He explains how LangChain abstracts away the quirks of different AI providers (OpenAI, Anthropic's Claude, Google's Gemini) so you can work with any LLM in one more consistent API. We dig into core features like conversation chaining, tool execution, automatic retries, and production-grade fallback strategies. Mark shares his experiences maintaining LangChain in a fast-moving AI world: how it shields developers from API drift, manages token budgets, and handles rate limits and outages. He also reveals testing tactics for non-deterministic AI outputs, configuration tips for custom authentication, and the highlights of the new v0.4 release, including “content parts” support for thinking-style models. Key topics discussed in this episode: • Abstracting LLM APIs behind a unified Elixir interface • Building and managing conversation chains across multiple models • Exposing application functionality to LLMs through tool integrations • Automatic retries and fallback chains for production resilience • Supporting a variety of LLM providers • Tracking and optimizing token usage for cost control • Configuring API keys, authentication, and provider-specific settings • Handling rate limits and service outages with degradation • Processing multimodal inputs (text, images) in Langchain workflows • Extracting structured data from unstructured LLM responses • Leveraging “content parts” in v0.4 for advanced thinking-model support • Debugging LLM interactions using verbose logging and telemetry • Kickstarting experiments in LiveBook notebooks and demos • Comparing Elixir LangChain to the original Python implementation • Crafting human-in-the-loop workflows for interactive AI features • Integrating Langchain with the Ash framework for chat-driven interfaces • Contributing to open-source LLM adapters and staying ahead of API changes • Building fallback chains (e.g., OpenAI → Azure) for seamless continuity • Embedding business logic decisions directly into AI-powered tools • Summarization techniques for token efficiency in ongoing conversations • Batch processing tactics to leverage lower-cost API rate tiers • Real-world lessons on maintaining uptime amid LLM service disruptions Links mentioned: https://rubyonrails.org/ https://fly.io/ https://zionnationalpark.com/ https://podcast.thinkingelixir.com/ https://github.com/brainlid/langchain https://openai.com/ https://claude.ai/ https://gemini.google.com/ https://www.anthropic.com/ Vertex AI Studio https://cloud.google.com/generative-ai-studio https://www.perplexity.ai/ https://azure.microsoft.com/ https://hexdocs.pm/ecto/Ecto.html https://oban.pro/ Chris McCord's ElixirConf EU 2025 Talk https://www.youtube.com/watch?v=ojL_VHc4gLk Getting started: https://hexdocs.pm/langchain/gettingstarted.html https://ash-hq.org/ https://hex.pm/packages/langchain https://hexdocs.pm/igniter/readme.html https://www.youtube.com/watch?v=WM9iQlQSFg @brainlid on Twitter and BlueSky Special Guest: Mark Ericksen.

O11ycast
Ep. #83, Observability Isn't Just SRE on Steroids with Dan Ravenstone

O11ycast

Play Episode Listen Later Jun 11, 2025 36:15


In episode 83 of o11ycast, the Honeycomb team chats with Dan Ravenstone, the o11yneer. Dan unpacks the crucial, often underappreciated, role of the observability engineer. He discusses how this position champions the user, bridging the gap between technical performance and real-world customer experience. Learn about the challenges of mobile observability, the importance of clear terminology, and how building alliances across an organization drives successful observability practices.

Heavybit Podcast Network: Master Feed
Ep. #83, Observability Isn't Just SRE on Steroids with Dan Ravenstone

Heavybit Podcast Network: Master Feed

Play Episode Listen Later Jun 11, 2025 36:15


In episode 83 of o11ycast, the Honeycomb team chats with Dan Ravenstone, the o11yneer. Dan unpacks the crucial, often underappreciated, role of the observability engineer. He discusses how this position champions the user, bridging the gap between technical performance and real-world customer experience. Learn about the challenges of mobile observability, the importance of clear terminology, and how building alliances across an organization drives successful observability practices.

VMware Communities Roundtable
#719 - Beyond Monitoring_ How Network Observability Transforms IT Operations

VMware Communities Roundtable

Play Episode Listen Later Jun 11, 2025


Bob and Eric discuss Network Observability with VMware tools.

VMware Communities Roundtable
#720 - A Developer_s Journey into Observability and Automation with Garvit Kataria

VMware Communities Roundtable

Play Episode Listen Later Jun 11, 2025


PodRocket - A web development podcast from LogRocket
Server functions don't exist with Jack Herrington

PodRocket - A web development podcast from LogRocket

Play Episode Listen Later Jun 5, 2025 21:20


Jack Herrington, podcaster, software engineer, writer and YouTuber, joins the pod to uncover the truth behind server functions and why they don't actually exist in the web platform. We dive into the magic behind frameworks like Next.js, TanStack Start, and Remix, breaking down how server functions work, what they simplify, what they hide, and what developers need to know to build smarter, faster, and more secure web apps. Links YouTube: https://www.youtube.com/@jherr Twitter: https://x.com/jherr Github: https://github.com/jherr ProNextJS: https://www.pronextjs.dev Discord: https://discord.com/invite/KRVwpJUG6p LinkedIn: https://www.linkedin.com/in/jherr Website: https://jackherrington.com Resources Server Functions Don't Exist (It Matters) (https://www.youtube.com/watch?v=FPJvlhee04E) We want to hear from you! How did you find us? Did you see us on Twitter? In a newsletter? Or maybe we were recommended by a friend? Let us know by sending an email to our producer, Em, at emily.kochanek@logrocket.com (mailto:emily.kochanek@logrocket.com), or tweet at us at PodRocketPod (https://twitter.com/PodRocketpod). Follow us. Get free stickers. Follow us on Apple Podcasts, fill out this form (https://podrocket.logrocket.com/get-podrocket-stickers), and we'll send you free PodRocket stickers! What does LogRocket do? LogRocket provides AI-first session replay and analytics that surfaces the UX and technical issues impacting user experiences. Start understanding where your users are struggling by trying it for free at LogRocket.com. Try LogRocket for free today. (https://logrocket.com/signup/?pdr) Special Guest: Jack Herrington.

CaSE: Conversations about Software Engineering
Mirko Novakovic on Waves of Innovation and Observability Product Management

CaSE: Conversations about Software Engineering

Play Episode Listen Later Jun 5, 2025 106:27 Transcription Available


In this episode of the CaSE Podcast, Mirko Novakovic, a seasoned entrepreneur and investor, shares his journey through the waves of technological innovation—from the early days of online banking to the rise of AI and open telemetry. We explore with him how the lessons learned in diverse industries, including the food business, can reshape our approach to software development and architecture, emphasizing the importance of curiosity, adaptability, and a solid grasp of the fundamentals.

MLOps.community
Product Metrics are LLM Evals // Raza Habib CEO of Humanloop // #320

MLOps.community

Play Episode Listen Later Jun 3, 2025 53:06


Raza Habib, the CEO of LLM Eval platform Humanloop, talks to us about how to make your AI products more accurate and reliable by shortening the feedback loop of your evals. Quickly iterating on prompts and testing what works, along with some of his favorite Dario from Anthropic AI Quotes.// BioRaza is the CEO and Co-founder at Humanloop. He has a PhD in Machine Learning from UCL, was the founding engineer of Monolith AI, and has built speech systems at Google. For the last 4 years, he has led Humanloop and supported leading technology companies such as Duolingo, Vanta, and Gusto to build products with large language models. Raza was featured in the Forbes 30 Under 30 technology list in 2022, and Sifted recently named him one of the most influential Gen AI founders in Europe.// Related LinksWebsites: https://humanloop.com~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExploreMLOps Swag/Merch: [https://shop.mlops.community/]Connect with Demetrios on LinkedIn: /dpbrinkmConnect with Raza on LinkedIn: /humanloop-razaTimestamps:[00:00] Cracking Open System Failures and How We Fix Them[05:44] LLMs in the Wild — First Steps and Growing Pains[08:28] Building the Backbone of Tracing and Observability[13:02] Tuning the Dials for Peak Model Performance[13:51] From Growing Pains to Glowing Gains in AI Systems[17:26] Where Prompts Meet Psychology and Code[22:40] Why Data Experts Deserve a Seat at the Table[24:59] Humanloop and the Art of Configuration Taming[28:23] What Actually Matters in Customer-Facing AI[33:43] Starting Fresh with Private Models That Deliver[34:58] How LLM Agents Are Changing the Way We Talk[39:23] The Secret Lives of Prompts Inside Frameworks[42:58] Streaming Showdowns — Creativity vs. Convenience[46:26] Meet Our Auto-Tuning AI Prototype[49:25] Building the Blueprint for Smarter AI[51:24] Feedback Isn't Optional — It's Everything

Engineering Kiosk
#198 RBAC & Co: Wer darf was? Klingt banal, ist aber verdammt wichtig!

Engineering Kiosk

Play Episode Listen Later Jun 3, 2025 67:34


Wer darf eigentlich was? Und sollten wir alle wirklich alles dürfen?Jedes Tech-Projekt beginnt mit einer simplen Frage: Wer darf eigentlich was? Doch spätestens wenn das Startup wächst, Kunden Compliance fordern oder der erste Praktikant an die Produktionsdatenbank rührt, wird Role Based Access Control (RBAC) plötzlich zur Überlebensfrage – und wer das Thema unterschätzt, hat schnell die Rechtehölle am Hals.In dieser Folge nehmen wir das altbekannte Konzept der rollenbasierten Zugriffskontrolle auseinander. wir klären, welches Problem RBAC eigentlich ganz konkret löst, warum sich hinter den harmlosen Checkboxen viel technische Tiefe und organisatorisches Drama verbirgt und weshalb RBAC nicht gleich RBAC ist.Dabei liefern wir dir Praxis-Insights: Wie setzen Grafana, Sentry, Elasticsearch, OpenSearch oder Tracing-Tools wie Jäger dieses Rechtekonzept um? Wo liegen die Fallstricke in komplexen, mehrmandantenfähigen Systemen?Ob du endlich verstehen willst, warum RBAC, ABAC (Attribute-Based), ReBAC (Relationship-Based) und Policy Engines mehr als nur Buzzwords sind oder wissen möchtest, wie du Policies, Edge Cases und Constraints in den Griff bekommst, darum geht es in diesem Deep Dives.Auch mit dabei: Open Source-Highlights wie Casbin, SpiceDB, OpenFGA und OPA und echte Projekt- und Startup-Tipps für pragmatischen Start und spätere Skalierung.Bonus: Ein Märchen mit Kevin und Max, wo auch manchmal der Praktikant trotzdem gegen den Admin gewinnt

Cup o' Go

Cup o' Go

Play Episode Listen Later May 29, 2025 31:04 Transcription Available


This episode was sponsored by Elastic! Elastic is the company behind Elasticsearch, they help teams find, analyze, and act on their data in real-time through their Search, Observability, and Security solutions. Thanks Elastic! This episode was recorded at Elastic's offices in San Francisco during a meetup.Find info about the show, past episodes including transcripts, our swag store, Patreon link, and more at https://cupogo.dev/.

The Data Stack Show
246: AI, Abstractions, and the Future of Data Engineering with Pete Hunt of Dagster

The Data Stack Show

Play Episode Listen Later May 28, 2025 48:59


Highlights from this week's conversation include:Pete's Background and Journey in Data (1:36)Evolution of Data Practices (3:02)Integration Challenges with Acquired Companies (5:13)Trust and Safety as a Service (8:12)Transition to Dagster (11:26)Value Creation in Networking (14:42)Observability in Data Pipelines (18:44)The Era of Big Complexity (21:38)Abstraction as a Tool for Complexity (24:41)Composability and Workflow Engines (28:08)The Need for Guardrails (33:13)AI in Development Tools (36:24)Internal Components Marketplace (40:14)Reimagining Data Integration (43:03)Importance of Abstraction in Data Tools (46:17)Parting Advice for Listeners and Closing Thoughts (48:01)The Data Stack Show is a weekly podcast powered by RudderStack, customer data infrastructure that enables you to deliver real-time customer event data everywhere it's needed to power smarter decisions and better customer experiences. Each week, we'll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

IBM Analytics Insights Podcasts
Part 2: Automation Remix: Observability, IBM Concert & the Next Wave of IT Ops

IBM Analytics Insights Podcasts

Play Episode Listen Later May 28, 2025 20:26


Send us a textWe're back for Part 2 of our Automation deep-dive—and the hits just keep coming! Host Al Martin reunites with IBM automation aces Sarah McAndrew (WW Automation Technical Sales) and Vikram Murali (App Mod & IT Automation Development) to push past the hype and map out the road ahead.

Making Data Simple
Part 2: Automation Remix: Observability, IBM Concert & the Next Wave of IT Ops

Making Data Simple

Play Episode Listen Later May 28, 2025 20:26


Send us a textWe're back for Part 2 of our Automation deep-dive—and the hits just keep coming! Host Al Martin reunites with IBM automation aces Sarah McAndrew (WW Automation Technical Sales) and Vikram Murali (App Mod & IT Automation Development) to push past the hype and map out the road ahead.

OpenObservability Talks
ClickHouse: Breaking the Speed Limit for Observability and Analytics - OpenObservability Talks S5E12

OpenObservability Talks

Play Episode Listen Later May 27, 2025 58:27


The ClickHouse® project is a rising star in observability and analytics, challenging performance conventions with its breakneck speed. This open source OLAP column store, originally developed at Yandex to power their web analytics platform at massive scale, has quickly evolved into one of the hottest open source observability data stores around. Its published performance benchmarks have been the topic of conversation, outperforming many legacy databases and setting a new bar for fast queries over large volumes of data.Our guest for this episode is Robert Hodges, CEO of Altinity — the second largest contributor to the ClickHouse project. With over 30 years of experience in databases, Robert brings deep insights into how ClickHouse is challenging legacy databases at scale. We'll also explore Altinity's just-launched groundbreaking open source project—Project Antalya—which extends ClickHouse with Apache Iceberg shared storage, unlocking dramatic improvements in both performance and cost efficiency. Think 90% reductions in storage costs and 10 to 100x faster queries, all without requiring any changes to your existing applications.The episode was live-streamed on 20 May 2025 and the video is available at https://www.youtube.com/watch?v=VeyTL2JlWp0You can read the recap post: https://medium.com/p/2004160b2f5e/ OpenObservability Talks episodes are released monthly, on the last Thursday of each month and are available for listening on your favorite podcast app and on YouTube.We live-stream the episodes on Twitch and YouTube Live - tune in to see us live, and chime in with your comments and questions on the live chat.⁠⁠https://www.youtube.com/@openobservabilitytalks⁠  https://www.twitch.tv/openobservability⁠Show Notes:00:00 - Intro01:38 - ClickHouse elevator pitch02:46 - guest intro04:48 - ClickHouse under the hood08:15 - SQL and the database evolution path 11:20 - the return of SQL16:13 - design for speed 17:14 - use cases for ClickHouse19:18 - ClickHouse ecosystem22:22 - ClickHouse on Kubernetes 31:45 - know how ClickHouse works inside to get the most out of it 38:59 - ClickHouse for Observability46:58 - Project Antalya55:03 - Kubernetes 1.33 release55:32 - OpenSearch 3.0 release56:01 - New Permissive License for ML Models Announced by the Linux Foundation57:08 - OutroResources:ClickHouse on GitHub: https://github.com/ClickHouse/ClickHouse Shopify's Journey to Planet-Scale Observability: https://medium.com/p/9c0b299a04ddProject Antalya: https://altinity.com/blog/getting-started-with-altinitys-project-antalya https://cmtops.dev/posts/building-observability-with-clickhouse/ Kubernetes 1.33 release highlights: https://www.linkedin.com/feed/update/urn:li:activity:7321054742174924800/ New Permissive License for Machine Learning Models Announced by the Linux Foundation: https://www.linkedin.com/feed/update/urn:li:share:7331046183244611584  Opensearch 3.0 major release: https://www.linkedin.com/posts/horovits_opensearch-activity-7325834736008880128-kCqrSocials:Twitter:⁠ https://twitter.com/OpenObserv⁠YouTube: ⁠https://www.youtube.com/@openobservabilitytalks⁠Dotan Horovits============X (Twitter): @horovitsLinkedIn: www.linkedin.com/in/horovitsMastodon: @horovits@fosstodonBlueSky: @horovits.bsky.socialRobert Hodges=============LinkedIn: https://www.linkedin.com/in/berkeleybob2105/ 

PurePerformance
The Research Behind the AI and Observability Innovation with Otmar Ertl and Martin Flechl

PurePerformance

Play Episode Listen Later May 26, 2025 50:59


Scientific research is the foundation of many innovative solutions in any field. Did you know that Dynatrace runs its own Research Lab within the Campus of the Johannes Kepler University (JKU) in Linz, Austria - just 2 kilometers away from our global engineering headquarter? What started in 2020 has grown to 20 full time researchers and many more students that do research on topics such as GenAI, Agentic AI, Log Analytics, Procesesing of Large Data Sets, Sampling Strategies, Cloud Native Security or Memory and Storage Optimizations.Tune in and hear from Otmar and Martin how they are researching on the N+2 generation of Observability and AI, how they are contributing to open source projects such as OpenTelemetry, and what their predictions are when AI is finally taking control of us humans!To learn more about their work check out these links:Martin's LinkedIn: https://www.linkedin.com/in/mflechl/Otmar's LinkedIn: https://www.linkedin.com/in/otmar-ertl/Dynatrace Research Lab: https://careers.dynatrace.com/locations/linz/#__researchLab

The Data Stack Show
244: Postgres to ClickHouse: Simplifying the Modern Data Stack with Aaron Katz & Sai Krishna Srirampur

The Data Stack Show

Play Episode Listen Later May 20, 2025 34:51


Highlights from this week's conversation include:Background of ClickHouse (1:14)PostgreSQL Data Replication Tool (3:19)Emerging Technologies Observations (7:25)Observability and Market Dynamics (11:26)Product Development Challenges (12:39)Challenges with PostgreSQL Performance (15:30)Philosophy of Open Source (18:01)Open Source Advantages (22:56)Simplified Stack Vision (24:48)End-to-End Use Cases (28:13)Migration Strategies (30:21)Final Thoughts and Takeaways (33:29)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we'll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

Federal Tech Podcast: Listen and learn how successful companies get federal contracts
Ep. 239 Boosting Federal Cybersecurity with Agentless Observability

Federal Tech Podcast: Listen and learn how successful companies get federal contracts

Play Episode Listen Later May 20, 2025 24:38


Connect to John Gilroy on LinkedIn   https://www.linkedin.com/in/john-gilroy/ Want to listen to other episodes? www.Federaltechpodcast.com AFCEA'S TechNet Cyber conference held in Baltimore, Maryland was the perfect opportunity to sit down with Bryan Rosensteel, Head of Public Sector Marketing at Wiz.  Wiz is the “new kid on the block,” and it has had tremendous growth.  During the interview, Bryan Rosensteel shows how agentless approaches can improve visibility and assist with compliance.  We all know how complexity has infiltrated federal technology.  We have the usual suspect of Cloud Service Providers, hybrid clouds, private clouds, and, if that was not complicated enough, alt-clouds.  As a result, it is almost impossible to get a “bird's eye” visibility to provide cyber security. Two main ways have been proposed to secure this much-desired system's view. Agent.  One approach is to put a bit of code on each device, called an “agent” method.  It is good for granular control, but can slow down a scan and must be maintained Agentless.  Bryan Rosensteel from Wiz describes something called a “agentless” method to gain visibility into complex systems.  This method leverages infrastructure and protocols to accomplish the scanning objective much faster. Bryan Rosensteel states that in a world of constant attacks, this faster method allows for rapid updates to threats. Beyond better observation, an agentless method, like the one provided by Wiz, allows for compliance automation, continuous monitoring, and sets the groundwork for effective Zero Trust implementation.

Startup Project
How Chronosphere Solved Observability in Containerized Environments to Build $1.6B Company | Uber spin-out, 5x Cheap & Impact of AI in Observability | CEO Martin Mao | Startup Project #101

Startup Project

Play Episode Listen Later May 18, 2025 50:47


Martin Mao is the co-founder and CEO of Chronosphere, an observability platform built for the modern containerized world. Prior to Chronosphere, Martin led the observability team at Uber, tackling the unique challenges of large-scale distributed systems. With a background as a technical lead at AWS, Martin brings unique experience in building scalable and reliable infrastructure. In this episode, he shares the story behind Chronosphere, its approach to cost-efficient observability, and the future of monitoring in the age of AI.What you'll learn:The specific observability challenges that arise when transitioning to containerized environments and microservices architectures, including increased data volume and new problem sources.How Chronosphere addresses the issue of wasteful data storage by providing features that identify and optimize useful data, ensuring customers only pay for valuable insights.Chronosphere's strategy for competing with observability solutions offered by major cloud providers like AWS, Azure, and Google Cloud, focusing on specialized end-to-end product.The innovative ways in which Chronosphere's products, including their observability platform and telemetry pipeline, improve the process of detecting and resolving problems.How Chronosphere is leveraging AI and knowledge graphs to normalize unstructured data, enhance its analytics engine, and provide more effective insights to customers.Why targeting early adopters and tech-forward companies is beneficial for product innovation, providing valuable feedback for further improvements and new features. How observability requirements are changing with the rise of AI and LLM-based applications, and the unique data collection and evaluation criteria needed for GPUs.Takeaways:Chronosphere originated from the observability challenges faced at Uber, where existing solutions couldn't handle the scale and complexity of a containerized environment.Cost efficiency is a major differentiator for Chronosphere, offering significantly better cost-benefit ratios compared to other solutions, making it attractive for companies operating at scale.The company's telemetry pipeline product can be used with existing observability solutions like Splunk and Elastic to reduce costs without requiring a full platform migration.Chronosphere's architecture is purposely single-tenanted to minimize coupled infrastructures, ensuring reliability and continuous monitoring even when core components go down.AI-driven insights for observability may not benefit from LLMs that are trained on private business data, which can be diverse and may cause models to overfit to a specific case.Many tech-forward companies are using the platform to monitor model training which involves GPU clusters and a new evaluation criterion that is unlike general CPU workload.The company found a huge potential by scrubbing the diverse data and building knowledge graphs to be used as a source of useful information when problems are recognized.Subscribe to Startup Project for more engaging conversations with leading entrepreneurs!→ Email updates: ⁠https://startupproject.substack.com/⁠#StartupProject #Chronosphere #Observability #Containers #Microservices #Uber #AWS #Monitoring #CloudNative #CostOptimization #AI #ArtificialIntelligence #LLM #MLOps #Entrepreneurship #Podcast #YouTube #Tech #Innovation

O11ycast
Ep. #81, Observability 3.0 and Beyond with Hazel Weakly and Matt Klein

O11ycast

Play Episode Listen Later May 14, 2025 40:36


In episode 81 of o11ycast, Charity Majors and Martin Thwaites dive into a lively discussion with Hazel Weakly and Matt Klein on the evolving landscape of observability. The guests explore the concept of observability versioning, the challenges of cost and ROI, and the future of observability tools, including the potential convergence with AI and business intelligence.

TestGuild Performance Testing and Site Reliability Podcast
Observability at Scale with AI with Jacob Leverich

TestGuild Performance Testing and Site Reliability Podcast

Play Episode Listen Later May 14, 2025 36:47


In this episode of the DevOps Toolchain podcast, Joe Colantonio sits down with Jacob Leverich, cofounder and Chief Product Officer at Observe, to explore how AI and cutting-edge data strategies are transforming the world of observability. With a career spanning heavyweight roles from Splunk to Google and Kuro Labs, Jacob shares his journey from banging out Perl scripts as a Linux sysadmin to building scalable, data-driven solutions that address the complex realities of today's digital infrastructure. Tune in as Joe and Jacob explore why traditional monitoring approaches are struggling with massive data volumes, how knowledge graphs and data lakes are breaking down tool silos, and what engineering leaders often get wrong when scaling visibility across teams. Whether you're a tester, developer, SRE, or team lead, get ready to discover actionable insights on maximizing the value of your data, the true role of AI in troubleshooting, and practical tips for leading your organization into the future of DevOps observability. Don't miss it! Try out Insight Hub free for 14 days now: https://testguild.me/insighthub. No credit card required.

Heavybit Podcast Network: Master Feed
Ep. #81, Observability 3.0 and Beyond with Hazel Weakly and Matt Klein

Heavybit Podcast Network: Master Feed

Play Episode Listen Later May 14, 2025 40:36


In episode 81 of o11ycast, Charity Majors and Martin Thwaites dive into a lively discussion with Hazel Weakly and Matt Klein on the evolving landscape of observability. The guests explore the concept of observability versioning, the challenges of cost and ROI, and the future of observability tools, including the potential convergence with AI and business intelligence.

Everyday AI Podcast – An AI and ChatGPT Podcast
EP 524: Agentic AI Done Right - How to avoid missing out or messing up.

Everyday AI Podcast – An AI and ChatGPT Podcast

Play Episode Listen Later May 13, 2025 18:33


Agentic AI is equally as daunting as it is dynamic. So…… how do you not screw it up? After all, the more robust and complex agentic AI becomes, the more room there is for error. Luckily, we've got Dr. Maryam Ashoori to guide our agentic ways. Maryam is the Senior Director of Product Management of watsonx at IBM. She joined us at IBM Think 2025 to break down agentic AI done right. Newsletter: Sign up for our free daily newsletterMore on this Episode: Episode PageJoin the discussion: Have a question? Join the convo here.Upcoming Episodes: Check out the upcoming Everyday AI Livestream lineupWebsite: YourEverydayAI.comEmail The Show: info@youreverydayai.comConnect with Jordan on LinkedInTopics Covered in This Episode:Agentic AI Benefits for EnterprisesWatson X's New Features & AnnouncementsAI-Powered Enterprise Solutions at IBMResponsible Implementation of Agentic AILLMs in Enterprise Cost OptimizationDeployment and Scalability EnhancementsAI's Impact on Developer ProductivityProblem-Solving with Agentic AITimestamps:00:00 AI Agents: A Business Imperative06:14 "Optimizing Enterprise Agent Strategy"09:15 Enterprise Leaders' AI Mindset Shift09:58 Focus on Problem-Solving with Technology13:34 "Boost Business with LLMs"16:48 "Understanding and Managing AI Risks"Keywords:Agentic AI, AI agents, Agent lifecycle, LLMs taking actions, WatsonX.ai, Product management, IBM Think conference, Business leaders, Enterprise productivity, WatsonX platform, Custom AI solutions, Environmental Intelligence Suite, Granite Code models, AI-powered code assistant, Customer challenges, Responsible AI implementation, Transparency and traceability, Observability, Optimization, Larger compute, Cost performance optimization, Chain of thought reasoning, Inference time scaling, Deployment service, Scalability of enterprise, Access control, Security requirements, Non-technical users, AI-assisted coding, Developer time-saving, Function calling, Tool calling, Enterprise data integration, Solving enterprise problems, Responsible implementation, Human in the loop, Automation, IBM savings, Risk assessment, Empowering workforce.Send Everyday AI and Jordan a text message. (We can't reply back unless you leave contact info)

Packet Pushers - Full Podcast Feed
TNO028: Move From Monitoring to Full Internet Stack Observability: New Strategies for NetOps (Sponsored)

Packet Pushers - Full Podcast Feed

Play Episode Listen Later May 9, 2025 52:35


Network monitoring, Internet monitoring, and observability are all key components of NetOps. We speak with sponsor Catchpoint to understand how Catchpoint can help network operators proactively identify and resolve issues before they impact customers. We discuss past and current network monitoring strategies and the challenges that operators face with both on-prem and cloud monitoring, along... Read more »

Packet Pushers - Fat Pipe
TNO028: Move From Monitoring to Full Internet Stack Observability: New Strategies for NetOps (Sponsored)

Packet Pushers - Fat Pipe

Play Episode Listen Later May 9, 2025 52:35


Network monitoring, Internet monitoring, and observability are all key components of NetOps. We speak with sponsor Catchpoint to understand how Catchpoint can help network operators proactively identify and resolve issues before they impact customers. We discuss past and current network monitoring strategies and the challenges that operators face with both on-prem and cloud monitoring, along... Read more »

AWS re:Think Podcast
Episode 40: AI Observabilty and Evaluation with Arize AI

AWS re:Think Podcast

Play Episode Listen Later May 7, 2025 39:04


AI can still sometimes hallucinate and give less than optimal answers. To address this, we are joined by Arize AI's Co-Founder a Aparna Dhinakaran for a discussion on Observability and Evaluation for AI. We begin by discussing the challenges AI Observability and Evaluation. For example, how does “LLM as a Judge” work? We conclude with some valuable advice from Aparna for first time entrepreneurs.Begin Observing and Evaluating your AI Applications with Open Source Phoenix:https://phoenix.arize.com/AWS Hosts: Nolan Chen & Malini ChatterjeeEmail Your Feedback: rethinkpodcast@amazon.com

The Cloud Gambit
Seeing Through the Clouds: Observability with Justin Ryburn

The Cloud Gambit

Play Episode Listen Later May 6, 2025 48:30 Transcription Available


Send us a textJustin Ryburn is the Field CTO at Kentik and works as a Limited Partner (LP) for Stage 2 Capital. Justin has 25 years of experience in network operations, engineering, sales, and marketing with service providers and vendors. In this conversation, we discuss startup funding,  the challenges that organizations face with hybrid and multi-cloud visibility, the impact of AI on network monitoring, and explore how companies can build more reliable systems through proper observability practices.Where to Find JustinLinkedIn: https://www.linkedin.com/in/justinryburn/Twitter: https://x.com/JustinRyburnBlog: http://ryburn.org/Talks: https://www.youtube.com/playlist?list=PLRrjaaisdWrYaue9KVLRdq5mlGE_2i0RTShow LinksKentik: https://www.kentik.com/Day One: Deploying BGP FlowSpec: https://www.juniper.net/documentation/en_US/day-one-books/DO_BGP_FLowspec.pdfStage 2 Capital: https://www.stage2.capital/Doug Madory's Internet Analysis: https://www.kentik.com/blog/author/doug-madory/Netflix Tech Blog: https://netflixtechblog.com/Multi-Region AWS: https://www.pluralsight.com/resources/blog/cloud/why-and-how-do-we-build-a-multi-region-active-active-architectureAutoCon: https://events.networktocode.com/autocon/Follow, Like, and Subscribe!Podcast: https://www.thecloudgambit.com/YouTube: https://www.youtube.com/@TheCloudGambitLinkedIn: https://www.linkedin.com/company/thecloudgambitTwitter: https://twitter.com/TheCloudGambitTikTok: https://www.tiktok.com/@thecloudgambit

Catalog & Cocktails
TAKEAWAYS - What is Data + AI Observability and Why It's Part of Your Competitive Moat with Barr Moses

Catalog & Cocktails

Play Episode Listen Later May 1, 2025 4:10


Barr Moses, CEO & Co-Founder of Monte Carlo, challenges the notion that models alone create competitive advantage, arguing instead that the real moat lies in how organizations manage their proprietary data and ensure end-to-end reliability. Tim and Juan chat with Barr to get the Honest, No-BS scoop of what AI observability is (hint, it's really data + AI) and how organizations can build resilient AI applications.

Catalog & Cocktails
What is Data + AI Observability and Why It's Part of Your Competitive Moat with Barr Moses

Catalog & Cocktails

Play Episode Listen Later May 1, 2025 53:09


Barr Moses, CEO & Co-Founder of Monte Carlo, challenges the notion that models alone create competitive advantage, arguing instead that the real moat lies in how organizations manage their proprietary data and ensure end-to-end reliability. Tim and Juan chat with Barr to get the Honest, No-BS scoop of what AI observability is (hint, it's really data + AI) and how organizations can build resilient AI applications.

AWS for Software Companies Podcast
Ep097: Specialized Agents & Agentic Orchestration - New Relic and the Future of Observability

AWS for Software Companies Podcast

Play Episode Listen Later Apr 28, 2025 29:04


New Relic's Head of AI and ML Innovation, Camden Swita discusses their four-cornered AI strategy and envisions a future of "agentic orchestration" with specialized agents.Topics Include:Introduction of Camden Swita, Head of AI at New Relic.New Relic invented the observability space for monitoring applications.Started with Java workloads monitoring and APM.Evolved into full-stack observability with infrastructure and browser monitoring.Uses advanced query language (NRQL) with time series database.AI strategy focuses on AI ops for automation.First cornerstone: Intelligent detection capabilities with machine learning.Second cornerstone: Incident response with generative AI assistance.Third cornerstone: Problem management with root cause analysis.Fourth cornerstone: Knowledge management to improve future detection.Initially overwhelmed by "ocean of possibilities" with LLMs.Needed narrow scope and guardrails for measurable progress.Natural language to NRQL translation proved immensely complex.Selecting from thousands of possible events caused accuracy issues.Shifted from "one tool" approach to many specialized tools.Created routing layer to select right tool for each job.Evaluation of NRQL is challenging even when syntactically correct.Implemented multi-stage validation with user confirmation step.AWS partnership involves fine-tuning models for NRQL translation.Using Bedrock to select appropriate models for different tasks.Initially advised prototyping on biggest, best available models.Now recommends considering specialized, targeted models from start.Agent development platforms have improved significantly since beginning.Future focus: "Agentic orchestration" with specialized agents.Envisions agents communicating through APIs without human prompts.Integration with AWS tools like Amazon Q.Industry possibly plateauing in large language model improvements.Increasing focus on inference-time compute in newer models.Context and quality prompts remain crucial despite model advances.Potential pros and cons to inference-time compute approach.Participants:Camden Swita – Head of AI & ML Innovation, Product Management, New RelicSee how Amazon Web Services gives you the freedom to migrate, innovate, and scale your software company at https://aws.amazon/isv/

Packet Pushers - Full Podcast Feed
Tech Bytes: Network Observability AIOps Tips For Success (Sponsored)

Packet Pushers - Full Podcast Feed

Play Episode Listen Later Apr 21, 2025 23:39


Today on the Tech Bytes podcast we're talking AI readiness with sponsor Broadcom. More specifically, getting your network observability ready to support AI operations. This isn't just a hardware or software issue. It's also a data issue. We'll get some tips with our guest Jeremy Rossbach. Jeremy is Chief Technical Evangelist and Lead Product Marketing... Read more »

Software Engineering Daily
Prometheus and Open-Source Observability with Eric Schabell

Software Engineering Daily

Play Episode Listen Later Apr 15, 2025 46:06


Modern cloud-native systems are highly dynamic and distributed, which makes it difficult to monitor cloud infrastructure using traditional tools designed for static environments. This has motivated the development and widespread adoption of dedicated observability platforms. Prometheus is an open-source observability tool designed for cloud-native environments. Its strong integration with Kubernetes and pull-based data collection model The post Prometheus and Open-Source Observability with Eric Schabell appeared first on Software Engineering Daily.