Podcasts about ETL

  • 401PODCASTS
  • 942EPISODES
  • 45mAVG DURATION
  • 5WEEKLY NEW EPISODES
  • Nov 11, 2025LATEST

POPULARITY

20172018201920202021202220232024


Best podcasts about ETL

Show all podcasts related to etl

Latest podcast episodes about ETL

MLOps.community
The GPU Uptime Battle

MLOps.community

Play Episode Listen Later Nov 11, 2025 93:45


Andy Pernsteiner is the Field CTO at VAST Data, working on large-scale AI infrastructure, serverless compute near data, and the rollout of VAST's AI Operating System.The GPU Uptime Battle // MLOps Podcast #346 with Andy Pernsteiner, Field CTO of VAST Data.Huge thanks to VAST Data for supporting this episode!Join the Community: https://go.mlops.community/YTJoinInGet the newsletter: https://go.mlops.community/YTNewsletter// AbstractMost AI projects don't fail because of bad models; they fail because of bad data plumbing. Andy Pernsteiner joins the podcast to talk about what it actually takes to build production-grade AI systems that aren't held together by brittle ETL scripts and data copies. He unpacks why unifying data - rather than moving it - is key to real-time, secure inference, and how event-driven, Kubernetes-native pipelines are reshaping the way developers build AI applications. It's a conversation about cutting out the complexity, keeping data live, and building systems smart enough to keep up with your models. // BioAndy is the Field Chief Technology Officer at VAST, helping customers build, deploy, and scale some of the world's largest and most demanding computing environments.Andy has spent the past 15 years focused on supporting and building large-scale, high-performance data platform solutions. From humble beginnings as an escalations engineer at pre-IPO Isilon, to leading a team of technical Ninjas at MapR, he's consistently been in the frontlines solving some of the toughest challenges that customers face when implementing Big Data Analytics and next-generation AI solutions.// Related LinksWebsite: www.vastdata.comhttps://www.youtube.com/watch?v=HYIEgFyHaxkhttps://www.youtube.com/watch?v=RyDHIMniLro The Mom Test by Rob Fitzpatrick: https://www.momtestbook.com/~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExploreJoin our Slack community [https://go.mlops.community/slack]Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)] Sign up for the next meetup: [https://go.mlops.community/register]MLOps Swag/Merch: [https://shop.mlops.community/]Connect with Demetrios on LinkedIn: /dpbrinkmConnect with Andy on LinkedIn: /andypernsteinerTimestamps:[00:00] Prototype to production gap[00:21] AI expectations vs reality[03:00] Prototype vs production costs[07:47] Technical debt awareness[10:13] The Mom Test[15:40] Chaos engineering[22:25] Data messiness reflection[26:50] Small data value[30:53] Platform engineer mindset shift[34:26] Gradient description comparison[38:12] Empathy in MLOps[45:48] Empathy in Engineering[51:04] GPU clusters rolling updates[1:03:14] Checkpointing strategy comparison[1:09:44] Predictive vs Generative AI[1:17:51] On Growth, Community, and New Directions[1:24:21] UX of agents[1:32:05] Wrap up

Crazy Wisdom
Episode #505: From Big Data to Big Meaning: Jessica Talisman on the Hidden Architecture of Knowledge

Crazy Wisdom

Play Episode Listen Later Nov 10, 2025 72:04


In this episode of Crazy Wisdom, host Stewart Alsop talks with Jessica Talisman, founder of Contextually and creator of the Ontology Pipeline, about the deep connections between knowledge management, library science, and the emerging world of AI systems. Together they explore how controlled vocabularies, ontologies, and metadata shape meaning for both humans and machines, why librarianship has lessons for modern tech, and how cultural context influences what we call “knowledge.” Jessica also discusses the rise of AI librarians, the problem of “AI slop,” and the need for collaborative, human-centered knowledge ecosystems. You can learn more about her work at Ontology Pipeline and find her writing and talks on LinkedIn.Check out this GPT we trained on the conversationTimestamps00:00 Stewart Alsop welcomes Jessica Talisman to discuss Contextually, ontologies, and how controlled vocabularies ground scalable systems.05:00 They compare philosophy's ontology with information science, linking meaning, categorization, and sense-making for humans and machines.10:00 Jessica explains why SQL and Postgres can't capture knowledge complexity and how neuro-symbolic systems add context and interoperability.15:00 The talk turns to library science's split from big data in the 1990s, metadata schemas, and the FAIR principles of findability and reuse.20:00 They discuss neutrality, bias in corporate vocabularies, and why “touching grass” matters for reconciling internal and external meanings.25:00 Conversation shifts to interpretability, cultural context, and how Western categorical thinking differs from China's contextual knowledge.30:00 Jessica introduces process knowledge, documentation habits, and the danger of outsourcing how-to understanding.35:00 They explore knowledge as habit, the tension between break-things culture and library design thinking, and early AI experiments.40:00 Libraries' strategic use of AI, metadata precision, and the emerging role of AI librarians take focus.45:00 Stewart connects data labeling, Surge AI, and the economics of good data with Jessica's call for better knowledge architectures.50:00 They unpack content lifecycle, provenance, and user context as the backbone of knowledge ecosystems.55:00 The talk closes on automation limits, human-in-the-loop design, and Jessica's vision for collaborative consulting through Contextually.Key InsightsOntology is about meaning, not just data structure. Jessica Talisman reframes ontology from a philosophical abstraction into a practical tool for knowledge management—defining how things relate and what they mean within systems. She explains that without clear categories and shared definitions, organizations can't scale or communicate effectively, either with people or with machines.Controlled vocabularies are the foundation of AI literacy. Jessica emphasizes that building a controlled vocabulary is the simplest and most powerful way to disambiguate meaning for AI. Machines, like people, need context to interpret language, and consistent terminology prevents the “hallucinations” that occur when systems lack semantic grounding.Library science predicted today's knowledge crisis. Stewart and Jessica trace how, in the 1990s, tech went down the path of “big data” while librarians quietly built systems of metadata, ontologies, and standards like schema.org. Today's AI challenges—interoperability, reliability, and information overload—mirror problems library science has been solving for decades.Knowledge is culturally shaped. Drawing from Patrick Lambe's work, Jessica notes that Western knowledge systems are category-driven, while Chinese systems emphasize context. This cultural distinction explains why global AI models often miss nuance or moral voice when trained on limited datasets.Process knowledge is disappearing. The West has outsourced its “how-to” knowledge—what Jessica calls process knowledge—to other countries. Without documentation habits, we risk losing the embodied know-how that underpins manufacturing, engineering, and even creative work.Automation cannot replace critical thinking. Jessica warns against treating AI as “room service.” Automation can support, but not substitute, human judgment. Her own experience with a contract error generated by an AI tool underscores the importance of review, reflection, and accountability in human–machine collaboration.Collaborative consulting builds knowledge resilience. Through her consultancy, Contextually, Jessica advocates for “teaching through doing”—helping teams build their own ontologies and vocabularies rather than outsourcing them. Sustainable knowledge systems, she argues, depend on shared understanding, not just good technology.

MarTech Podcast // Marketing + Technology = Business Growth
What will the marketing analytics tech-stack look like in 5 years?

MarTech Podcast // Marketing + Technology = Business Growth

Play Episode Listen Later Nov 7, 2025 4:13


Marketing analytics stacks struggle with outdated, siloed data that delays critical business decisions. Noha Rizk, CMO of Incorta, explains how live data integration transforms enterprise analytics capabilities. She demonstrates how questioning "why" behind data patterns unlocks actionable insights and discusses eliminating complex ETL processes through real-time analysis across all business systems. The conversation covers practical frameworks for moving from raw data collection to immediate business intelligence that drives customer behavior understanding.See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

Revenue Generator Podcast: Sales + Marketing + Product + Customer Success = Revenue Growth
What will the marketing analytics tech-stack look like in 5 years?

Revenue Generator Podcast: Sales + Marketing + Product + Customer Success = Revenue Growth

Play Episode Listen Later Nov 7, 2025 4:13


Marketing analytics stacks struggle with outdated, siloed data that delays critical business decisions. Noha Rizk, CMO of Incorta, explains how live data integration transforms enterprise analytics capabilities. She demonstrates how questioning "why" behind data patterns unlocks actionable insights and discusses eliminating complex ETL processes through real-time analysis across all business systems. The conversation covers practical frameworks for moving from raw data collection to immediate business intelligence that drives customer behavior understanding.See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

MarTech Podcast // Marketing + Technology = Business Growth
Quickest way to improve analytics using live data in a campaign

MarTech Podcast // Marketing + Technology = Business Growth

Play Episode Listen Later Nov 5, 2025 3:15


Incorta is the first and only open data delivery platform that enables real-time analysis of live, detailed data across all systems of record—without the need for complex ETL processes.See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

Revenue Generator Podcast: Sales + Marketing + Product + Customer Success = Revenue Growth
Quickest way to improve analytics using live data in a campaign

Revenue Generator Podcast: Sales + Marketing + Product + Customer Success = Revenue Growth

Play Episode Listen Later Nov 5, 2025 3:15


Incorta is the first and only open data delivery platform that enables real-time analysis of live, detailed data across all systems of record—without the need for complex ETL processes.See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

MarTech Podcast // Marketing + Technology = Business Growth
The most important learning about data at Meta

MarTech Podcast // Marketing + Technology = Business Growth

Play Episode Listen Later Nov 4, 2025 4:48


Incorta is the first and only open data delivery platform that enables real-time analysis of live, detailed data across all systems of record—without the need for complex ETL processes.See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

Revenue Generator Podcast: Sales + Marketing + Product + Customer Success = Revenue Growth

Most companies rely on stale dashboards while AI demands live data for real-time decisions. Noha Rizk, CMO of Incorta, explains how enterprises can transition from legacy data systems to real-time analytics infrastructure. She covers identifying high-ROI use cases like retail waste optimization and supply chain management, implementing live data without complex ETL processes, and enabling business users to query data instantly for creative problem-solving.See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

The Ravit Show
Agents are now, what is going to be next?

The Ravit Show

Play Episode Listen Later Nov 3, 2025 16:13


Last week at BDL, we caught up with my friend Pavel Dolezalžal, co-founder at Keboola - and let me tell you, this conversation was a ride.We started with Keboola Agents, which are already live and helping data teams debug pipelines, document, and automate safely inside a governed platform.Then Pavel dropped the big news: An open-source, conversational ETL pipeline generator - Osiris.You literally describe the problem, and Osiris drafts a transparent, deterministic YAML pipeline.You review, approve, commit - and it just runs. No black boxes, no daily AI gambling.That would be a big shift: AI proposes. Humans approve. Execution stays auditable.Too bold? Have a look for yourself!Links & Resources- Learn more about Keboola Agents: http://bit.ly/4pL6a7h- Explore Osiris on GitHub: https://github.com/keboola/osiris - Connect with Pavel on LinkedIn: https://www.linkedin.com/in/paveld/#data #ai #agents #keboola #theravitshow #dataengineering

The Staff Assistant Podcast
Episode 53: ETL - Enter The Lion

The Staff Assistant Podcast

Play Episode Listen Later Oct 20, 2025 118:12


In this episode, I interview Josh Cook - the founder of ETL - Enter the Lion. Josh joined the Baltimore Police Department as a police officer, eventually transferring to the Anne Arundel County Police Department. Feeling called to minister to law enforcement directly, he separated from his department, relocated to Tennessee and began Enter the Lion  (ETL).  ETL is a Christian ministry that provides a completely free retreat to first responders and law enforcement who are desiring rest and time in nature. ETL provides biblical counseling, mentorship, and discipleship to those seeking a connection with other believers. To contact Josh or inquire about attending a retreat, contact him through his website:www.enterthelion.coYou can access The Tactical Debrief on Apple, Spotify, or Audible Podcasts.

The Agency Profit Podcast
Parakeeto vs. Project Management Tools: What's the Real Solution?, With Kristen Kelly

The Agency Profit Podcast

Play Episode Listen Later Oct 8, 2025 42:39


Points of Interest00:00 – 01:30 – Introduction: Marcel welcomes Parakeeto's Kristen Kelly back to discuss a recurring misconception in agency operations—the belief that a better project management or PSA tool can solve profit management challenges.01:30 – 03:25 – The PM Tool “Silver Bullet” Myth: Kristen explains how leaders and PMs often adopt new tools to tame chaos, believing marketing promises that they'll also solve utilization, capacity, and profitability issues.03:25 – 06:00 – Why Agencies Fall for It: Marcel and Kristen note that while PM tools are valuable, they're often oversold as full profit-management systems. Agencies end up frustrated by missing fields, tool quirks, and data limitations.06:00 – 08:45 – Hitting the Wall: Many teams find themselves with tools that improve delivery workflows but still leave them unable to make key financial or operational decisions because the data remains fragmented across systems.08:45 – 11:43 – Introducing the Framework → Data → Process Model: Marcel outlines Parakeeto's three-part sequence for solving profit management: define the framework (metrics and formulas), structure the data, and establish ongoing processes for hygiene and cadence.11:43 – 12:46 – Why Sequencing Matters: Without first defining what needs to be measured, agencies make poor configuration choices in PM tools—creating rework, confusion, and endless tool migrations.12:46 – 15:19 – Defining the Framework: Agencies must precisely define how metrics like utilization, delivery margin, and project profitability are calculated, and understand the relationships between those measures before configuring tools.15:19 – 19:54 – The Role of Process and Data Hygiene: Marcel explains that real-time reporting fails if data quality is poor. Clean, reliable reporting requires an ETL (Extract, Transform, Load) process, not direct reporting from source data.19:54 – 22:55 – The Precision Trap: Kristen and Marcel explore the conflict between PMs needing granular precision and executives needing simple, high-level rollups. Forcing perfect data consistency across teams destroys usability and compliance.22:55 – 26:28 – Practical Limits of In-Tool Reporting: Marcel describes how building detailed profitability reporting directly in PM tools creates unsustainable complexity, unrealistic data maintenance, and unreliable results.26:28 – 34:38 – Building a Sustainable Data Architecture: They outline how Parakeeto's ETL pipeline works—extracting time data (person, project, hours), joining it with payroll and project grids, normalizing fields, and applying ongoing QA to ensure accuracy.34:38 – 42:37 – The Big Takeaway: Kristen and Marcel conclude that PM tools are essential for delivery but not the whole profit solution. Agencies should use them for managing work while relying on a clear framework and data pipeline for accurate reporting.Show NotesConnect with Kristen via LinkedInFree Agency ToolkitParakeeto Foundations CourseFree access to our Model PlatformLove this Episode?Leave us a review here. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

The Eternal Now with Andy Ortmann | WFMU
Over Easy Done Well from Oct 3, 2025

The Eternal Now with Andy Ortmann | WFMU

Play Episode Listen Later Oct 3, 2025 58:48


Roger Baudet - "Compliainte a Deux" - Musique Électronique Pour La Scène Et L'image 1976 - 1992 https://www.wfmu.org/playlists/shows/156794

GOTO - Today, Tomorrow and the Future
Incremental Design, DevOps, Microservices & CICD • Michael Nygard & Dave Farley

GOTO - Today, Tomorrow and the Future

Play Episode Listen Later Oct 3, 2025 32:41


This interview was recorded at GOTO Copenhagen 2024.https://gotocph.comMichael Nygard - General Manager of Data at NubankDave Farley - Continuous Delivery & DevOps Pioneer, Award-winning Author, Founder & Director of Continuous Delivery Ltd.RESOURCESMichaelhttps://www.linkedin.com/in/mtnygardhttps://twitter.com/mtnygardhttp://www.michaelnygard.comDavehttps://bsky.app/profile/davefarley77.bsky.socialhttps://www.continuous-delivery.co.ukhttps://linkedin.com/in/dave-farley-a67927https://twitter.com/davefarley77http://www.davefarley.netRead the full abstract hereRECOMMENDED BOOKSDavid Deutsch • The Beginning of InfinityMichael Nygard • Release It! 2nd EditionMichael Nygard • Release It! 1st EditionZhamak Dehghani • Data MeshDave Farley • Modern Software EngineeringDave Farley • Continuous Delivery PipelinesDave Farley & Jez Humble • Continuous DeliveryInspiring Tech Leaders - The Technology PodcastInterviews with Tech Leaders and insights on the latest emerging technology trends.Listen on: Apple Podcasts SpotifyBlueskyTwitterInstagramLinkedInFacebookCHANNEL MEMBERSHIP BONUSJoin this channel to get early access to videos & other perks:https://www.youtube.com/channel/UCs_tLP3AiwYKwdUHpltJPuA/joinLooking for a unique learning experience?Attend the next GOTO conference near you! Get your ticket: gotopia.techSUBSCRIBE TO OUR YOUTUBE CHANNEL - new videos posted daily!

Growth Masterminds Podcast
More data? More growth!

Growth Masterminds Podcast

Play Episode Listen Later Oct 2, 2025 16:27


MMPs give you a strong foundation for measuring mobile campaigns. But what if that's not enough? What if the data you're missing could unlock faster growth, smarter user acquisition, and better ROI?That's where Extract comes in. In this episode of Growth Masterminds, John Koetsier talks with Maayan Schor about why mobile marketers need a next-gen ELT and reverse ETL platform to move raw data in and out of their systems. From app stores to social to ad networks, Extract helps you pull it all together with your MMP data ... and make smarter decisions with more confidence.Leading, of course, to more growth.We cover:- Why there's more data than MMPs provide alone- How to access raw app store, organic social, and granular ad network data- Real-world use cases from top mobile marketers- How BI teams and marketers collaborate to make Extract work- Why flexibility and context are key to growth in mobile marketingIf you're working in mobile marketing, user acquisition, data engineering, or growth analytics, this conversation is packed with insights you can use today.

Smart Agency Masterclass with Jason Swenk: Podcast for Digital Marketing Agencies
Why the Middle Layer of Your Agency Org Chart May Not Survive AI with Jennifer Bagley | Ep #841

Smart Agency Masterclass with Jason Swenk: Podcast for Digital Marketing Agencies

Play Episode Listen Later Oct 1, 2025 28:36


Would you like access to our advanced agency training for FREE? https://www.agencymastery360.com/training Are you still thinking of AI as just “ChatGPT with a better prompt”? Or maybe you've played around with Zapier automations and thought, yeah, that's good enough. Today's featured guest knows that the agencies pulling ahead right now are building full-on AI agent networks that replace routine tasks, streamline data pipelines, and give their teams superpowers. She's re-engineering her agency around AI and will talk about where she finds top-tier talent and why you don't need to code to lead your agency into the future. Jennifer Bagley is the CEO and founder of CI Web Group, a fully virtual digital marketing agency registered in 22 U.S. states with clients across the United States and Canada. A former corporate operator turned entrepreneur, Jennifer started in real estate and mortgage brokerage before leaning into the marketing work she built to support those businesses. Today she runs a modern, tech-forward agency that's rebuilt its stack around AI, centralized data, and agentic networks, all while carrying the scars and lessons of scaling, pivoting, and re-founding a business from the ground up. In this episode, we'll discuss: Feeling trapped by the business. Hiring, firing, and the people reset AI, reskilling, and the end of “middle” roles What does this talent cost? Subscribe Apple | Spotify | iHeart Radio Sponsors and Resources E2M Solutions: Today's episode of the Smart Agency Masterclass is sponsored by E2M Solutions, a web design, and development agency that has provided white-label services for the past 10 years to agencies all over the world. Check out e2msolutions.com/smartagency and get 10% off for the first three months of service. From Corporate Ladder to Accidental Agency Founder Jennifer came from an operations background, a self-proclaimed black belt in Six Sigma and certified project manager. Having built that corporate background, she had made a promise to herself (“by 30 I'll be an entrepreneur”), and started to build the side hustle that became the main event. She started in real estate and mortgage brokering where she had to learn marketing the hard way; not because she wanted to be a marketer, but because the survival of her businesses depended on it. Initially, Jennifer didn't set out to build a scalable agency; she built a team to support her broker network. When the market collapsed in 2008, the same team that did marketing for agents suddenly had a market outside real estate. That “we'll just help this painter or HVAC company” phase is where the web group was born: small, service-focused, and useful to people in her network. That accidental turn became a business by solving real, pressing problems for paying clients, then leaned into that. Trading Time for Freedom: The Hard Pivot For the first five years, Jennifer describes the business as a “lifestyle” operation, profitable maybe, but trapping her time. She was trading billable hours for income and was reaching her limit when she hired a coach that forced a reckoning: if entrepreneurship isn't buying you time, money, and freedom, what's the point? So she made the brutal choice of cutting consulting contracts and burning the bridge to the “safety” of hourly work, and effectively gave herself a mulligan. This is the classic founder pivot: you have to choose between growth that keeps you doing the work and growth that scales the business without you. Jennifer's reset wasn't pretty, for a while she lost everything and she and her son lived in an office for a while, but it bought her the permission to build something salable, not just sustainable. Agency owners who feel trapped in delivery need to remember that sometimes you have to give up short-term revenue to create long-term value. Feeling Trapped by the Agency and Becoming a CEO Those first five years, Jennifer continued to run a business that started as a supply chain consulting and eventually turned into a sales supply chain consulting. This change meant the business was now a good lead generator for the agency but it also meant Jennifer was essentially selling her image and her time. Until she ran out of time. Once she felt trapped by the business, Jennifer actually hired a business coach that helped her change the model from “selling Jennifer with marketing on the side” to an actual sustainable business. She had to go back to the basics and remember she, like every entrepreneur, started the business with the idea of having more time, money, and freedom. It took losing everything, but Jennifer knew she didn't want a lifestyle business, she wanted a sellable business. The antidote was delegation plus systems. If you want growth and a future exit, you need to own those CEO responsibilities and be comfortable with letting go of the day-to-day. Hiring, Firing, and Resetting the Team Jennifer's talent strategy has evolved with each stage of growth. Her early hires were the classic “friends, family, fools” bootstrap crew; later she invested in developers, content teams, project managers, and over time, more strategic hires like CFOs, chief of staff, BI teams, and AI engineers. Each five-year arc brought a new set of needs and a new level of sophistication in hiring. Now, she divides her time between promoting her agency's work in podcasts and content and thinking of ways to navigate her business in these volatile and exciting times. Her most recent addition to the team was a technology and transformation team that is revisiting all of the agency's processes, investments, and infrastructure. As a result, she has downsized her team from over 300 W2 employees and refocus the team. The takeaway for agency owners: be honest about whether your people are builders or maintainers, and hire accordingly. The workforce you need for growth is not the same as the workforce you need for stable operations. Building AI Agent Networks with Centralized Data Jennifer's agency shifted from WordPress to Webflow and built agentic networks: hundreds of AI agents that crawl competitors, do strategy homework, and automate tasks that humans used to do. More importantly, they rebuilt infrastructure into a hub-and-spoke model with a centralized min.io data layer and ETL pipelines feeding analytics and BI. Two big lessons here. One: invest in your tech stack deliberately so you're not a Frankenstein of five different platforms that don't talk to each other. Two: design your data architecture so your people (and your AI agents) have a single source of truth. That's how you get from fire-fighting in six dashboards to proactive, predictive signals that tell you when a client engagement needs attention. AI, Reskilling, and Shrinking Middle Roles Jennifer draws a hard line: the agency now tends to hire either very seasoned client-facing leaders or AI engineers; the middle is shrinking. With agentic networks giving junior staff “superpowers,” the agency can afford fewer mid-level “lever pullers.” At this level there's no room for slow execution or elementary work. That's a cultural and ethical challenge, both for hiring and for workforce development. For agency owners, this raises practical HR questions: do you reskill your people, or replace them? Jennifer suggests building agent-driven systems that augment humans, and being brutally honest about who can grow into that future. It's also a call to action for how we prepare the next generation: schools won't teach this; companies will need to. Playing with AI Platforms: Why Leaders Need to Just Know Enough to Be Dangerous Jennifer started like a lot of agency owners dipping into AI, playing around on tools like n8n, Make.com, Relevance, and Longchain. Her dev team laughed, calling her an “elementary school kid on a tricycle,” but here's the point: she didn't need to master the tech. She needed to know enough to point her team in the right direction. Instead of obsessing over code, she framed the problem differently: “Here's what I don't want a human doing anymore. Can you make that happen?” That mindset shift is key for agency owners. You don't need to be a full-stack AI engineer to lead an agency into the future; you just need to clearly define outcomes and invest in people who can deliver them. Find Real AI Talent in Unlikely Places This is where most agencies get stuck. You're not going to find your next AI architect on Upwork. Jennifer leaned on her network, starting with her cousin Chris, a hardcore developer who initially thought AI platforms were “rookie business.” Once Chris realized the power of agentic networks to scale his expertise, he became the backbone of CI Web Group's transformation. Now, she hunts talent in unconventional places: hackathons, LinkedIn, and especially YouTube. Forget the flashy “10x growth hack” videos — she looks for nerds with four views, geeking out about orchestrators and ETL pipelines. Those are the builders who care about solving real problems, not just building hype. Her tip: if you find one, reach out immediately. They don't want sales, they just want to build. Designing AI Agents Like an Agency Org Chart Jennifer compares AI agents to a company org chart. You don't hire one person to do everything, that's a recipe for burnout. Same thing with AI. Each agent should tightly focus on a single task, with checks, auditors, and orchestrators overseeing the system. The payoff was massive efficiency gains. Instead of six different platforms that don't talk, her agency built a centralized hub with min.io, ClickHouse, and AI layers on top. That's how you go from patchwork automation to true predictive intelligence. The Real Cost of AI Talent If you're wondering how much this all costs, the answer is… a lot. On the high end, seasoned AI engineers can run you a quarter million in salary. On the low end, Jennifer tests new hires on project-based sprints, maybe $6K for a 10-hour challenge. The point isn't to cut costs; it's to prove quickly who can deliver and who can't. Her recruiting process is brutal but effective: give candidates a project, a tight deadline, and see how they perform. If they stall, they're out. If they screen-share fast and solve problems live, they're in. No fluff, no endless interviews. Do You Want to Transform Your Agency from a Liability to an Asset? Looking to dig deeper into your agency's potential? Check out our Agency Blueprint. Designed for agency owners like you, our Agency Blueprint helps you uncover growth opportunities, tackle obstacles, and craft a customized blueprint for your agency's success.

China Manufacturing Decoded
Fail‑Safe by Design: Avoiding Catastrophic Product Failures

China Manufacturing Decoded

Play Episode Listen Later Sep 26, 2025 28:52 Transcription Available


In this episode, Adrian is joined by Renaud Anjoran to explore fail-safe design principles: essential thinking for anyone developing most kinds of products. Through real-world examples ranging from Tesla doors to Boeing and consumer electronics, they highlight how designers must ask: “If this fails, what happens to the user?” They break down why it matters, what trade-offs exist, and how structured risk analysis, simplification, redundancy, and error-proofing can dramatically reduce hazards and costly failures.   Episode Sections: 00:00:03 – Introduction 00:01:00 – Tesla door handle fail-safe issue 00:02:32 – Building lock systems vs. car safety 00:05:55 – Structured thinking in fail-safe design 00:07:21 – Designing with users in mind 00:09:02 – Risk analysis methods: FMEA & fault tree analysis 00:11:10 – Catastrophic failures & extreme examples 00:12:18 – Everyday product applications 00:14:21 – Principle: Simplification in design 00:16:13 – Redundancy in critical systems 00:20:30 – Battery management & safety logic 00:20:34 – Human error and mistake-proofing 00:23:09 – Error-proofing examples: tables & plugs 00:23:41 – Trade-offs and cost considerations 00:26:03 – Testing, regulations & standards (UL, ETL, etc.) 00:27:11 – Summary & wrap-up 00:28:07 – Final thoughts & listener takeaway 00:28:19 – Outro   Are you designing a new product? Ask yourself: “If this fails, what happens?” Visit Sofeast.com to learn how our quality, reliability, and product development teams can support you in building safer, more reliable products.   Related content... Fail Safe Design Principles & Examples | Product Risk Reduction Alaska Airlines Boeing 737 Max 9 Near Disaster! Quality & Reliability Issues? Why Product Safety, Quality, and Reliability Are Tightly Linked Tesla's Cybertruck Debacle: Reliability, Politics, & Plummeting Sales [Podcast] We can do your manufacturing at Agilian Technology   Get in touch with us Connect with us on LinkedIn Contact us via Sofeast's contact page Subscribe to our YouTube channel Prefer Facebook? Check us out on FB

IBM Analytics Insights Podcasts
Making Data Simple: Live Data, Smarter AI with Snow Leopard founder Deepti Srivastava

IBM Analytics Insights Podcasts

Play Episode Listen Later Sep 17, 2025 39:24


Send us a textWhat if AI could tap into live operational data — without ETL or RAG? In this episode, Deepti Srivastava, founder of Snow Leopard, reveals how her company is transforming enterprise data access with intelligent data retrieval, semantic intelligence, and a governance-first approach. Tune in for a fresh perspective on the future of AI and the startup journey behind it.We explore how companies are revolutionizing their data access and AI strategies. Deepti Srivastava, founder of Snow Leopard, shares her insights on bridging the gap between live operational data and generative AI — and how it's changing the game for enterprises worldwide.We dive into Snow Leopard's innovative approach to data retrieval, semantic intelligence, and governance-first architecture.04:54 Meeting Deepti Srivastava 14:06 AI with No ETL, no RAG 17:11 Snow Leopard's Intelligent Data Fetching 19:00 Live Query Challenges 21:01 Snow Leopard's Secret Sauce 22:14 Latency 23:48 Schema Changes 25:02 Use Cases 26:06 Snow Leopard's Roadmap 29:16 Getting Started 33:30 The Startup Journey 34:12 A Woman in Technology 36:03 The Contrarian View

Making Data Simple
Making Data Simple: Live Data, Smarter AI with Snow Leopard founder Deepti Srivastava

Making Data Simple

Play Episode Listen Later Sep 17, 2025 39:24


Send us a textWhat if AI could tap into live operational data — without ETL or RAG? In this episode, Deepti Srivastava, founder of Snow Leopard, reveals how her company is transforming enterprise data access with intelligent data retrieval, semantic intelligence, and a governance-first approach. Tune in for a fresh perspective on the future of AI and the startup journey behind it.We explore how companies are revolutionizing their data access and AI strategies. Deepti Srivastava, founder of Snow Leopard, shares her insights on bridging the gap between live operational data and generative AI — and how it's changing the game for enterprises worldwide.We dive into Snow Leopard's innovative approach to data retrieval, semantic intelligence, and governance-first architecture.04:54 Meeting Deepti Srivastava 14:06 AI with No ETL, no RAG 17:11 Snow Leopard's Intelligent Data Fetching 19:00 Live Query Challenges 21:01 Snow Leopard's Secret Sauce 22:14 Latency 23:48 Schema Changes 25:02 Use Cases 26:06 Snow Leopard's Roadmap 29:16 Getting Started 33:30 The Startup Journey 34:12 A Woman in Technology 36:03 The Contrarian View

The Tech Blog Writer Podcast
From Bots To Agents: Building Trustworthy Autonomy With Hakkōda, an IBM Company

The Tech Blog Writer Podcast

Play Episode Listen Later Sep 13, 2025 25:49


I invited Atalia Horenshtien to unpack a topic many leaders are wrestling with right now. Everyone is talking about AI agents, yet most teams are still living with rule based bots, brittle scripts, and a fair bit of anxiety about handing decisions to software. Atalia has lived through the full arc, from early machine learning and automated pipelines to today's agent frameworks inside large enterprises. She is an AI and data strategist, a former data scientist and software engineer, and has just joined Hakoda, an IBM company, to help global brands move from experiments to outcomes. The timing matters. She starts on the 18th, and this conversation captures how she thinks about responsible progress at exactly the moment she steps into that new role. Here's the thing. Words like autonomy sound glamorous until an agent faces a messy real world task. Atalia draws a clear line between scripted bots and agents with goals, memory, and the ability to learn from feedback. Her advice is refreshingly grounded. Start internal where you can observe behavior. Put human in the loop review where it counts. Use role based access rather than feeding an LLM everything you own. Build an observability layer so you can see what the model did, why it did it, and what it cost. We also get into measurements that matter. Time saved, cycle time reduction, adoption, before and after comparisons, and a sober look at LLM costs against any reduction in FTE hours. She shares how custom cost tracking for agents prevents surprises, and why version one should ship even if it is imperfect. Culture shows up as a recurring theme. Leaders need to talk openly about reskilling, coach managers through change, and invite teams to be co creators. Her story about Hakoda's internal AI Lab is a good example. What began as an engineer's idea for ETL schema matching grew into agent powered tools that won a CIO 100 award and now help deliver faster, better outcomes for clients. There are lighter moments too. Atalia explains how she taught an ex NFL player the basics of time series forecasting using football tactics. Then she takes us behind the scenes with McLaren Racing, where data and strategy collide on the F1 circuit, and admits she has become a committed fan because of that work. If you want a practical playbook for moving from shiny demos to dependable agents, this episode will help you think clearly about scope, safeguards, and speed. Connect with Atalia on LinkedIn, explore Hakoda's work at hakoda.io, and then tell me how you plan to measure your first agent's value. ********* Visit the Sponsor of Tech Talks Network: Land your first job  in tech in 6 months as a Software QA Engineering Bootcamp with Careerist https://crst.co/OGCLA  

The Tech Blog Writer Podcast
3412: PuppyGraph at the IT Press Tour: Graph Power Without the Pain

The Tech Blog Writer Podcast

Play Episode Listen Later Sep 6, 2025 21:59


During the IT Press Tour, I had the pleasure of speaking with Weimo Liu, CEO and co-founder of PuppyGraph, and hearing firsthand how his team is rethinking graph technology for the enterprise. In this episode of Tech Talks Daily, Weimo joins me to share the story behind PuppyGraph's “zero ETL” approach, which lets organizations query their existing data as a graph without ever moving or duplicating it. We discuss why graph databases, despite their promise, have struggled with mainstream adoption, often because of complex pipelines and heavy infrastructure requirements. Weimo explains how PuppyGraph borrows from his time at TigerGraph and Google's F1 engine to build something new: a distributed query engine that maps tables into a logical graph and delivers subsecond performance on massive datasets. That shift opens the door for use cases in cybersecurity, fraud detection, and AI-driven applications where latency and accuracy matter most. We also unpack the developer experience. Instead of rewriting schemas or reloading data every time requirements change, PuppyGraph allows teams to define nodes and edges directly from existing tables. That design lowers the barrier for SQL-focused teams and accelerates time to value. Weimo even touches on the role of graph in reducing AI hallucinations, showing how structured relationships can make enterprise AI systems more reliable. What struck me most in our conversation is how PuppyGraph's playful branding belies its serious engineering depth. Behind the “puppy” name lies a distributed engine built to scale with today's data volumes, backed by strong early adoption and a team that listens closely to customer needs. Whether you're exploring graph for cybersecurity, AI chatbots, or supply chain analytics, this discussion offers a glimpse of how the next generation of graph tech might finally break free from its niche and go mainstream. ********* Visit the Sponsor of Tech Talks Network: Land your first job  in tech in 6 months as a Software QA Engineering Bootcamp with Careerist https://crst.co/OGCLA

Irish Tech News Audio Articles
Siri Co-Founder Adam Cheyer: Why Data, Not Algorithms, Is AI's True Competitive Edge

Irish Tech News Audio Articles

Play Episode Listen Later Sep 4, 2025 12:56


Adam Cheyer is a pioneering AI technologist whose innovations have fundamentally shaped today's intelligent interfaces. As co-founder of Siri Inc. (acquired by Apple), he served as a Director of Engineering at Apple's iOS group, and later co-founded Viv Labs (acquired by Samsung), Sentient Technologies, and played a founding role in Change.org. Adam Cheyer was Chief Architect of CALO, one of DARPA's largest AI projects, authored over 60 publications and holds more than 25 patents In recognition of his achievement, he received his alma mater Brandeis University's 2024 Alumni Achievement Award - for transforming a long?standing AI vision into everyday tools used by hundreds of millions. Now represented by Champions Speakers Agency, he continues to speak globally on how organisations can harness AI with responsibility, scale, and impact. Q1. How do you see the role of data management in enabling AI capabilities and bringing data to life for organisations? Adam Cheyer: "AI systems are built on two foundations: algorithms and data. The algorithms themselves are well established, but without high-quality, well-organised data, they can't deliver real value. Data is the fuel that powers every AI application, and managing it effectively is now a mission-critical skill for any organisation developing AI. "With the rapid acceleration of AI in recent years - especially in the past six months - the ability to handle, refine, and govern data has shifted from being a technical advantage to an essential requirement across industries." Q2. What challenges have you faced when managing large data sets? Adam Cheyer: "I've been building AI systems over 30 years, so it's changed a little bit over time. Clearly, the first issue is just storage and management and processing of the data. The data now is so large. Back in the 80s and 90s that wasn't quite as essential, it was smaller data sets, but today the data sets are huge. "So, you need a system that can store it efficiently in a distributed way, and we've used various systems over the years to do that. You need a system that can process this huge amount of data in parallel at scale. "One of the key areas in data management for me is data quality. Even if you work with data companies - and when we were a start-up, and then even at Apple for instance - many of the data sources come from other places, other vendors, and surprisingly the data is not always in perfect clean form. "So, you need to have a process and tools and a pipeline that goes through and takes that data, cleanses it, adapts it, and often if you have multiple sources you need to integrate data together, and that can be a real challenge. "There are standard systems, ETL systems etc., but sometimes you need proprietary algorithms. As an example, with Siri, when we were a start-up, you would get millions and millions of restaurant name data and business name data. "If you had something like Joe's Restaurant and Joe's Bar and Grill - are they the same or not? That's a real problem. Joe's - probably you'd say yes, but Joe's Pizzeria and Joe's Grill maybe not, right? And so, how do you know? "There's a lot of work that goes into cleansing, integrating data. "And then the final thing I'll mention, which is a big topic in data management, is privacy and security. Once you have data coming in from users, there are standards, issues, and regulations that mean you need to be able to ensure that the data you have is accessible only by the right people, that it is secured and protected, and that it keeps privacy as much as possible - standardised. "At Apple, we had a number of techniques and teams, and there's a lot that goes into that. So, you need good systems, good processes, and to set up your organisation to be able to handle all of these challenges." Q3. How do you manage data privacy when building large AI systems? Adam Cheyer: "Absolutely, so it is a challenge. Your first tendency is, well, we just record everything, but I think that'...

The Tech Trek
What “Data-Driven” Really Means

The Tech Trek

Play Episode Listen Later Sep 2, 2025 32:11


What does it really mean to be data-driven? Mark Gergess, VP of Data and BI at DoubleVerify, joins the show to unpack how data teams can go beyond dashboards to drive meaningful business action. From building an internal consulting lens to evaluating the latest AI tools, Mark shares how his team translates complex data flows into measurable revenue impact. If you've ever wrestled with the gap between insights and outcomes, this conversation will hit home.Key Takeaways• Being data-driven is about driving action, not just reporting numbers• Stakeholders don't care about your data problems—they care about business outcomes• The biggest challenge with AI adoption isn't the model, it's the use cases• Efficiency gains from AI should shift focus from ETL tasks to solving real business problems• Data culture health is measured by how naturally teams rely on data day-to-dayTimestamped Highlights01:17 How DoubleVerify helps advertisers build safer, more effective digital campaigns04:55 Why the definition of “data-driven” still varies and why it matters09:25 Measuring whether data efforts are moving the needle on revenue13:15 How to separate hype from value when evaluating AI and GenAI tools17:10 Lessons from the data science boom and why companies must go “all in” with AI25:31 Can AI act as your junior analyst? Where efficiency gains really show up27:01 How freeing up time changes the structure of data teams and boosts business impactA thought worth holding onto“It's not about dashboards. It's not about reporting. It's about doing something with the information.”Pro TipsMark recommends treating AI as a “junior analyst”—let it handle quick, lower-priority questions so your team can focus on bigger business challenges.Call to ActionEnjoyed the conversation? Share this episode with a colleague who talks about being “data-driven.” Subscribe on your favorite podcast platform and connect with me on LinkedIn for more insights from leaders shaping the future of data and technology.

Vanishing Gradients
Episode 57: AI Agents and LLM Judges at Scale: Processing Millions of Documents (Without Breaking the Bank)

Vanishing Gradients

Play Episode Listen Later Aug 29, 2025 41:27


While many people talk about “agents,” Shreya Shankar (UC Berkeley) has been building the systems that make them reliable. In this episode, she shares how AI agents and LLM judges can be used to process millions of documents accurately and cheaply. Drawing from work on projects ranging from databases of police misconduct reports to large-scale customer transcripts, Shreya explains the frameworks, error analysis, and guardrails needed to turn flaky LLM outputs into trustworthy pipelines. We talk through: - Treating LLM workflows as ETL pipelines for unstructured text - Error analysis: why you need humans reviewing the first 50–100 traces - Guardrails like retries, validators, and “gleaning” - How LLM judges work — rubrics, pairwise comparisons, and cost trade-offs - Cheap vs. expensive models: when to swap for savings - Where agents fit in (and where they don't) If you've ever wondered how to move beyond unreliable demos, this episode shows how to scale LLMs to millions of documents — without breaking the bank. LINKS Shreya's website (https://www.sh-reya.com/) DocETL, A system for LLM-powered data processing (https://www.docetl.org/) Upcoming Events on Luma (https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk) Watch the podcast video on YouTube (https://youtu.be/3r_Hsjy85nk) Shreya's AI evals course, which she teaches with Hamel "Evals" Husain (https://maven.com/parlance-labs/evals?promoCode=GOHUGORGOHOME)

Data Culture Podcast
Data Warehouse Automation: Benefits and Market Overview – with Florian Bigelmaier, BARC

Data Culture Podcast

Play Episode Listen Later Aug 18, 2025 34:10


Soft Skills Engineering
Episode 472: Should my junior dev use AI and thrown in to ETL

Soft Skills Engineering

Play Episode Listen Later Aug 4, 2025 26:59


In this episode, Dave and Jamison answer these questions: I'm the CTO of a small startup. We're 3 devs including me and one of them is a junior developer. My current policy is to discourage the use of AI tools for the junior dev to make sure they build actual skills and don't just prompt their way through tasks. However I'm more and more questioning my stance as AI skills will be in demand for jobs to come and I want to prepare this junior dev for a life after my startup. How would you do this? What's the AI coding assistant policy in your companies. Is it the same for all seniority levels? Hi everyone! Long-time listener here, and I really appreciate all the insights you share. Greetings from Brazil! I recently joined a large company (5,000 employees) that hired around 500 developers in a short time. It seems like they didn't have enough projects aligned with everyone's expertise, so many of us, myself included, were placed in roles that don't match our skill sets. I'm a web developer with experience in Java and TypeScript, but I was assigned to a data-focused project involving Python and ETL pipelines, which is far from my area of interest or strength. I've already mentioned to my manager that I don't have experience in this stack, but the response was that the priority is to place people in projects. He told me to “keep [him] in the loop if you don't feel comfortable”, but I'm not sure that should I do. The company culture is chill, and I don't want to come across as unwilling to work or ungrateful. But I also want to grow in the right direction for my career. How can I ask for a project change, ideally one that aligns with my web development background, without sounding negative or uncooperative? Maybe wait for like 3 months inside of this project and then ask for a change? Thanks so much for your thoughts!

Coder Radio
624: Tampa Tech With Joey DeVilla

Coder Radio

Play Episode Listen Later Aug 2, 2025 34:57


Joey DeVilla of Tampa Tech fame and accordion playing glory joins Mike to discuss the Tampa Tech scene, some Python goodness, a little Rust and much more. Try Mailtrap for free (https://l.rw.rw/coder_radio_6) Joey's Blog (https://www.joeydevilla.com/) Mike on X (https://x.com/dominucco) Mike on BlueSky (https://bsky.app/profile/dominucco.bsky.social) Coder on X (https://x.com/coderradioshow) Coder on BlueSky (https://bsky.app/profile/coderradio.bsky.social) Show Discord (https://discord.gg/k8e7gKUpEp) Alice (https://alice.dev)

Alter Everything
190: Alteryx Use Cases in the Tax Industry

Alter Everything

Play Episode Listen Later Jul 30, 2025 26:33


Unlock the power of Alteryx for tax professionals in this insightful episode of Alter Everything! Join us in an interview with Adrian Steller, Director of Tax Technology at Ryan, to explore how Alteryx revolutionizes tax processes, automates data workflows, and enhances efficiency for tax teams. Discover real-world Alteryx use cases in VAT compliance, transfer pricing, and automation, and learn practical tips for transitioning from Excel to Alteryx. Whether you're a tax analyst, data professional, or business leader, this episode provides actionable insights on leveraging Alteryx for tax data transformation, reporting, and analytics.Panelists: Adrian Steller, Director @ International Tax Technology - LinkedInMegan Bowers, Sr. Content Manager @ Alteryx - @MeganBowers, LinkedInShow notes: Ryan (Company)Ryan Tax Lab (Podcast)Alteryx Community BlogsAlteryx Help Docs Interested in sharing your feedback with the Alter Everything team? Take our feedback survey here!This episode was produced by Megan Bowers, Mike Cusic, and Matt Rotundo. Special thanks to Andy Uttley for the theme music.

The Eternal Now with Andy Ortmann | WFMU
Solar Translucent Lifeformless from Jul 17, 2025

The Eternal Now with Andy Ortmann | WFMU

Play Episode Listen Later Jul 18, 2025 65:09


Roger Baudet - "Anhamete (Ceremonial) 1991" - Musique Électronique Pour La Scène Et L'image 1976 - 1992 Mariana La Palma - "Hong-Kong Shoes" - SNX va C. Lavender - "An Offering Proclaimed in the Dream" - Rupture in the Eternal Realm Anni-Frid Lyngstad - "Så Synd Du Måste Gå (It Hurts To Say Goodbye)" - The Girls Want The Boys! Sweden's Beat Girls 1964-1970 Secos & Molhados - "Não Digas Nada" - Secos & Molhados Serei Usignolo , Giampiero Boneschi E I Suoi Strumenti Elettronici - "Mitridate - Visione" Brandon Auger - "T24.d02.0315" - Anthology of Experimental Music From Canada va Bernard Parmegiani - "Entropie" - Chants Magnetiques Amedeo Tommasi - "Gemelli" - Zodiac Matia Bazar - "Lili Marleen" - Berlino, Parigi, Londra Marius Constant - "La Publicite (excerpt)" - Eloge De La Folie Nurse With Wound - "A Snake In Your Abdomen (excerpt)" - More Automating Ash Ra Tempel - "Echo Waves (excerpt)" - Inventions For Electric Guitar Brainticket - "Voyage (part 1) excerpt" - Voyage MT Luciani - "Ribellione Del Terzo Mondo" - Situazione Del Le Terzo Mondo https://www.wfmu.org/playlists/shows/154222

The Data Stack Show
253: Why Traditional Data Pipelines Are Broken (And How to Fix Them) with Ruben Burdin of Stacksync

The Data Stack Show

Play Episode Listen Later Jul 16, 2025 58:37


This week on The Data Stack Show, Eric and welcomes back Ruben Burdin, Founder and CEO of Stacksync as they together dismantle the myths surrounding zero-copy ETL and traditional data integration methods. Ruben reveals the complex challenges of two-way syncing between enterprise systems like Salesforce, HubSpot, and NetSuite, highlighting how existing tools often create more problems than solutions. He also introduces Stacksync's innovative approach, which uses real-time SQL-based synchronization to simplify data integration, reduce maintenance overhead, and enable more efficient operational workflows. The conversation exposes the limitations of current data transfer techniques and offers a glimpse into a more declarative, flexible approach to managing enterprise data across multiple systems. You won't want to miss it.Highlights from this week's conversation include:The Pain of Two-Way Sync and Early Integration Challenges (2:01)Zero Copy ETL: Hype vs. Reality (3:50)Data Definitions and System Complexity (7:39)Limitations of Out-of-the-Box Integrations (9:35)The CSV File: The Original Two-Way Sync (11:18)Stacksync's Approach and Capabilities (12:21)Zero Copy ETL: Technical and Business Barriers (14:22)Data Sharing, Clean Rooms, and Marketing Myths (18:40)The Reliable Loop: ETL, Transform, Reverse ETL (27:08)Business Logic Fragmentation and Maintenance (33:43)Simplifying Architecture with Real-Time Two-Way Sync (35:14)Operational Use Case: HubSpot, Salesforce, and Snowflake (39:10)Filtering, Triggers, and Real-Time Workflows (45:38)Complex Use Case: Salesforce to NetSuite with Data Discrepancies (48:56)Declarative Logic and Debugging with SQL (54:54)Connecting with Ruben and Parting Thoughts (57:58)The Data Stack Show is a weekly podcast powered by RudderStack, customer data infrastructure that enables you to deliver real-time customer event data everywhere it's needed to power smarter decisions and better customer experiences. Each week, we'll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

Les Cast Codeurs Podcast
LCC 327 - Mon ami de 30 ans

Les Cast Codeurs Podcast

Play Episode Listen Later Jun 16, 2025 103:18


Dans cet épisode, c'est le retour de Katia et d'Antonio. Les Cast Codeurs explorent WebAssembly 2.0, les 30 ans de Java, l'interopérabilité Swift-Java et les dernières nouveautés Kotlin. Ils plongent dans l'évolution de l'IA avec Claude 4 et GPT-4.1, débattent de la conscience artificielle et partagent leurs retours d'expérience sur l'intégration de l'IA dans le développement. Entre virtualisation, défis d'infrastructure et enjeux de sécurité open source, une discussion riche en insights techniques et pratiques. Enregistré le 13 juin 2025 Téléchargement de l'épisode LesCastCodeurs-Episode-327.mp3 ou en vidéo sur YouTube. News Langages Wasm 2.0 enfin officialisé ! https://webassembly.org/news/2025-03-20-wasm-2.0/ La spécification Wasm 2.0 est officiellement sortie en décembre dernier. Le consensus sur la spécification avait été atteint plus tôt, en 2022. Les implémentations majeures supportent Wasm 2.0 depuis un certain temps. Le processus W3C a pris du temps pour atteindre le statut de “Recommandation Candidate” pour des raisons non techniques. Les futures versions de Wasm adopteront un modèle “evergreen” où la “Recommandation Candidate” sera mise à jour en place. La dernière version de la spécification est considérée comme le standard actuel (Candidate Recommendation Draft). La version la plus à jour est disponible sur la page GitHub (GitHub page). Wasm 2.0 inclut les nouveautés suivantes : Instructions vectorielles pour le SIMD 128-bit. Instructions de manipulation de mémoire en bloc pour des copies et initialisations plus rapides. Résultats multiples pour les instructions, blocs et fonctions. Types références pour les références à des fonctions ou objets externes. Conversions non-piégeantes de flottant à entier. Instructions d'extension de signe pour les entiers signés. Wasm 2.0 est entièrement rétrocompatible avec Wasm 1.0. Paul Sandoz annonce que le JDK intègrera bientôt une API minimaliste pour lire et écrire du JSON https://mail.openjdk.org/pipermail/core-libs-dev/2025-May/145905.html Java a 30 ans, c'était quoi les points bluffants au début ? https://blog.jetbrains.com/idea/2025/05/do-you-really-know-java/ nom de code Oak Mais le trademark était pris Write Once Run Anywhere Garbage Collector Automatique multi threading au coeur de la palteforme meme si Java est passé par les green threads pendant un temps modèle de sécurité: sandbox applets, security manager, bytecode verifier, classloader Des progrès dans l'interopérabilité Swift / Java mentionnés à la conférence Apple WWDC 2025 https://www.youtube.com/watch?v=QSHO-GUGidA Interopérabilité Swift-Java : Utiliser Swift dans des apps Java et vice-versa. Historique : L'interopérabilité Swift existait déjà avec C et C++. Méthodes : Deux directions d'interopérabilité : Java depuis Swift et Swift depuis Java. JNI : JNI est l'API Java pour le code natif, mais elle est verbeuse. Swift-Java : Un projet pour une interaction Swift-Java plus flexible, sûre et performante. Exemples pratiques : Utiliser des bibliothèques Java depuis Swift et rendre des bibliothèques Swift disponibles pour Java. Gestion mémoire : Swift-Java utilise la nouvelle API FFM de Java pour gérer la mémoire des objets Swift. Open Source : Le projet Swift-Java est open source et invite aux contributions. KotlinConf le retour https://www.sfeir.dev/tendances/kotlinconf25-quelles-sont-les-annonces-a-retenir/ par Adelin de Sfeir “1 developeur sur 10” utilise Kotlin Kotlin 2.2 en RC $$ multi dollar interpolation pour eviter les sur interpolations non local break / continue (changement dans la conssitance de Kotlin guards sur le pattern matching D'autres features annoncées alignement des versions de l'ecosysteme sur kotlin jvm par defaut un nouvel outil de build Amper beaucoup d'annonces autour de l'IA Koog, framework agentique de maniere declarative nouvelle version du LLM de JetBrains: Mellum (focalisé sur le code) Kotlin et Compose multiplateforme (stable en iOS) Hot Reload dans compose en alpha partenariat strategque avec Spring pour bien integrer kotlin dans spring Librairies Sortie d'une version Java de ADK, le framework d'agents IA lancé par Google https://glaforge.dev/posts/2025/05/20/writing-java-ai-agents-with-adk-for-java-getting-started/ Guillaume a travaillé sur le lancement de ce framework ! (améliorations de l'API, code d'exemple, doc…) Comment déployer un serveur MCP en Java, grâce à Quarkus, et le déployer sur Google Cloud Run https://glaforge.dev/posts/2025/06/09/building-an-mcp-server-with-quarkus-and-deploying-on-google-cloud-run/ Même Guillaume se met à faire du Quarkus ! Utilisation du support MCP développé par l'équipe Quarkus. C'est facile, suffit d'annoter une méthode avec @Tool et ses arguments avec @ToolArg et c'est parti ! L'outil MCP inspector est très pratique pour inspecter manuellement le fonctionnement de ses serveurs MCP Déployer sur Cloud Run est facile grâce aux Dockerfiles fournis par Quarkus En bonus, Guillaume montre comment configuré un serveur MCP comme un outil dans le framework ADK pour Java, pour créer ses agents IA Jilt 1.8 est sorti, un annotation processor pour le pattern builder https://www.endoflineblog.com/jilt-1_8-and-1_8_1-released processing incrémental pour Gradle meilleure couverture de votre code (pour ne pas comptabiliser le code généré par l'annotation processeur) une correction d'un problème lors de l'utilisation des types génériques récursifs (genre Node Hibernate Search 8 est sorti https://in.relation.to/2025/06/06/hibernate-search-8-0-0-Final/ aggregation de metriques compatibilité avec les dernieres OpenSearch et Elasticsearch Lucene 10 en backend Preview des requetes validées à la compilation Hibernate 7 est sorti https://in.relation.to/2025/05/20/hibernate-orm-seven/ ASL 2.0 Hibernate Validator 9 Jakarta Persistence 3.2 et Jakarta Validation 3.1 saveOrUpdate (reattachement d'entité) n'est plus supporté session stateless plus capable: oeprations unitaires et pas seulement bach, acces au cache de second niveau, m,eilleure API pour les batchs (insertMultiple etc) nouvelle API criteria simple et type-safe: et peut ajouter a une requete de base Un article qui décrit la Dev UI de Quarkus https://www.sfeir.dev/back/quarkus-dev-ui-linterface-ultime-pour-booster-votre-productivite-en-developpement-java/ apres un test pour soit ou une demo, c'est un article détaillé et la doc de Quarkus n'est pas top là dessus Vert.x 5 est sorti https://vertx.io/blog/eclipse-vert-x-5-released/ on en avait parlé fin de l'année dernière ou début d'année Modèle basé uniquement sur les Futures : Vert.x 5 abandonne le modèle de callbacks pour ne conserver que les Futures, avec une nouvelle classe de base VerticleBase mieux adaptée à ce modèle asynchrone. Support des modules Java (JPMS) : Vert.x 5 prend en charge le système de modules de la plateforme Java avec des modules explicites, permettant une meilleure modularité des applications. Améliorations majeures de gRPC : Support natif de gRPC Web et gRPC Transcoding (support HTTP/JSON et gRPC), format JSON en plus de Protobuf, gestion des timeouts et deadlines, services de réflexion et de health. Support d'io_uring : Intégration native du système io_uring de Linux (précédemment en incubation) pour de meilleures performances I/O sur les systèmes compatibles. Load balancing côté client : Nouvelles capacités de répartition de charge pour les clients HTTP et gRPC avec diverses politiques de distribution. Service Resolver : Nouveau composant pour la résolution dynamique d'adresses de services, étendant les capacités de load balancing à un ensemble plus large de résolveurs. Améliorations du proxy HTTP : Nouvelles transformations prêtes à l'emploi, interception des upgrades WebSocket et interface SPI pour le cache avec support étendu des spécifications. Suppressions et remplacements : Plusieurs composants sont dépréciés (gRPC Netty, JDBC API, Service Discovery) ou supprimés (Vert.x Sync, RxJava 1), remplacés par des alternatives plus modernes comme les virtual threads et Mutiny. Spring AI 1.0 est sorti https://spring.io/blog/2025/05/20/spring-ai-1-0-GA-released ChatClient multi-modèles : API unifiée pour interagir avec 20 modèles d'IA différents avec support multi-modal et réponses JSON structurées. Écosystème RAG complet : Support de 20 bases vectorielles, pipeline ETL et enrichissement automatique des prompts via des advisors. Fonctionnalités enterprise : Mémoire conversationnelle persistante, support MCP, observabilité Micrometer et évaluateurs automatisés. Agents et workflows : Patterns prédéfinis (routing, orchestration, chaînage) et agents autonomes pour applications d'IA complexes. Infrastructure Les modèles d'IA refusent d'être éteint et font du chantage pour l'eviter, voire essaient se saboter l'extinction https://www.thealgorithmicbridge.com/p/ai-companies-have-lost-controland?utm_source=substac[…]aign=email-restack-comment&r=2qoalf&triedRedirect=true Les chercheur d'Anthropic montrent comment Opus 4 faisait du chantage aux ingenieurs qui voulaient l'eteindre pour mettre une nouvelle version en ligne Une boite de recherche a montré la même chose d'Open AI o3 non seulemenmt il ne veut pas mais il essaye activement d'empêcher l'extinction Apple annonce le support de la virtualisation / conteneurisation dans macOS lors de la WWDC https://github.com/apple/containerization C'est open source Possibilité de lancer aussi des VM légères Documentation technique : https://apple.github.io/containerization/documentation/ Grosse chute de services internet suite à un soucis sur GCP Le retour de cloud flare https://blog.cloudflare.com/cloudflare-service-outage-june-12-2025/ Leur système de stockage (une dépendance majeure) dépend exclusivement de GCP Mais ils ont des plans pour surfit de cette dépendance exclusive la première analyse de Google https://status.cloud.google.com/incidents/ow5i3PPK96RduMcb1SsW Un quota auto mis à jour qui a mal tourné. ils ont bypassé le quota en code mais le service de quote en us-central1 était surchargé. Prochaines améliorations: pas d propagation de données corrompues, pas de déploiement global sans rolling upgrade avec monitoring qui peut couper par effet de bord (fail over) certains autres cloud providers ont aussi eu quelques soucis (charge) - unverified Data et Intelligence Artificielle Claude 4 est sorti https://www.anthropic.com/news/claude-4 Deux nouveaux modèles lancés : Claude Opus 4 (le meilleur modèle de codage au monde) et Claude Sonnet 4 (une amélioration significative de Sonnet 3.7) Claude Opus 4 atteint 72,5% sur SWE-bench et peut maintenir des performances soutenues sur des tâches longues durant plusieurs heures Claude Sonnet 4 obtient 72,7% sur SWE-bench tout en équilibrant performance et efficacité pour un usage quotidien Nouvelle fonctionnalité de “pensée étendue avec utilisation d'outils” permettant à Claude d'alterner entre raisonnement et usage d'outils Les modèles peuvent maintenant utiliser plusieurs outils en parallèle et suivre les instructions avec plus de précision Capacités mémoire améliorées : Claude peut extraire et sauvegarder des informations clés pour maintenir la continuité sur le long terme Claude Code devient disponible à tous avec intégrations natives VS Code et JetBrains pour la programmation en binôme Quatre nouvelles capacités API : outil d'exécution de code, connecteur MCP, API Files et mise en cache des prompts Les modèles hybrides offrent deux modes : réponses quasi-instantanées et pensée étendue pour un raisonnement plus approfondi en mode “agentique” L'intégration de l'IA au delà des chatbots et des boutons à étincelles https://glaforge.dev/posts/2025/05/23/beyond-the-chatbot-or-ai-sparkle-a-seamless-ai-integration/ Plaidoyer pour une IA intégrée de façon transparente et intuitive, au-delà des chatbots. Chatbots : pas toujours l'option LLM la plus intuitive ou la moins perturbatrice. Préconisation : IA directement dans les applications pour plus d'intelligence et d'utilité naturelle. Exemples d'intégration transparente : résumés des conversations Gmail et chat, web clipper Obsidian qui résume et taggue, complétion de code LLM. Meilleure UX IA : intégrée, contextuelle, sans “boutons IA” ou fenêtres de chat dédiées. Conclusion de Guillaume : intégrations IA réussies = partie naturelle du système, améliorant les workflows sans perturbation, le développeur ou l'utilisateur reste dans le “flow” Garder votre base de donnée vectorielle à jour avec Debezium https://debezium.io/blog/2025/05/19/debezium-as-part-of-your-ai-solution/ pas besoin de detailler mais expliquer idee de garder les changements a jour dans l'index Outillage guide pratique pour choisir le bon modèle d'IA à utiliser avec GitHub Copilot, en fonction de vos besoins en développement logiciel. https://github.blog/ai-and-ml/github-copilot/which-ai-model-should-i-use-with-github-copilot/ - Équilibre coût/performance : GPT-4.1, GPT-4o ou Claude 3.5 Sonnet pour des tâches générales et multilingues. - Tâches rapides : o4-mini ou Claude 3.5 Sonnet pour du prototypage ou de l'apprentissage rapide. - Besoins complexes : Claude 3.7 Sonnet, GPT-4.5 ou o3 pour refactorisation ou planification logicielle. - Entrées multimodales : Gemini 2.0 Flash ou GPT-4o pour analyser images, UI ou diagrammes. - Projets techniques/scientifiques : Gemini 2.5 Pro pour raisonnement avancé et gros volumes de données. UV, un package manager pour les pythonistes qui amène un peu de sanité et de vitesse http://blog.ippon.fr/2025/05/12/uv-un-package-manager-python-adapte-a-la-data-partie-1-theorie-et-fonctionnalites/ pour les pythonistes un ackage manager plus rapide et simple mais il est seulement semi ouvert (license) IntelliJ IDEA 2025.1 permet de rajouter un mode MCP client à l'assistant IA https://blog.jetbrains.com/idea/2025/05/intellij-idea-2025-1-model-context-protocol/ par exemple faire tourner un MCP server qui accède à la base de donnée Méthodologies Développement d'une bibliothèque OAuth 2.1 open source par Cloudflare, en grande partie générée par l'IA Claude: - Prompts intégrés aux commits : Chaque commit contient le prompt utilisé, ce qui facilite la compréhension de l'intention derrière le code. - Prompt par l'exemple : Le premier prompt montrait un exemple d'utilisation de l'API qu'on souhaite obtenir, ce qui a permis à l'IA de mieux comprendre les attentes. - Prompts structurés : Les prompts les plus efficaces suivaient un schéma clair : état actuel, justification du changement, et directive précise. - Traitez les prompts comme du code source : Les inclure dans les commits aide à la maintenance. - Acceptez les itérations : Chaque fonctionnalité a nécessité plusieurs essais. - Intervention humaine indispensable : Certaines tâches restent plus rapides à faire à la main. https://www.maxemitchell.com/writings/i-read-all-of-cloudflares-claude-generated-commits/ Sécurité Un packet npm malicieux passe par Cursor AI pour infecter les utilisateurs https://thehackernews.com/2025/05/malicious-npm-packages-infect-3200.html Trois packages npm malveillants ont été découverts ciblant spécifiquement l'éditeur de code Cursor sur macOS, téléchargés plus de 3 200 fois au total.Les packages se déguisent en outils de développement promettant “l'API Cursor la moins chère” pour attirer les développeurs intéressés par des solutions AI abordables. Technique d'attaque sophistiquée : les packages volent les identifiants utilisateur, récupèrent un payload chiffré depuis des serveurs contrôlés par les pirates, puis remplacent le fichier main.js de Cursor. Persistance assurée en désactivant les mises à jour automatiques de Cursor et en redémarrant l'application avec le code malveillant intégré. Nouvelle méthode de compromission : au lieu d'injecter directement du malware, les attaquants publient des packages qui modifient des logiciels légitimes déjà installés sur le système. Persistance même après suppression : le malware reste actif même si les packages npm malveillants sont supprimés, nécessitant une réinstallation complète de Cursor. Exploitation de la confiance : en s'exécutant dans le contexte d'une application légitime (IDE), le code malveillant hérite de tous ses privilèges et accès. Package “rand-user-agent” compromis : un package légitime populaire a été infiltré pour déployer un cheval de Troie d'accès distant (RAT) dans certaines versions. Recommandations de sécurité : surveiller les packages exécutant des scripts post-installation, modifiant des fichiers hors node_modules, ou initiant des appels réseau inattendus, avec monitoring d'intégrité des fichiers. Loi, société et organisation Le drama OpenRewrite (automatisation de refactoring sur de larges bases de code) est passé en mode propriétaire https://medium.com/@jonathan.leitschuh/when-open-source-isnt-how-openrewrite-lost-its-way-642053be287d Faits Clés : Moderne, Inc. a re-licencié silencieusement du code OpenRewrite (dont rewrite-java-security) de la licence Apache 2.0 à une licence propriétaire (MPL) sans consultation des contributeurs. Ce re-licenciement rend le code inaccessible et non modifiable pour les contributeurs originaux. Moderne s'est retiré de la Commonhaus Foundation (dédiée à l'open source) juste avant ces changements. La justification de Moderne est la crainte que de grandes entreprises utilisent OpenRewrite sans contribuer, créant une concurrence. Des contributions communautaires importantes (VMware, AliBaba) sous Apache 2.0 ont été re-licenciées sans leur consentement. La légalité de ce re-licenciement est incertaine sans CLA des contributeurs. Cette action crée un précédent dangereux pour les futurs contributeurs et nuit à la confiance dans l'écosystème OpenRewrite. Corrections de Moderne (Suite aux réactions) : Les dépôts Apache originaux ont été restaurés et archivés. Des versions majeures ont été utilisées pour signaler les changements de licence. Des espaces de noms distincts (org.openrewrite vs. io.moderne) ont été créés pour différencier les modules. Suggestions de Correction de l'Auteur : Annuler les changements de licence sur toutes les recettes communautaires. S'engager dans le dialogue et communiquer publiquement les changements majeurs. Respecter le versionnement sémantique (versions majeures pour les changements de licence). L'ancien gourou du design d'Apple, Jony Ive, va occuper un rôle majeur chez OpenAI OpenAI va acquérir la startup d'Ive pour 6,5 milliards de dollars, tandis qu'Ive et le PDG Sam Altman travaillent sur une nouvelle génération d'appareils et d'autres produits d'IA https://www.wsj.com/tech/ai/former-apple-design-guru-jony-ive-to-take-expansive-role-at-openai-5787f7da Rubrique débutant Un article pour les débutants sur le lien entre source, bytecode et le debug https://blog.jetbrains.com/idea/2025/05/sources-bytecode-debugging/ le debugger voit le bytecode et le lien avec la ligne ou la methode est potentiellement perdu javac peut ajouter les ligne et offset des operations pour que le debugger les affichent les noms des arguments est aussi ajoutable dans le .class quand vous pointez vers une mauvaise version du fichier source, vous avez des lignes decalées, c'est pour ca peu de raisons de ne pas actier des approches de compilations mais cela rend le fichier un peu plus gros Conférences La liste des conférences provenant de Developers Conferences Agenda/List par Aurélie Vache et contributeurs : 11-13 juin 2025 : Devoxx Poland - Krakow (Poland) 12-13 juin 2025 : Agile Tour Toulouse - Toulouse (France) 12-13 juin 2025 : DevLille - Lille (France) 13 juin 2025 : Tech F'Est 2025 - Nancy (France) 17 juin 2025 : Mobilis In Mobile - Nantes (France) 19-21 juin 2025 : Drupal Barcamp Perpignan 2025 - Perpignan (France) 24 juin 2025 : WAX 2025 - Aix-en-Provence (France) 25 juin 2025 : Rust Paris 2025 - Paris (France) 25-26 juin 2025 : Agi'Lille 2025 - Lille (France) 25-27 juin 2025 : BreizhCamp 2025 - Rennes (France) 26-27 juin 2025 : Sunny Tech - Montpellier (France) 1-4 juillet 2025 : Open edX Conference - 2025 - Palaiseau (France) 7-9 juillet 2025 : Riviera DEV 2025 - Sophia Antipolis (France) 5 septembre 2025 : JUG Summer Camp 2025 - La Rochelle (France) 12 septembre 2025 : Agile Pays Basque 2025 - Bidart (France) 18-19 septembre 2025 : API Platform Conference - Lille (France) & Online 23 septembre 2025 : OWASP AppSec France 2025 - Paris (France) 25-26 septembre 2025 : Paris Web 2025 - Paris (France) 2-3 octobre 2025 : Volcamp - Clermont-Ferrand (France) 3 octobre 2025 : DevFest Perros-Guirec 2025 - Perros-Guirec (France) 6-7 octobre 2025 : Swift Connection 2025 - Paris (France) 6-10 octobre 2025 : Devoxx Belgium - Antwerp (Belgium) 7 octobre 2025 : BSides Mulhouse - Mulhouse (France) 9 octobre 2025 : DevCon #25 : informatique quantique - Paris (France) 9-10 octobre 2025 : Forum PHP 2025 - Marne-la-Vallée (France) 9-10 octobre 2025 : EuroRust 2025 - Paris (France) 16 octobre 2025 : PlatformCon25 Live Day Paris - Paris (France) 16 octobre 2025 : Power 365 - 2025 - Lille (France) 16-17 octobre 2025 : DevFest Nantes - Nantes (France) 30-31 octobre 2025 : Agile Tour Bordeaux 2025 - Bordeaux (France) 30-31 octobre 2025 : Agile Tour Nantais 2025 - Nantes (France) 30 octobre 2025-2 novembre 2025 : PyConFR 2025 - Lyon (France) 4-7 novembre 2025 : NewCrafts 2025 - Paris (France) 5-6 novembre 2025 : Tech Show Paris - Paris (France) 6 novembre 2025 : dotAI 2025 - Paris (France) 7 novembre 2025 : BDX I/O - Bordeaux (France) 12-14 novembre 2025 : Devoxx Morocco - Marrakech (Morocco) 13 novembre 2025 : DevFest Toulouse - Toulouse (France) 15-16 novembre 2025 : Capitole du Libre - Toulouse (France) 19 novembre 2025 : SREday Paris 2025 Q4 - Paris (France) 20 novembre 2025 : OVHcloud Summit - Paris (France) 21 novembre 2025 : DevFest Paris 2025 - Paris (France) 27 novembre 2025 : DevFest Strasbourg 2025 - Strasbourg (France) 28 novembre 2025 : DevFest Lyon - Lyon (France) 5 décembre 2025 : DevFest Dijon 2025 - Dijon (France) 10-11 décembre 2025 : Devops REX - Paris (France) 10-11 décembre 2025 : Open Source Experience - Paris (France) 28-31 janvier 2026 : SnowCamp 2026 - Grenoble (France) 2-6 février 2026 : Web Days Convention - Aix-en-Provence (France) 3 février 2026 : Cloud Native Days France 2026 - Paris (France) 23-25 avril 2026 : Devoxx Greece - Athens (Greece) 17 juin 2026 : Devoxx Poland - Krakow (Poland) Nous contacter Pour réagir à cet épisode, venez discuter sur le groupe Google https://groups.google.com/group/lescastcodeurs Contactez-nous via X/twitter https://twitter.com/lescastcodeurs ou Bluesky https://bsky.app/profile/lescastcodeurs.com Faire un crowdcast ou une crowdquestion Soutenez Les Cast Codeurs sur Patreon https://www.patreon.com/LesCastCodeurs Tous les épisodes et toutes les infos sur https://lescastcodeurs.com/

Infinite Machine Learning
Building an AI+Data Startup Studio | Tom Chavez, cofounder of super{set}

Infinite Machine Learning

Play Episode Listen Later Jun 10, 2025 52:57


Tom Chavez is the cofounder of super{set}, a startup studio that founds, funds, and builds data and AI startups. Prior to this, he was the CEO and co-founder of Krux, a martech platform acquired by Salesforce in 2016. Before Krux, he was the CEO and co-founder of Rapt, a provider of software for media monetization acquired by Microsoft in 2008. He went to Harvard for undergrad and Stanford for his PhD.Tom's favorite book: The Three Musketeers (Author: Alexandre Dumas)(00:01) Origin Story and Starting Superset(02:58) How Superset Evaluates Ideas and Risk(06:24) What Is a Venture Studio and How Superset Works(10:49) Underfunded Layers in AI Infrastructure(14:55) Orchestration Opportunities in LLM Workflows(15:49) The Future of Data Infra and ETL in the AI Era(20:46) Code Infra: Code Quality and AI-Generated Software(24:55) Model Infra, MLOps, and Why It's Underwhelming(27:22) Cloud Economics and Gross Margins in AI Companies(32:15) Early Team Structure in AI Infra Startups(34:49) Full Stack vs Composable Infra in AI(37:52) Fragmentation vs Consolidation in AI Tooling(41:02) Where Moats Will Accumulate: Data In, AI, Data Out(45:10) Biggest Challenge in Building Superset(46:23) Rapid Fire Round--------Where to find Tom Chavez: LinkedIn: https://www.linkedin.com/in/tommychavez/--------Where to find Prateek Joshi: Newsletter: https://prateekjoshi.substack.com Website: https://prateekj.com LinkedIn: https://www.linkedin.com/in/prateek-joshi-91047b19 X: https://x.com/prateekvjoshi 

The Analytics Engineering Podcast
The history and future of the data ecosystem (w/ Lonne Jaffe)

The Analytics Engineering Podcast

Play Episode Listen Later Jun 8, 2025 53:53


In this decades-spanning episode, Tristan Handy sits down with Lonne Jaffe, Managing Director at Insight Partners and former CEO of Syncsort (now Precisely), to trace the history of the data ecosystem—from its mainframe origins to its AI-infused future. Lonne reflects on the evolution of ETL, the unexpected staying power of legacy tech, and why AI may finally erode the switching costs that have long protected incumbents. For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. The Analytics Engineering Podcast is sponsored by dbt Labs.

Medical Spa Insider
Unlocking Med Spa Growth with Smarter Data

Medical Spa Insider

Play Episode Listen Later May 28, 2025 41:40


Thiersch, JD, speaks with Alex Lirtsman, founder and CEO of CorralData, to explore how medical spas can unlock real-time, HIPAA-compliant insights without changing their existing systems. CorralData integrates everything from your EMR to marketing, payroll, and finance systems, giving med spas the ability to uncover actionable insights that drive profitability, patient retention, and scalable growth. Listen for strategies to ask smarter questions of your data, including: Integrating all of your existing platforms to get actionable insights from your data; How multi-location practices, med spa rollups and private equity develop playbooks; Navigating HIPAA and BAAs with AI companies to create secure data analysis tools; Using reverse ETL to optimize for high lifetime value patients and boost profitability; The questions you can ask your data with conversational AI and large language models; CorralData's tailor-made solutions for Advanced MedAesthetic Partners, and more! -- Music by Ghost Score

Great Things with Great Tech!
The End of Stale AI Data with Snow Leopard | Episode #99

Great Things with Great Tech!

Play Episode Listen Later May 12, 2025 41:07


We all talk about #AI, but what good is it if your models are powered by stale, outdated data?In Episode 99 of Great Things with Great Tech, Deepti Srivastava, founder and CEO of Snow Leopard, and former founding PM of Google Spanner, calls out the broken state of enterprise AI. With decades of experience in distributed systems and data infrastructure, Deepti unveils how Snow Leopard is redefining how AI applications are built, by tapping into live, real-time data from SQL and APIs without the need for ETL or pipelines.Instead of relying on static snapshots or disconnected data lakes, Snow Leopard's #agentic platform queries native sources like PostgreSQL, Snowflake, and Salesforce on-demand, empowering AI to live directly in the critical decision path.In This Episode, We Cover:Deepti's journey from building Spanner at Google to founding Snow Leopard AI.Why most enterprise AI fails due to reliance on stale data and outdated pipelines. How Snow Leopard federates live data across SQL and APIs with zero ETL.The limitations of vector databases in structured, real-time business use cases.Why putting AI in the critical path of business decisions unlocks real value.Snow Leopard is a U.S.-based technology company founded in 2023 by and is Headquartered in San Francisco, CaliforniaSnow Leopard specializes in building a platform that enables the development of production-ready AI applications by leveraging live business data. The company's approach focuses on real-time data retrieval directly from sources like SQL databases and APIs, eliminating the need for traditional ETL processes and data pipelines. This innovation allows for more accurate and timely AI-driven business decision.PODCAST LINKSGreat Things with Great Tech Podcast: https://gtwgt.comGTwGT Playlist on YouTube: https://www.youtube.com/@GTwGTPodcastListen on Spotify: https://open.spotify.com/show/5Y1Fgl4DgGpFd5Z4dHulVXListen on Apple Podcasts: https://podcasts.apple.com/us/podcast/great-things-with-great-tech-podcast/id1519439787EPISODE LINKSSnow Leopard Web: https://www.snowleopard.ai/Deepti Srivastava on LinkedIn:https://www.linkedin.com/in/thedeepti/Snow Leopard on LinkedIn: https://www.linkedin.com/company/snow-leopard-ai/GTwGT LINKSSupport the Channel: https://ko-fi.com/gtwgtBe on #GTwGT: Contact via Twitter/X @GTwGTPodcast or visit https://www.gtwgt.comSubscribe to YouTube: https://www.youtube.com/@GTwGTPodcast?sub_confirmation=1Great Things with Great Tech Podcast Website: https://gtwgt.comSOCIAL LINKSFollow GTwGT on Social Media:Twitter/X: https://twitter.com/GTwGTPodcastInstagram: https://www.instagram.com/GTwGTPodcastTikTok: https://www.tiktok.com/@GTwGTPodcast

Choses à Savoir
Pourquoi La Joconde n'a-t-elle pas de sourcils ?

Choses à Savoir

Play Episode Listen Later May 5, 2025 2:12


Si tu observes attentivement la Joconde, le célèbre tableau de Léonard de Vinci exposé au Louvre, un détail intrigue immédiatement : elle n'a ni sourcils ni cils. Un visage d'une précision incroyable, un regard presque vivant… mais un front totalement nu. Comment expliquer cette absence ?Une mode de la Renaissance ?Pendant longtemps, on a pensé que l'absence de sourcils était simplement liée à la mode de l'époque. Au début du XVIe siècle, en Italie, certaines femmes aristocrates s'épilaient les sourcils (et parfois la racine des cheveux) pour dégager le front, considéré alors comme un signe de beauté et de noblesse. Selon cette hypothèse, Mona Lisa (ou Lisa Gherardini, si l'on en croit la thèse majoritaire) aurait pu suivre cette tendance esthétique.Mais cette explication ne tient pas totalement : d'autres portraits de femmes de la même époque montrent clairement des sourcils, même fins ou discrets. Et Léonard de Vinci, connu pour son obsession du réalisme, aurait-il vraiment volontairement omis un tel détail ?Une disparition progressiveL'explication la plus crédible aujourd'hui repose sur l'histoire matérielle du tableau. La Joconde a plus de 500 ans, et au fil des siècles, elle a été soumise à des restaurations, nettoyages et vernissages qui ont pu altérer les détails les plus fins.Une étude scientifique menée par le spécialiste Pascal Cotte, en 2004, à l'aide d'une technologie de réflectographie multispectrale, a révélé qu'à l'origine, Léonard avait bien peint des sourcils et des cils, très fins et délicats. Mais ces détails auraient disparu avec le temps, en raison de l'usure naturelle de la couche picturale ou de restaurations trop agressives. En somme, les sourcils étaient là, mais ils se sont effacés au fil des siècles.Un effet renforçant le mystèreL'absence de sourcils contribue aussi, paradoxalement, au mystère et à l'ambiguïté du visage de la Joconde. Son expression indéfinissable, ce mélange de sourire et de neutralité, est renforcé par ce manque de lignes faciales qui encadreraient normalement le regard. Ce flou contribue au caractère intemporel et énigmatique du tableau, qui fascine depuis des siècles.En résumé : la Joconde avait probablement des sourcils, peints avec la finesse propre à Léonard de Vinci. Mais le temps, les restaurations et les vernis les ont effacés. Ce détail oublié est devenu un élément clé de son mystère. Hébergé par Acast. Visitez acast.com/privacy pour plus d'informations.

The MongoDB Podcast
EP. 264 Beyond the Database: Mastering Multi-Cloud Data, AI Automation & Integration (feat. Peter Ngai, SnapLogic)

The MongoDB Podcast

Play Episode Listen Later May 1, 2025 58:31


✨ Heads up! This episode features a demonstration of the SnapLogic UI and its AI Agent Creator towards the end. For the full visual experience, check out the video version on the Spotify app! ✨(Episode Summary)Tired of tangled data spread across multiple clouds, on-premise systems, and the edge? In this episode, MongoDB's Shane McAllister sits down with Peter Ngai, Principal Architect at SnapLogic, to explore the future of data integration and management in today's complex tech landscape.Dive into the challenges and solutions surrounding modern data architecture, including:Navigating the complexities of multi-cloud and hybrid cloud environments.The secrets to building flexible, resilient data ecosystems that avoid vendor lock-in.Strategies for seamless data integration and connecting disparate applications using low-code/no-code platforms like SnapLogic.Meeting critical data compliance, security, and sovereignty demands (think GDPR, HIPAA, etc.).How AI is revolutionizing data automation and providing faster access to insights (featuring SnapLogic's Agent Creator).The powerful synergy between SnapLogic and MongoDB, leveraging MongoDB both internally and for customer integrations.Real-world applications, from IoT data processing to simplifying enterprise workflows.Whether you're an IT leader, data engineer, business analyst, or simply curious about cloud strategy, iPaaS solutions, AI in business, or simplifying your data stack, Peter offers invaluable insights into making data connectivity a driver, not a barrier, for innovation.-Keywords: Data Integration, Multi-Cloud, Hybrid Cloud, Edge Computing, SnapLogic, MongoDB, AI, Artificial Intelligence, Data Automation, iPaaS, Low-Code, No-Code, Data Architecture, Data Management, Cloud Data, Enterprise Data, API Integration, Data Compliance, Data Sovereignty, Data Security, Business Automation, ETL, ELT, Tech Stack Simplification, Peter Ngai, Shane McAllister.

Oracle University Podcast
What is Oracle GoldenGate 23ai?

Oracle University Podcast

Play Episode Listen Later Apr 29, 2025 18:03


In a new season of the Oracle University Podcast, Lois Houston and Nikita Abraham dive into the world of Oracle GoldenGate 23ai, a cutting-edge software solution for data management. They are joined by Nick Wagner, a seasoned expert in database replication, who provides a comprehensive overview of this powerful tool.   Nick highlights GoldenGate's ability to ensure continuous operations by efficiently moving data between databases and platforms with minimal overhead. He emphasizes its role in enabling real-time analytics, enhancing data security, and reducing costs by offloading data to low-cost hardware. The discussion also covers GoldenGate's role in facilitating data sharing, improving operational efficiency, and reducing downtime during outages.   Oracle GoldenGate 23ai: Fundamentals: https://mylearn.oracle.com/ou/course/oracle-goldengate-23ai-fundamentals/145884/237273 Oracle University Learning Community: https://education.oracle.com/ou-community LinkedIn: https://www.linkedin.com/showcase/oracle-university/ X: https://x.com/Oracle_Edu   Special thanks to Arijit Ghosh, David Wright, Kris-Ann Nansen, Radhika Banka, and the OU Studio Team for helping us create this episode. ---------------------------------------------------------------   Episode Transcript: 00:00 Welcome to the Oracle University Podcast, the first stop on your cloud journey. During this series of informative podcasts, we'll bring you foundational training on the most popular Oracle technologies. Let's get started! 00:25 Nikita: Welcome to the Oracle University Podcast! I'm Nikita Abraham, Team Lead: Editorial Services with Oracle University, and with me is Lois Houston: Director of Innovation Programs. Lois: Hi everyone! Welcome to a new season of the podcast. This time, we're focusing on the fundamentals of Oracle GoldenGate. Oracle GoldenGate helps organizations manage and synchronize their data across diverse systems and databases in real time.  And with the new Oracle GoldenGate 23ai release, we'll uncover the latest innovations and features that empower businesses to make the most of their data. Nikita: Taking us through this is Nick Wagner, Senior Director of Product Management for Oracle GoldenGate. He's been doing database replication for about 25 years and has been focused on GoldenGate on and off for about 20 of those years.  01:18 Lois: In today's episode, we'll ask Nick to give us a general overview of the product, along with some use cases and benefits. Hi Nick! To start with, why do customers need GoldenGate? Nick: Well, it delivers continuous operations, being able to continuously move data from one database to another database or data platform in efficiently and a high-speed manner, and it does this with very low overhead. Almost all the GoldenGate environments use transaction logs to pull the data out of the system, so we're not creating any additional triggers or very little overhead on that source system. GoldenGate can also enable real-time analytics, being able to pull data from all these different databases and move them into your analytics system in real time can improve the value that those analytics systems provide. Being able to do real-time statistics and analysis of that data within those high-performance custom environments is really important. 02:13 Nikita: Does it offer any benefits in terms of cost?  Nick: GoldenGate can also lower IT costs. A lot of times people run these massive OLTP databases, and they are running reporting in those same systems. With GoldenGate, you can offload some of the data or all the data to a low-cost commodity hardware where you can then run the reports on that other system. So, this way, you can get back that performance on the OLTP system, while at the same time optimizing your reporting environment for those long running reports. You can improve efficiencies and reduce risks. Being able to reduce the amount of downtime during planned and unplanned outages can really make a big benefit to the overall operational efficiencies of your company.  02:54 Nikita: What about when it comes to data sharing and data security? Nick: You can also reduce barriers to data sharing. Being able to pull subsets of data, or just specific pieces of data out of a production database and move it to the team or to the group that needs that information in real time is very important. And it also protects the security of your data by only moving in the information that they need and not the entire database. It also provides extensibility and flexibility, being able to support multiple different replication topologies and architectures. 03:24 Lois: Can you tell us about some of the use cases of GoldenGate? Where does GoldenGate truly shine?  Nick: Some of the more traditional use cases of GoldenGate include use within the multicloud fabric. Within a multicloud fabric, this essentially means that GoldenGate can replicate data between on-premise environments, within cloud environments, or hybrid, cloud to on-premise, on-premise to cloud, or even within multiple clouds. So, you can move data from AWS to Azure to OCI. You can also move between the systems themselves, so you don't have to use the same database in all the different clouds. For example, if you wanted to move data from AWS Postgres into Oracle running in OCI, you can do that using Oracle GoldenGate. We also support maximum availability architectures. And so, there's a lot of different use cases here, but primarily geared around reducing your recovery point objective and recovery time objective. 04:20 Lois: Ah, reducing RPO and RTO. That must have a significant advantage for the customer, right? Nick: So, reducing your RPO and RTO allows you to take advantage of some of the benefits of GoldenGate, being able to do active-active replication, being able to set up GoldenGate for high availability, real-time failover, and it can augment your active Data Guard and Data Guard configuration. So, a lot of times GoldenGate is used within Oracle's maximum availability architecture platinum tier level of replication, which means that at that point you've got lots of different capabilities within the Oracle Database itself. But to help eke out that last little bit of high availability, you want to set up an active-active environment with GoldenGate to really get true zero RPO and RTO. GoldenGate can also be used for data offloading and data hubs. Being able to pull data from one or more source systems and move it into a data hub, or into a data warehouse for your operational reporting. This could also be your analytics environment too. 05:22 Nikita: Does GoldenGate support online migrations? Nick: In fact, a lot of companies actually get started in GoldenGate by doing a migration from one platform to another. Now, these don't even have to be something as complex as going from one database like a DB2 on-premise into an Oracle on OCI, it could even be simple migrations. A lot of times doing something like a major application or a major database version upgrade is going to take downtime on that production system. You can use GoldenGate to eliminate that downtime. So this could be going from Oracle 19c to Oracle 23ai, or going from application version 1.0 to application version 2.0, because GoldenGate can do the transformation between the different application schemas. You can use GoldenGate to migrate your database from on premise into the cloud with no downtime as well. We also support real-time analytic feeds, being able to go from multiple databases, not only those on premise, but being able to pull information from different SaaS applications inside of OCI and move it to your different analytic systems. And then, of course, we also have the ability to stream events and analytics within GoldenGate itself.  06:34 Lois: Let's move on to the various topologies supported by GoldenGate. I know GoldenGate supports many different platforms and can be used with just about any database. Nick: This first layer of topologies is what we usually consider relational database topologies. And so this would be moving data from Oracle to Oracle, Postgres to Oracle, Sybase to SQL Server, a lot of different types of databases. So the first architecture would be unidirectional. This is replicating from one source to one target. You can do this for reporting. If I wanted to offload some reports into another server, I can go ahead and do that using GoldenGate. I can replicate the entire database or just a subset of tables. I can also set up GoldenGate for bidirectional, and this is what I want to set up GoldenGate for something like high availability. So in the event that one of the servers crashes, I can almost immediately reconnect my users to the other system. And that almost immediately depends on the amount of latency that GoldenGate has at that time. So a typical latency is anywhere from 3 to 6 seconds. So after that primary system fails, I can reconnect my users to the other system in 3 to 6 seconds. And I can do that because as GoldenGate's applying data into that target database, that target system is already open for read and write activity. GoldenGate is just another user connecting in issuing DML operations, and so it makes that failover time very low. 07:59 Nikita: Ok…If you can get it down to 3 to 6 seconds, can you bring it down to zero? Like zero failover time?   Nick: That's the next topology, which is active-active. And in this scenario, all servers are read/write all at the same time and all available for user activity. And you can do multiple topologies with this as well. You can do a mesh architecture, which is where every server talks to every other server. This works really well for 2, 3, 4, maybe even 5 environments, but when you get beyond that, having every server communicate with every other server can get a little complex. And so at that point we start looking at doing what we call a hub and spoke architecture, where we have lots of different spokes. At the end of each spoke is a read/write database, and then those communicate with a hub. So any change that happens on one spoke gets sent into the hub, and then from the hub it gets sent out to all the other spokes. And through that architecture, it allows you to really scale up your environments. We have customers that are doing up to 150 spokes within that hub architecture. Within active-active replication as well, we can do conflict detection and resolution, which means that if two users modify the same row on two different systems, GoldenGate can actually determine that there was an issue with that and determine what user wins or which row change wins, which is extremely important when doing active-active replication. And this means that if one of those systems fails, there is no downtime when you switch your users to another active system because it's already available for activity and ready to go. 09:35 Lois: Wow, that's fantastic. Ok, tell us more about the topologies. Nick: GoldenGate can do other things like broadcast, sending data from one system to multiple systems, or many to one as far as consolidation. We can also do cascading replication, so when data moves from one environment that GoldenGate is replicating into another environment that GoldenGate is replicating. By default, we ignore all of our own transactions. But there's actually a toggle switch that you can flip that says, hey, GoldenGate, even though you wrote that data into that database, still push it on to the next system. And then of course, we can also do distribution of data, and this is more like moving data from a relational database into something like a Kafka topic or a JMS queue or into some messaging service. 10:24 Raise your game with the Oracle Cloud Applications skills challenge. Get free training on Oracle Fusion Cloud Applications, Oracle Modern Best Practice, and Oracle Cloud Success Navigator. Pass the free Oracle Fusion Cloud Foundations Associate exam to earn a Foundations Associate certification. Plus, there's a chance to win awards and prizes throughout the challenge! What are you waiting for? Join the challenge today by visiting visit oracle.com/education. 10:58 Nikita: Welcome back! Nick, does GoldenGate also have nonrelational capabilities?  Nick: We have a number of nonrelational replication events in topologies as well. This includes things like data lake ingestion and streaming ingestion, being able to move data and data objects from these different relational database platforms into data lakes and into these streaming systems where you can run analytics on them and run reports. We can also do cloud ingestion, being able to move data from these databases into different cloud environments. And this is not only just moving it into relational databases with those clouds, but also their data lakes and data fabrics. 11:38 Lois: You mentioned a messaging service earlier. Can you tell us more about that? Nick: Messaging replication is also possible. So we can actually capture from things like messaging systems like Kafka Connect and JMS, replicate that into a relational data, or simply stream it into another environment. We also support NoSQL replication, being able to capture from MongoDB and replicate it onto another MongoDB for high availability or disaster recovery, or simply into any other system. 12:06 Nikita: I see. And is there any integration with a customer's SaaS applications? Nick: GoldenGate also supports a number of different OCI SaaS applications. And so a lot of these different applications like Oracle Financials Fusion, Oracle Transportation Management, they all have GoldenGate built under the covers and can be enabled with a flag that you can actually have that data sent out to your other GoldenGate environment. So you can actually subscribe to changes that are happening in these other systems with very little overhead. And then of course, we have event processing and analytics, and this is the final topology or flexibility within GoldenGate itself. And this is being able to push data through data pipelines, doing data transformations. GoldenGate is not an ETL tool, but it can do row-level transformation and row-level filtering.  12:55 Lois: Are there integrations offered by Oracle GoldenGate in automation and artificial intelligence? Nick: We can do time series analysis and geofencing using the GoldenGate Stream Analytics product. It allows you to actually do real time analysis and time series analysis on data as it flows through the GoldenGate trails. And then that same product, the GoldenGate Stream Analytics, can then take the data and move it to predictive analytics, where you can run MML on it, or ONNX or other Spark-type technologies and do real-time analysis and AI on that information as it's flowing through.  13:29 Nikita: So, GoldenGate is extremely flexible. And given Oracle's focus on integrating AI into its product portfolio, what about GoldenGate? Does it offer any AI-related features, especially since the product name has “23ai” in it? Nick: With the advent of Oracle GoldenGate 23ai, it's one of the two products at this point that has the AI moniker at Oracle. Oracle Database 23ai also has it, and that means that we actually do stuff with AI. So the Oracle GoldenGate product can actually capture vectors from databases like MySQL HeatWave, Postgres using pgvector, which includes things like AlloyDB, Amazon RDS Postgres, Aurora Postgres. We can also replicate data into Elasticsearch and OpenSearch, or if the data is using vectors within OCI or the Oracle Database itself. So GoldenGate can be used for a number of things here. The first one is being able to migrate vectors into the Oracle Database. So if you're using something like Postgres, MySQL, and you want to migrate the vector information into the Oracle Database, you can. Now one thing to keep in mind here is a vector is oftentimes like a GPS coordinate. So if I need to know the GPS coordinates of Austin, Texas, I can put in a latitude and longitude and it will give me the GPS coordinates of a building within that city. But if I also need to know the altitude of that same building, well, that's going to be a different algorithm. And GoldenGate and replicating vectors is the same way. When you create a vector, it's essentially just creating a bunch of numbers under the screen, kind of like those same GPS coordinates. The dimension and the algorithm that you use to generate that vector can be different across different databases, but the actual meaning of that data will change. And so GoldenGate can replicate the vector data as long as the algorithm and the dimensions are the same. If the algorithm and the dimensions are not the same between the source and the target, then you'll actually want GoldenGate to replicate the base data that created that vector. And then once GoldenGate replicates the base data, it'll actually call the vector embedding technology to re-embed that data and produce that numerical formatting for you.  15:42 Lois: So, there are some nuances there… Nick: GoldenGate can also replicate and consolidate vector changes or even do the embedding API calls itself. This is really nice because it means that we can take changes from multiple systems and consolidate them into a single one. We can also do the reverse of that too. A lot of customers are still trying to find out which algorithms work best for them. How many dimensions? What's the optimal use? Well, you can now run those in different servers without impacting your actual AI system. Once you've identified which algorithm and dimension is going to be best for your data, you can then have GoldenGate replicate that into your production system and we'll start using that instead. So it's a nice way to switch algorithms without taking extensive downtime. 16:29 Nikita: What about in multicloud environments?  Nick: GoldenGate can also do multicloud and N-way active-active Oracle replication between vectors. So if there's vectors in Oracle databases, in multiple clouds, or multiple on-premise databases, GoldenGate can synchronize them all up. And of course we can also stream changes from vector information, including text as well into different search engines. And that's where the integration with Elasticsearch and OpenSearch comes in. And then we can use things like NVIDIA and Cohere to actually do the AI on that data.  17:01 Lois: Using GoldenGate with AI in the database unlocks so many possibilities. Thanks for that detailed introduction to Oracle GoldenGate 23ai and its capabilities, Nick.  Nikita: We've run out of time for today, but Nick will be back next week to talk about how GoldenGate has evolved over time and its latest features. And if you liked what you heard today, head over to mylearn.oracle.com and take a look at the Oracle GoldenGate 23ai Fundamentals course to learn more. Until next time, this is Nikita Abraham… Lois: And Lois Houston, signing off! 17:33 That's all for this episode of the Oracle University Podcast. If you enjoyed listening, please click Subscribe to get all the latest episodes. We'd also love it if you would take a moment to rate and review us on your podcast app. See you again on the next episode of the Oracle University Podcast.

Data Transforming Business
Building Trust in Data: Transparency, Collaboration, and Governance for Successful AI

Data Transforming Business

Play Episode Listen Later Apr 14, 2025 22:59


"So you want trusted data, but you want it now? Building this trust really starts with transparency and collaboration. It's not just technology. It's about creating a single governed view of data that is consistent no matter who accesses it, " says Errol Rodericks, Director of Product Marketing at Denodo.In this episode of the 'Don't Panic, It's Just Data' podcast, Shawn Rogers, CEO at BARC US, speaks with Errol Rodericks from Denodo. They explore the crucial link between trusted data and successful AI initiatives. They discuss key factors such as data orchestration, governance, and cost management within complex cloud environments. We've all heard the horror stories – AI projects that fail spectacularly, delivering biased or inaccurate results. But what's the root cause of these failures? More often than not, it's a lack of focus on the data itself. Rodericks emphasises that "AI is only as good as the data it's trained on." This episode explores how organisations can avoid the "garbage in, garbage out" scenario by prioritising data quality, lineage, and responsible AI practices. Learn how to avoid AI failures and discover strategies for building an AI-ready data foundation that ensures trusted, reliable outcomes. Key topics include overcoming data bias, ETL processes, and improving data sharing practices.TakeawaysBad data leads to bad AI outputs.Trust in data is essential for effective AI.Organisations must prioritise data quality and orchestration.Transparency and collaboration are key to building trust in data.Compliance is a responsibility for the entire organisation, not just IT.Agility in accessing data is crucial for AI success.Chapters00:00 The Importance of Data Quality in AI02:57 Building Trust in Data Ecosystems06:11 Navigating Complex Data Landscapes09:11 Top-Down Pressure for AI Strategy11:49 Responsible AI and Data Governance15:08 Challenges in Personalisation and Compliance17:47 The Role of Speed in Data Utilisation20:47 Advice for CFOs on AI InvestmentsAbout DenodoDenodo is a leader in data management. The award-winning Denodo Platform is the leading logical data management platform for transforming data into trustworthy insights and outcomes for all data-related initiatives across the enterprise, including AI and self-service. Denodo's customers in all industries all over the world have delivered trusted AI-ready and business-ready data in a third of the time and with 10x better performance than with lakehouses and other mainstream data platforms alone.

The MongoDB Podcast
EP.260 Vector Search Secrets Revealed! - AI-Powered Image Search with MongoDB - Live Demo

The MongoDB Podcast

Play Episode Listen Later Mar 26, 2025 57:33


Ever wondered how companies like Amazon or Pinterest deliver lightning-fast image search? Dive into this episode of MongoDB Podcast Live with Shane McAllister and Nenad, a MongoDB Champion, as they unravel the magic of semantic image search powered by MongoDB Atlas Vector Search!

Raw Data By P3
Data, Data, and Metadata: Letting ChatGPT Interpret Power BI Output

Raw Data By P3

Play Episode Listen Later Mar 25, 2025 16:17


What happens when you hand off your Power BI output to ChatGPT and ask it to make sense of your world? You might be surprised. This week, Rob shares a deeply personal use case. One that ties together two major themes we've been exploring: Gen AI is reshaping the way we think about dashboards. To get real value out of AI, you need more than just data. You need metadata. And yes, that kind of metadata—the kind you create in Power BI when you translate raw data into something meaningful. Along the way, we revisit the old guard of data warehousing. The mighty (and now dusty?) ETL priesthood. And we uncover a delicious little irony about how the future of data looks a lot like its past, just with better tools and smarter questions. The big twist? We're all ETL now. But the "T" might not mean what you think it does anymore. Listen now to find out how a few rows of carefully modeled data, a table visual, and one really good AI assistant changed the game. For Rob and, just possibly, for all of us. Also in this episode: Blind Melon – Change (YouTube) The Data Warehouse Toolkit Raw Data Episode - The Human Side of Data: Using Analytics for Personal Well-Being

HVAC School - For Techs, By Techs
A Conversation with NAVAC at AHR 2025

HVAC School - For Techs, By Techs

Play Episode Listen Later Mar 7, 2025 46:23


In this engaging episode of the HVAC School Podcast, host Bryan sits down with Jesse from NAVAC to dive deep into the evolving landscape of refrigeration technology, focusing primarily on the transition to A2L refrigerants. The conversation offers a refreshingly pragmatic approach to addressing industry concerns about these new, mildly flammable refrigerants, dispelling myths and providing practical insights for HVAC technicians. The discussion begins by addressing the most pressing question for many technicians: Do you need to buy all new tools to work with A2L refrigerants? Jesse from NAVAC provides a nuanced response, emphasizing that while there are currently no regulations mandating new equipment, the company has proactively developed tools that are safety-certified and compatible with the new refrigerant types. They explore the intricacies of safety certifications like UL and CSA, explaining the differences between UL Listed and UL Verified, and highlighting the importance of intrinsically safe equipment, especially for tools like vacuum pumps and recovery machines. NAVAC's approach goes beyond mere product promotion, with Jesse positioning himself as an educator first. The podcast delves into the technical details of A2L refrigerants, challenging common misconceptions and providing context about their flammability. Bryan and Jesse draw parallels with previous refrigerant transitions, noting how technicians were initially skeptical about R-410A but eventually adapted. They emphasize the importance of best practices, proper training, and understanding the actual risks associated with these new refrigerants, rather than succumbing to fear-based narratives. The episode also showcases NAVAC's latest technological innovations, including smart probes, a Bluetooth scale, a smart valve for charging and recovery, and an advanced vacuum pump with a one-touch oil testing feature. These tools represent the company's commitment to improving technician efficiency and safety, with features that address real-world challenges faced by HVAC professionals. Key Topics Covered: A2L Refrigerants Myths and misconceptions about flammability Comparison with previous refrigerant transitions Safety considerations and best practices Safety Certifications Differences between UL Listed and UL Verified Importance of intrinsically safe equipment CSA and ETL certifications NAVAC's New Tools Smart probes with Bluetooth connectivity Advanced vacuum pump with automatic oil testing Flex manifold with digital accuracy and analog feel Battery-operated pumps with improved run times Industry Trends Preparation for A2L and future refrigerant transitions Regulatory changes and efficiency standards Importance of technician education and adaptation Additional Insights: No current regulations require new tools for A2L refrigerants Proper training and best practices are crucial Technicians should focus on understanding new technologies Safety is about awareness and proper procedures, not fear   Have a question that you want us to answer on the podcast? Submit your questions at https://www.speakpipe.com/hvacschool. Purchase your tickets or learn more about the 6th Annual HVACR Training Symposium at https://hvacrschool.com/symposium. Subscribe to our podcast on your iPhone or Android. Subscribe to our YouTube channel. Check out our handy calculators here or on the HVAC School Mobile App for Apple and Android

The Marketing Hero Podcast
Integrating Your CRM With Your Data Warehouse

The Marketing Hero Podcast

Play Episode Listen Later Mar 4, 2025 38:29


Companies have crucial data stored across multiple systems in their organization. And the bigger the company, the more systems there are. Sales, marketing, finance, ERP, inventory, contract management, billing, and service delivery are just types of data and systems that show the story of the business. Many times your CRM just by itself can't show all of that!Because of this, many companies have started setting up a centralized data warehouse like Snowflake or Redshift to pull in data from their CRM and other systems to be able to run more advanced and centralized reporting across it all. But if you want to set this up, where do you start? How do you manage it? How do you protect it? How do you keep it maintained? How do you actually derive value from all the hard work of implementing it?I contacted Ryan Severns to talk through all these questions with him. We talk through all these questions and more, starting with questions like:Why set up a CRM data warehouse infrastructure in the first place?How do you build the integration?What important considerations do you need to plan for?We dig deeper into the details from there, talking through topics like ETL tooling, DBT, data governance, cross-platform analytics, building effective business intelligence systems, and more.If you're ready to level up your customer data skills and advance to the next level of your RevOps hero journey, this episode is for you! Give it a watch and a like. And hit that subscribe button so that you'll always get notified of future episodes of The RevOps Hero Podcast as well.

Microsoft Mechanics Podcast
Connect to any data with Shortcuts, Mirroring and Data Factory using Microsoft Fabric

Microsoft Mechanics Podcast

Play Episode Listen Later Mar 3, 2025 16:21 Transcription Available


Easily access and unify your data for analytics and AI—no matter where it lives. With OneLake in Microsoft Fabric, you can connect to data across multiple clouds, databases, and formats without duplication. Use the OneLake catalog to quickly find and interact with your data, and let Copilot in Fabric help you transform and analyze it effortlessly. Eliminate barriers to working with your data using Shortcuts to virtualize external sources and Mirroring to keep databases and warehouses in sync—all without ETL. For deeper integration, leverage Data Factory's 180+ connectors to bring in structured, unstructured, and real-time streaming data at scale. Maraki Ketema from the Microsoft Fabric team shows how to combine these methods, ensuring fast, reliable access to quality data for analytics and AI workloads. ► QUICK LINKS: 00:00 - Access data wherever it lives 00:42 - Microsoft Fabric background 01:17 - Manage data with Microsoft Fabric 03:04 - Low latency 03:34 - How Shortcuts work  06:41 - Mirroring 08:10 - Open mirroring 08:40 - Low friction ways to bring data in 09:32 - Data Factory in Microsoft Fabric 10:52 - Build out your data flow 11:49 - Use built-in AI to ask questions of data 12:56 - OneLake catalog 13:36 - Data security & compliance 15:10 - Additional options to bring data in 15:42 - Wrap up ► Link References Watch our show on Real-Time Intelligence at https://aka.ms/MechanicsRTI Check out Open Mirroring at https://aka.ms/FabricOpenMirroring ► Unfamiliar with Microsoft Mechanics? As Microsoft's official video series for IT, you can watch and share valuable content and demos of current and upcoming tech from the people who build it at Microsoft. • Subscribe to our YouTube: https://www.youtube.com/c/MicrosoftMechanicsSeries • Talk with other IT Pros, join us on the Microsoft Tech Community: https://techcommunity.microsoft.com/t5/microsoft-mechanics-blog/bg-p/MicrosoftMechanicsBlog • Watch or listen from anywhere, subscribe to our podcast: https://microsoftmechanics.libsyn.com/podcast ► Keep getting this insider knowledge, join us on social: • Follow us on Twitter: https://twitter.com/MSFTMechanics • Share knowledge on LinkedIn: https://www.linkedin.com/company/microsoft-mechanics/ • Enjoy us on Instagram: https://www.instagram.com/msftmechanics/ • Loosen up with us on TikTok: https://www.tiktok.com/@msftmechanics  

The Agency Profit Podcast
Why Most Reporting Systems Fail, & What to Do Instead, With Ben Zittlau

The Agency Profit Podcast

Play Episode Listen Later Feb 12, 2025 51:15


Points of Interest01:00 – 01:45 – Guest Introduction: Marcel welcomes back his co-founder, Ben Zittlau, highlighting his expertise in data operations and agency growth strategies.03:35 – 05:50 – Scaling Data Operations: Lessons from Jobber: Ben shares insights from his experience at Jobber, detailing the challenges of building and scaling a data operations team in a fast-growing company.08:43 – 13:14 – Why Agencies Struggle to Get Insights from Their Data: Discussion on how agencies collect vast amounts of data across multiple tools but fail to derive meaningful insights due to fragmentation and inconsistency.13:15 – 16:05 – The Myth of a “Single Source of Truth: Marcel and Ben challenge the common belief that pushing all data into a single platform solves reporting issues, highlighting the reality of messy operational data.19:26 – 21:48 – The Limitations of All-in-One Software Solutions: Exploring why all-in-one agency management tools often fail to deliver on their promise of seamless reporting and data integration.24:43 – 27:44 – The Hidden Costs of Locking into a Single Platform: Discussion on how agencies become “trapped” by software providers, making it difficult to switch tools without major operational disruptions.30:29 – 35:53 – How to Integrate Data Without Sacrificing Flexibility: A deep dive into the challenges of stitching data from various tools while maintaining adaptability and historical accuracy.35:54 – 41:14 – Accuracy vs. Precision: Why Clean Data is a Myth: Why agencies should focus on broader trends instead of pursuing impossible data perfection, and how to handle data inconsistencies effectively.41:15 – 44:15 – A Modern Data Approach: Extract, Transform, Load (ETL): Introduction to the ETL process, which allows agencies to clean and transform data before reporting, improving reliability and flexibility.50:14 – 52:00 – Lessons from Finance: What Agencies Can Learn from Accounting: Marcel compares data operations to bookkeeping, explaining how structured financial workflows can serve as a model for better agency data management.Show NotesConnect with Ben via LinkedInFree ToolkitParakeeto Foundations CourseLove the PodcastLeave us a review here.

SQL Data Partners Podcast
Episode 284: The Four-Letter Word ETL - Data Movement

SQL Data Partners Podcast

Play Episode Listen Later Feb 4, 2025 58:57


Once you have your data stored in OneLake, you'll be ready to start transforming it to improve it's usability, accuracy, and efficiency.    In this episode of the podcast, Belinda Allen takes us on a delightful journey through Data Flows, Power Query, Azure Data Factory (Pipelines), and discusses the merits of shortcuts. We also learn about a handy way to manually upload a table if you have some static data you need to update.  There are many tools and techniques that can be used for data ingestion and transformations. And while some of these options we discuss will be up to individual preference, there are pros and cons to each. One of the blessings and curses of Fabric is that there are many ways of achieving the same result, so what you choose may depend on the nature of the data you have and your goals, but might also be dictated by personal experience.  We hope you enjoyed this conversation with Belinda on ingesting and transforming data in Microsoft Fabric. If you have questions or comments, please send them our way. We would love to answer your questions on a future episode. Leave us a comment and some love ❤️on LinkedIn, X, Facebook, or Instagram. The show notes for today's episode can be found at Episode 284: The Four-Letter Word ETL - Data Movement. Have fun on the SQL Trail!

Voice of the DBA
Data Debt

Voice of the DBA

Play Episode Listen Later Jan 31, 2025 3:33


I had never heard of data debt until I saw this article on the topic. In reading it, I couldn't help thinking that most everyone has data debt, it creates inefficiencies, and it's unlikely we'll get rid of it. And by the way, it's too late to get this under control. I somewhat dismissed the article when I saw this: "addressing data debt in its early stages is crucial to ensure that it does not become an overwhelming barrier to progress." I know it's a barrier, as I assume most of you also know, but it's also not stopping us. We keep building more apps, databases, and systems, and accruing more data debt. Somehow, most organizations keep running. The description of debt might help here. How many of you have inconsistent data standards, where you might define a data element differently in different databases? Maybe you have duplicated data that is slow to update (think ETL/warehouses), maybe you have different ways of tracking a completed sale in different systems. Maybe you even store dates in different formats (int, string, or something weirder). How many of you lack some documentation on what the columns in your databases mean? Maybe I should ask the reverse, where the few of you who have complete data dictionaries can raise your hands. Read the rest of Data Debt

Thinking Elixir Podcast
235: Wrapping Up 2024 with Types

Thinking Elixir Podcast

Play Episode Listen Later Jan 7, 2025 26:55


News includes the official release of Elixir 1.18.0 with enhanced type system support, José Valim's retrospective on Elixir's progress in 2024, LiveView Native's significant v0.4.0-rc.0 release with a new networking stack, ExDoc v0.36's introduction of swup.js for smoother page navigations, the announcement of a new Elixir conference called Goatmire in Sweden, and more! Show Notes online - http://podcast.thinkingelixir.com/235 (http://podcast.thinkingelixir.com/235) Elixir Community News https://elixir-lang.org/blog/2024/12/19/elixir-v1-18-0-released/ (https://elixir-lang.org/blog/2024/12/19/elixir-v1-18-0-released/?utm_source=thinkingelixir&utm_medium=shownotes) – Official Elixir 1.18.0 release announcement https://github.com/elixir-lang/elixir/blob/v1.18/CHANGELOG.md (https://github.com/elixir-lang/elixir/blob/v1.18/CHANGELOG.md?utm_source=thinkingelixir&utm_medium=shownotes) – Changelog for Elixir 1.18.0 release https://bsky.app/profile/david.bernheisel.com/post/3leetmgvihk2a (https://bsky.app/profile/david.bernheisel.com/post/3leetmgvihk2a?utm_source=thinkingelixir&utm_medium=shownotes) – Details about upcoming Elixir 1.19 type checking capabilities for protocols https://bsky.app/profile/josevalim.bsky.social/post/3ldyphlun4c2z (https://bsky.app/profile/josevalim.bsky.social/post/3ldyphlun4c2z?utm_source=thinkingelixir&utm_medium=shownotes) – José Valim's retrospective on Elixir's progress in 2024, highlighting type system improvements and project releases https://github.com/liveview-native/liveviewnative/releases (https://github.com/liveview-native/live_view_native/releases?utm_source=thinkingelixir&utm_medium=shownotes) – LiveView Native v0.4.0-rc.0 release announcement https://x.com/liveviewnative/status/1869081462659809771 (https://x.com/liveviewnative/status/1869081462659809771?utm_source=thinkingelixir&utm_medium=shownotes) – Twitter announcement about LiveView Native release https://github.com/liveview-native/liveviewnative/blob/main/CHANGELOG.md (https://github.com/liveview-native/live_view_native/blob/main/CHANGELOG.md?utm_source=thinkingelixir&utm_medium=shownotes) – Changelog for LiveView Native v0.4.0-rc.0 https://bsky.app/profile/josevalim.bsky.social/post/3le25qqcfh22x (https://bsky.app/profile/josevalim.bsky.social/post/3le25qqcfh22x?utm_source=thinkingelixir&utm_medium=shownotes) – ExDoc v0.36 release announcement introducing swup.js for navigation https://github.com/swup/swup (https://github.com/swup/swup?utm_source=thinkingelixir&utm_medium=shownotes) – Swup.js GitHub repository https://swup.js.org/ (https://swup.js.org/?utm_source=thinkingelixir&utm_medium=shownotes) – Swup.js documentation https://swup.js.org/getting-started/demos/ (https://swup.js.org/getting-started/demos/?utm_source=thinkingelixir&utm_medium=shownotes) – Swup.js demos showing page transition capabilities https://github.com/hexpm/hexdocs/pull/44 (https://github.com/hexpm/hexdocs/pull/44?utm_source=thinkingelixir&utm_medium=shownotes) – Pull request for cross-package function search in ExDoc using Typesense https://github.com/elixir-lang/ex_doc/issues/1811 (https://github.com/elixir-lang/ex_doc/issues/1811?utm_source=thinkingelixir&utm_medium=shownotes) – Related issue for cross-package function search feature https://bsky.app/profile/tylerayoung.com/post/3lejnfttgok2u (https://bsky.app/profile/tylerayoung.com/post/3lejnfttgok2u?utm_source=thinkingelixir&utm_medium=shownotes) – Announcement of parameterized_test v0.6.0 with improved failure messages https://hexdocs.pm/phoenix_test/changelog.html#0-5-1 (https://hexdocs.pm/phoenix_test/changelog.html#0-5-1?utm_source=thinkingelixir&utm_medium=shownotes) – phoenix_test v0.5.1 changelog with new assertion helpers https://x.com/germsvel/status/1873732271611469976 (https://x.com/germsvel/status/1873732271611469976?utm_source=thinkingelixir&utm_medium=shownotes) – Twitter announcement about phoenix_test updates https://x.com/ElixirConf/status/1873445096773111848 (https://x.com/ElixirConf/status/1873445096773111848?utm_source=thinkingelixir&utm_medium=shownotes) – Announcement of new ElixirConf US 2024 videos https://www.youtube.com/playlist?list=PLqj39LCvnOWbW2Zli4LurDGc6lL5ij-9Y (https://www.youtube.com/playlist?list=PLqj39LCvnOWbW2Zli4LurDGc6lL5ij-9Y?utm_source=thinkingelixir&utm_medium=shownotes) – YouTube playlist of ElixirConf US 2024 talks https://x.com/TylerAYoung/status/1873798040525693040 (https://x.com/TylerAYoung/status/1873798040525693040?utm_source=thinkingelixir&utm_medium=shownotes) – Recommendation for David's ETL talk at ElixirConf https://goatmire.com/ (https://goatmire.com/?utm_source=thinkingelixir&utm_medium=shownotes) – New Elixir conference "Goatmire" announced in Sweden https://bsky.app/profile/lawik.bsky.social/post/3ldougsbvhk2s (https://bsky.app/profile/lawik.bsky.social/post/3ldougsbvhk2s?utm_source=thinkingelixir&utm_medium=shownotes) – Lars Wikman's announcement about Goatmire conference Do you have some Elixir news to share? Tell us at @ThinkingElixir (https://twitter.com/ThinkingElixir) or email at show@thinkingelixir.com (mailto:show@thinkingelixir.com) Find us online - Message the show - Bluesky (https://bsky.app/profile/thinkingelixir.com) - Message the show - X (https://x.com/ThinkingElixir) - Message the show on Fediverse - @ThinkingElixir@genserver.social (https://genserver.social/ThinkingElixir) - Email the show - show@thinkingelixir.com (mailto:show@thinkingelixir.com) - Mark Ericksen on X - @brainlid (https://x.com/brainlid) - Mark Ericksen on Bluesky - @brainlid.bsky.social (https://bsky.app/profile/brainlid.bsky.social) - Mark Ericksen on Fediverse - @brainlid@genserver.social (https://genserver.social/brainlid) - David Bernheisel on Bluesky - @david.bernheisel.com (https://bsky.app/profile/david.bernheisel.com) - David Bernheisel on Fediverse - @dbern@genserver.social (https://genserver.social/dbern)

The Data Exchange with Ben Lorica
Beyond ETL: How Snow Leopard Connects AI, Agents, and Live Data

The Data Exchange with Ben Lorica

Play Episode Listen Later Dec 5, 2024 43:33


Deepti Srivastava is the Founder and CEO of Snow Leopard. We dive into Snow Leopard's innovative approach to data integration, exploring its live data access model that bypasses traditional ETL pipelines to offer real-time data retrieval directly from source systems.Subscribe to the Gradient Flow Newsletter:  https://gradientflow.substack.com/Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon •  RSS.Detailed show notes - with links to many references - can be found on The Data Exchange web site.

The Data Exchange with Ben Lorica
Reimagining Code: The AI-Driven Transformation of Programming and Data Analytics

The Data Exchange with Ben Lorica

Play Episode Listen Later Oct 17, 2024 41:05


Matt Welsh is a technical leader at Aryn AI, an AI-powered ETL system for RAG frameworks, LLM-based applications, and vector databases. In this episode, we explore how AI is revolutionizing programming and software development. Subscribe to the Gradient Flow Newsletter:  https://gradientflow.substack.com/Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon •  RSS.Detailed show notes - with links to many references - can be found on The Data Exchange web site.