Open||Source||Data

Share on

We all know the future of software is “cloud-native.” How did we get here? What’s coming next? Join Sam Ramji and become part of this growing community of friends. Let's go far together.

with Sam Ramji

Sep 23, 2025 LATEST EPISODE
monthly NEW EPISODES
39m AVG DURATION
104 EPISODES

Search for episodes from Open||Source||Data with a specific topic:

Latest episodes from Open||Source||Data

Season 7 Finale: Open Source, AI in Real Life & Developments

Play Episode Listen Later Sep 23, 2025 37:27

Season 7 of Open Source Data marks the year of 2025 as a year of change. In the finale, host Charna Parkey and producer Leo Godoy take time to reflect on the conversations that defined the season, from the rise of “small AI” models to democratization, creativity in the workplace, bias in AI, and the human side of technological change. They discuss the shifting meaning of trust in media, the evolving ways professionals integrate AI into their careers, and the critical role of open source in keeping these conversations plural and transparent.QUOTESCHARNA PARKEY“If we really want to democratize, if we really want people to use these things, it has to be approachable and human readable." “Founders are founding that never would have before… because they can activate their domain expertise in a different kind of way with AI.” LEO GODOY“There's so many things that sometimes you wouldn't think that would need correction or a reroute, but there is, there always will be.” “Some concerns that we've had now in September are different from the ones that we've had in January.” TIMESTAMPS00:00 — Season 7 Finale Kickoff01:30 — Reflections on the Year02:30 — Remembering the first ‘Open Source Data' Livestream04:00 — Small AI in 202506:00 — Democratization & Access08:00 — Copyright, Fair Use & Trust10:30 — Changing Media & Journalism12:00 — The Kangaroo Clip & AI Realism14:30 — Creativity & Workplace AI16:30 — AI as a Career Roadmap22:00 — Founders, New Roles & Opportunity24:00 — Bias, Guardrails & Social Impact26:30 — Open Source & Community Ownership28:30 — What's Coming in Season 832:00 — Multi-Agent Systems & Future Trends34:00 — Thank You & Season Wrap-Up36:00 — Closing Notes for Listeners

founders ai finale reflections real life bias copyright developments open source ai open source data

Can You Build AI Without Bias? | John Pasmore

Play Episode Listen Later Jul 29, 2025 42:56

John Pasmore, thinks the answer is yes — but not if we keep doing things the old way. In this episode, the CEO and founder of Latimer AI lays out the company's strategy for inclusive AI: replace scraped social content with vetted academic material, digitize underrepresented history, and build guardrails with purpose.Charna and John also explore the implications for enterprise, healthcare, and education — sectors where small biases can cause serious harm. TIMESTAMPS[00:00:00] — Intro [00:02:00] — John's Journey into AI[00:04:00] — Data Sources & Historical Archives[00:06:00] — Underrepresented Digital Histories[00:08:00] — Flawed Training Sets in LLMs[00:10:00] — Measuring & Detecting Bias[00:12:00] — Algorithmic Bias in Hiring[00:14:00] — Copyright & Ethical Data Use[00:16:00] — Multimodal Platform Rollout[00:18:00] — Enterprise Privacy & LLM Hosting[00:20:00] — Optimism & Intergenerational Impact[00:22:00] — Founding in a Crowded Market[00:26:00] — Charna's Takeaways on Systemic Bias[00:28:00] — Guardrails vs Structural Solutions[00:30:00] — Training Data vs Output Behavior[00:32:00] — Algorithmic vs Contextual Bias[00:34:00] — Providing Cultural Context to LLMs[00:36:00] — Community-Based Data Labeling[00:38:00] — The Yard Tour & HBCU Partnerships[00:40:00] — Wrapping up the Season & What's Next QUOTESJohn Pasmore “If a company is using AI to look at resumes, what is it? How is it classifying people's names or, we're surprised that sometimes it's using the name and coming to some conclusion about the desirability of a candidate just based on their name, where maybe that wasn't the intent."Charna Parkey “Instead of modifying the model itself, we can say, okay, here's a historical context, here's a new cultural insight, and here's the situation. Now tell me about the outcome, right?"

ceo ai hiring takeaways bias wrapping founding guardrails algorithmic crowded market algorithmic bias charna

How Open Data and AI Are Transforming Environmental Monitoring | Gracie Ermi

Play Episode Listen Later Jul 15, 2025 35:49

Machine learning scientist Gracie Ermi joins Charna Parkey to explore how AI and open-source satellite data are changing the way we understand land use, climate impact, and environmental risk. At Impact Observatory, she helps create high-resolution, publicly available maps used by educators, researchers, and global organizations alike. A conversation about the technical challenges behind these tools, what open access really looks like in practice, and the role AI plays in making environmental data faster and more useful. QuotesCharna Parkey“One of the most exciting things about where AI is headed is that we're finally expanding its use beyond language. Gracie's work is a prime example of how machine learning can interpret physical space, detect environmental change, and deliver insights that matter. It's a reminder that AI isn't just a chatbot—it's a tool to see, sense, and protect the planet.” Gracie Ermi“The biggest innovation we need right now isn't necessarily a new AI model. It's better, cheaper satellite imagery—especially higher-resolution data that's still open access. Right now, we're working mostly with Sentinel imagery, which has a 10-meter resolution. That's great for a lot of things, but it limits what you can detect. Individual buildings, small changes—they get lost at that scale. If higher-res data became more affordable or openly available, it would change everything.” Timestamps00:00:00 – Introduction to Gracie Ermi and Impact Observatory's mission using AI and open data for environmental monitoring.00:02:00 – Gracie shares how she discovered computer science and open source, and how that shaped her interest in using tech for impact.00:04:00 – Why Gracie chose to work at a mission-driven organization that prioritizes open access and environmental good.00:06:00 – Real-world uses of Impact Observatory's open-source maps00:08:00 – Challenges around tracking open-source usage and the tension between openness and attribution in the ecosystem.00:10:00 – How AI speeds up the creation of land-use maps00:12:00 – Discussion on classical computer vision versus GenAI in geospatial work00:14:00 – The technical limitations of current satellite imagery, particularly resolution and frequency, and how they affect output.00:16:00 – Ethical considerations of increasing image resolution and what it might mean for privacy and surveillance.00:18:00 – Reflections on unexpected risks and consequences that come with technological advancement in mapping.00:24:00 – Advice for people with nontraditional backgrounds who want to enter AI or conservation tech.00:26:00 – How Gracie uses GenAI tools like ChatGPT to overcome creative friction and emotional resistance to complex tasks.00:28:00 – How large language models might help make geospatial tools more accessible, and what's next for the field.

ai real advice challenges reflections chatgpt transforming individual environmental ethical monitoring sentinel genai open data ermi

Multi-Agent Systems and Human-Agent Collaboration | Rodrigo Nader

Play Episode Listen Later Jul 1, 2025 58:55

In this episode, Charna Parkey welcomes Rodrigo Nader, the founder of Langflow, an open-source, low-code app builder for multi-agent AI systems. Rodrigo and Charna dive into his beginnings in a small Brazilian town to the future of AI and the emergence of multi-agent systems. Discover how these systems will enable human-agent collaboration, increase productivity, and solve complex problems across various industries.---TIMESTAMPS00:01:00 Introduction to Rodrigo Nader, CEO and founder of Langflow, and an overview of Langflow's mission and recent developments.00:03:00 - Rodrigo Nader's background and journey into open-source, data science, and machine learning, including his early experiences with MIT OpenCourseWare and Kaggle.00:06:00 - Rodrigo's work at Bitvore Corp, focusing on structuring financial data using machine learning, and his introduction to the open-source AI ecosystem.00:10:00 - The inspiration behind Langflow, including the idea of connecting multiple AI models to create a more powerful, trainable system.00:15:00 - Discussion on the evolution of AI agents, their decision-making capabilities, and the future of multi-agent systems.00:18:00 -The role of agents in AI development, the democratization of AI tools, and the potential for community-driven innovation.00:22:00 -The importance of multi-agent collaboration and the future of human-AI interaction in productivity and task management.00:26:00 - Common use cases for Langflow, including language model pipelines, RAG (Retrieval-Augmented Generation), and agentic systems.00:30:00 - Challenges in AI development, particularly debugging and prompt engineering, and the need for better tools to visualize and monitor AI systems.00:34:00 - Predictions for the future of AI in 2025, including the rise of specialized agents and the importance of human feedback in AI training.00:38:00 - Rodrigo's personal interests outside of AI, particularly his fascination with physics, quantum mechanics, and the concept of time.00:42:00 - Final thoughts on the democratization of AI tools, the importance of community contributions, and advice for aspiring developers and AI enthusiasts.00:46:00 - Reflections with executive producer Leo Godoy, discussing the impact of Langflow, the differences between traditional and AI development, and the rapid pace of AI evolution.QuotesCharna Parkey"For any developer who has sort of avoided the soft skills, the managerial skills, et cetera, you should go listen to some of those courses. You are now going to be managing this AI workforce that you really do need to treat like a team of interns that you're delegating work to, that you're giving feedback on, and all of those skills of sort of like more senior-level engineering of design reviews, code reviews, feedback, like that's gonna be more central than actually writing a line of code yourself."Rodrigo Nader"We're going to see millions and millions more agents than humans very soon, right? So we don't think that these agents are going to emerge from, one, only developers, meaning like hard-code developers, neither from big companies creating solutions that will suddenly solve all the problems."

ceo ai discover challenges predictions brazil reflections collaboration app agent brazilian builder open source nader kaggle datastax open source data charna

Why AI Can't Scale Without Infrastructure Fixes | Darrick Horton

Play Episode Listen Later Jun 17, 2025 50:55

From energy bottlenecks to proprietary GPU ecosystems, the CEO of TensorWave, Darrick Horton explains why today's AI scale is unsustainable—and how open-source hardware, smarter networking, and nuclear power could be the fix.QUOTESDarrick Horton“The energy crisis is getting worse every day. It's very hard to find data center capacity—especially capacity that can scale. Five years ago, 10 or 20 megawatts was considered state-of-the-art. Now, 20 is nothing. The real hyperscale AI players are looking at 100 megawatts minimum, going into the gigawatt territory. That's more than many cities combined just to power one cluster.”Charna Parkey“We're still training models in a very brute-force way—throwing the biggest datasets possible at the problem and hoping something useful emerges. That's not sustainable. At some point, we have to shift toward smarter, more intentional training methods. We can't afford to be wasteful at this scale.”TIMESTAMPS[00:00:00] Introduction[00:01:00] Founding TensorWave[00:04:00] AMD as a Viable Alternative[00:08:00] Open Source as a Startup Enabler[00:09:30] Launching ScalarLM[00:12:00] ScalarLM Impact and Reception[00:14:30] Roadmap for 2025[00:16:00] Technical Advantages of AMD[00:18:00] Emerging Open Source Infrastructure[00:20:00] Broader Societal Issues AI Must Address[00:22:00] AI's Impact on Global Energy[00:26:00] Fundamental Hardware vs. Human Efficiency[00:30:00] Data Center Density Evolution[00:34:00] Advice to Founders and Tech Trends[00:38:00] AI Energy Challenges[00:44:00] AI's Rapid Impact vs. Internet[00:46:00] Monopoly vs. Democratization in AI[00:50:00] Close to Season Wrap Discussion and Predictions

Building Open-Source LLMs with Philosophy | Anastasia Stasenko

Play Episode Listen Later Jun 3, 2025 57:45

Join Charna Parkey as she welcomes Anastasia Stasenko, CEO and co-founder of pleias, through her unique journey from philosophy to building open-source, energy-efficient LLMs. Discover how pleias is revolutionizing the AI landscape by training models exclusively on open data and establishing a precedent for ethical and socially acceptable AI. Learn about the challenges and opportunities in creating multilingual models and contributing back to the open-source community. QUOTES[00:00:00] Introducing Anastasia and pleias[00:02:00] From Philosophy to AI[00:06:00] The Problem of Generic Models[00:10:00] Open Weights vs. Open Source vs. Open Science[00:14:00] Why Open Data Matters[00:18:00] High-Quality, Specialized Models[00:22:00] Multilingual Challenges[00:26:00] Global Inclusion Requires Small Models[00:30:00] Using and Contributing to Wikidata[00:38:00] The Future: Specialized Models[00:48:00] Advice for Newcomers[00:54:00] Cultural Sensitivity and Data Representation[00:50:00] Leo's Takeaways[00:52:00] Charna on Ethical, Verifiable AI[00:54:00] Representation vs. Exclusion[00:56:00] Letting People Be More Human[00:57:30] Applied, Transformative AIQUOTESCharna:"If you didn't make it represented in the data, then we're leaving another culture behind... So which one are you wanting to do, misrepresent them or just completely leave them behind from this technical revolution?"Anastasia:"The real issue now is that the lack of diversity in the current AI labs leads to the situation where all LLMs look alike."Anastasia:"Being able to design, to find, and also to create the appropriate data mix for large language models is something that we shouldn't really forget about when we talk about the success of what large language models are."

ceo ai discover advice philosophy takeaways representation ethical quotes open source applied contributing high quality exclusion newcomers open science cultural sensitivity wikidata charna

Democratizing Cloud Infrastructure | Kevin Carter

Play Episode Listen Later May 20, 2025 59:19

Discover how Rackspace Spot is democratizing cloud infrastructure with an open-market, transparent option for cloud servers. Kevin Carter, Product Director at Rackspace Technology, discusses Rackspace Spot's hypothesis and the impact of an open marketplace for cloud resources. Discover how this novel approach is transforming the industry. TIMESTAMPS[00:00:00] – Introduction & Kevin Carter's Background[00:02:00] – Journey to Rackspace and Open Source[00:04:00] – Engineering Culture and Pushing Boundaries[00:06:00] – Rackspace Spot and Market-Based Compute[00:08:00] – Cognitive vs. Technical Barriers in Cloud Adoption[00:10:00] – Tying Spot to OpenStack and Resource Scheduling[00:12:00] – Product Roadmap and Expansion of Spot[00:16:00] – Hardware Constraints and Power Consumption[00:18:00] – Scrappy Startups and Emerging Hardware Solutions[00:20:00] – Programming Languages for Accelerators (e.g., Mojo)[00:22:00] – Evolving Role of Software Engineers[00:24:00] – Importance of Collaboration and Communication[00:28:00] – Building Personal Networks Through Open Source[00:30:00] – The Power of Asking and Offering Help[00:34:00] – A Question No One Asks: Mentors[00:38:00] – The Power of Educators and Mentorship[00:40:00] – Rackspace's OpenStack and Spot Ecosystem Strategy[00:42:00] – Open Source Communities to Join[00:44:00] – Simplifying Complex Systems[00:46:00] – Getting Started with Rackspace Spot and GitHub[00:48:00] – Human Skills in the Age of GenAI - Post Interview Conversation[00:54:00] – Processing Feedback with Emotional Intelligence[00:56:00] – Encouraging Inclusive and Clear Collaboration QUOTESCHARNA PARKEY“If you can't engage with this infrastructure in a way that's going to help you, then I guarantee you it's not up to par for the direction that we're going. [...] This democratization — if you don't know how to use it — it's not doing its job.”KEVIN CARTER“Those scrappy startups are going to be the ones that solve it. They're going to figure out new and interesting ways to leverage instructions. [...] You're going to see a push from them into the hardware manufacturers to enhance workloads on FPGAs, leveraging AVX 512 instruction sets that are historically on CPU silicon, not on a GPU.”

AI and the Future of Media Consumption | Pete Pachal

Play Episode Listen Later May 6, 2025 63:58

In this episode of Open Source Data, Charna Parkey interviews Pete Pachal, founder of The Media Copilot. With over two decades of experience covering technology, Pete shares his insights on how AI is transforming media, journalism and discusses how journalists can embrace AI as a tool to enhance their work to adapt and thrive in this new environment. QUOTESPETE PACHAL: AI is something that you control. I know, it feels like it's a wave that's coming over that it's unstoppable, inevitable. And that's true to a large extent. But at the same time, it's not, there's no there, right? There's no spark, there's no intent. (...) Never relinquish your role as the ultimate creator and person responsible for what's coming out of this thing.CHARNA PARKEY: I think that there was a point where I found myself shifting more away from media and towards individual curated newsletters because like subject matter experts in that area, I could be like maybe they're going to summarize it incorrectly, et cetera. But at least I know my theory of mind of that individual. And then when I expand that to media, I don't know who's writing what and who's shadow writing what for who.TIMESTAMPS00:00:00 - Introduction of Pete Pachal and his background in journalism and AI.00:02:00 - Pete's career journey, including his work at CoinDesk and founding The Media Copilot.00:04:00 - AI training for media professionals (journalists, PR, marketers).00:06:00 - Evolution of AI in journalism: From skepticism to ethical frameworks.00:08:00 - AI in content pipelines: Idea generation vs. post-production tasks.00:10:00 - Open-source builders needing to cater to domain experts (e.g., journalists).00:12:00 - Meta's removal of fact-checking and its implications.00:16:00 - Public tolerance for AI errors (e.g., Apple's AI summaries).00:18:00 - Consumer trust shifts away from platforms like Facebook/X.00:22:00 - Ghostwriting vs. authenticity in AI-generated content.00:24:00 - Preference for human-curated newsletters over AI summaries.00:26:00 - AI in news digests (e.g., Perplexity, Alexa).00:28:00 - Publisher AI experiments (Washington Post chatbot, TIME summaries).00:32:00 - AI's impact on click-through rates and publisher economics.00:34:00 - AI-written articles (e.g., ESPN's use case) and copyright issues.00:36:00 - Legal battles over AI training data (NYT vs. OpenAI).00:38:00 - Copyright concerns with AI-generated outputs.00:40:00 - AI search tools (Perplexity, ChatGPT) and publisher licensing deals.00:46:00 - The unhealthy impact of social media trends on journalism.00:48:00 - Post-interview discussion: Accountability in AI and media.00:56:00 - Leo's perspective as a journalist on AI adoption.00:58:00 - Closing thoughts on balancing AI innovation with industry needs.

Your AI Roadmap: Building a Career, Revenue and a Future in AI | Dr. Joan Bajorek

Play Episode Listen Later Apr 22, 2025 55:36

In this episode, Dr. Joan Bajorek—AI entrepreneur, author of Your AI Roadmap, and founder of Clarity AI—joins Charna Parkey to talk about what it really takes to build a future in AI. From career pivots and layoff anxiety to financial transparency and finding joy in your work, Joan shares practical advice and personal stories navigating fear, burnout, and career uncertainty in tech, while staying grounded in purpose, community, and long-term resilience.TIMESTAMPS[00:00:00] — Introduction to Joan Bajorek & Her Work[00:02:00] — Transparency About Finances and Career[00:04:00] — The Taboo Around Talking About Money[00:06:00] — Resilience During Tech Layoffs[00:08:00] — How to Get Credit for Your Work[00:12:00] — Should You Chase an AI Job?[00:14:00] — Career Goals vs. Financial Security[00:16:00] — Translating Academic and Life Skills into Tech[00:18:00] — Defining and Finding Joy in Work[00:20:00] — Multiple Income Streams and Personal Freedom[00:24:00] — AI's Near-Future Impact on Jobs and Industries[00:26:00] — Data and AI Opportunities in Underexplored Domains[00:34:00] — Creating Scalable, Alternative Income Models[00:36:00] — How Joan Maintains Long-Term Motivation[00:42:00] — Post-Interview DiscussionQUOTESJoan Bajorek"Networking is how I've gotten the best opportunities and jobs of my life... LinkedIn has this research about how after COVID layoffs, 70% of people landed their next job based on an intro."Charna Parkey"I always try to strive for transparency, and I get such mixed results where at work with coworkers, it's absolutely valued. And then there seems to always be some sort of consequences in my personal life."

Cooperative Systems, Data Transparency & Quality and the Year of Small AI | Dr. Jason Corso

Play Episode Listen Later Apr 8, 2025 63:09

Dr. Jason Corso joins Charna Parkey to debate the critical role of data quality, how its transparency shapes AI development and the rise of smaller, domain-specific AI models - making 2025 the year of small, specialized AI. QUOTESCharna Parkey"Knowing the right data is incredibly important, because it'll save you money, but predicting the impact of that data means that you don't have to do the training at all to even directionally know if it's going to work out, right?"Jason Corso "You can't understand and analyze an AI system in the way you can analyze open source software if you don't have access to the data."Timestamps[00:00:00] - Introduction[00:02:00] - Jason Corso's journey on open source[00:08:00] - The importance of data in AI[00:10:00] - Voxel 51's mission[00:14:00] - The value of open source and the importance of data in AI systems[00:20:00] - Recent discoveries in AI[00:28:00] - The cost of training AI models[00:36:00] - Cooperative AI in healthcare[00:40:00] - Charna Parkey on the impact of AI in education[00:56:00] -The year of small AI

ai technology data transparency open source llm cooperative genai corso voxel open source data

Building the Future of Streaming Data | Alex Gallego

Play Episode Listen Later Apr 3, 2025 55:48

In this episode of Open Source Data, Charna Parkey talks with Alex Gallego, CEO and founder of Redpanda Data, about his journey as a builder, the evolution of Redpanda, and the company's new agent framework for the enterprise. Alex shares insights on low-latency storage, distributed stream processing, and the importance of developer experience to the growth of AI and the Open Source space. Timestamps[00:00:00] Introduction[00:02:00] Alex Gallego talks about his background[00:04:00] Charna Parkey discusses the importance of hands-on experience in learning.[00:06:00] Alex explains the origins of Red Panda and how it emerged from challenges in the streaming space.[00:08:00] Alex details the evolution of Red Panda, its use of C-Star and FlatBuffers, and its low-latency design.[00:11:00] Alex discusses the positioning of Kafka versus Red Panda in the market.[00:20:00] Alex introduces Red Panda's new agent framework and multi-agent orchestration.[00:24:00] Alex explains how Red Panda fits into the evolving landscape of AI-powered applications.[00:30:00] The future of multi-agent orchestration.[00:44:00] Thoughts on AI model training and data retention.[00:46:00] Alex encourages future founders and shares his perspective on risk-taking.[00:50:00] Charna Parkey and Leo Godoy discuss the key takeaways from the conversation with Alex Gallego.[00:52:00] Charna reflects on open source trends and the role of developer experience in adoption.[00:54:00] Charna and Leo talk about the different types of founder journeys and the importance of team dynamQuotes Charna Parkey"For AI, unifying historical and real-time data is critical. If you're just using nightly or monthly data, it doesn't match the context in which your prediction is being made. So it becomes very important in the future of applying AI because you need to align those things."Alex Gallego"Every app is going to span three layers. The first layer is going to be your operational layer, just like you have to do business right now. Then there always has to be an analytical layer, and the third layer is this layer of autonomy."

ceo ai data streaming open source kafka gallego building the future red panda open source data charna

What is Neuro-Symbolic AI? | Emin Can Turan

Play Episode Listen Later Mar 11, 2025 56:22

In this episode, we dive deep into the world of neuro-symbolic AI with Emin Can Turan, CEO of Pebbles AI. Learn how this technology combines neuroscience, behavioral economics, and AI to revolutionize B2B go-to-market strategies. Emin explains how neuro-symbolic AI bridges the gap between human logic and machine learning, enabling smarter, context-aware systems that democratize complex workflows for startups and enterprises alike.Timestamps[00:00:00] - Introduction by Charna Parkey and introduction of Emin Can Turan.[00:02:00] - Emin's journey to AI and his background in go-to-market strategies.[00:06:00] - Emin explains his deep R&D phase and the development of neuro-symbolic AI.[00:08:00] - Emin describes the architecture of their AI system, including neuro-symbolic AI, generative AI, and agentic frameworks.[00:10:00] - Explanation of neuro-symbolic AI and its relevance to domain-specific problems.[00:12:00] - Discussion on the components of go-to-market strategies and the role of psychology and communication.[00:16:00] -The limitations of generative AI and how they applied strict communication tactics.[00:22:00] - Discussion on the importance of contextual science and data insights.[00:24:00] - The three agentic frameworks they use in their system.[00:26:00] - Explanation of how users control the product and the two co-pilots (strategy and execution).[00:36:00] - The ethical implications of AI and the potential for misuse.[00:38:00] - Discussion on the future of AI and the balance between dystopian and hopeful outcomes.[00:40:00] - Emin emphasizes the importance of truth and transparency in AI development.[00:42:00] - Emin shares his personal motivation for building his AI startup.[00:48:00] - Closing remarks and discussion on the user experience of their platform.[00:50:00] - Charna and Leo discuss the connection between Emin's work and the open-source community.QuotesEmin Can Turan"I felt that this was the future and that AI was the only technology that can digitalize this level of complexity for everyone to use. Nothing else could, you know, you can't use normal neural networks to do this. Even generative AI is not sufficient enough."Charna ParkeyI would love to be able to use Gen AI for more personal things. I love technology. I have the Oura Ring. I've got the Apple Watch. I want to feed that data into something that can somehow tell me and others, here's your state of mind. Here's what you're going to be affected by.

ceo ai b2b platforms explanation apple watches open source neuro r d genai symbolic oura ring emin turan open source data charna

How to Empower Non-Technical Teams with Data Insights | Suzanne El-Moursi

Play Episode Listen Later Feb 25, 2025 55:23

Learn how BrightHive's AI-powered platform is democratizing data insights, making them accessible to non-technical teams across organizations. Suzanne El-Moursi discusses the importance of data fluency and how BrightHive is helping businesses harness the power of their data.Timestamps00:00:00 - Introduction and Background00:02:30 - Journey to BrightHive and open source00:06:00 - The evolution of AI and BrightHive's approach00:14:00 - The data problem and the role of AI agents00:22:00 - Building BrightBot with open source frameworks00:26:00 - The future of AI agents and open source00:30:00 - People's reaction to DeepSeek 00:34:00 - The future of work and AI00:40:00- AI in education and personal growth00:42:00 - Suzanne's legacy 00:48:00 -Recap and takeaways with producer Leo GodoyQuotesCharna Parkey "Every single innovation comes out of some form of restriction or need. (...) Don't come and say, “oh, what is this? This is terrible”. I heard all kinds of responses to my excitement and to my belief."Suzanne El-Moursi"So if 97% of an organization is data consumers, there are strategists, the marketing analysts, the customer success associates, the managers all across the enterprise, who need to understand the insights in the company's data, in their functions, in their units, so that they can make the next right step for the customer and for their plan."

ai work future data empower technical future of work open source llm data insights ai00 open source data

Open Source AI and Copyright: Building Ethical Models | Kent Keirsey

Play Episode Listen Later Feb 11, 2025 70:19

QuotesKent Keirsey "When we look at open source models, if you just release the weights, and you don't really release information on how the data set was captioned, for example, or how you construct the data set, if you don't really know how it got to the artifact that was released, as a user, you do not understand how it works."Charna Parkey But there's still a lot of claims by big tech right now about how anything on the internet should be fair use for training, even if, you know, it might have its own kind of copyrightTimestamps[00:02:00] - Kent Keirsey on his journey to open source[00:06:00] - Kent Keirsey on the Open Model Initiative (OMI)[00:08:00] -What makes a model truly open source[00:12:00] - The legal landscape of AI and copyright[00:14:00] - Kent Keirsey on the ethical implications of AI training data fair and use and AI development[00:26:00] Creativity, AI tools, personal AI models and recommendation algorithms:[00:32:00] - Kent Keirsey on TikTok and cultural clash:[00:38:00] - AI, self-reflection and a decision-making tool[00:42:00] - The Bria AI partnership[00:52:00] - The future of creativity, AI and Robotics:[01:00:00] - Final thoughts with producer Leo GodoyConnect with Kent KeirseyConnect with Charna Parkey

tiktok ai creativity ethics kent models ethical copyright robotics open source open source ai open source data

Building Trust in AI: From Open Source to Global Impact with host, Charna Parkey

Play Episode Listen Later Oct 8, 2024 44:03

Join Charna Parkey as she recaps a transformative year in AI, exploring the delicate balance between innovation and ethics. From open source communities to global regulations, discover how trust, diversity, and collaboration are shaping the future of technology.

ai technology data artificial intelligence open source building trust global impact parkey open source data charna

AI Regulations in Financial Services with Vinay Kumar

Play Episode Listen Later Sep 24, 2024 54:05

Episode QuotesVinay Kumar"I always believe in this: you don't need to solve a very large problem. Maybe it will take a lot of time to do that. A lot of resources to do that but something small, which you can have an opportunity to solve that could be very big or a fundamental for quite a bit is fantastic. Think of a scenario where your small fundamental idea is a base for another small fundamental idea for someone else." Charna ParkeyWe also want to ground it a little bit in impact we've been seeing. And I think in the financial, banking, insurance industries it's not, I would say, an even distribution of advancement. Different countries have different regulations and different appetites for risk."Timestamps- [00:00:00] Introduction by Charna Parkey.- [00:01:57] Vinay Kumar begins talking about his journey.- [00:05:27] Discussion on building a search engine for STEM researchers.- [00:07:06] Challenges with early deep learning.- [00:09:55] Conversation shifts to ML observability.- [00:17:06] Discussion on simplifying verticalized AI.- [00:22:30] Impact of large language models (LLMs) on AI.- [00:30:58] Comparison of autonomous cars with AI regulation.- [00:37:58] Vinay mentions his science fiction novels.- [00:42:19] Conversation summary with Producer Leo Godoy.

ai conversations challenges impact comparison stem financial services ml kumar ai regulations

The importance and the Challenges & Solutions of AI Literacy with Brian Magerko

Play Episode Listen Later Aug 13, 2024 54:19

QuotesBrian Magerko“We're really trying to show that we could co-create experiences with AI technology that augmented our experience rather than served as something to replace us in creative act”.“For every project like [LuminAI], there's a thousand companies out there just trying to do their best to get our money... That's an uncomfortable place to be in for someone who has worked in AI for decades”.“I had no idea what was going to happen kind of in the future. When we started EarSketch... we were advised by a couple of colleagues to not do it. And here we are, having engaged over a million and a half learners globally”.Charna Parkey"I remember the first robot that I built. It was part of the first robotic systems... and watching these machines work with each other was just crazy."“If you're building a product and your goal is to engage underrepresented groups, it is on you to make sure that you're educating the folks in a way that you're trying to reach.”Episode timestamps(01:11) Brian Magerko's Journey into AI and Robotics (05:00) LuminAI and Human-Machine Collaboration in Dance(09:00) Challenges of AI Literacy and Public Perception(17:32) Explainable AI and Accountability (20:00) The Future of AI and Its Impact on Human Interaction

ai future challenges dance accountability robotics its impact public perception human interactions explainable ai ai literacy

Demystifying AI Governance: A Practical Guide for Organizations with Heather Domin

Play Episode Listen Later Jul 30, 2024 47:44

Timestamps00:00:00 - 00:01:23 - Introduction00:01:23 - 00:04:30 - Heather Domin's Journey00:09:50 - 00:12:48 - Open Source and AI Ethics00:12:48 - 00:15:25 - Generative AI and Governance00:23:40 - 00:26:22 - Future of Responsible AI Practices00:35:37 - 00:37:31 - Advice for the Audience00:37:31 - 00:46:04 - Reflection on Risk and Hope in AI QuotesHeather Domin"I think that each of us individually can scan our environment and understand, you know, where can I make an impact? What problem can I help solve? What is the next thing that I can really contribute to?""There are absolutely ways to automate, you know, the prompt testing and many of the routine tasks that you want to leverage automation in that way so that you can actually have the humans focus on other, other things so they can focus on the critical thinking and outside the box sort of thinking that we want the humans to be focused on."Charna Parkey"I think that it's a hard for people getting into it for the first time to jump to hope if they've experienced something that they should fear in the past. By that, I mean, groups that have been marginalized by other forms of technology are not going to start hopeful with this new one that is is using their data without their permission..""If for some reason I came to understand in a month what that meant, I should be able to go back and revoke and be like, nope, I actually don't want you to have that anymore. So I think that that would help people feel better." Check Heather's paper: On the ROI of AI Ethics and Governance Investments Connect with HeatherConnect with Charna

ai future advice risk artificial intelligence reflection ethics roi organizations governance demystifying open source genai practical guide ai ethics domin charna

Transforming Food Systems with Regenerative AI with Ethan Soloviev

Play Episode Listen Later Jul 16, 2024 60:55

Timestamps1. Introduction and Background (00:00:00 - 00:01:16)2. Ethan's Journey (00:01:16 - 00:05:12)3. The Role of Food and Agriculture (00:05:12 - 00:06:52)4. Investment in Regenerative Agriculture and Generative AI (00:06:52 - 00:07:44)5. Levels of AI Impact (00:07:44 - 00:12:42)6. HowGood's Use of AI (00:12:42 - 00:13:20)7. Consumer Impact and Corporate Responsibility (00:13:20 - 00:15:44)8. Future of AI in Food Systems (00:15:44 - 00:20:30)9. Innovative Perspectives on AI Training (00:20:30 - 00:21:10) QuotesEthan Soloviev"What if we're using ecological data? What if we're training on trees and insects and animals and whale song? What kind of questions would a gen AI trained on whale song and hummingbird language ask us?"Charna Parkey"If we have this great translator that is Gen AI, we already have text and language to code. We can do code generation. We can already interpret this code and tell me what it's going to do. Take that code to language. Why can't we do that with some of these other senses and these other measurements?"Connect with EthanConnect with Charna

ai future food investment transforming levels agriculture regenerative genai food systems regenerative agriculture corporate responsibility ai impact ai training charna

Redefining AI Ethics: The Key Role of Explainability with Beth Rudden

Play Episode Listen Later Jul 2, 2024 53:18

Timestamps00:00:00 - Intro00:02:00 - Beth's Journey00:19:33 - Ontologies in AI00:21:44 - Data Lineage and Provenance00:32:52 - Open Source Tools00:38:38 - Explainable AI00:44:58- Inspiration from NatureQuotesBeth Rudden: "The best thing that I could tell you that I see is that it's going to shift from more pure mathematical and statistical to much more semantic, more qualitative. Instead of quantity, we're going to have quality."Charna Parkey: "I love that because I've been so mathematical for most of my life. I didn't have a lot of words for the feelings or expressions, right? And so I had sort of this lack of data and the Brené Brown reference you make, like I have many of her books on my shelf and I often pull, I don't even know where it is right now, but the Atlas of the Heart because I am having this feeling and I don't know what it is."LinksConnect with BethConnect with Charna

ai heart inspiration ethics redefining machine learning bren brown genai ai ethics ontology explainability ai00 charna data lineage

Eliminating AI Bias Through Inclusive Data Annotation with Andrea Brown

Play Episode Listen Later Jun 18, 2024 45:56

Learn how Andrea Brown, CEO of Reliabl, is revolutionizing AI by ensuring diverse communities are represented in data annotation. Discover how this approach not only reduces bias but also improves algorithmic performance. Andrea shares insights from her journey as an entrepreneur and AI researcher. Episode timestamps(02:22) Andrea's Career Journey and Experience with Open Source (Adobe, Macromedia, and Alteryx)(11:59) Origins of Alteryx's AI and ML Capabilities / Challenges of Data Annotation and Bias in AI(19:00) Data Transparency & Agency(26:05) Ethical Data Practices(31:00) Open Source Inclusion Algorithms(38:20) Translating AI Governance Policies into Technical Controls(39:00) Future Outlook for AI and ML(42:34) Impact of Diversity Data and Inclusion in Open SourceQuotesAndrea Brown"If we get more of this with data transparency, if we're able to include more inputs from marginalized communities into open source data sets, into open source algorithms, then these smaller platforms that maybe can't pay for a custom algorithm can use an algorithm without having to sacrifice inclusion." Charna Parkey“I think if we lift every single platform up, then we'll advance all of the state of the art and I'm excited for that to happen."Connect with AndreaConnect with Charna

Regulation's Role in Driving Responsible AI with Asa Whillock

Play Episode Listen Later Jun 4, 2024 58:16

Episode timestamps(01:47) Asa Whillock's career journey at market-leading companies and the role of open source in each (Adobe, Macromedia, Alteryx)(04:56) Feature Labs acquisition by Alteryx and its open source roots in democratizing machine learning capabilities(11:00) Survey findings on enterprise board members' perspectives on AI and the need to move beyond policy creation to implementation and governance.(27:00) Applying AI capabilities and decision-making related to AI (30:00) The future of AI predominance, including cost reduction, open source model advancements, and the push for demonstrating business value(43:33) Advice for navigating AI expertise and decision-making, including continuous learning, self-awareness of decision-making models, and acknowledging knowledge limitsQuotesAsa Whillock"I love regulation. I think it's great. And people are like, what? Why would you say that? And the reason why I say that is because I think it puts a floor underneath all of us of what do we think good looks like?"Charna Parkey"I think we need to, as a community, focus on meeting them where they are if we really want the democratization that is promised. Yeah, I don't know any other way to do it."

ai advice driving survey regulation adobe responsible ai alteryx applying ai macromedia

Transforming Client Experience with AI with Robbi Armstrong

Play Episode Listen Later May 21, 2024 60:25

Episode Timestamps(02:11): Robbi Armstrong's role at KeyBank and intersection with open source and AI initiatives in the financial industry(04:06): Compliance and regulatory trends in AI for banking(12:10): Organizational Change Management with AI(28:00): Responsible and Ethical AI(37:00): Financial Literacy and AI QuotesRobbi Armstrong“I truly believe that if you are an organization and you are sitting back and you're not organizing a team and you're not organizing a program and you're not learning, you're not looking at education, you're not looking at change management around Gen AI, I don't think you'll be here in two years. I really truly believe that. Because you won't be able to compete."Charna Parkey“I think the democratization is real and I think it's incredibly important because that step in between the domain expert and the technology is very lossy. You know, oftentimes we say, well, if only I had the data to answer your question let me give you a different answer or let me answer it completely and now we can actually put it in the hands of the experts and say, well, oh, then let's go collect that data." LinksConnect with RobbiConnect with Charna

ai technology clients transforming banking responsible compliance customer service armstrong financial literacy genai client experience keybank organizational change management charna

Navigating Open Source Talent, AI & Policy Challenges with Amanda Brock

Play Episode Listen Later May 7, 2024 40:16

Episode timestamps(05:06): State of open source in the UK (07:22): Importance of open source community (15:19): Balancing openness and regulation in AI (21:19): Pace of technological development and regulation(28:21): Reliability and discernment with AI outputs(35:24): Universal advice QuotesAmanda Brock“I think the governments that are going to win, the governments that are going to have the best regulation that promotes most innovation are going to be the ones which are able to make their regulatory environment flow in the same way as the technology evolution and innovation flows."Charna Parkey"I think the expectation needs to change. Part of what has happened with, you know, literal text search or keyword search and just Google and things like that, is that the average person expects what comes back to be relatively factual. That it's been referenced and, you know, backlinked, etc. That's a deterministic system. These are not. These are based upon statistical likelihoods of what word should come next." LinksConnect with CharnaConnect with Amanda

ai google uk state navigating innovation talent balancing universal pace open source reliability policy challenges open source data

Using AI to Impact Performance Feedback Equity with Tacita Morway

Play Episode Listen Later Apr 23, 2024 48:01

Episode timestamps(02:15): Tacita's unconventional career path to becoming a CTO (07:00): Textio's practices for building AI responsibly and ethically (14:00) The impact of Textio's AI on performance feedback (17:00) The importance of purpose-built vs generic AI models(28:00) Balancing open source and proprietary data/models (42:00) Advice for the AI industry moving forward QuotesTacita Morway“When you've got a team with different backgrounds, educational, lived experiences, identity, careers, all of those things, we have those different perspectives in the room. And we're all working off of the same expectations. We can catch each other's gaps.”Charna Parkey“There's an interesting conversation happening, I think, in the community right now about these purpose-built LLMs. Are they as good as generic LLMs? Sure, certainly if you're not going to apply something purpose-built to something generic or outside of its domain, it is not as good. But I think some of this shows us that unless you have something purpose-built and unless you're leveraging the data in the right way, you may just be feeding noise back into the system.” LinksConnect with TacitaConnect with Charna

ai technology advice performance artificial intelligence balancing equity cto using ai textio open source data charna

The Ethical Path to High-Quality AI Data with Fabiana Clemente

Play Episode Listen Later Apr 9, 2024 50:12

Timestamps(00:02:29) Fabiana's journey starting YData and becoming a public speaker (00:20:19) Misconceptions and hype around generative AI and AGI (00:32:46) Potential real-world impact and use cases of LLMs today (00:34:55) The role of synthetic data in making AI models more robust and fair (00:43:55) Advice for founders: value your time and learn to say no (00:48:24) The importance of technical leaders being able to communicate well QuotesCharna Parkey: "It's a balance. I think that's also what led us to some of the demographic based data science. Essentially, folks were making like event data into pre-aggregated data. And then they were trying to obscure it so much that you couldn't get back to the person. And so you're like, okay, what's their age and what's their gender? And you're like, that's not actually the most useful part of data science that can't predict behavior or intent or any of that. It throws out time as a component of the entire process, seasonality, everything. And so there just, there has to be a better way."Fabiana Clemente: "I have to say, that's a very beautiful way to put it. Hallucinations, I have to say. I never thought about that. And it makes a lot of sense. I do think, though, that in terms of LLMs, it's so language, it's so definitely, it sounds like we are getting very, very intelligent system, exactly, because language is very complex. And we know that was needed for the leap of humanity. I do think there are other, the sense of combining. Well, and here we enter in the multimodal kind of space. It's what's missing." LinksConnect with CharnaConnect with Fabiana

ai technology advice safety artificial intelligence misconceptions ethical high quality agi hallucinations clemente fabiana ai data open source data

Disrupting Data Analysis with Avi Press

Play Episode Listen Later Mar 26, 2024 50:35

Episode timestamps(02:15): Challenges of collecting open source usage data(22:06): Driving impact with open source usage data(28:27): Avi's entrepreneurial journey(39:42) Persistence and vision in startups(44:03) Tracking outcomes to stay motivated QuotesAvi Press“I mean, one thing is, for any project that you might be thinking about doing or any initiative that you want to work on or goal that you have, I think there's a lot of power in just trying the thing. You may not have all the details figured out, but just try it anyway and see where it takes you. And I think a lot of projects that I've ever worked on that led anywhere, I didn't know all these details, but I just start trying and seeing what works anyway and being very open to it not working out, but attempting it anyway. And then the other thing, which is I think admittedly fitting into our agenda at Scarf, but it is something that I really believe, which is that for any of these things you're doing, tracking the outcomes of that thing is very, very important and will both be tactically helpful, but also I think, like you said, give you these inspirational moments that keep you going, whether that's awe or inspiration or fulfillment or whatever that feeling is that helps you keep going. I think that tracking the outputs of your work such that you can understand the impact that you have is both very strategic and the most rewarding way to do anything, I think”. Charna Parkey“Given the venture-backed nature of a lot of these startups, there's going to have to be some sort of monetization at some point. You're not gonna have 1 million, 10 million, 40 million dollars dumped into just giving software away for free. So sort of these misaligned motivations are certainly what raised my hackles where I'm like, oh, you're claiming forever or you're claiming that you're like a values-driven organization, but you're venture-backed and you need to make money. And so show me how those motivations align or misalign. Tell me what your monetization strategy is gonna be. I know you need one. That way I'm not wondering, should I use this? Should I not?” LinksConnect with CharnaConnect with Avi

challenges press driving tracking persistence disrupting data analysis scarf

Tech, Trust, and Transformation with Paula Paul

Play Episode Listen Later Mar 12, 2024 53:58

Timestamps00:00 - Intro05:10 - Paula's Professional Journey10:30 - What Inspired Paula to Go Through the Open Source Path14:50 - What are some of the biggest challenges and impacts that Paula sees in companies trying to derive value?23:30 - Is the Tech World a Meritocracy? 25:35 - A Shift Of What is a Tech Company?27:30 - Kids Interacting with New Technologies31:30 - What Does Open Source Data Means to Paula? 42:50 - What is a Question that Paula has never been asked before?47:00 - What Advice would you give to the audience? 51:50 - Backstage with Executive Producer Leo Godoy LinkedIn - Connect with CharnaLinkedin - Connect with Paula

trust tech transformation backstage tech companies meritocracy tech world paula paul

An Innovative Approach to AI & NLP with Milos Rusic

Play Episode Listen Later Feb 27, 2024 45:30

innovative milos

New Beginnings: Open||Source||Data in Transition

Play Episode Listen Later Dec 20, 2023 50:14

This episode features an interview with Charna Parkey, Real-Time AI Product and Strategy Leader at DataStax. Charna has been developing AI and ML products over the last 17 years and has worked with 90 of the Fortune 100 in her various roles. She is also a co-author and inventor on several patents.In this episode, Sam and Charna discuss handing over the role as host, Sam's new startup journey, and how their thinking has evolved during the explosion of LLMs.-------------------“Now, it seems like we have this opportunity where the conversation and the place that society is at is different. Where we want to contribute to the right set of data when we talk open source data. We want to make sure that we have the right data to train this model in order to get the right outcome. We want to provide a lens of, ‘All right, you are this persona. How would you say this thing?' I do think that from a lot of what the LLMs have today, the outcome of those words are still missing. And we need to solve that. Like, ‘Is this piece of writing actually going to achieve the outcome I want versus am I following legal's guidelines? Am I technically correct? Is my CEO going to like it?' That doesn't mean you're achieving impact in the world. There's an aspect there where we've given feedback loops, it seems, to be like, ‘Did I like the answer or not?' But not, ‘Did I take an action?' As we get to autonomousness, we're going to have to have an outcome or multiple outcomes associated with the reward of the system.” – Charna Parkey“I personally believe that all cognition is bias. My degree is in cognitive science. One of the things that we trained on is attention. And to pay attention, literally means to selectively choose what data is coming in from the world that you're going to pay attention to and what you're going to discard. Which is also, to me, the definition of bias. All cognition is bias, but what do we care about? Do you trust this thing? What does that mean? Well, do you trust it to do these particular actions to a level of consistency in this particular domain? It doesn't mean that you're going to trust it in all environments. There's a lot more nuance that hopefully will evolve in this strange age of nuanced destruction machines.” – Sam Ramji-------------------Episode Timestamps:(01:04): Sam and Charna catch up (06:05): Sam explains his new company, Sailplane (14:21): How Charna's thinking has evolved during the LLM explosion(25:45): Sam's thoughts after 5 seasons of Open||Source||Data(38:52): What Charna is looking forward to in the next season of the podcast(40:44): A question Sam wishes to be asked(45:45): Backstage takeaways with executive producer, Audra Montenegro-------------------Links:LinkedIn - Connect with CharnaLinkedIn - Connect with SamLearn more about Sailplane

ceo ai data fortune startups transition new beginnings backstage ml llm large language models cognitive science datastax open source data charna strategy leader

The Intersection of Open Source and AI with Stefano Maffulli & Stephen O'Grady

Play Episode Listen Later Dec 13, 2023 55:40

This episode features a panel discussion with Stefano Maffulli, Executive Director of the Open Source Initiative (OSI); and Stephen O'Grady, Co-founder of RedMonk. Stefano has decades of experience in open source advocacy. He co-founded the Italian chapter of Free Software Foundation Europe, built the developer community of the OpenStack Foundation, and led open source marketing teams at several international companies. Stephen has been an industry analyst for several decades and is author of the developer playbook, The New Kingmakers: How Developers Conquered the World.In this episode, Sam, Stefano, and Stephen discuss the intersection of open source and AI, good data for everyone, and open data foundations.-------------------“Internet Archive, Wikipedia, they have that mission to accumulate data. The OpenStreetMap is another big one with a lot of interesting data. It's a fascinating space, though. There are so many facets of the word ‘data.' One of the reasons why open data is so hard to manage and hasn't had that same impact of open source is because, like Stephen, the stories that he was telling about the startups having a hard time assembling the mixing and matching, or modifying of data has a different connotation. It's completely different from being able to do the same with software.” – Stefano Maffulli“It's also not clear how said foundation would get buy-in. Because, as far as a lot of the model holders themselves, they've been able to do most of what they want already. What's the foundation really going to offer them? They've done what they wanted. Not having any inside information here, but just judging by the fact that they are willing to indemnify their users, they feel very confident legally in their stance. Therefore, it at least takes one of the major cards off the table for them.” – Stephen O'Grady-------------------Episode Timestamps:(01:44): What open source in the context of AI means to each guest(16:21): Stefano explains OSI's opportunity to shine a light on models and teams(21:22): The next step of open source AI according to Stephen(25:38): Creating better definitions in order to modify software(33:09): The case of funding an open data foundation(42:31): The future of open source data(51:54): Executive producer, Audra Montenegro's backstage takeaways-------------------Links:LinkedIn - Connect with StefanoVisit Open Source InitiativeLinkedIn - Connect with StephenVisit RedMonk

world ai executive director italian executives software wikipedia intersection open source stefano internet archive osi open source ai openstreetmap stephen o open source initiative open source data redmonk free software foundation europe openstack foundation

Throwback: The AI-Native Stack with Mikiko Bazeley, Zain Hasan, and Tuana Celik

Play Episode Listen Later Nov 15, 2023 57:37

This episode features a panel discussion with Mikiko Bazeley, Head of MLOps at Featureform; Zain Hasan, Senior Developer Advocate at Weaviate; and Tuana Celik, Developer Advocate at deepset.In this episode, Mikiko, Zain, and Tuana discuss what open source data means to them, how their companies fit into the AI-first ecosystem, and how jobs will need to evolve with the AI-native stack.-------------------“We're almost part of a fancy new AI robot kitchen that you'd find in Tokyo, in some ways. I see a virtual feature store as, yes, you can have a bunch of your ingredients tossed into a closet. Or, what you can do is you can essentially have a nice way to organize them. You can have a way to label them, to capture information.” – Mikiko Bazeley“I really like that analogy as well. I like how Mikiko put it where a vector search engine is really extracting value from what you've already got. [...] So where I see vector search engines, really, is if we think of these embedding providers as the translators to take all of our unstructured data and bring it into vector space into a common machine language, vector search engines are essentially the workhorses that allow us to compute and search over these objects in vectorized format. They're essentially the calculators of the AI stack.” – Zain Hasan“Haystack, I would really position as the kitchen. I need Mikiko to bring the apples. I need Zain to bring the pears. I need Hugging Face or OpenAI to bring the oranges to make a good fruit salad. But, Haystack will provide the spoons and the pans and the knives to make that into something that works together.” – Tuana Celik-------------------Episode Timestamps:(02:58): What open source data means to the panelists(09:11): What interested the panelists about AI/ML(24:10): Mikiko explains Featureform(27:00): Zain explains Weaviate(30:23): Tuana explains deepset(36:00): The panelists discuss how their companies fit into the AI-first ecosystem(44:58): How jobs need to evolve with the AI-native stack(54:35): Executive producer, Audra Montenegro's backstage takeaways-------------------Links:LinkedIn - Connect with MikikoVisit FeatureformLinkedIn - Connect with ZainVisit WeaviateLinkedIn - Connect with TuanaVisit deepsetVisit Data-centric AI

head ai executives artificial intelligence tokyo native machine learning throwback openai stack ai ml natural language processing haystack developer advocate senior developer advocate celik open source data weaviate bazeley

How We Should Think About Data Reliability for Our LLMs with Mona Rakibe

Play Episode Listen Later Nov 1, 2023 38:17

This episode features an interview with Mona Rakibe, CEO and Co-founder of Telmai, an AI-based data observability platform built for open architecture. Mona is a veteran in the data infrastructure space and has held engineering and product leadership positions that drove product innovation and growth strategies for startups and enterprises. She has served companies like Reltio, EMC, Oracle, and BEA where AI-driven solutions have played a pivotal role.In this episode, Sam sits down with Mona to discuss the application of LLMs, cleaning up data pipelines, and how we should think about data reliability.-------------------“When this push of large language model generative AI came in, the discussions shifted a little bit. People are more keen on, ‘How do I control the noise level in my data, in-stream, so that my model training is proper or is not very expensive, we have better precision?' We had to shift a little bit that, ‘Can we separate this data in-stream for our users?' Like good data, suspicious data, so they train it on little bit pre-processed data and they can optimize their costs. There's a lot that has changed from even people, their education level, but use cases also just within the last three years. Can we, as a tool, let users have some control and what they define as quality data reliability, and then monitor on those metrics was some of the things that we have done. That's how we think of data reliability. Full pipeline from ingestion to consumption, ability to have some human's input in the system.” – Mona Rakibe-------------------Episode Timestamps:(01:04): The journey of Telmai (05:30): How we should think about data reliability, quality, and observability (13:37): What open source data means to Mona(15:34): How Mona guides people on cleaning up their data pipelines (26:08): LLMs in real life(30:37): A question Mona wishes to be asked(33:22): Mona's advice for the audience(36:02): Backstage takeaways with executive producer, Audra Montenegro-------------------Links:LinkedIn - Connect with MonaLearn more about Telmai

ceo ai data oracle backstage reliability large language models emc data quality open source data data observability

Throwback: Open Source Innovation, The GPL for Data, and The Data In to Data Out Ratio with Larry Augustin

Play Episode Listen Later Oct 18, 2023 40:57

This episode features an interview with Larry Augustin, angel investor and advisor to early-stage technology companies. Larry previously served as the Vice President for Applications at AWS, where he was responsible for application services like Pinpoint, Chime, and WorkSpaces.Before joining AWS, Larry was the CEO of SugarCRM, an open source CRM vendor. He also was the founder and CEO of VA Linux, where he launched SourceForge. Among the group who coined the term “open source”, Larry has sat on the boards of several open source and Linux organizations.In this episode, Sam and Larry discuss who owns the rights to data, the data in to data out ratio, and why Larry is an open source titan.-------------------"People are willing to give up so much of their personal information because they get an awful lot back. And privacy experts come along and say, ‘Well, you're taking all this personal information'. But then most people look at that and say, ‘But I get a lot of value back out of that.' And it's this data ratio value question, which is: for a little in, I get a lot back. That becomes a key element in this. And I think there has to be some kind of similar thought process around open source data in general, which is if I contribute some data into this, I'm going to get a lot of value back. So this data in to data out ratio, I think it's an incredibly important one. And it gets everyone in the mindset of, ‘How do I provide more and more and take less and less?' It's a principle of application development that I like a lot. And I think there's a similar concept here around open source data. Are there models or structures that we can come up with where people can contribute small amounts of data and as a result of that, they get back a lot of value.” – Larry Augustin-------------------Episode Timestamps:(02:52): How Larry is spending his time now after AWS(06:25): What drove Larry to open source(18:41): What is the GPL for data?(24:28): Areas of progress in open source data(28:57): The data in to data out ratio(36:39): Larry's advice for folks in open source-------------------Links:LinkedIn - Connect with LarryTwitter - Follow Larry

ceo innovation data vice president applications crm areas throwback open source aws linux ratio workspace chime augustin pinpoint gpl sugarcrm open source data sourceforge

Reframing Machine Learning and AI-Assisted Development with Jorge Torres

Play Episode Listen Later Sep 27, 2023 45:11

This episode features an interview with Jorge Torres, Co-founder and CEO of MindsDB. MindsDB is a virtual AI database that works with existing data to help developers build AI-centered apps. In 2008, Jorge began his work on scaling solutions using machine learning as the first full-time engineer at Couchsurfing, growing the company from a few thousand users to a few million. He has also served a number of data-intensive start-ups and was a visiting scholar at UC Berkeley researching machine learning automation and explainability.In this episode, Sam and Jorge discuss the inspiration and challenges behind MindsDB, classic data science AI versus applied AI, and time series transformers.-------------------“So much data in the world is time series data, so much data. Even data that people don't know is time series, it's time series. So long as it's moving over time, it is time series data. Whether you store it or not, that's a different thing. For having a pre-trained model on time series data, it even enabled the fact that you don't have to store all the historical data. You can just take the model and start passing data as it comes through, and then you get out the forecast. So you don't even have to have the historical data. All you need to have is the data at that given instance, and you can pass it to the model and you get an output. It's mind blowing.” – Jorge Torres-------------------Episode Timestamps:(05:20): The inspiration behind MindsDB(10:20): Classic data science AI approach vs. applied AI(22:09): What open source data means to Jorge(28:51): What excites Jorge about Nixtla and time series transformers(37:07): A question Jorge wishes to be asked(40:20): Jorge's advice for the audience(41:38): Backstage takeaways with executive producer, Audra Montenegro-------------------Links:LinkedIn - Connect with JorgeLearn more about MindsDB open source codeLearn more about MindsDB

ceo ai data development classic machine learning reframing uc berkeley backstage data science assisted couchsurfing jorge torres applied ai open source data

A Sam Ramji Feature: The Evolution of Open Source, Kubernetes, and AI's Forward Journey

Play Episode Listen Later Sep 6, 2023 69:46

On this episode, we've partnered with the Future Rodeo podcast for a discussion between Sam and Matt Wallace. Matt is the Chief Technology Officer and EVP at Faction, a pioneer of multi-cloud data services, and host of Future Rodeo.In this episode, Sam and Matt discuss Microsoft's transformation, the impact of Kubernetes on container orchestration, and the rapid acceleration of AI research and development.-------------------Episode Timestamps:(01:38): Microsoft's open source transformation(13:19): The impact of Kubernetes and how it defragmented the industry(22:06): The transformative power of AI and how it's changing the value of reasoning(54:58): The concept of cognitive economy and its potential impact on AI and software development(01:03:25): Potential implications of advancements in robotics, AI, and clean energy(01:04:17): Sam's advice for those entering the industry or choosing a career path-------------------Links:LinkedIn - Connect with MattListen to the Future Rodeo podcast

ai evolution microsoft forward feature evp open source chief technology officer kubernetes faction matt wallace open source data

The Importance of Open Source Data for Generative AI, Now and in the Future with Abby Kearns

Play Episode Listen Later Aug 23, 2023 46:14

This episode features an interview with Abby Kearns, technology executive, board director, and angel investor. Her career has spanned executive leadership, product marketing, product management, and consulting across Fortune 500 companies and startups, including Puppet, Cloud Foundry Foundation, and Verizon. Abby currently serves as a board director for Lightbend, Stackpath, and Invoke. In this episode, Sam sits down with Abby to discuss the betrayal source license, the role open source plays in AI, and empowering trust.-------------------“There's so much happening so quickly that I think open source has the power to help harness a lot of that innovative conversation. In a way that I think it's going to be really, really hard to match in a proprietary way. I think open source and the ability, given the fact that we're talking about AI and data, the two are very interrelated at this point. AI is not super interesting without data. I think the power of open source right now and what's happening, I think it has to happen in open source and I think it really has to have that level of transparency and visibility. But, always the ability for everyone to step up and understand what's happening at this moment in time and shape it.” – Abby Kearns-------------------Episode Timestamps:(00:50): Sam and Abby discuss the betrayal source license(14:12): What open source data means to Abby(23:30): Abby dives into the companies she's investing in(34:30): How nonprofits can empower trust(38:32): A question Abby wishes to be asked(40:21): Abby's advice for the audience(43:53): Backstage takeaways with executive producer, Audra Montenegro-------------------Links:LinkedIn - Connect with AbbyTwitter - Follow AbbyRead Design the Life You Love

trust ai fortune transparency nonprofits verizon puppets backstage generative life you love genai kearns invoke bsl open source data lightbend stackpath

The Value of Reproducibility and Ease of AI Deployment with Daniel Lenton

Play Episode Listen Later Aug 9, 2023 33:58

This episode features an interview with Daniel Lenton, Founder and CEO of Ivy, where the team is on a mission to unify the fragmented AI stack. Prior to Ivy, Daniel was a Robotics Research Engineer at Dyson and a Deep Learning Research Scientist for Amazon Prime Air. During his PhD, Daniel explored the intersection between learning-based geometric representations, ego-centric perception, spatial memory, and visuomotor control for robotics.In this episode, Sam and Daniel discuss the inspiration behind Ivy, open source reproducibility, and democratizing AI.-------------------"There's too much amazing stuff going on, from too many different parties. We just want to be the objective source of truth to show you the data and show you where your model will be doing best, and continue to do this as a service or something like this. This is high-level, some of the areas we see and going into, we really want to be a useful tool for anybody that wants to just kind of understand this fragmented complex space quickly and intuitively, and we are trying to be the tool that does that." – Daniel Lenton-------------------Episode Timestamps:(01:00): What open source data means to Daniel(05:37): The challenges of building Ivy(15:37): The future of Ivy(25:19): Who should know about Ivy(28:46): Daniel's advice for the audience(32:00): Backstage takeaways with executive producer, Audra Montenegro-------------------Links:LinkedIn - Connect with DanielLearn more about Ivy

ceo founders ai phd data artificial intelligence ease backstage dyson deployment democratization reproducibility lenton amazon prime air open source data

ML Engineering Teams and Niche Chat Bot Experiences with Demetrios Brinkmann

Play Episode Listen Later Jul 26, 2023 50:17

This episode features an interview with Demetrios Brinkmann, Founder of the MLOps Community, an organization for people to share best practices around MLOps. Demetrios fell into the Machine Learning Operations world and has since interviewed leading names around MLOps, data science, and machine learning. In this episode, Sam sits down with Demetrios to discuss LLM in production use cases, ML engineering teams, and the LLM Survey Report from the MLOps Community.-------------------"I think the most novel ones that I saw from the survey were when a chat bot would prompt a human as opposed to the human prompting the chat bot. It's almost like you have this LLM coach. And in that way, it's not necessarily like this isn't LLM in production that an end user is getting that's not outside the business or that is outside the business. It's more like internally, you can think about maybe it's an accountant and the accountant is filing my taxes for the year. As they're filing them, the LLM is prompting them on different tax laws that maybe they weren't thinking about or different ways that they could file things." – Demetrios Brinkmann-------------------Episode Timestamps:(04:30): LLMs as the new standard(19:26): Key LLM in production use cases(31:18): What open source data means to Demetrios(34:36): What Demetrios is seeing in open source AI models(42:44): One question Demetrios wishes to be asked(44:41): Demetrios's advice for the audience(47:19): Backstage takeaways with executive producer, Audra Montenegro-------------------Links:LinkedIn - Connect with DemetriosRead the LLM Survey ReportListen to The MLOps Podcast

founders ai experiences engineering niche bots machine learning chatbots backstage ml llm large language models brinkmann demetrios open source data mlops community

Building With Trust, Inspiration, and Reputation with Jaya Gupta, Yuliia Tkachova, and Omoju Miller

Play Episode Listen Later Jul 12, 2023 4:12

This bonus episode features conversations from season 5 of the Open||Source||Data podcast. In this episode, you'll hear from Jaya Gupta, Partner at Foundation Capital; Yuliia Tkachova, Co-founder and CEO of Masthead Data; and Omoju Miller, Founder and CEO of Fimio.Sam sat down with each guest to discuss how they are building foundations for trust, inspiration, and reputation as we all race into the AI-centric future.You can listen to the full episodes from Jaya Gupta, Yuliia Tkachova, and Omoju Miller by clicking the links below.-------------------Episode Timestamps:(00:49): Jaya Gupta(01:48): Yuliia Tkachova(03:03): Omoju Miller-------------------Links:Listen to Jaya's episodeListen to Yuliia's episodeListen to Omoju's episode

ceo founders trust ai partner inspiration code artificial intelligence reputation machine learning ml gupta jaya foundation capital open source data omoju miller

FMOps and a Founders Automated Future with Jaya Gupta

Play Episode Listen Later Jun 28, 2023 33:49

This episode features an interview with Jaya Gupta, Partner at Foundation Capital, where she leads early-stage investments across the enterprise software stack. Previously, Jaya was a Senior Business Analyst at McKinsey & Company focusing on software diligence and helping startups expand their go-to-market strategies.In this episode, Sam and Jaya discuss her journey to Foundation Model Ops, how software is becoming more accessible, and the democratization of AI tools.-------------------"At the end of the day, FMOps isn't just about the new tools. It's actually more about the new builders, the new workflows, and a completely new market of customers. I was on the other day, looking at LangChain's page of integrations, I don't know if you've seen it, but it's like Anyscale, Databricks, all these other huge legendary companies are integrating with LangChain, and I think it's clear that there's a huge community that is building something real and valuable." – Jaya Gupta-------------------Episode Timestamps:(01:05): What open source data means to Jaya(08:51): Jaya's journey to Foundation Model Ops(15:58): How software is becoming more accessible(23:04): The democratization of AI tools(27:01): One question Jaya wishes to be asked(29:32): Jaya's advice for the audience(31:51): Backstage takeaways with executive producer, Audra Montenegro-------------------Links:LinkedIn - Connect with JayaFollow Jaya on TwitterLearn more about FMOps

founders ai data partner software backstage automated gupta democratization jaya mckinsey company databricks langchain foundation capital senior business analyst open source data anyscale

MLOPs: Privacy, Security, Cost, and Latency - a Sneak Peek with Bart Farrell

Play Episode Listen Later Jun 14, 2023 8:04

This episode features an interview with Bart Farrell, is a CNCF Ambassador, a Cloud Native Community Consultant, and Content Creator. An American entrepreneur living in Spain, Bart has spent the last decade helping tech companies broaden their audience through exceptional content. He has organized and hosted over 250 cloud native in-person and virtual events in 10 different countries.In this episode, Audra and Bart discuss upcoming AI and MLOps events, his work as a community consultant, and what open source data means to him.-------------------“When we're looking at other technologies, in particular use cases like low latency, if we're talking about autonomous vehicles, we're talking about the financial sector, we're talking about fraud detection, things where decisions have to be made in real time. What are the technologies that are helping out with that? How can organizations, some that are more advanced than others, go through that adoption phase? And others that aren't so advanced, that haven't really moved things yet into production, how can they be better prepared in order to tackle these challenges that are coming up? That being said, we've got quite a cross section of different larger and smaller organizations that are really playing a pivotal role in the changes that are going on when it comes to edge meeting AI and MLOps.” – Bart Farrell-------------------Episode Timestamps:(01:27): Bart's background(02:45): Bart dives into The Cutting Edge of MLOps live event(06:18): What open source data means to Bart-------------------Links:LinkedIn - Connect with BartTwitter - Follow BartLearn more about The Cutting-EDGE of MLOps webinarLearn more about Edgecase 2023Listen to The AI-Native Stack with Mikiko Bazeley, Zain Hasan, and Tuana Celik

american community ai cost spain content creators bart sneak peek cutting edge farrell cloud native latency privacy security open source data

Web3 and Putting Reputation on Code with ML with Omoju Miller

Play Episode Listen Later May 31, 2023 62:01

This episode features an interview with Omoju Miller, Founder and CEO of Fimio, a web3 reputation company. Originally from Lagos, Nigeria, Omoju holds a doctoral degree in Computer Science Education from UC Berkeley. Her expertise in machine learning and computational intelligence led her to companies such as Google and GitHub. Omoju also served as a volunteer advisor to the Obama administration's White House Presidential Innovation Fellows.In this episode, Sam sits down with Omoju to discuss how machine learning can make applications more secure, what the future of the internet looks like, and the fascinating story behind Fimio.-------------------“So my first view is, in this future internet we have people, we also have bots, we have machines, we have code doing things. And bots sounds like such a horrible word now. [...] You need to have a level of trust on what that bot is. Everything from the humans to the machines collaborating in this decentralized world, we need to have some kind of reputation attached to each of those nodes. And the reason why we need that reputation is, as the thing scales, it becomes overwhelming to get value from it. You need something to help you filter, to find what you're looking for. Otherwise, you get stuck in that environment where you're just completely overwhelmed and you don't even know what to do. So I think of what I'm doing as just reputation to make this decentralized future slightly more attainable.” – Omoju Miller-------------------Episode Timestamps:(00:59): Omoju's inspiration for starting Fimio(10:27): The future of smart contracts(28:47): Using mathematics to guarantee the safety of algorithms(34:34): What led Omoju to building a mathematical product(51:27): What open source data means to Omoju(55:38): One question Omoju wishes to be asked(57:47): Omoju's advice for the audience(01:00:08): Backstage takeaways with executive producer, Audra Montenegro-------------------Links:LinkedIn - Connect with OmojuVisit Fimio

The Human Right to Privacy and Caring About UX Design with Yuliia Tkachova

Play Episode Listen Later May 17, 2023 46:34

This episode features an interview with Yullia Tkachova, Co-founder and CEO of Masthead Data, an observability platform that catches anomalies in Google BigQuery in real-time. She holds degrees in Management Information Systems, Math, Statistics, and Marketing. Prior to Masthead, Yuliia designed complex BI products and solutions powered by ML and utilized by Fortune 500 companies.In this episode, Sam and Yuliia discuss how ML is shaping the future of data analytics, caring about users, and the fundamental human right to privacy.-------------------“We map those errors and anomalies on lineage, helping to understand what upstreams and downstreams are affected, what business users are affected. And that actually speeds up all the troubleshooting from hours to minutes. And this is the ultimate goal where we deliver. Because again, my belief that if you don't have this lineage piece was mapped anomalous in errors, it's not observability. It's monitoring. [...] What is also very unique to us, because Masthead operates on logs, it's triggered by logs. So, we do support streaming data. Unlike SQL-first solutions, as you can guess. We don't have to run SQL queries to see if they're anomalous, we're triggered by logs. And this is also what sets us apart.” – Yuliia Tkachova-------------------Episode Timestamps:(01:14): What got Yuliia excited about math and statistics(11:31): The basic human right to privacy(18:21): What open source data means to Yuliia(28:00): Yuliia's reason for building a solution focused on privacy and security(38:09): One question Yuliia wishes to be asked(42:21): Yuliia's advice for the audience(44:46): Backstage takeaways with executive producer, Audra Montenegro-------------------Links:LinkedIn - Connect with YuliiaVisit Masthead Data

Determinism in Complex Environments and Workflow Services with Maxim Fateev

Play Episode Listen Later May 3, 2023 42:06

This episode features an interview with Maxim Fateev, Co-founder and CEO of Temporal, an open source, distributed, and scalable workflow orchestration engine capable of running millions of workflows. He has 20 years of experience architecting mission-critical systems at Uber, Google, Amazon, and Microsoft. In this episode, Sam sits down with Maxim to discuss workflow services, the power behind Temporal, and bringing determinism to highly complex environments.-------------------“[Temporal] has this notion of workflows, which can run for a very long time and handle external events, you can treat them as a durable actor. And they're very good at implementing a lifecycle. For example, you can have an object per model and let this object handle all the events. Like, new data came in, notify this object, this object will go and retrain it. Or, it'll run an activity to superiorly check the status. So you can have end-to-end lifecycle implemented fully in Temporal.” – Maxim Fateev-------------------Episode Timestamps:(01:03): What's top of mind for Maxim in workflow services(04:09): What open source data means to Maxim(11:07): Maxim explains his time at AWS and building Cadence at Uber(23:09): Use cases and the community of Temporal(28:26): How Temporal is being used for ML workloads(32:28): One question Maxim wishes to be asked(36:38): Maxim's advice for those working with complex distributed systems(39:11): Backstage takeaways with executive producer, Audra Montenegro-------------------Links:LinkedIn - Connect with MaximTemporal.ioWatch Maxim's talk “Designing a Workflow Engine from First Principles”Replay Conference 2023

The AI-Native Stack in Practice with Charna Parkey and Sam Bean

Play Episode Listen Later Mar 15, 2023 66:25

This episode features a panel discussion with Charna Parkey, a Real-Time AI Product and Strategy leader at DataStax; and Sam Bean, Staff Engineer at You.com. Charna is a co-author and inventor on several patents, including patent-pending work on ML/coordinated feature engine at the edge. Sam helped create the Spark connector to Weaviate, and is passionate about Big Data, Spark, NLP, Hugging Face, and large language models.In this episode, Charna and Sam discuss adapting to user expectations, what's missing in the AI stack, and how to become an advanced citizen in open source.-------------------"We've seen these companies start to better understand that these streaming technologies have a place, whether it's Kafka or Flink or Pulsar, but it's still incredibly difficult to use and we need a different level of abstraction. [...] We're starting to see the stack change so that it becomes more interchangeable of the components and try to sort of raise that layer of abstraction so that we can get these types of models and these types of capabilities to more people." – Charna Parkey"I think that a lot of what you need to adjust to are these, what you were discussing as I call interaction data, you were calling it event data. But these interactions that people have with the internet and trying to find ways to model that in a way that even if your models aren't real-time, having ways to featurize real-time data in a way that's interpretable by a model. [...] I think Spark and Kafka and Delta and all of those things, give you a lot more flexibility now to move in different directions and readjust and I think, pivot what you want to do with the system." – Sam Bean-------------------Episode Timestamps:(01:29): Sam explains his background(03:36): Charna explains her background(18:13): Sam explains the problems You.com is solving for(28:21): Changes in user expectations in the AI-native stack(39:09): Advice for becoming an advanced citizen in open source(47:25): What's missing in the AI stack(54:51): What open source data means to the panelists(58:22): How technologists should prepare for the future(01:03:10): Executive producer, Audra Montenegro's backstage takeaways-------------------Links:LinkedIn - Connect with CharnaVisit DataStaxLinkedIn - Connect with SamVisit You.com

The AI-Native Stack with Mikiko Bazeley, Zain Hasan, and Tuana Celik

Play Episode Listen Later Mar 1, 2023 56:48

This episode features a panel discussion with Mikiko Bazeley, Head of MLOps at Featureform; Zain Hasan, Senior Developer Advocate at Weaviate; and Tuana Celik, Developer Advocate at deepset.In this episode, Mikiko, Zain, and Tuana discuss what open source data means to them, how their companies fit into the AI-first ecosystem, and how jobs will need to evolve with the AI-native stack.-------------------“We're almost part of a fancy new AI robot kitchen that you'd find in Tokyo, in some ways. I see a virtual feature store as, yes, you can have a bunch of your ingredients tossed into a closet. Or, what you can do is you can essentially have a nice way to organize them. You can have a way to label them, to capture information.” – Mikiko Bazeley“I really like that analogy as well. I like how Mikiko put it where a vector search engine is really extracting value from what you've already got. [...] So where I see vector search engines, really, is if we think of these embedding providers as the translators to take all of our unstructured data and bring it into vector space into a common machine language, vector search engines are essentially the workhorses that allow us to compute and search over these objects in vectorized format. They're essentially the calculators of the AI stack.” – Zain Hasan“Haystack, I would really position as the kitchen. I need Mikiko to bring the apples. I need Zain to bring the pears. I need Hugging Face or OpenAI to bring the oranges to make a good fruit salad. But, Haystack will provide the spoons and the pans and the knives to make that into something that works together.” – Tuana Celik-------------------Episode Timestamps:(02:08): What open source data means to the panelists(08:22): What interested the panelists about AI/ML(23:20): Mikiko explains Featureform(26:11): Zain explains Weaviate(29:34): Tuana explains deepset(35:11): The panelists discuss how their companies fit into the AI-first ecosystem(44:12): How jobs need to evolve with the AI-native stack(53:45): Executive producer, Audra Montenegro's backstage takeaways-------------------Links:LinkedIn - Connect with MikikoVisit FeatureformLinkedIn - Connect with ZainVisit WeaviateLinkedIn - Connect with TuanaVisit deepsetVisit Data-centric AI

head ai executives artificial intelligence tokyo native machine learning openai stack ai ml natural language processing haystack developer advocate senior developer advocate celik open source data weaviate bazeley

Special Episode: Data on Kubernetes and Cassandra Forward with Patrick McFadin

Play Episode Listen Later Feb 22, 2023 18:44

This special episode of Open||Source||Data features an interview with Patrick McFadin. Patrick has been a distributed systems hacker since he first plugged a modem into his Atari computer. Looking for adventure, he joined the US Navy, working on the Naval Tactical Data System (NTDS), which cemented his love of distributed systems. He is now an Apache Cassandra Committer, and is the Vice President of Developer Relations at DataStax. Sam catches up with Patrick at Data Day Texas to discuss his book Managing Cloud Native Data on Kubernetes, Cassandra Forward, and the future of Apache Cassandra.-------------------“I can now use my Parquet file in Iceberg or DuckDB, and this is data that I created with Cassandra. And we're not getting to the point where we have to reinvent an entire database. We can just connect the Lego parts together and if they're open, then I don't have these encumbrances. I'm not like, ‘Well, I can connect that if I call a salesperson and get a license.' [...] That's what's exciting to me about Cassandra, the way that the ecosystem is evolving around Cassandra. It's not, ‘Cassandra's at the center, it's just a player.' It's at the party." – Patrick McFadin-------------------Episode Timestamps:(01:06): What open source data means to Patrick(02:11): Patrick discusses his book Managing Cloud Native Data on Kubernetes(10:02): Patrick discusses Cassandra Forward(11:09): The future of Apache Cassandra-------------------Links:LinkedIn - Connect with PatrickCassandra Forward

data vice president forward lego us navy atari iceberg kubernetes developer relations parquet datastax apache cassandra open source data duckdb patrick mcfadin

Making Graph Data Easier with Open Initiatives with Denise Gosnell

Play Episode Listen Later Feb 15, 2023 40:10

This episode features an interview with Denise Gosnell, Principal Product Manager at Amazon Web Services. At AWS, Denise leads product and strategy for Amazon Neptune, a fully managed graph database service. Her career centers on her passion for examining, applying, and advocating for the applications of graph data. Denise has also authored, patented, and spoken on graph theory, algorithms, databases, and applications across all industry verticals.In this episode, Sam sits down with Denise to discuss graph initiatives, the future of developer models, and what Denise learned from hiking the Appalachian Trail.-------------------“We just open sourced something called graph-explorer, which is something for the community by the community, Apache 2.0 license. graph-explorer is a low-code visualization tool. But, the best part about it is that it works for JanusGraph, it works for Blazegraph, it works for all of these graph models that we've talked about, because we've got this divided graph community, but it was written to work with all graphs. [...] Today it's all, ‘Here's your Lego blocks and build one on your own. If you want to go ahead and fork Jupyter Notebook and figure out a way to get that D3 force-directed graph way out to pop up, have fun.' It's the first time that we've had a unified way across graph vendors and graph implementations to have a way to visualize your graph data in one tool that's open source.” – Denise Gosnell-------------------Episode Timestamps:(01:17): What open source data means to Denise(04:27): How Denise got interested in computer science(08:39): Denise's work on graph initiatives(14:30): How Denise's work at LDBC relates to SQL standards(23:43): The future of developer models(29:43): One question Denise wishes to be asked(34:05): Denise's advice for graph practitioners(37:37): Executive producer, Audra Montenegro's backstage takeaways-------------------Links:LinkedIn - Connect with DeniseThe Practitioner's Guide to Graph Data

guide data executives lego easier computer science aws initiatives apache appalachian trail amazon web services d3 graphs sql principal product manager jupyter notebooks open source data denise gosnell amazon neptune ldbc

Advising Big Data and The Future of AI/ML with Ben Lorica

Play Episode Listen Later Feb 1, 2023 48:13

This episode features an interview with Ben Lorica, Co-founder and Principal of Gradient Flow, a company that provides a wide range of content on data and technology. Ben is an industry expert on data, machine learning, and AI. He is a Technical Advisor for Databricks, a program chair for several data conferences, and he hosts The Data Exchange Podcast.In this episode, Sam and Ben discuss Big Data and the improvements and future opportunities of AI and machine learning.-------------------“The reason I use the word decentralize is because when you try to explain it to someone, let's say you want to train a different model for each user, or region, or sensor, or device. So you can't use necessarily just personalized because recommenders can be personalized, but they're still centralized models.” – Ben Lorica-------------------Episode Timestamps:(01:17): What open source data means to Ben(05:54): What intrigued Ben about Big Data(12:07): What brought Ben to working on Ray(16:15): Ben's opinion on how far AI and ML have come in the last 5 years(26:38): What Ben sees happening in this space in the next 5 years(39:06): What challenges Ben sees in the next 5 years (43:51): One question Ben's always wanted to be asked(44:55): Ben's advice for those starting their open source data adventure(46:34): Executive producer, Audra Montenegro's backstage takeaways-------------------Links:LinkedIn - Connect with BenGradient Flow's NewsletterGradient Flow's 2023 Trends ReportVisit Sky Labs

ai executives artificial intelligence principal machine learning big data ml future of ai advising ai ml databricks technical advisor open source data ben lorica

Functional Programming and an Ideal Data Stack Building Experience with Holden Karau

Play Episode Listen Later Jan 18, 2023 45:11

This episode features an interview with Holden Karau, an Open Source Engineer at Netflix. Holden is best known for her work on Apache Spark, her advocacy in the open source software movement, and her creation of a variety of related projects including spark-testing-base. Previously, Holden worked at Big Tech companies like Apple, IBM, and Google as a software engineer and developer advocate.In this episode, Sam sits down with Holden to discuss the data analysis stack, functional programming, and the future of open source software data tooling.-------------------“These things are not one off. We may think that they're one off and they don't need testing, but that's not the reality. When you write something, it needs to be maintainable and as software people, the only real way that I think we know to make something vaguely maintainable is to at least have tests. And these tests need to cover common failure cases that we've experienced. And certainly, there's different approaches to this. There's property based testing, there's golden sets, all kinds of different options. I don't think necessarily any one approach is right or better here, but I think we need something. We need less untitled 5.IPython Notebook running in production, scheduled every hour. That is not a way to run a company.” – Holden Karau-------------------Episode Timestamps:(02:27): What open source data means to Holden(04:37): What interested Holden in mathematical computer science (09:51): What drew Holden to Spark(12:49): What Holden has learned about cognitive systems(20:02): What we need to learn as developers and data specialists(25:28): The future of the data analysis stack(31:21): Improvements in data tooling over the next 5 years(34:25): A question Holden wishes to be asked(40:51): Holden's advice for open source data project committers(43:18): Executive producer, Audra Montenegro's backstage takeaways-------------------Links:LinkedIn - Connect with HoldenBuy Holden's booksVisit Holden's website

netflix google apple data executives ideal spark ibm big tech stack improvements open source software functional programming apache spark open source data cognitive systems holden karau

Claim Open||Source||Data

In order to claim this podcast we'll send an email to with a verification link. Simply click the link and you will be able to edit tags, request a refresh, and other features to take control of your podcast page!

Claim Cancel