POPULARITY
Here comes SQL Server 2025! While at Build, Richard chatted with Bob Ward about releasing a preview version of SQL Server 2025. Bob discusses SQL Server 2025 as an AI-ready enterprise database with numerous capabilities specifically tailored to your organization's AI needs, including a new vector data type. This includes making REST API calls to Azure OpenAI, Ollama, or OpenAI. This is also the version of SQL Server designed to integrate with Microsoft Fabric through mirroring. There are many more features, even a new icon!LinksSQL Server 2025 AnnouncementJSON Data TypeOllamaRecorded May 20, 2025
SHOW NOTESGuest: Andrew AmannWebsite: ninetwothree.coLinkedIn: Andrew AmannX/Twitter: @andrewamannKey topics:Andrew's pivot from mechanical engineering to AI and software development Early experiments with digital transformation, including VBA-coded automations Founding 923 Studio and delivering 150+ innovative AI and ML products Ideal clients: established brands with innovation labs and funded startups How Andrew and his team win business through SEO, conferences, and LinkedIn outreach Stabilization and growth goals for 923 Studio in 2025 How AI can be implemented in enterprise businesses, starting with a knowledge base Balancing business growth with a holistic lifestyle for employees Andrew's best advice: become an apprentice, learn from both good and bad bosses The 923 Studio name: inspired by their early days working 9 PM to 3 AM Tips for building AI solutions that truly solve real-world problems Key Questions(01:19) Can you tell us a bit about how you ended up where you are today?(03:15) Who would be your ideal client these days?(04:03) How do you get in front of these people?(04:35) Do you have repeat customers?(05:55) What are some big goals that you'd like to achieve in the next year?(06:45) Do you use AI within your business?(08:07) So your goals that you have, how would that affect your business?(08:55) What do you feel is the number one roadblock from you guys getting there?.(09:20) Can you talk a little bit about successful AI transformation in enterprise companies?(11:33) Do you have any tips or anything about how to build AI solutions that will solve our real problems like you were talking about?(12:55) How about running a holistic agency that uses profit to enhance the lifestyle of all employees?(13:49) What is the best piece of advice that you've ever received?(15:13) How did you come up with the business name?(15:54) What's the best advice you have ever given?(17:54) Is there anything else that you would like to touch on?(18:02) Where can we go to learn more about you and what you're doing?Andrew Amannwww.ninetwothree.coAndrew Amann | LinkedInx.com/andrewamannVirginia PurnellFunnel & Visibility SpecialistDistinct Digital Marketing(833) 762-5336virginia@distinctdigitalmarketing.comwww.distinctdigitalmarketing.comwww.distinctdigitalmarketing.co
Join Jobi George and Tony Le from Weaviate as we discuss how their AI Native Vector Database is powering modern AI applications. We begin with an intro to Vector Databases followed by common use cases, and how they fit into Agentic Workflows. We also talk about the advantages of using an AI Native Database from the Open Source community. To learn more:Website - www.weaviate.ioLinkedin - https://www.linkedin.com/company/weaviate-io/AWS Marketplace - https://aws.amazon.com/marketplace/seller-profile?id=seller-jxgfug62rvpxsDeveloper Community - https://weaviate.io/developers/weaviate/model-providers/awsInstagram - https://www.instagram.com/weaviate.io/Youtube - https://www.youtube.com/@WeaviateAWS Hosts: Nolan Chen & Malini ChatterjeeEmail Your Feedback: rethinkpodcast@amazon.com
API docs used to be by and for developers. Now, non-technical people use AI tools to build integrations into our SaaS products. We need to rethink how we communicate with them (and the AI agents that write their code).The blog post: https://thebootstrappedfounder.com/your-api-documentation-is-not-for-developers-anymore/ The podcast episode: https://tbf.fm/episodes/your-api-documentation-is-not-for-developers-anymoreCheck out Podscan, the Podcast database that transcribes every podcast episode out there minutes after it gets released: https://podscan.fmSend me a voicemail on Podline: https://podline.fm/arvidYou'll find my weekly article on my blog: https://thebootstrappedfounder.comPodcast: https://thebootstrappedfounder.com/podcastNewsletter: https://thebootstrappedfounder.com/newsletterMy book Zero to Sold: https://zerotosold.com/My book The Embedded Entrepreneur: https://embeddedentrepreneur.com/My course Find Your Following: https://findyourfollowing.comHere are a few tools I use. Using my affiliate links will support my work at no additional cost to you.- Notion (which I use to organize, write, coordinate, and archive my podcast + newsletter): https://affiliate.notion.so/465mv1536drx- Riverside.fm (that's what I recorded this episode with): https://riverside.fm/?via=arvid- TweetHunter (for speedy scheduling and writing Tweets): http://tweethunter.io/?via=arvid- HypeFury (for massive Twitter analytics and scheduling): https://hypefury.com/?via=arvid60- AudioPen (for taking voice notes and getting amazing summaries): https://audiopen.ai/?aff=PXErZ- Descript (for word-based video editing, subtitles, and clips): https://www.descript.com/?lmref=3cf39Q- ConvertKit (for email lists, newsletters, even finding sponsors): https://convertkit.com?lmref=bN9CZw
Enterprise RAG? CEO of Elastic Ashutosh Kulkarni sat down with our host Daniel Newman at AWS re:Invent 2024. They touched on how Elastic is driving generative AI adoption by empowering developers with tools and partnerships, focusing on efficiency and integration within the AWS ecosystem, and aiming to lead the enterprise AI sector in 2025. Specific highlights covered ⤵️ Drivers behind Elastic's strong momentum in generative AI adoption and their efforts to help customers accelerate their GenAI projects. Insights into the newly announced Elastic AI Ecosystem and its role in aiding developers to navigate AI product choices and integrations more efficiently. The significant influence of the AWS partnership on Elastic's strategic directions and key takeaways from this collaboration. Elastic's achievements in 2024 within the enterprise tech landscape. Future prospects for Elastic and the evolving enterprise AI sector in 2025.
Web and Mobile App Development (Language Agnostic, and Based on Real-life experience!)
In this conversation, Krish Palaniappan introduces Weaviate, an open-source vector database, and explores its functionalities compared to traditional databases. The discussion covers the setup and configuration of Weaviate, hands-on coding examples, and the importance of vectorization and embeddings in AI. The conversation also addresses debugging challenges faced during implementation and concludes with a recap of the key points discussed. Takeaways Weaviate is an open-source vector database designed for AI applications. Vector databases differ fundamentally from traditional databases in data retrieval methods. Understanding vector embeddings is crucial for leveraging vector databases effectively. Hands-on coding examples help illustrate the practical use of Weaviate. Python is often preferred for AI-related programming due to its extensive support. Debugging is an essential part of working with new technologies like Weaviate. Vectorization optimizes database operations for modern CPU architectures. Embedding models can encode various types of unstructured data. The conversation emphasizes co-learning and exploration of new technologies. Future discussions may delve deeper into the capabilities of vector databases. Chapters 00:00 Introduction to Weaviate and Vector Databases 06:58 Understanding Vector Databases vs Traditional Databases 12:05 Exploring Weaviate: Setup and Configuration 20:32 Hands-On with Weaviate: Coding and Implementation 34:50 Deep Dive into Vectorization and Embeddings 42:15 Debugging and Troubleshooting Weaviate Code 01:20:40 Recap and Future Directions Purchase course in one of 2 ways: 1. Go to https://getsnowpal.com, and purchase it on the Web 2. On your phone: (i) If you are an iPhone user, go to http://ios.snowpal.com, and watch the course on the go. (ii). If you are an Android user, go to http://android.snowpal.com.
Web and Mobile App Development (Language Agnostic, and Based on Real-life experience!)
In this conversation, Krish Palaniappan interviews Bob van Luijt, CEO of Weaviate, about the emerging field of vector databases and their significance in AI applications. Bob explains the concept of vector embeddings, the evolution of databases from SQL to NoSQL and now to vector databases, and the unique capabilities that vector databases offer for search and recommendation systems. They discuss the importance of developer experience, community feedback, and the future of database technology in the context of AI integration. Bob discusses the evolution of AI development, emphasizing the shift towards AI-native applications and the democratization of AI tools for developers. Bob explains the concept of Retrieval Augmented Generation (RAG) and its significance in enhancing AI applications. They discuss the integration of models with vector databases, the various data storage options available in Weaviate, and the importance of user-friendly documentation for developers. The conversation concludes with insights into the future of AI and the potential for innovative applications. Takeaways Vector databases are designed for AI and machine learning applications. Vector embeddings allow for semantic search, improving data retrieval. The developer experience is crucial for the adoption of new database technologies. Community feedback plays a significant role in shaping database features. Vector databases can handle large volumes of data efficiently. The architecture of vector databases differs from traditional databases. AI native databases are becoming essential for modern applications. Search systems have evolved from keyword-based to semantic-based. The future of databases will focus on AI integration and flexibility. Understanding vector embeddings is key to leveraging vector databases. The early adopters of AI were well-informed and specialized. In the post-JGPT era, all developers want to build with AI. AI-enabled applications can function without the model, while AI-native applications cannot. Weaviate focuses on AI-native applications at the core of their technology. The developer experience is crucial for building AI applications. RAG allows for the integration of generative models with database retrieval. Vector databases are essential for machine learning models. Weaviate offers multiple data storage options to meet various needs. Documentation should be accessible and easy to understand for developers. The future of AI applications is about seamless integration and user experience. Chapters 00:00 Introduction to Vector Databases 02:46 Understanding Vector Embeddings 05:47 The Evolution of Databases: From SQL to Vector 09:08 Use Cases for Vector Databases 11:47 The Role of AI in Vector Databases 14:45 Storage and Indexing in Vector Databases 17:49 Building Applications with Vector Databases 21:01 Community Feedback and Market Trends 23:57 The Future of Database Technology 33:43 The Evolution of AI Development 39:08 Democratizing AI Application Development 41:52 Understanding Retrieval Augmented Generation (RAG) 47:07 Integrating Models with Vector Databases 50:17 Data Storage Options in Weaviate 53:34 Closing Thoughts and Future Directions
In this episode, Steven Batifol, a Developer Advocate at Zilliz, discusses his role in fostering the MLOps community, the significance of vector databases like Milvus, and the importance of open source ecosystems. We covered the excitement of developing creative demos, the challenges facing developers in the AI space, and the rapid advancements in LLMs and AI agents. We even learn some trivia about Germany and fax machines! 00:00 Introduction 00:16 Developer Advocacy 01:02 The MLOps Community in Berlin 01:51 Joining Zilliz and Working with Milvus 04:46 Fun and Creative Demos 10:21 Challenges in the AI/ML Community 13:00 The Importance of Open Source 17:02 Upcoming Open Source Summit Presentation 20:14 Future of AI and LLMs 24:24 Conclusion Guest: Stephen Batifol is a Developer Advocate at Zilliz. He previously worked as a Machine Learning Engineer at Wolt, where he created and worked on the ML Platform, and previously as a Data Scientist at Brevo. Stephen studied Computer Science and Artificial Intelligence. He is a founding member of the MLOps.community Berlin group, where he organizes Meetups and hackathons. He enjoys boxing and surfing.
MongoDB Podcast Episode with customer (Pankaj Prasad & Vijai Anand from Airwave) to discuss Airwave, how they are using MongoDB Atlas to support their RAG/AI chatbot application and why they decided to migrate to Atlas + replace their existing vector database.
Chroma is an open-source AI application database. Anton Troynikov is a Founder at Chroma. He has a background in computer vision and previously worked at Meta. In this episode Anton speaks with Sean Falconer about Chroma, and the goal of building the memory and storage subsystem for the new computing primitive that AI models represent. The post Chroma's Vector Database with Anton Troynikov appeared first on Software Engineering Daily.
Chroma is an open-source AI application database. Anton Troynikov is a Founder at Chroma. He has a background in computer vision and previously worked at Meta. In this episode Anton speaks with Sean Falconer about Chroma, and the goal of building the memory and storage subsystem for the new computing primitive that AI models represent. The post Chroma's Vector Database with Anton Troynikov appeared first on Software Engineering Daily.
Get a unified solution for secure access management, identity verification, and Zero Trust security for cloud and on-premises resources. The new Microsoft Entra suite integrates five capabilities: Private Access, Internet Access, ID Protection, ID Governance, and Face Check as part of Verified ID Premium, included with Microsoft Entra Suite. With these capabilities, you can streamline user onboarding, enhance security with automated workflows, and protect against threats using Conditional Access policies. See how to reduce security gaps, block lateral attacks, and replace legacy VPNs, ensuring efficient and secure access to necessary resources. Jarred Boone, Identity Security Senior Product Manager, shares how to experience advanced security and management with Microsoft Entra Suite. ► QUICK LINKS: 00:00 - Unified solution with Microsoft Entra Suite 00:38 - Microsoft Entra Private Access 01:39 - Microsoft Entra Internet Access 02:42 - Microsoft Entra ID Protection 03:31 - Microsoft Entra ID Governance 04:18 - Face Check in Verified ID Premium, included with Microsoft Entra Suite 04:52 - How core capabilities work with onboarding process 06:08 - Protect access to resources 07:22 - Control access to internet endpoints 08:05 - Establish policies to dynamically adjust 08:45 - Wrap up ► Link References Try it out at https://entra.microsoft.com Watch our related deep dives at https://aka.ms/EntraSuitePlaylist Check out https://aka.ms/EntraSuiteDocs ► Unfamiliar with Microsoft Mechanics? As Microsoft's official video series for IT, you can watch and share valuable content and demos of current and upcoming tech from the people who build it at Microsoft. • Subscribe to our YouTube: https://www.youtube.com/c/MicrosoftMechanicsSeries • Talk with other IT Pros, join us on the Microsoft Tech Community: https://techcommunity.microsoft.com/t5/microsoft-mechanics-blog/bg-p/MicrosoftMechanicsBlog • Watch or listen from anywhere, subscribe to our podcast: https://microsoftmechanics.libsyn.com/podcast ► Keep getting this insider knowledge, join us on social: • Follow us on Twitter: https://twitter.com/MSFTMechanics • Share knowledge on LinkedIn: https://www.linkedin.com/company/microsoft-mechanics/ • Enjoy us on Instagram: https://www.instagram.com/msftmechanics/ • Loosen up with us on TikTok: https://www.tiktok.com/@msftmechanics
Leverage Azure Cosmos DB for generative AI workloads for automatic scalability, low latency, and global distribution to handle massive data volumes and real-time processing. With support for versatile data models and built-in vector indexing, it efficiently retrieves natural language queries, making it ideal for grounding large language models. Seamlessly integrate with Azure OpenAI Studio for API-level access to GPT models and access a comprehensive gallery of open-source tools and frameworks in Azure AI Studio to enhance your AI applications. ► QUICK LINKS: 00:00 - Azure Cosmos DB for generative AI workloads 00:18 - Versatile Data Models 00:39 - Scalability and performance 01:19 - Global distribution 01:31 - Vector indexing and search 02:07 - Grounding LLMs 02:30 - Wrap up ► Unfamiliar with Microsoft Mechanics? As Microsoft's official video series for IT, you can watch and share valuable content and demos of current and upcoming tech from the people who build it at Microsoft. • Subscribe to our YouTube: https://www.youtube.com/c/MicrosoftMechanicsSeries • Talk with other IT Pros, join us on the Microsoft Tech Community: https://techcommunity.microsoft.com/t5/microsoft-mechanics-blog/bg-p/MicrosoftMechanicsBlog • Watch or listen from anywhere, subscribe to our podcast: https://microsoftmechanics.libsyn.com/podcast ► Keep getting this insider knowledge, join us on social: • Follow us on Twitter: https://twitter.com/MSFTMechanics • Share knowledge on LinkedIn: https://www.linkedin.com/company/microsoft-mechanics/ • Enjoy us on Instagram: https://www.instagram.com/msftmechanics/ • Loosen up with us on TikTok: https://www.tiktok.com/@msftmechanics
Check out new AI integrations for your Azure SQL databases. With Retrieval Augmented Generation, you can bridge structured data with generative AI, enhancing natural language queries across applications. With advanced vector-based semantic search, discover precise insights tailored to your data, while Copilot in Azure streamlines troubleshooting and T-SQL query authoring. Optimize workflows, personalize responses, and unlock new levels of efficiency in SQL-driven AI applications. Accelerate performance troubleshooting and complex query authoring tasks with Copilot in Azure. Quickly diagnose database issues and receive expert recommendations for optimization, ensuring optimal performance and reliability. Seamlessly traverse hierarchies within tables and generate intricate queries with ease, saving time and resources. Bob Ward, Azure Principal Architect, shows how to unleash the full potential of your SQL data, driving innovation and intelligence across your applications. ► QUICK LINKS: 00:00 - AI and Azure SQL 01:40 - Using T-SQL for search 02:30 - Using Azure AI Search 03:17 - Vector embeddings and skillsets 04:08 - Connect your SQL data to an AI app 05:44 - Test it in Azure OpenAI Studio playground 07:22 - Combine native JSON data type in SQL 08:30 - Hybrid search 09:56 - Copilot in Azure: Performance troubleshooting 11:11 - Copilot in Azure: Query authoring 12:24 - Permissions 12:40 - Wrap up ► Link References For building AI apps, check out https://aka.ms/sqlai Try out new copilot experiences at https://aka.ms/sqlcopilot ► Unfamiliar with Microsoft Mechanics? As Microsoft's official video series for IT, you can watch and share valuable content and demos of current and upcoming tech from the people who build it at Microsoft. • Subscribe to our YouTube: https://www.youtube.com/c/MicrosoftMechanicsSeries • Talk with other IT Pros, join us on the Microsoft Tech Community: https://techcommunity.microsoft.com/t5/microsoft-mechanics-blog/bg-p/MicrosoftMechanicsBlog • Watch or listen from anywhere, subscribe to our podcast: https://microsoftmechanics.libsyn.com/podcast ► Keep getting this insider knowledge, join us on social: • Follow us on Twitter: https://twitter.com/MSFTMechanics • Share knowledge on LinkedIn: https://www.linkedin.com/company/microsoft-mechanics/ • Enjoy us on Instagram: https://www.instagram.com/msftmechanics/ • Loosen up with us on TikTok: https://www.tiktok.com/@msftmechanics
Improve search capabilities for your PostgreSQL-backed applications using vector search and embeddings generated in under 10 milliseconds without sending data outside your PostgreSQL instance. Integrate real-time translation, sentiment analysis, and advanced AI functionalities securely within your database environment with Azure Local AI and Azure AI Service. Combine the Azure Local AI extension with the Azure AI extension to maximize the potential of AI-driven features in your applications, such as semantic search and real-time data translation, all while maintaining data security and efficiency. Joshua Johnson, Principal Technical PM for Azure Database for PostgreSQL, demonstrates how you can reduce latency and ensure predictable performance by running locally deployed models, making it ideal for highly transactional applications. ► QUICK LINKS: 00:00 - Improve search for PostgreSQL 01:21 - Increased speed 02:47 - Plain text descriptive query 03:20 - Improve search results 04:57 - Semantic search with vector embeddings 06:10 - Test it out 06:41 - Azure local AI extension with Azure AI Service 07:39 - Wrap up ► Link References Check out our previous episode on Azure AI extension at https://aka.ms/PGAIMechanics Get started with Azure Database for PostgreSQL - Flexible Server at https://aka.ms/postgresql To stay current with all the updates, check out our blog at https://aka.ms/azurepostgresblog ► Unfamiliar with Microsoft Mechanics? As Microsoft's official video series for IT, you can watch and share valuable content and demos of current and upcoming tech from the people who build it at Microsoft. • Subscribe to our YouTube: https://www.youtube.com/c/MicrosoftMechanicsSeries • Talk with other IT Pros, join us on the Microsoft Tech Community: https://techcommunity.microsoft.com/t5/microsoft-mechanics-blog/bg-p/MicrosoftMechanicsBlog • Watch or listen from anywhere, subscribe to our podcast: https://microsoftmechanics.libsyn.com/podcast ► Keep getting this insider knowledge, join us on social: • Follow us on Twitter: https://twitter.com/MSFTMechanics • Share knowledge on LinkedIn: https://www.linkedin.com/company/microsoft-mechanics/ • Enjoy us on Instagram: https://www.instagram.com/msftmechanics/ • Loosen up with us on TikTok: https://www.tiktok.com/@msftmechanics
Got a chance to chat with one and only, Jerry Liu, Co-Founder & CEO of LlamaIndex on The Ravit Show at NVIDIA GTC! We discussed about LLMs, RAG, Vector Database, Enterprise Use Cases and much more! #data #ai #nvidiagtc #llamaindex #theravitshow
Jeff Huber is Co-Founder of Chroma, the open source vector database. Their open source project, also called chroma, has 13K stars on GitHub. Chroma has raised $20M from investors including Quiet Ventures and Bloomberg Beta. In this episode, we dig into why vector databases are important for AI applications & why AI workloads are different, how their partnership with LangChain helped with early growth, why data is really the only tool a user has to change modern AI's behavior & more!
Amazon Web Services (AWS) has introduced PG Vector, an open-source tool that integrates generative AI and vector capabilities into PostgreSQL databases. Sirish Chandrasekaran, General Manager of Amazon Relational Database Services, explained at Open Source Summit 2024 in Seattle that PG Vector allows users to store vector types in Postgres and perform similarity searches, a key feature for generative AI applications. The tool, developed by Andrew Kane and offered by AWS in services like Aurora and RDS, originally used an indexing scheme called IVFFlat but has since adopted Hierarchical Navigable Small World (HNSW) for improved query performance. HNSW offers a graph-based approach, enhancing the ability to find nearest neighbors efficiently, which is crucial for generative AI tasks. AWS emphasizes customer feedback and continuous innovation in the rapidly evolving field of generative AI, aiming to stay responsive and adaptive to customer needs. Learn more from The New Stack about Vector Databases Top 5 Vector Database Solutions for Your AI Project Vector Databases Are Having a Moment – A Chat with Pinecone Why Vector Size Matters Join our community of newsletter subscribers to stay on top of the news and at the top of your game. https://thenewstack.io/newsletter/
Ensuring the reliability and effectiveness of AI systems remains a significant challenge. Generative AI must be combined with access to your company data in most use cases, a process called retrieval-augmented generation (RAG). The results from GenerativeAI are vastly improved when the model is enhanced with contextual data from your organization. Most practitioners rely on vector embeddings to surface content based on semantic similarity. While this can be a great step forward, achieving good quality requires a combination of multiple vectors with text and structured data, using machine learning to make final decisions. Vespa.ai, a leading player in the field, enables solutions that do this while keeping latencies suitable for end users, at any scale. In this episode of the EM360 Podcast, Kevin Petrie, VP of research at BARC US speaks to Jon Bratseth, CEO of Vespa.ai, to discuss: the opportunity for Generative AI in businesswhy you need more than vectors to achieve high quality in real systemshow to create high-quality GenerativeAI solutions at an enterprise scale
Build low-latency recommendation engines with Azure Cosmos DB and Azure OpenAI Service. Elevate user experience with vector-based semantic search, going beyond traditional keyword limitations to deliver personalized recommendations in real-time. With pre-trained models stored in Azure Cosmos DB, tailor product predictions based on user interactions and preferences. Explore the power of augmented vector search for optimized results prioritized by relevance. Kirill Gavrylyuk, Azure Cosmos DB General Manager, shows how to build recommendation systems with limitless scalability, leveraging pre-computed vectors and collaborative filtering for next-level, real-time insights. ► QUICK LINKS: 00:00 - Build a low latency recommendation engine 00:59 - Keyword search 01:46 - Vector-based semantic search 02:39 - Vector search built-in to Cosmos DB 03:56 - Model training 05:18 - Code for product predictions 06:02 - Test code for product prediction 06:39 - Augmented vector search 08:23 - Test code for augmented vector search 09:16 - Wrap up ► Link References Walk through an example at https://aka.ms/CosmosDBvectorSample Try out Cosmos DB for MongoDB for free at https://aka.ms/TryC4M ► Unfamiliar with Microsoft Mechanics? As Microsoft's official video series for IT, you can watch and share valuable content and demos of current and upcoming tech from the people who build it at Microsoft. • Subscribe to our YouTube: https://www.youtube.com/c/MicrosoftMechanicsSeries • Talk with other IT Pros, join us on the Microsoft Tech Community: https://techcommunity.microsoft.com/t5/microsoft-mechanics-blog/bg-p/MicrosoftMechanicsBlog • Watch or listen from anywhere, subscribe to our podcast: https://microsoftmechanics.libsyn.com/podcast ► Keep getting this insider knowledge, join us on social: • Follow us on Twitter: https://twitter.com/MSFTMechanics • Share knowledge on LinkedIn: https://www.linkedin.com/company/microsoft-mechanics/ • Enjoy us on Instagram: https://www.instagram.com/msftmechanics/ • Loosen up with us on TikTok: https://www.tiktok.com/@msftmechanics
In this episode, Nathan sits down with Andrew Lee, founder of Shortwave, an AI-powered email app that describes itself as “the smartest email app on planet earth.” They discuss Shortwave's RAG stack, how the AI assistant works, the models Shortwave is using, and much more. This is truly a masterclass in RAG app development. Sponsors : Oracle Cloud Infrastructure (OCI) is a single platform for your infrastructure, database, application development, and AI needs. OCI has four to eight times the bandwidth of other clouds; offers one consistent price, and nobody does data better than Oracle. If you want to do more and spend less, take a free test drive of OCI at https://oracle.com/cognitive The Brave search API can be used to assemble a data set to train your AI models and help with retrieval augmentation at the time of inference. All while remaining affordable with developer first pricing, integrating the Brave search API into your workflow translates to more ethical data sourcing and more human representative data sets. Try the Brave search API for free for up to 2000 queries per month at https://bit.ly/BraveTCR Plumb is a no-code AI app builder designed for product teams who care about quality and speed. What is taking you weeks to hand-code today can be done confidently in hours. Check out https://bit.ly/PlumbTCR for early access. Head to Squad to access global engineering without the headache and at a fraction of the cost: head to choosesquad.com and mention “Turpentine” to skip the waitlist. TIMESTAMPS: (00:00) - Intro (01:07) - Shortwave: AI-Powered Email (06:22) - Genesis of Shortwave (15:25) - Sponsors: Oracle / Omneky (16:46) - Data Processing Pipeline (21:57) - Choosing a Vector Database (26:34) - How the AI Assistant Works (36:24) - Sponsors: Brave / Plumb / Squad (40:53) - Fine-Tuning & Hosting Models (45:11) - Email Automation (47:55) - Email Retrieval (51:44) - Optimizing AI Performance (53:20) - Safety & Embedded Ethics (01:04:51) - Strategic Product Launches (01:09:58) - Inbox Management (01:12:24) - The Future of Collaborative AI (01:18:54) - Scaling Personalized Outreach (01:26:04) - Closing Thoughts & What's Next
Welcome back to an episode where we're talking Vectors, Vector Databases, and AI with Linpeng Tang, CTO and co-founder of MyScale. MyScale is a super interesting technology. They're combining the best of OLAP databases with Vector Search. The project started back in 2019 where they forked ClickHouse and then adapted it to support Vector Storage, Indexing, and Search. The really unique and cool thing is you get the familiarity and usability of SQL with the power of being able to compare the similarity between unstructured data. We think this has really fascinating use cases for analytics well beyond what we're seeing with other vector database technology that's mostly restricted to building RAG models for LLMs. Also, because it's built on ClickHouse, MyScale is massively scalable, which is an area that many of the dedicated vector databases actually struggle with. We cover a lot about how vector databases work, why they decided to build off of ClickHouse, and how they plan to open source the database. Timestamps 02:29 Introduction 06:22 Value of a Vector Database 12:40 Forking ClickHouse 18:53 Transforming Clickhouse into a SQL vector database 32:08 Data modeling 32:56 What data can be Vectorized 38:37 Indexing 43:35 Achieving Scale 46:35 Bottlenecks 48:41 MyScale vs other dedicated Vector Databases 51:38 Going Open Source 56:04 Closing thoughts
Today's guest is Yujian Tang from Zilliz, one of the big players in the vector database market. This is the first episode in a series of episodes we're doing on vectors and vector databases. We start with the basics, what is a vector? What are vector embeddings? How does vector search work? And why the heck do I even need a vector database? RAG models for customizing LLMs is where vector databases are getting a lot of their use. On the surface, it seems pretty simple, but in reality, there's a lot of tinkering that goes into taking RAG to production. Yujian explains some of the tripwires that you might run into and how to think through those problems. We think you're going to really enjoy this episode. Timestamps 02:08 Introduction 03:16 What is a Vector? 07:01 How does Vector Search work? 14:08 Why need a Vector database? 15:11 Use Cases 17:37 What is RAG? 20:34 RAG vs fine-tuning 29:51 Measuring Performance 32:32 Is RAG here to stay? 35:43 Milvus 37:17 History of Milvus 47:44 Rapid Fire X https://twitter.com/yujian_tang https://twitter.com/seanfalconer
The rapid progress in AI technology has fueled the evolution of new tools and platforms. One such tool is a vector search. If the function of AI is to reason and think, the key to achieving this is not just in processing data, but also in understanding the relationships among data. Vector databases provide AI systems with the ability to explore these relationships, draw similarities, and make logical conclusions. Understanding and harnessing the power of vector databases will have a transformative impact on the future of AI.Edo Liberty is optimistic about the future where knowledge can be accessed at any time. Edo is the CEO and Founder of Pinecone, the managed database for large-scale vector search. Previously, he was a Director of Research at AWS and Head of Amazon AI Labs, where he built groundbreaking machine learning algorithms, systems, and services. He also served as Yahoo's Senior Research Director and led the research lab building horizontal ML platforms and improving applications. Satyen and Edo give a crash course on vector databases: what they are, who needs them, how they will evolve, and what role AI plays.--------“We as a community need to learn how to reason and think. We need to teach our machines how to reason and think and talk and read. This is the intelligence and we need to teach them how to know and remember and recall relevant stuff. Which is the capacity of knowing and remembering. The question is, what does it mean to know something? To know something is to be able to digest it, somehow to make the connections. When I ask you something about it, to figure out, ‘Oh, what's relevant? And I know how to bring the right information to bear so that I can reason about it.' This ping pong between reasoning and retrieving the right knowledge is what we need to get good at.” – Edo Liberty--------Time Stamps*(03:13): How vector databases revolutionize AI*(14:13): Transforming the digital landscape with semantic search and LLM integration*(28:10): Exploring AI's black box: The challenge of understanding complex systems *(37:02): Striking a balance between AI innovation and thoughtful regulation*(40:01): Satyen's Takeaways--------SponsorThis podcast is presented by Alation.Learn more:* Subscribe to the newsletter: https://www.alation.com/podcast/* Alation's LinkedIn Profile: https://www.linkedin.com/company/alation/* Satyen's LinkedIn Profile: https://www.linkedin.com/in/ssangani/--------LinksConnect with Edo on LinkedInWatch Edo's TED Talk
AI 流行りで台頭中な Vector DB のサーベイを向井が眺めました。
This episode features a discussion with Amar Goel, co-founder and CEO of Bito, a company revolutionizing the software development process through AI-driven tools. Focusing on increasing developer productivity and code quality, Bito integrates ChatGPT into IDEs and CLIs, streamlining the coding process. Amar and I discuss Bito's role in enhancing software development, its key features, and the future trajectory of AI in coding. The conversation also touches on job security in the evolving landscape of AI and software development. Founded by seasoned entrepreneurs and technologists in the heart of Silicon Valley in 2020, Bito has emerged as a pioneering AI-driven software development company. Topics Covered: Amar Goel's background and journey to founding Bito. The inception of Bito and its mission to transform software development with Gen AI. Bito's integration with IDEs for enhanced coding efficiency. How Bito AI assists in understanding, writing, testing, and documenting code. The significance of Bito for non-developers, including product managers. Discussion on Bito's AI models and their selection process for specific tasks. The concept and future development of AI agents in Bito for automating coding workflows. Exploring the impact of generative AI on job security and the software industry. Personal insights and plans for Bito in 2024, focusing on developing AI agents for code review and unit testing. ☑️ Web: Bito AI Official Website☑️ Crunchbase: Bito AI Crunchbase Profile ☑️ Support the Channel by buying a coffee? - https://ko-fi.com/gtwgt ☑️ Technology and Topics Mentioned: Bito, Generative AI, ChatGPT, IDE Integration, CLI Tools, Software Development Efficiency, AI-Powered Coding, Code Quality Enhancement, Job Security in AI Era, DevOps, AI Code Completions, AI Agents, Vector Database, AI Model Selection, Automation in Coding. ☑️ Interested in being on #GTwGT? Contact via Twitter @GTwGTPodcast or visit https://www.gtwgt.com ☑️ Subscribe to YouTube: https://www.youtube.com/@GTwGTPodcast?sub_confirmation=1 ☑️ Subscribe to Spotify: https://open.spotify.com/show/5Y1Fgl4DgGpFd5Z4dHulVX • Web - https://gtwgt.com • Twitter - https://twitter.com/GTwGTPodcast • Apple Podcasts - https://podcasts.apple.com/us/podcast/id1519439787?mt=2&ls=1 ☑️ Music: https://www.bensound.com
TNS publisher Alex Williams spoke with Ben Kramer, co-founder and CTO of Monterey.ai Cole Hoffer, Senior Software Engineer at Monterey.ai to discuss how the company utilizes vector search to analyze user voices, feedback, reviews, bug reports, and support tickets from various channels to provide product development recommendations. Monterey.ai connects customer feedback to the development process, bridging customer support and leadership to align with user needs. Figma and Comcast are among the companies using this approach. In this interview, Kramer discussed the challenges of building Large Language Model (LLM) based products and the importance of diverse skills in AI web companies and how Monterey employs Zilliz for vector search, leveraging Milvus, an open-source vector database. Kramer highlighted Zilliz's flexibility, underlying Milvus technology, and choice of algorithms for semantic search. The decision to choose Zilliz was influenced by its performance in the company's use case, privacy and security features, and ease of integration into their private network. The cloud-managed solution and Zilliz's ability to meet their needs were crucial factors for Monterey AI, given its small team and preference to avoid managing infrastructure.Learn more from The New Stack about Zilliz and vector database search:Improving ChatGPT's Ability to Understand Ambiguous PromptsCreate a Movie Recommendation Engine with Milvus and PythonUsing a Vector Database to Search White House Speeches Join our community of newsletter subscribers to stay on top of the news and at the top of your game. https://thenewstack.io/newsletter/
An embedding is a concept in machine learning that refers to a particular representation of text, images, audio, or other information. Embeddings are designed to make data consumable by ML models. However, storing embeddings presents a challenge to traditional databases. Vector databases are designed to solve this problem. Pinecone has developed one of the most The post Pinecone Vector Database with Marek Galovic appeared first on Software Engineering Daily.
An embedding is a concept in machine learning that refers to a particular representation of text, images, audio, or other information. Embeddings are designed to make data consumable by ML models. However, storing embeddings presents a challenge to traditional databases. Vector databases are designed to solve this problem. Pinecone has developed one of the most The post Pinecone Vector Database with Marek Galovic appeared first on Software Engineering Daily.
Summary Building machine learning systems and other intelligent applications are a complex undertaking. This often requires retrieving data from a warehouse engine, adding an extra barrier to every workflow. The RelationalAI engine was built as a co-processor for your data warehouse that adds a greater degree of flexibility in the representation and analysis of the underlying information, simplifying the work involved. In this episode CEO Molham Aref explains how RelationalAI is designed, the capabilities that it adds to your data clouds, and how you can start using it to build more sophisticated applications on your data. Announcements Hello and welcome to the Machine Learning Podcast, the podcast about machine learning and how to bring it from idea to delivery. Your host is Tobias Macey and today I'm interviewing Molham Aref about RelationalAI and the principles behind it for powering intelligent applications Interview Introduction How did you get involved in machine learning? Can you describe what RelationalAI is and the story behind it? On your site you call your product an "AI Co-processor". Can you explain what you mean by that phrase? What are the primary use cases that you address with the RelationalAI product? What are the types of solutions that teams might build to address those problems in the absence of something like the RelationalAI engine? Can you describe the system design of RelationalAI? How have the design and goals of the platform changed since you first started working on it? For someone who is using RelationalAI to address a business need, what does the onboarding and implementation workflow look like? What is your design philosophy for identifying the balance between automating the implementation of certain categories of application (e.g. NER) vs. providing building blocks and letting teams assemble them on their own? What are the data modeling paradigms that teams should be aware of to make the best use of the RKGS platform and Rel language? What are the aspects of customer education that you find yourself spending the most time on? What are some of the most under-utilized or misunderstood capabilities of the RelationalAI platform that you think deserve more attention? What are the most interesting, innovative, or unexpected ways that you have seen the RelationalAI product used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on RelationalAI? When is RelationalAI the wrong choice? What do you have planned for the future of RelationalAI? Contact Info LinkedIn (https://www.linkedin.com/in/molham/) Parting Question From your perspective, what is the biggest barrier to adoption of machine learning today? Closing Announcements Thank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast (https://www.dataengineeringpodcast.com) covers the latest on modern data management. Podcast.__init__ () covers the Python language, its community, and the innovative ways it is being used. Visit the site (https://www.themachinelearningpodcast.com) to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show then tell us about it! Email hosts@themachinelearningpodcast.com (mailto:hosts@themachinelearningpodcast.com)) with your story. To help other people find the show please leave a review on iTunes (https://podcasts.apple.com/us/podcast/the-machine-learning-podcast/id1626358243) and tell your friends and co-workers. Links RelationalAI (https://relational.ai/) Snowflake (https://www.snowflake.com/en/) AI Winter (https://en.wikipedia.org/wiki/AI_winter) BigQuery (https://cloud.google.com/bigquery) Gradient Descent (https://en.wikipedia.org/wiki/Gradient_descent) B-Tree (https://en.wikipedia.org/wiki/B-tree) Navigational Database (https://en.wikipedia.org/wiki/Navigational_database) Hadoop (https://hadoop.apache.org/) Teradata (https://www.teradata.com/) Worst Case Optimal Join (https://relational.ai/blog/worst-case-optimal-join-algorithms-techniques-results-and-open-problems) Semantic Query Optimization (https://relational.ai/blog/semantic-optimizer) Relational Algebra (https://en.wikipedia.org/wiki/Relational_algebra) HyperGraph (https://en.wikipedia.org/wiki/Hypergraph) Linear Algebra (https://en.wikipedia.org/wiki/Linear_algebra) Vector Database (https://en.wikipedia.org/wiki/Vector_database) Pathway (https://pathway.com/) Data Engineering Podcast Episode (https://www.dataengineeringpodcast.com/pathway-database-that-thinks-episode-334/) Pinecone (https://www.pinecone.io/) Data Engineering Podcast Episode (https://www.dataengineeringpodcast.com/pinecone-vector-database-similarity-search-episode-189/) The intro and outro music is from Hitman's Lovesong feat. Paola Graziano (https://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Tales_Of_A_Dead_Fish/Hitmans_Lovesong/) by The Freak Fandango Orchestra (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/)/CC BY-SA 3.0 (https://creativecommons.org/licenses/by-sa/3.0/)
I sat down with Praveen Viswanath, the Co-Founder and Chief Architect of Alpha Ori Technologies at the Amazon Web Services (AWS) Re: Invent conference. Our conversation was nothing short of enlightening, and I'm thrilled to share it with you all.
* Intro by Slava Kovalevskyi - Reference to Kovalevsky Academy YouTube channel for live broadcasts* Discussion Topics - Sovereign Shadow Leadership Society on Discord - Release of The Beatles' new song 'Now and Then' - Google's opt-out extension for Analytics on major browsers - Personal interest in the Indian Scout Motorcycle and Eagle Lights 5 Headlight modification - Recent changes in OpenAI's leadership - Bing Chat plug-ins and Character AI's speculated acquisition by Google* Jessica GPT and VectorDB - Discussion on Jessica GPT, its prompt engineering, and VectorDB for long-term memory storage - Explanation of embeddings as a method to represent text as a vector of numbers* Text management with Python AI tools for text-to-vector conversions and Vector Databases - Benefits of user-friendly Vector Database such as Pinecone AI* New Releases - Introduction of new GPTs: Engineering Doc Critics and iPhone Wallpaper Maker - Introduction of Career Decision Helper* Insights on Reducing Phone Usage - Strategies including using Apple Watch, creating phone-free zones, applying grayscale filter and reorganizing home screen* AWS Services Review - Review of Party Rocked AWS and comparison with Charge GPT for app creation* Discord link for further discussion: https://discord.gg/T38WpgkHGQ
Jeff Huber (@jeffreyhuber, Founder/CEO @trychroma) talks vector databases, data integration, RAG, Embedded models, and AI integration.SHOW: 771CLOUD NEWS OF THE WEEK - http://bit.ly/cloudcast-cnotwNEW TO CLOUD? CHECK OUT - "CLOUDCAST BASICS"SHOW SPONSORS:Datadog Kubernetes Solution: Maximum Visibility into Container EnvironmentsStart monitoring the health and performance of your container environment with a free 14 day Datadog trial. Listeners of The Cloudcast will also receive a free Datadog T-shirt.CloudZero – Cloud Cost Visibility and SavingsCloudZero provides immediate and ongoing savings with 100% visibility into your total cloud spendSHOW NOTES:Chroma HomepageChroma Reaches Major MilestoneChroma GitHubTopic 1 - Welcome to the show. Tell us a little bit about your background, and what brought you to create Chroma.Topic 2 - Our audience, like many out there, is learning AI as fast as they can. So, let's start with a few basic concepts. What is a vector database, and why is it important to AI?Topic 3 - Does this only help foundational models? What goes into the integration into a model, and can this be used with any model?Topic 4 - Is RAG (Retrieval-Augmented Generation) possible without a vector database?Topic 5 - What are the trade-offs between fine-tuning a model (adding the data in) vs. RAG (keeping the data external)?Topic 6 - What else is needed to create an “AI Stack”?Topic 7 - Let's talk about Chroma. First off, there is the OSS project which has been a huge success. Over 3 million downloads and 9.5k Github stars and inclusion into some 10k plus projects. Tell everyone a little bit about Chroma and what makes it different. You recently also announced a major milestone, over one million instances runningFEEDBACK?Email: show at the cloudcast dot netTwitter: @thecloudcastnetBuzzcastKeep up to date on the latest podcasting tech & news with the folks at Buzzsprout!Listen on: Apple Podcasts
Summary Databases are the core of most applications, but they are often treated as inscrutable black boxes. When an application is slow, there is a good probability that the database needs some attention. In this episode Lukas Fittl shares some hard-won wisdom about the causes and solution of many performance bottlenecks and the work that he is doing to shine some light on PostgreSQL to make it easier to understand how to keep it running smoothly. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team. You specify the customer traits, then Profiles runs the joins and computations for you to create complete customer profiles. Get all of the details and try the new product today at dataengineeringpodcast.com/rudderstack (https://www.dataengineeringpodcast.com/rudderstack) You shouldn't have to throw away the database to build with fast-changing data. You should be able to keep the familiarity of SQL and the proven architecture of cloud warehouses, but swap the decades-old batch computation model for an efficient incremental engine to get complex queries that are always up-to-date. With Materialize, you can! It's the only true SQL streaming database built from the ground up to meet the needs of modern data products. Whether it's real-time dashboarding and analytics, personalization and segmentation or automation and alerting, Materialize gives you the ability to work with fresh, correct, and scalable results — all in a familiar SQL interface. Go to dataengineeringpodcast.com/materialize (https://www.dataengineeringpodcast.com/materialize) today to get 2 weeks free! Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst powers petabyte-scale SQL analytics fast, at a fraction of the cost of traditional methods, so that you can meet all your data needs ranging from AI to data applications to complete analytics. Trusted by teams of all sizes, including Comcast and Doordash, Starburst is a data lake analytics platform that delivers the adaptability and flexibility a lakehouse ecosystem promises. And Starburst does all of this on an open architecture with first-class support for Apache Iceberg, Delta Lake and Hudi, so you always maintain ownership of your data. Want to see Starburst in action? Go to dataengineeringpodcast.com/starburst (https://www.dataengineeringpodcast.com/starburst) and get $500 in credits to try Starburst Galaxy today, the easiest and fastest way to get started using Trino. This episode is brought to you by Datafold – a testing automation platform for data engineers that finds data quality issues before the code and data are deployed to production. Datafold leverages data-diffing to compare production and development environments and column-level lineage to show you the exact impact of every code change on data, metrics, and BI tools, keeping your team productive and stakeholders happy. Datafold integrates with dbt, the modern data stack, and seamlessly plugs in your data CI for team-wide and automated testing. If you are migrating to a modern data stack, Datafold can also help you automate data and code validation to speed up the migration. Learn more about Datafold by visiting dataengineeringpodcast.com/datafold (https://www.dataengineeringpodcast.com/datafold) Your host is Tobias Macey and today I'm interviewing Lukas Fittl about optimizing your database performance and tips for tuning Postgres Interview Introduction How did you get involved in the area of data management? What are the different ways that database performance problems impact the business? What are the most common contributors to performance issues? What are the useful signals that indicate performance challenges in the database? For a given symptom, what are the steps that you recommend for determining the proximate cause? What are the potential negative impacts to be aware of when tuning the configuration of your database? How does the database engine influence the methods used to identify and resolve performance challenges? Most of the database engines that are in common use today have been around for decades. How have the lessons learned from running these systems over the years influenced the ways to think about designing new engines or evolving the ones we have today? What are the most interesting, innovative, or unexpected ways that you have seen to address database performance? What are the most interesting, unexpected, or challenging lessons that you have learned while working on databases? What are your goals for the future of database engines? Contact Info LinkedIn (https://www.linkedin.com/in/lfittl/) @LukasFittl (https://twitter.com/LukasFittl) on Twitter Parting Question From your perspective, what is the biggest gap in the tooling or technology for data management today? Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ (https://www.pythonpodcast.com) covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast (https://www.themachinelearningpodcast.com) helps you go from idea to production with machine learning. Visit the site (https://www.dataengineeringpodcast.com) to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com (mailto:hosts@dataengineeringpodcast.com)) with your story. To help other people find the show please leave a review on Apple Podcasts (https://podcasts.apple.com/us/podcast/data-engineering-podcast/id1193040557) and tell your friends and co-workers Links PGAnalyze (https://pganalyze.com/) Citus Data (https://www.citusdata.com/) Podcast Episode (https://www.dataengineeringpodcast.com/citus-data-with-ozgun-erdogan-and-craig-kerstiens-episode-13/) ORM == Object Relational Mapper (https://en.wikipedia.org/wiki/Object%E2%80%93relational_mapping) N+1 Query (https://docs.sentry.io/product/issues/issue-details/performance-issues/n-one-queries/) Autovacuum (https://www.postgresql.org/docs/current/routine-vacuuming.html#AUTOVACUUM) Write-ahead Log (https://en.wikipedia.org/wiki/Write-ahead_logging) pgstatio (https://pgpedia.info/p/pg_stat_io.html) randompagecost (https://postgresqlco.nf/doc/en/param/random_page_cost/) pgvector (https://github.com/pgvector/pgvector) Vector Database (https://en.wikipedia.org/wiki/Vector_database) Ottertune (https://ottertune.com/) Podcast Episode (https://www.dataengineeringpodcast.com/ottertune-database-performance-optimization-episode-197/) Citus Extension (https://github.com/citusdata/citus) Hydra (https://github.com/hydradatabase/hydra) Clickhouse (https://clickhouse.tech/) Podcast Episode (https://www.dataengineeringpodcast.com/clickhouse-data-warehouse-episode-88/) MyISAM (https://en.wikipedia.org/wiki/MyISAM) MyRocks (http://myrocks.io/) InnoDB (https://en.wikipedia.org/wiki/InnoDB) Great Expectations (https://greatexpectations.io/) Podcast Episode (https://www.dataengineeringpodcast.com/great-expectations-data-contracts-episode-352) OpenTelemetry (https://opentelemetry.io/) The intro and outro music is from The Hug (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Love_death_and_a_drunken_monkey/04_-_The_Hug) by The Freak Fandango Orchestra (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/) / CC BY-SA (http://creativecommons.org/licenses/by-sa/3.0/)
Summary Databases are the core of most applications, whether transactional or analytical. In recent years the selection of database products has exploded, making the critical decision of which engine(s) to use even more difficult. In this episode Tanya Bragin shares her experiences as a product manager for two major vendors and the lessons that she has learned about how teams should approach the process of tool selection. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team. You specify the customer traits, then Profiles runs the joins and computations for you to create complete customer profiles. Get all of the details and try the new product today at dataengineeringpodcast.com/rudderstack (https://www.dataengineeringpodcast.com/rudderstack) You shouldn't have to throw away the database to build with fast-changing data. You should be able to keep the familiarity of SQL and the proven architecture of cloud warehouses, but swap the decades-old batch computation model for an efficient incremental engine to get complex queries that are always up-to-date. With Materialize, you can! It's the only true SQL streaming database built from the ground up to meet the needs of modern data products. Whether it's real-time dashboarding and analytics, personalization and segmentation or automation and alerting, Materialize gives you the ability to work with fresh, correct, and scalable results — all in a familiar SQL interface. Go to dataengineeringpodcast.com/materialize (https://www.dataengineeringpodcast.com/materialize) today to get 2 weeks free! This episode is brought to you by Datafold – a testing automation platform for data engineers that finds data quality issues before the code and data are deployed to production. Datafold leverages data-diffing to compare production and development environments and column-level lineage to show you the exact impact of every code change on data, metrics, and BI tools, keeping your team productive and stakeholders happy. Datafold integrates with dbt, the modern data stack, and seamlessly plugs in your data CI for team-wide and automated testing. If you are migrating to a modern data stack, Datafold can also help you automate data and code validation to speed up the migration. Learn more about Datafold by visiting dataengineeringpodcast.com/datafold (https://www.dataengineeringpodcast.com/datafold) Data projects are notoriously complex. With multiple stakeholders to manage across varying backgrounds and toolchains even simple reports can become unwieldy to maintain. Miro is your single pane of glass where everyone can discover, track, and collaborate on your organization's data. I especially like the ability to combine your technical diagrams with data documentation and dependency mapping, allowing your data engineers and data consumers to communicate seamlessly about your projects. Find simplicity in your most complex projects with Miro. Your first three Miro boards are free when you sign up today at dataengineeringpodcast.com/miro (https://www.dataengineeringpodcast.com/miro). That's three free boards at dataengineeringpodcast.com/miro (https://www.dataengineeringpodcast.com/miro). Your host is Tobias Macey and today I'm interviewing Tanya Bragin about her views on the database products market Interview Introduction How did you get involved in the area of data management? What are the aspects of the database market that keep you interested as a VP of product? How have your experiences at Elastic informed your current work at Clickhouse? What are the main product categories for databases today? What are the industry trends that have the most impact on the development and growth of different product categories? Which categories do you see growing the fastest? When a team is selecting a database technology for a given task, what are the types of questions that they should be asking? Transactional engines like Postgres, SQL Server, Oracle, etc. were long used as analytical databases as well. What is driving the broad adoption of columnar stores as a separate environment from transactional systems? What are the inefficiencies/complexities that this introduces? How can the database engine used for analytical systems work more closely with the transactional systems? When building analytical systems there are numerous moving parts with intricate dependencies. What is the role of the database in simplifying observability of these applications? What are the most interesting, innovative, or unexpected ways that you have seen Clickhouse used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on database products? What are your prodictions for the future of the database market? Contact Info LinkedIn (https://www.linkedin.com/in/tbragin/) Parting Question From your perspective, what is the biggest gap in the tooling or technology for data management today? Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ (https://www.pythonpodcast.com) covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast (https://www.themachinelearningpodcast.com) helps you go from idea to production with machine learning. Visit the site (https://www.dataengineeringpodcast.com) to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com (mailto:hosts@dataengineeringpodcast.com)) with your story. To help other people find the show please leave a review on Apple Podcasts (https://podcasts.apple.com/us/podcast/data-engineering-podcast/id1193040557) and tell your friends and co-workers Links Clickhouse (https://clickhouse.com/) Podcast Episode (https://www.dataengineeringpodcast.com/clickhouse-data-warehouse-episode-88/) Elastic (https://www.elastic.co/) OLAP (https://en.wikipedia.org/wiki/Online_analytical_processing) OLTP (https://en.wikipedia.org/wiki/Online_transaction_processing) Graph Database (https://en.wikipedia.org/wiki/Graph_database) Vector Database (https://en.wikipedia.org/wiki/Vector_database) Trino (https://trino.io/) Presto (https://prestodb.io/) Foreign data wrapper (https://wiki.postgresql.org/wiki/Foreign_data_wrappers) dbt (https://www.getdbt.com/) Podcast Episode (https://www.dataengineeringpodcast.com/dbt-data-analytics-episode-81/) OpenTelemetry (https://opentelemetry.io/) Iceberg (https://iceberg.apache.org/) Podcast Episode (https://www.dataengineeringpodcast.com/tabular-iceberg-lakehouse-tables-episode-363) Parquet (https://parquet.apache.org/) The intro and outro music is from The Hug (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Love_death_and_a_drunken_monkey/04_-_The_Hug) by The Freak Fandango Orchestra (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/) / CC BY-SA (http://creativecommons.org/licenses/by-sa/3.0/)
With the rise of GenAI, LLMs are now accessible to everyone. They start with a very easy learning curve that grows more complicated the deeper you go. But, not all models are created equal. It's critical to design effective prompts so users stay focused and have context that will drive how productive the model is.In this episode, Matthew Lynley, Founding Writer of Supervised, delivers a crash course on LLMs. From the basics of what they are, to vector databases, to trends in the market, you'll learn everything about LLMs that you've always wanted to know. Matthew has spent the last decade reporting on the tech industry at publications like Business Insider, The Wall Street Journal, BuzzFeed News, and TechCrunch. He founded the AI newsletter, Supervised, with the goal of helping readers understand the implications of new technologies and the team building it. Satyen and Matt discuss the inspiration behind Supervised, LLMs, and the rivalry between Databricks and Snowflake.--------“This idea of, ‘How does an LLM work?' I think, the second you touch one for the first time, you get it right away. Now, there's an enormous level of intricacy and complication once you go a single step deeper, which is the differences between the LLMs. How do you think about crafting the right prompt? Knowing that they can go off the rails really fast if you're not careful, and the whole network of tools that are associated on top of it. But, when you think from an education perspective, the education really only starts when you are talking to people that are like, ‘This is really cool. I've tried it, it's awesome. It's cool as hell. But how can I use it to improve my business?' Then it starts to get complicated. Then you have to start understanding how expensive is OpenAI? How do you integrate it? Do I go closed source or open source? The learning curve starts off very, very, very easy because you can get it right away. Then, it quickly becomes one of the hardest possible products to understand once you start trying to dig into it.” – Matthew Lynley--------Time Stamps:*(04:21): The genesis of Supervised*(11:34): The LLM learning curve*(21:35): Time to build a vector database?*(31:55): Open source vs. proprietary LLMs *(41:35): Snowflake/Databricks overlap*(47:47): Satyen's Takeaways--------SponsorThis podcast is presented by Alation.Learn more:* Subscribe to the newsletter: https://www.alation.com/podcast/* Alation's LinkedIn Profile: https://www.linkedin.com/company/alation/* Satyen's LinkedIn Profile: https://www.linkedin.com/in/ssangani/--------LinksRead SupervisedConnect with Matthew on LinkedIn
Highlights from this week's conversation include:Chang's background and journey with Pandas (6:26)The persisting challenges in data collection and preparation (10:37)The resistance to change in using Python for data workflows (13:05)AI hype and its impact (14:09)The success and evolution of Pandas as a data framework (20:04)The vision for a next-generation data infrastructure (26:48]LanceDB's file and table format (34:35)Trade-Offs in Lance Format (42:45)Introducing the Vector Database (46:30)The split between production and serving databases (51:14)The importance of unstructured data and multimodal use cases (57:01)The potential of generative AI and the balance between value and hype (1:01:34)Changing expectations of interacting with information systems (1:13:53)Final thoughts and takeaways (1:15:32)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we'll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
Summary Large language models have gained a substantial amount of attention in the area of AI and machine learning. While they are impressive, there are many applications where they are not the best option. In this episode Piero Molino explains how declarative ML approaches allow you to make the best use of the available tools across use cases and data formats. Announcements Hello and welcome to the Machine Learning Podcast, the podcast about machine learning and how to bring it from idea to delivery. Your host is Tobias Macey and today I'm interviewing Piero Molino about the application of declarative ML in a world being dominated by large language models Interview Introduction How did you get involved in machine learning? Can you start by summarizing your perspective on the effect that LLMs are having on the AI/ML industry? In a world where LLMs are being applied to a growing variety of use cases, what are the capabilities that they still lack? How does declarative ML help to address those shortcomings? The majority of current hype is about commercial models (e.g. GPT-4). Can you summarize the current state of the ecosystem for open source LLMs? For teams who are investing in ML/AI capabilities, what are the sources of platform risk for LLMs? What are the comparative benefits of using a declarative ML approach? What are the most interesting, innovative, or unexpected ways that you have seen LLMs used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on declarative ML in the age of LLMs? When is an LLM the wrong choice? What do you have planned for the future of declarative ML and Predibase? Contact Info LinkedIn (https://www.linkedin.com/in/pieromolino/?locale=en_US) Website (https://w4nderlu.st/) Closing Announcements Thank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast (https://www.dataengineeringpodcast.com) covers the latest on modern data management. Podcast.__init__ () covers the Python language, its community, and the innovative ways it is being used. Visit the site (https://www.themachinelearningpodcast.com) to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show then tell us about it! Email hosts@themachinelearningpodcast.com (mailto:hosts@themachinelearningpodcast.com)) with your story. To help other people find the show please leave a review on iTunes (https://podcasts.apple.com/us/podcast/the-machine-learning-podcast/id1626358243) and tell your friends and co-workers Parting Question From your perspective, what is the biggest barrier to adoption of machine learning today? Links Predibase (https://predibase.com/) Podcast Episode (https://www.themachinelearningpodcast.com/predibase-declarative-machine-learning-episode-4) Ludwig (https://ludwig.ai/latest/) Podcast.__init__ Episode (https://www.pythonpodcast.com/ludwig-horovod-distributed-declarative-deep-learning-episode-341/) Recommender Systems (https://en.wikipedia.org/wiki/Recommender_system) Information Retrieval (https://en.wikipedia.org/wiki/Information_retrieval) Vector Database (https://thenewstack.io/what-is-a-real-vector-database/) Transformer Model (https://en.wikipedia.org/wiki/Transformer_(machine_learning_model)) BERT (https://en.wikipedia.org/wiki/BERT_(language_model)) Context Windows (https://www.linkedin.com/pulse/whats-context-window-anyway-caitie-doogan-phd/) LLAMA (https://en.wikipedia.org/wiki/LLaMA) The intro and outro music is from Hitman's Lovesong feat. Paola Graziano (https://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Tales_Of_A_Dead_Fish/Hitmans_Lovesong/) by The Freak Fandango Orchestra (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/)/CC BY-SA 3.0 (https://creativecommons.org/licenses/by-sa/3.0/)
Improve the information retrieval process, so you have the most optimal set of grounding data needed to generate useful AI responses. See how Azure Cognitive Search combines different search strategies out of the box and at scale - so you don't have to. Keyword search—match the exact words to search your grounding data Vector search—focuses on conceptual similarity, where the app is using part of the dialogue to retrieve grounding information Hybrid approach—combines both keyword and vector searches Semantic ranking—to boost precision, a re-ranking step can re-score the top results using a larger deep learning ranking model Pablo Castro, Azure AI Distinguished Engineer, shows how to improve the quality of generative AI responses using Azure Cognitive Search. ► QUICK LINKS: 00:00 - How to generate high-quality AI responses 01:06 - Improve quality of generative AI outputs 02:56 - Why use vectors? 04:57 - Vector Database 06:56 - Apply to real data and text 08:00 - Vectors using images09:40 - Keyword search11:22 - Hybrid retrieval 12:18 - Re-ranking 14:18 - Wrap up ► Link References Sample code available at https://aka.ms/MechanicsVectors Complete Copilot sample app at https://aka.ms/EntGPTSearch Evaluation details for relevance quality at https://aka.ms/ragrelevance ► Unfamiliar with Microsoft Mechanics? As Microsoft's official video series for IT, you can watch and share valuable content and demos of current and upcoming tech from the people who build it at Microsoft. • Subscribe to our YouTube: https://www.youtube.com/c/MicrosoftMechanicsSeries • Talk with other IT Pros, join us on the Microsoft Tech Community: https://techcommunity.microsoft.com/t5/microsoft-mechanics-blog/bg-p/MicrosoftMechanicsBlog • Watch or listen from anywhere, subscribe to our podcast: https://microsoftmechanics.libsyn.com/podcast ► Keep getting this insider knowledge, join us on social: • Follow us on Twitter: https://twitter.com/MSFTMechanics • Share knowledge on LinkedIn: https://www.linkedin.com/company/microsoft-mechanics/ • Enjoy us on Instagram: https://www.instagram.com/msftmechanics/ • Loosen up with us on TikTok: https://www.tiktok.com/@msftmechanics
Since the release of OpenAI's ChatGPT-3 in late 2022, various industries have been actively exploring its applications. Madhukar Kumar, CMO of SingleStore, discussed his experiments with large language models (LLMs) in this podcast episode with TNS host Heather Joslyn. He mentioned a specific LLM called Gorilla, which is trained on APIs and can generate APIs based on specific tasks. Kumar also talked about SingleStore Now, an AI conference, where they plan to teach attendees how to build generative AI applications from scratch, focusing on enterprise applications.Kumar highlighted a limitation with current LLMs - they are "frozen in time" and cannot provide real-time information. To address this, a method called "retrieval augmented generation" (RAG) has emerged. SingleStore is using RAG to keep LLMs updated. In this approach, a user query is first matched with up-to-date enterprise data to provide context, and then the LLM is tasked with generating answers based on this context. This method aims to prevent the generation of factually incorrect responses and relies on storing data as vectors for efficient real-time processing, which SingleStore enables.This strategy ensures that LLMs can provide current and contextually accurate information, making AI applications more reliable and responsive for enterprises.Learn more from The New Stack about LLMs and SingleStore:Top 5 Large Language Models and How to Use Them EffectivelyUsing ChatGPT for Questions Specific to Your Company Data6 Reasons Private LLMs Are Key for Enterprises
Highlights from this week's conversation include:How music impacted Bob's data journey (3:16)Music's relationship with creativity and innovation (11:38)The genesis of Weaviate and the idea of vector databases (14:09)The joy of creation (19:02)OLAP Databases (22:21)The progression of complexity in databases (24:31)Vector database (29:23)Scaling suboptimal algorithms (34:34)The future of vector space representation (35:51)Databases role in different industries (39:14)The brute force approach to discovery (45:57)Retrieval augmented generation (51:26)How generative model interacts with the database (57:55)Final thoughts and takeaways (1:03:20)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we'll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
Allen and Mark revisit a conversation from episode 146 where they discovered Google had a Vector Database. Now, several months later, Allen has done some work with the Google Cloud Vertex AI Matching Engine and incorporated it into LangChain JS. We discuss why this is important, and how it fits into the overall landscape of LLMs and MLs today. (And Allen has a little announcement towards the end.) More info: * Matching Engine: https://cloud.google.com/vertex-ai/docs/matching-engine/overview * LangChain JS: https://js.langchain.com/docs/modules/data_connection/vectorstores/integrations/googlevertexai
Summary Generative AI has unlocked a massive opportunity for content creation. There is also an unfulfilled need for experts to be able to share their knowledge and build communities. Illumidesk was built to take advantage of this intersection. In this episode Greg Werner explains how they are using generative AI as an assistive tool for creating educational material, as well as building a data driven experience for learners. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team. You specify the customer traits, then Profiles runs the joins and computations for you to create complete customer profiles. Get all of the details and try the new product today at dataengineeringpodcast.com/rudderstack (https://www.dataengineeringpodcast.com/rudderstack) This episode is brought to you by Datafold – a testing automation platform for data engineers that finds data quality issues before the code and data are deployed to production. Datafold leverages data-diffing to compare production and development environments and column-level lineage to show you the exact impact of every code change on data, metrics, and BI tools, keeping your team productive and stakeholders happy. Datafold integrates with dbt, the modern data stack, and seamlessly plugs in your data CI for team-wide and automated testing. If you are migrating to a modern data stack, Datafold can also help you automate data and code validation to speed up the migration. Learn more about Datafold by visiting dataengineeringpodcast.com/datafold (https://www.dataengineeringpodcast.com/datafold) You shouldn't have to throw away the database to build with fast-changing data. You should be able to keep the familiarity of SQL and the proven architecture of cloud warehouses, but swap the decades-old batch computation model for an efficient incremental engine to get complex queries that are always up-to-date. With Materialize, you can! It's the only true SQL streaming database built from the ground up to meet the needs of modern data products. Whether it's real-time dashboarding and analytics, personalization and segmentation or automation and alerting, Materialize gives you the ability to work with fresh, correct, and scalable results — all in a familiar SQL interface. Go to dataengineeringpodcast.com/materialize (https://www.dataengineeringpodcast.com/materialize) today to get 2 weeks free! Your host is Tobias Macey and today I'm interviewing Greg Werner about building IllumiDesk, a data-driven and AI powered online learning platform Interview Introduction How did you get involved in the area of data management? Can you describe what Illumidesk is and the story behind it? What are the challenges that educators and content creators face in developing and maintaining digital course materials for their target audiences? How are you leaning on data integrations and AI to reduce the initial time investment required to deliver courseware? What are the opportunities for collecting and collating learner interactions with the course materials to provide feedback to the instructors? What are some of the ways that you are incorporating pedagogical strategies into the measurement and evaluation methods that you use for reports? What are the different categories of insights that you need to provide across the different stakeholders/personas who are interacting with the platform and learning content? Can you describe how you have architected the Illumidesk platform? How have the design and goals shifted since you first began working on it? What are the strategies that you have used to allow for evolution and adaptation of the system in order to keep pace with the ecosystem of generative AI capabilities? What are the failure modes of the content generation that you need to account for? What are the most interesting, innovative, or unexpected ways that you have seen Illumidesk used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Illumidesk? When is Illumidesk the wrong choice? What do you have planned for the future of Illumidesk? Contact Info LinkedIn (https://www.linkedin.com/in/wernergreg/) Parting Question From your perspective, what is the biggest gap in the tooling or technology for data management today? Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ (https://www.pythonpodcast.com) covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast (https://www.themachinelearningpodcast.com) helps you go from idea to production with machine learning. Visit the site (https://www.dataengineeringpodcast.com) to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com (mailto:hosts@dataengineeringpodcast.com)) with your story. To help other people find the show please leave a review on Apple Podcasts (https://podcasts.apple.com/us/podcast/data-engineering-podcast/id1193040557) and tell your friends and co-workers Links Illumidesk (https://www.illumidesk.com/) Generative AI (https://en.wikipedia.org/wiki/Generative_artificial_intelligence) Vector Database (https://www.pinecone.io/learn/vector-database/) LTI == Learning Tools Interoperability (https://en.wikipedia.org/wiki/Learning_Tools_Interoperability) SCORM (https://scorm.com/scorm-explained/) XAPI (https://xapi.com/overview/) Prompt Engineering (https://en.wikipedia.org/wiki/Prompt_engineering) GPT-4 (https://en.wikipedia.org/wiki/GPT-4) LLama (https://en.wikipedia.org/wiki/LLaMA) Anthropic (https://www.anthropic.com/) FastAPI (https://fastapi.tiangolo.com/) LangChain (https://www.langchain.com/) Celery (https://docs.celeryq.dev/en/stable/) The intro and outro music is from The Hug (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Love_death_and_a_drunken_monkey/04_-_The_Hug) by The Freak Fandango Orchestra (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/) / CC BY-SA (http://creativecommons.org/licenses/by-sa/3.0/)
MLOps Coffee Sessions #167 with Maxime Beauchemin, Treating Prompt Engineering More Like Code. // Abstract Promptimize is an innovative tool designed to scientifically evaluate the effectiveness of prompts. Discover the advantages of open-sourcing the tool and its relevance, drawing parallels with test suites in software engineering. Uncover the increasing interest in this domain and the necessity for transparent interactions with language models. Delve into the world of prompt optimization, deterministic evaluation, and the unique challenges in AI prompt engineering. // Bio Maxime Beauchemin is the founder and CEO of Preset, a series B startup supporting and commercializing the Apache Superset project. Max was the original creator of Apache Airflow and Apache Superset when he was at Airbnb. Max has over a decade of experience in data engineering, at companies like Lyft, Airbnb, Facebook, and Ubisoft. // MLOps Jobs board https://mlops.pallet.xyz/jobs // MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links Max's first MLOps Podcast episode: https://go.mlops.community/KBnOgN Test-Driven Prompt Engineering for LLMs with Promptimize blog: https://maximebeauchemin.medium.com/mastering-ai-powered-product-development-introducing-promptimize-for-test-driven-prompt-bffbbca91535https://maximebeauchemin.medium.com/mastering-ai-powered-product-development-Test-Driven Prompt Engineering for LLMs with Promptimize podcast: https://talkpython.fm/episodes/show/417/test-driven-prompt-engineering-for-llms-with-promptimizeTaming AI Product Development Through Test-driven Prompt Engineering // Maxime Beauchemin // LLMs in Production Conference lightning talk: https://home.mlops.community/home/videos/taming-ai-product-development-through-test-driven-prompt-engineering --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Max on LinkedIn: https://www.linkedin.com/in/maximebeauchemin/ Timestamps: [00:00] Max introducing the Apache Superset project at Preset [01:04] Max's preferred coffee [01:16] Airflow creator [01:45] Takeaways [03:53] Please like, share, and subscribe to our MLOps channels! [04:31] Check Max's first MLOps Podcast episode [05:20] Promptimize [06:10] Interaction with API [08:27] Deterministic evaluation of SQL queries and AI [12:40] Figuring out the right edge cases [14:17] Reaction with Vector Database [15:55] Promptomize Test Suite [18:48] Promptimize vision [20:47] The open-source blood [23:04] Impact of open source [23:18] Dangers of open source [25:25] AI-Language Models Revolution [27:36] Test-driven design [29:46] Prompt tracking [33:41] Building Test Suites as Assets [36:49] Adding new prompt cases to new capabilities [39:32] Monitoring speed and cost [44:07] Creating own benchmarks [46:19] AI feature adding more value to the end users [49:39] Perceived value of the feature [50:53] LLMs costs [52:15] Specialized model versus Generalized model [56:58] Fine-tuning LLMs use cases [1:02:30] Classic Engineer's Dilemma [1:03:46] Build exciting tech that's available [1:05:02] Catastrophic forgetting [1:10:28] Promt driven development [1:13:23] Wrap up
Summary Real-time data processing has steadily been gaining adoption due to advances in the accessibility of the technologies involved. Despite that, it is still a complex set of capabilities. To bring streaming data in reach of application engineers Matteo Pelati helped to create Dozer. In this episode he explains how investing in high performance and operationally simplified streaming with a familiar API can yield significant benefits for software and data teams together. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team. You specify the customer traits, then Profiles runs the joins and computations for you to create complete customer profiles. Get all of the details and try the new product today at dataengineeringpodcast.com/rudderstack (https://www.dataengineeringpodcast.com/rudderstack) Modern data teams are using Hex to 10x their data impact. Hex combines a notebook style UI with an interactive report builder. This allows data teams to both dive deep to find insights and then share their work in an easy-to-read format to the whole org. In Hex you can use SQL, Python, R, and no-code visualization together to explore, transform, and model data. Hex also has AI built directly into the workflow to help you generate, edit, explain and document your code. The best data teams in the world such as the ones at Notion, AngelList, and Anthropic use Hex for ad hoc investigations, creating machine learning models, and building operational dashboards for the rest of their company. Hex makes it easy for data analysts and data scientists to collaborate together and produce work that has an impact. Make your data team unstoppable with Hex. Sign up today at dataengineeringpodcast.com/hex (https://www.dataengineeringpodcast.com/hex) to get a 30-day free trial for your team! Your host is Tobias Macey and today I'm interviewing Matteo Pelati about Dozer, an open source engine that includes data ingestion, transformation, and API generation for real-time sources Interview Introduction How did you get involved in the area of data management? Can you describe what Dozer is and the story behind it? What was your decision process for building Dozer as open source? As you note in the documentation, Dozer has overlap with a number of technologies that are aimed at different use cases. What was missing from each of them and the center of their Venn diagram that prompted you to build Dozer? In addition to working in an interesting technological cross-section, you are also targeting a disparate group of personas. Who are you building Dozer for and what were the motivations for that vision? What are the different use cases that you are focused on supporting? What are the features of Dozer that enable engineers to address those uses, and what makes it preferable to existing alternative approaches? Can you describe how Dozer is implemented? How have the design and goals of the platform changed since you first started working on it? What are the architectural "-ilities" that you are trying to optimize for? What is involved in getting Dozer deployed and integrated into an existing application/data infrastructure? How can teams who are using Dozer extend/integrate with Dozer? What does the development/deployment workflow look like for teams who are building on top of Dozer? What is your governance model for Dozer and balancing the open source project against your business goals? What are the most interesting, innovative, or unexpected ways that you have seen Dozer used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Dozer? When is Dozer the wrong choice? What do you have planned for the future of Dozer? Contact Info LinkedIn (https://www.linkedin.com/in/matteopelati/?originalSubdomain=sg) @pelatimtt (https://twitter.com/pelatimtt) on Twitter Parting Question From your perspective, what is the biggest gap in the tooling or technology for data management today? Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ (https://www.pythonpodcast.com) covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast (https://www.themachinelearningpodcast.com) helps you go from idea to production with machine learning. Visit the site (https://www.dataengineeringpodcast.com) to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com (mailto:hosts@dataengineeringpodcast.com)) with your story. To help other people find the show please leave a review on Apple Podcasts (https://podcasts.apple.com/us/podcast/data-engineering-podcast/id1193040557) and tell your friends and co-workers Links Dozer (https://getdozer.io/) Data Robot (https://www.datarobot.com/) Netflix Bulldozer (https://netflixtechblog.com/bulldozer-batch-data-moving-from-data-warehouse-to-online-key-value-stores-41bac13863f8) CubeJS (http://cube.dev/) Podcast Episode (https://www.dataengineeringpodcast.com/cubejs-open-source-headless-data-analytics-episode-248/) JVM == Java Virtual Machine (https://en.wikipedia.org/wiki/Java_virtual_machine) Flink (https://flink.apache.org/) Podcast Episode (https://www.dataengineeringpodcast.com/apache-flink-with-fabian-hueske-episode-57/) Airbyte (https://airbyte.com/) Podcast Episode (https://www.dataengineeringpodcast.com/airbyte-open-source-data-integration-episode-173/) Fivetran (https://www.fivetran.com/) Podcast Episode (https://www.dataengineeringpodcast.com/fivetran-data-replication-episode-93/) Delta Lake (https://delta.io/) Podcast Episode (https://www.dataengineeringpodcast.com/delta-lake-data-lake-episode-85/) LMDB (http://www.lmdb.tech/doc/) Vector Database (https://thenewstack.io/what-is-a-real-vector-database/) LLM == Large Language Model (https://en.wikipedia.org/wiki/Large_language_model) Rockset (https://rockset.com/) Podcast Episode (https://www.dataengineeringpodcast.com/rockset-serverless-analytics-episode-101/) Tinybird (https://www.tinybird.co/) Podcast Episode (https://www.dataengineeringpodcast.com/tinybird-analytical-api-platform-episode-185) Rust Language (https://www.rust-lang.org/) Materialize (https://materialize.com/) Podcast Episode (https://www.dataengineeringpodcast.com/materialize-streaming-analytics-episode-112/) RisingWave (https://www.risingwave.com/) DuckDB (https://duckdb.org/) Podcast Episode (https://www.dataengineeringpodcast.com/duckdb-in-process-olap-database-episode-270/) DataFusion (https://docs.rs/datafusion/latest/datafusion/) Polars (https://www.pola.rs/) The intro and outro music is from The Hug (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Love_death_and_a_drunken_monkey/04_-_The_Hug) by The Freak Fandango Orchestra (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/) / CC BY-SA (http://creativecommons.org/licenses/by-sa/3.0/)
In this episode of The New Stack Makers podcast, the focus is on the challenges of handling unstructured data in today's data-rich world and the potential solutions offered by vector databases and vector searches. The use of relational databases is limited when dealing with text, images, and voice data, which makes it difficult to uncover meaningful relationships between different data points.Vector databases, which facilitate vector searches, have become increasingly popular for addressing this issue. They allow organizations to store, search, and index data that would be challenging to manage in traditional databases. Semantic search and Large Language Models have sparked interest in vector databases, providing developers with new possibilities.Beyond standard applications like information search and recommendation bots, vector searches have also proven useful in combating copyright infringement. Social media companies like Facebook have pioneered this approach by using vectors to check copyrighted media uploads.Vector databases excel at finding similarities between data objects, as they operate in vector spaces and perform approximate nearest neighbor searches, sacrificing a bit of accuracy for increased efficiency. However, developers need to understand their specific use cases and the scale of their applications to make the most of vector databases and search.Frank Liu, the director of operations at Zilliz, advised listeners to educate themselves about vector databases, vector search, and machine learning to leverage the existing ecosystem of tools effectively. One notable indexing strategy for vectors is Hierarchical Navigable Small Worlds (HNSW), a graph-based algorithm created by Yury Malkov, a distinguished software engineer at VerSE Innovation who also joined us along with Nils Reimers of Cohere.It's crucial to view vector databases and search as additional tools in the developer's toolbox rather than replacements for existing database management systems or document databases. The ultimate goal is to build applications focused on user satisfaction, not just optimizing clicks. To delve deeper into the topic and explore the gaps in current tooling, check out the full episode.Listen on PoduramaLearn more about vector databases at thenewstack.ioVector Databases: What Devs Need to Know about How They WorkVector Primer: Understand the Lingua Franca of Generative AIHow Large Language Models Fuel the Rise of Vector Databases
Bob van Luijt has gone from building websites in middle school to raising tens of millions of dollars for his tech startup. The venture, Weaviate, has acquired funding from top-tier investors like Index Ventures, Cortical Ventures, Zetta Venture Partners, and Battery Ventures.
Summary Batch vs. streaming is a long running debate in the world of data integration and transformation. Proponents of the streaming paradigm argue that stream processing engines can easily handle batched workloads, but the reverse isn't true. The batch world has been the default for years because of the complexities of running a reliable streaming system at scale. In order to remove that barrier, the team at Estuary have built the Gazette and Flow systems from the ground up to resolve the pain points of other streaming engines, while providing an intuitive interface for data and application engineers to build their streaming workflows. In this episode David Yaffe and Johnny Graettinger share the story behind the business and technology and how you can start using it today to build a real-time data lake without all of the headache. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management RudderStack helps you build a customer data platform on your warehouse or data lake. Instead of trapping data in a black box, they enable you to easily collect customer data from the entire stack and build an identity graph on your warehouse, giving you full visibility and control. Their SDKs make event streaming from any app or website easy, and their extensive library of integrations enable you to automatically send data to hundreds of downstream tools. Sign up free at dataengineeringpodcast.com/rudderstack (https://www.dataengineeringpodcast.com/rudderstack) Your host is Tobias Macey and today I'm interviewing David Yaffe and Johnny Graettinger about using streaming data to build a real-time data lake and how Estuary gives you a single path to integrating and transforming your various sources Interview Introduction How did you get involved in the area of data management? Can you describe what Estuary is and the story behind it? Stream processing technologies have been around for around a decade. How would you characterize the current state of the ecosystem? What was missing in the ecosystem of streaming engines that motivated you to create a new one from scratch? With the growth in tools that are focused on batch-oriented data integration and transformation, what are the reasons that an organization should still invest in streaming? What is the comparative level of difficulty and support for these disparate paradigms? What is the impact of continuous data flows on dags/orchestration of transforms? What role do modern table formats have on the viability of real-time data lakes? Can you describe the architecture of your Flow platform? What are the core capabilities that you are optimizing for in its design? What is involved in getting Flow/Estuary deployed and integrated with an organization's data systems? What does the workflow look like for a team using Estuary? How does it impact the overall system architecture for a data platform as compared to other prevalent paradigms? How do you manage the translation of poll vs. push availability and best practices for API and other non-CDC sources? What are the most interesting, innovative, or unexpected ways that you have seen Estuary used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Estuary? When is Estuary the wrong choice? What do you have planned for the future of Estuary? Contact Info Dave Y (mailto:dave@estuary.dev) Johnny G (mailto:johnny@estuary.dev) Parting Question From your perspective, what is the biggest gap in the tooling or technology for data management today? Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ (https://www.pythonpodcast.com) covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast (https://www.themachinelearningpodcast.com) helps you go from idea to production with machine learning. Visit the site (https://www.dataengineeringpodcast.com) to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com (mailto:hosts@dataengineeringpodcast.com)) with your story. To help other people find the show please leave a review on Apple Podcasts (https://podcasts.apple.com/us/podcast/data-engineering-podcast/id1193040557) and tell your friends and co-workers Links Estuary (https://estuary.dev) Try Flow Free (https://dashboard.estuary.dev/register) Gazette (https://gazette.dev) Samza (https://samza.apache.org/) Flink (https://flink.apache.org/) Podcast Episode (https://www.dataengineeringpodcast.com/apache-flink-with-fabian-hueske-episode-57/) Storm (https://storm.apache.org/) Kafka Topic Partitioning (https://www.openlogic.com/blog/kafka-partitions) Trino (https://trino.io/) Avro (https://avro.apache.org/) Parquet (https://parquet.apache.org/) Fivetran (https://www.fivetran.com/) Podcast Episode (https://www.dataengineeringpodcast.com/fivetran-data-replication-episode-93/) Airbyte (https://www.dataengineeringpodcast.com/airbyte-open-source-data-integration-episode-173/) Snowflake (https://www.snowflake.com/en/) BigQuery (https://cloud.google.com/bigquery) Vector Database (https://learn.microsoft.com/en-us/semantic-kernel/concepts-ai/vectordb) CDC == Change Data Capture (https://en.wikipedia.org/wiki/Change_data_capture) Debezium (https://debezium.io/) Podcast Episode (https://www.dataengineeringpodcast.com/debezium-change-data-capture-episode-114/) MapReduce (https://en.wikipedia.org/wiki/MapReduce) Netflix DBLog (https://netflixtechblog.com/dblog-a-generic-change-data-capture-framework-69351fb9099b) JSON-Schema (http://json-schema.org/) The intro and outro music is from The Hug (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Love_death_and_a_drunken_monkey/04_-_The_Hug) by The Freak Fandango Orchestra (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/) / CC BY-SA (http://creativecommons.org/licenses/by-sa/3.0/)
### Apéro* Nouveau métier dans l'IA: C3PO -> https://podcast.ausha.co/le-podcast-des-eclaireurs/c3po-ce-metier-qui-va-sauver-les-journalistes### GenAI* Microsoft JARVIS / HuggingGPT -> https://analyticsindiamag.com/microsoft-jarvis-is-the-path-towards-agi/* New ways to manage your data in ChatGPT -> https://openai.com/blog/new-ways-to-manage-your-data-in-chatgpt### Vector DB* What is a Vector Database? -> What is a Vector Database? - Zilliz Vector database learn* Chroma the AI-native open-source embedding database -> https://www.trychroma.com### Cloud* Augmentation Bigquery la suite -> ### Databases* MySQL Locking Reads -> https://vincepergolizzi.com/programming/2020/09/02/mysql-locking-reads.html* CLickhouse -> https://affini-tech.com/blog/clickhouse/
This is a recap of the top 10 posts on Hacker News on May 5th, 2023.(00:41): Show HN: The HN Recap – AI generated daily HN podcastOriginal post: https://news.ycombinator.com/item?id=35831177(01:57): Htmx Is the FutureOriginal post: https://news.ycombinator.com/item?id=35829733(03:37): Build your own private WireGuard VPN with PiVPN Original post: https://news.ycombinator.com/item?id=35828046(04:58): What is a Vector Database? (2021)Original post: https://news.ycombinator.com/item?id=35826929(06:28): Element is one of fourteen messaging apps blocked by Central Indian GovernmentOriginal post: https://news.ycombinator.com/item?id=35826946(07:33): The EARN IT bill is back. We've killed it twice, let's do it againOriginal post: https://news.ycombinator.com/item?id=35826088(08:37): Journalist writes about discovering she'd been surveilled by TikTokOriginal post: https://news.ycombinator.com/item?id=35829294(09:50): Unlimiformer: Long-Range Transformers with Unlimited Length InputOriginal post: https://news.ycombinator.com/item?id=35832802(11:30): Releasing 3B and 7B RedPajamaOriginal post: https://news.ycombinator.com/item?id=35836411(12:37): U.S. Hits Z-Library with New Domain Name SeizuresOriginal post: https://news.ycombinator.com/item?id=35828664(13:54): When “free forever” means “free for the next 4 months”Original post: https://news.ycombinator.com/item?id=35836541This is a third-party project, independent from HN and YC. Text and audio generated using AI, by wondercraft.ai. Create your own studio quality podcast with text as the only input in seconds at app.wondercraft.ai. Issues or feedback? We'd love to hear from you: team@wondercraft.ai
This week's episode sponsored by Keep, an open-source alerting tool built by developers, for developers.Security fixes in Go 1.20.1, 1.19.6, golang.org/x/image, and golang.org/x/image/tiffGo 1.20.1 changesGo 1.19.6 changesLabstack Echo v4.10.1TinyGo 0.27.0 changesGolang Weekly newsletterPurego, a library for calling C functions from Go without Cgo.Accepted proposal: New standard library package based on x/exp/slicesGo Blog: All your comparable types by Robert GriesemerGo-Redis is now an official Redis clientplumber v2.1.0Reddit discussion: What are the best alternatives to gorilla session?FOSDEM'23 talk: Our Mad Journey of Building a Vector Database in GoInterview with Daniel NephinOn GitHubgotestsumgotest.toolsFind us at cupogo.dev.
00:00 Introduction01:11 Yaniv's background and intro to Searchium & GSI04:12 Ways to consume the APU acceleration for vector search05:39 Power consumption dimension in vector search 7:40 Place of the platform in terms of applications, use cases and developer experience12:06 Advantages of APU Vector Search Plugins for Elasticsearch and OpenSearch compared to their own implementations17:54 Everyone needs to save: the economic profile of the APU solution20:51 Features and ANN algorithms in the solution24:23 Consumers most interested in dedicated hardware for vector search vs SaaS27:08 Vector Database or a relevance oriented application?33:51 Where to go with vector search?42:38 How Vector Search fits into Search48:58 Role of the human in the AI loop58:05 The missing bit in the AI/ML/Search space1:06:37 Magical WHY question1:09:54 Announcements- Searchium vector search: https://searchium.ai/- Dr. Avidan Akerib, founder behind the APU technology: https://www.linkedin.com/in/avidan-akerib-phd-bbb35b12/- OpenSearch benchmark for performance tuning: https://betterprogramming.pub/tired-of-troubleshooting-idle-search-resources-use-opensearch-benchmark-for-performance-tuning-d4277c9f724- APU KNN plugin for OpenSearch: https://towardsdatascience.com/bolster-opensearch-performance-with-5-simple-steps-ca7d21234f6b- Multilingual and Multimodal Search with Hardware Acceleration: https://blog.muves.io/multilingual-and-multimodal-vector-search-with-hardware-acceleration-2091a825de78- Muves talk at Berlin Buzzwords, where we have utilized GSI APU: https://blog.muves.io/muves-at-berlin-buzzwords-2022-3150eef01c4- Not All Vector Databases are made equal: https://towardsdatascience.com/milvus-pinecone-vespa-weaviate-vald-gsi-what-unites-these-buzz-words-and-what-makes-each-9c65a3bd0696Episode on YouTube: https://youtu.be/EerdWRPuqd4Podcast design: Saurabh Rai: https://twitter.com/srvbhr
Frank Liu is Director of Operations & ML Architect at Zilliz, the company behind Milvus, an open source vector database. We discuss their recent VLDB paper (“A Cloud Native Vector Database Management System”) that describes recent updates to Milvus, as well as vector databases and vector search in general.Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.Detailed show notes can be found on The Data Exchange web site.
About RamDr. Ram Sriharsha held engineering, product management, and VP roles at the likes of Yahoo, Databricks, and Splunk. At Yahoo, he was both a principal software engineer and then research scientist; at Databricks, he was the product and engineering lead for the unified analytics platform for genomics; and, in his three years at Splunk, he played multiple roles including Sr Principal Scientist, VP Engineering and Distinguished Engineer.Links Referenced: Pinecone: https://www.pinecone.io/ XKCD comic: https://www.explainxkcd.com/wiki/index.php/1425:_Tasks TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored in part by our friends at Chronosphere. Tired of observability costs going up every year without getting additional value? Or being locked into a vendor due to proprietary data collection, querying, and visualization? Modern-day, containerized environments require a new kind of observability technology that accounts for the massive increase in scale and attendant cost of data. With Chronosphere, choose where and how your data is routed and stored, query it easily, and get better context and control. 100% open-source compatibility means that no matter what your setup is, they can help. Learn how Chronosphere provides complete and real-time insight into ECS, EKS, and your microservices, wherever they may be at snark.cloud/chronosphere that's snark.cloud/chronosphere.Corey: This episode is brought to you in part by our friends at Veeam. Do you care about backups? Of course you don't. Nobody cares about backups. Stop lying to yourselves! You care about restores, usually right after you didn't care enough about backups. If you're tired of the vulnerabilities, costs, and slow recoveries when using snapshots to restore your data, assuming you even have them at all living in AWS-land, there is an alternative for you. Check out Veeam, that's V-E-E-A-M for secure, zero-fuss AWS backup that won't leave you high and dry when it's time to restore. Stop taking chances with your data. Talk to Veeam. My thanks to them for sponsoring this ridiculous podcast.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. Today's promoted guest episode is brought to us by our friends at Pinecone and they have given their VP of Engineering and R&D over to suffer my various sling and arrows, Ram Sriharsha. Ram, thank you for joining me.Ram: Corey, great to be here. Thanks for having me.Corey: So, I was immediately intrigued when I wound up seeing your website, pinecone.io because it says right at the top—at least as of this recording—in bold text, “The Vector Database.” And if there's one thing that I love, it is using things that are not designed to be databases as databases, or inappropriately referring to things—be they JSON files or senior engineers—as databases as well. What is a vector database?Ram: That's a great question. And we do use this term correctly, I think. You can think of customers of Pinecone as having all the data management problems that they have with traditional databases; the main difference is twofold. One is there is a new data type, which is vectors. Vectors, you can think of them as arrays of floats, floating point numbers, and there is a new pattern of use cases, which is search.And what you're trying to do in vector search is you're looking for the nearest, the closest vectors to a given query. So, these two things fundamentally put a lot of stress on traditional databases. So, it's not like you can take a traditional database and make it into a vector database. That is why we coined this term vector database and we are building a new type of vector database. But fundamentally, it has all the database challenges on a new type of data and a new query pattern.Corey: Can you give me an example of what, I guess, an idealized use case would be of what the data set might look like and what sort of problem you would have in a vector database would solve?Ram: A very great question. So, one interesting thing is there's many, many use cases. I'll just pick the most natural one which is text search. So, if you're familiar with the Elastic or any other traditional text search engines, you have pieces of text, you index them, and the indexing that you do is traditionally an inverted index, and then you search over this text. And what this sort of search engine does is it matches for keywords.So, if it finds a keyword match between your query and your corpus, it's going to retrieve the relevant documents. And this is what we call text search, right, or keyword search. You can do something similar with technologies like Pinecone, but what you do here is instead of searching our text, you're searching our vectors. Now, where do these vectors come from? They come from taking deep-learning models, running your text through them, and these generate these things called vector embeddings.And now, you're taking a query as well, running them to deep-learning models, generating these query embeddings, and looking for the closest record embeddings in your corpus that are similar to the query embeddings. This notion of proximity in this space of vectors tells you something about semantic similarity between the query and the text. So suddenly, you're going beyond keyword search into semantic similarity. An example is if you had a whole lot of text data, and maybe you were looking for ‘soda,' and you were doing keyword search. Keyword search will only match on variations of soda. It will never match ‘Coca-Cola' because Coca-Cola and soda have nothing to do with each other.Corey: Or Pepsi, or pop, as they say in the American Midwest.Ram: Exactly.Corey: Yeah.Ram: Exactly. However, semantic search engines can actually match the two because they're matching for intent, right? If they find in this piece of text, enough intent to suggest that soda and Coca-Cola or Pepsi or pop are related to each other, they will actually match those and score them higher. And you're very likely to retrieve those sort of candidates that traditional search engines simply cannot. So, this is a canonical example, what's called semantic search, and it's known to be done better by these other vector search engines. There are also other examples in say, image search. Just if you're looking for near duplicate images, you can't even do this today without a technology like vector search.Corey: What is the, I guess, translation or conversion process of existing dataset into something that a vector database could use? Because you mentioned it was an array of floats was the natural vector datatype. I don't think I've ever seen even the most arcane markdown implementation that expected people to wind up writing in arrays of floats. What does that look like? How do you wind up, I guess, internalizing or ingesting existing bodies of text for your example use case?Ram: Yeah, this is a very great question. This used to be a very hard problem and what has happened over the last several years in deep-learning literature, as well as in deep-learning as a field itself, is that there have been these large, publicly trained models, examples will be OpenAI, examples will be the models that are available in Hugging Face like Cohere, and a large number of these companies have come forward with very well trained models through which you can pass pieces of text and get these vectors. So, you no longer have to actually train these sort of models, you don't have to really have the expertise to deeply figured out how to take pieces of text and build these embedding models. What you can do is just take a stock model, if you're familiar with OpenAI, you can just go to OpenAIs homepage and pick a model that works for you, Hugging Face models, and so on. There's a lot of literature to help you do this.Sophisticated customers can also do something called fine-tuning, which is built on top of these models to fine-tune for their use cases. The technology is out there already, there's a lot of documentation available. Even Pinecone's website has plenty of documentation to do this. Customers of Pinecone do this [unintelligible 00:07:45], which is they take piece of text, run them through either these pre-trained models or through fine-tuned models, get the series of floats which represent them, vector embeddings, and then send it to us. So, that's the workflow. The workflow is basically a machine-learning pipeline that either takes a pre-trained model, passes them through these pieces of text or images or what have you, or actually has a fine-tuning step in it.Corey: Is that ingest process something that not only benefits from but also requires the use of a GPU or something similar to that to wind up doing the in-depth, very specific type of expensive math for data ingestion?Ram: Yes, very often these run on GPUs. Sometimes, depending on budget, you may have compressed models or smaller models that run on CPUs, but most often they do run on GPUs, most often, we actually find people make just API calls to services that do this for them. So, very often, people are actually not deploying these GPU models themselves, they are maybe making a call to Hugging Face's service, or to OpenAI's service, and so on. And by the way, these companies also democratized this quite a bit. It was much, much harder to do this before they came around.Corey: Oh, yeah. I mean, I'm reminded of the old XKCD comic from years ago, which was, “Okay, I want to give you a picture. And I want you to tell me it was taken within the boundaries of a national park.” Like, “Sure. Easy enough. Geolocation information is attached. It'll take me two hours.” “Cool. And I also want you to tell me if it's a picture of a bird.” “Okay, that'll take five years and a research team.”And sure enough, now we can basically do that. The future is now and it's kind of wild to see that unfolding in a human perceivable timespan on these things. But I guess my question now is, so that is what a vector database does? What does Pinecone specifically do? It turns out that as much as I wish it were otherwise, not a lot of companies are founded on, “Well, we have this really neat technology, so we're just going to be here, well, in a foundational sense to wind up ensuring the uptake of that technology.” No, no, there's usually a monetization model in there somewhere. Where does Pinecone start, where does it stop, and how does it differentiate itself from typical vector databases? If such a thing could be said to exist yet.Ram: Such a thing doesn't exist yet. We were the first vector database, so in a sense, building this infrastructure, scaling it, and making it easy for people to operate it in a SaaS fashion is our primary core product offering. On top of that, this very recently started also enabling people who have who actually have raw text to not just be able to get value from these vector search engines and so on, but also be able to take advantage of traditional what we call keyword search or sparse retrieval and do a combined search better, in Pinecone. So, there's value-add on top of this that we do, but I would say the core of it is building a SaaS managed platform that allows people to actually easily store as data, scale it, query it in a way that's very hands off and doesn't require a lot of tuning or operational burden on their side. This is, like, our core value proposition.Corey: Got it. There's something to be said for making something accessible when previously it had only really been available to people who completed the Hello World tutorial—which generally resembled a doctorate at Berkeley or Waterloo or somewhere else—and turn it into something that's fundamentally, click the button. Where on that, I guess, a spectrum of evolution do you find that Pinecone is today?Ram: Yeah. So, you know, prior to Pinecone, we didn't really have this notion of a vector database. For several years, we've had libraries that are really good that you can pre-train on your embeddings, generate this thing called an index, and then you can search over that index. There is still a lot of work to be done even to deploy that and scale it and operate it in production and so on. Even that was not being, kind of, offered as a managed service before.What Pinecone does which is novel, is you no longer have to have this pre-training be done by somebody, you no longer have to worry about when to retrain your indexes, what to do when you have new data, what to do when there is deletions, updates, and the usual data management operations. You can just think of this is, like, a database that you just throw your data in. It does all the right things for you, you just worry about querying. This has never existed before, right? This is—it's not even like we are trying to make the operational part of something easier. It is that we are offering something that hasn't existed before, at the same time, making it operationally simple.So, we're solving two problems, which is we building a better database that hasn't existed before. So, if you really had this sort of data management problems and you wanted to build an index that was fresh that you didn't have to super manually tune for your own use cases, that simply couldn't have been done before. But at the same time, we are doing all of this in a cloud-native fashion; it's easy for you to just operate and not worry about.Corey: You've said that this hasn't really been done before, but this does sound like it is more than passingly familiar specifically to the idea of nearest neighbor search, which has been around since the '70s in a bunch of different ways. So, how is it different? And let me of course, ask my follow-up to that right now: why is this even an interesting problem to start exploring?Ram: This is a great question. First of all, nearest neighbor search is one of the oldest forms of machine learning. It's been known for decades. There's a lot of literature out there, there are a lot of great libraries as I mentioned in the passing before. All of these problems have primarily focused on static corpuses. So basically, you have a set of some amount of data, you want to create an index out of it, and you want to query it.A lot of literature has focused on this problem. Even there, once you go from small number of dimensions to large number of dimensions, things become computationally far more challenging. So, traditional nearest neighbor search actually doesn't scale very well. What do I mean by large number of dimensions? Today, deep-learning models that produce image representations typically operate in 2048 dimensions of photos [unintelligible 00:13:38] dimensions. Some of the OpenAI models are even 10,000 dimensional and above. So, these are very, very large dimensions.Most of the literature prior to maybe even less than ten years back has focused on less than ten dimensions. So, it's like a scale apart in dealing with small dimensional data versus large dimensional data. But even as of a couple of years back, there hasn't been enough, if any, focus on what happens when your data rapidly evolves. For example, what happens when people add new data? What happens if people delete some data? What happens if your vectors get updated? These aren't just theoretical problems; they happen all the time. Customers of ours face this all the time.In fact, the classic example is in recommendation systems where user preferences change all the time, right, and you want to adapt to that, which means your user vectors change constantly. When even these sort of things change constantly, you want your index to reflect it because you want your queries to catch on to the most recent data. [unintelligible 00:14:33] have to reflect the recency of your data. This is a solved problem for traditional databases. Relational databases are great at solving this problem. A lot of work has been done for decades to solve this problem really well.This is a fundamentally hard problem for vector databases and that's one of the core focus areas [unintelligible 00:14:48] painful. Another problem that is hard for these sort of databases is simple things like filtering. For example, you have a corpus of say product images and you want to only look at images that maybe are for the Fall shopping line, right? Seems like a very natural query. Again, databases have known and solved this problem for many, many years.The moment you do nearest neighbor search with these sort of constraints, it's a hard problem. So, it's just the fact that nearest neighbor search and lots of research in this area has simply not focused on what happens to that, so those are of techniques when combined with data management challenges, filtering, and all the traditional challenges of a database. So, when you start doing that you enter a very novel area to begin with.Corey: This episode is sponsored in part by our friends at Redis, the company behind the incredibly popular open-source database. If you're tired of managing open-source Redis on your own, or if you are looking to go beyond just caching and unlocking your data's full potential, these folks have you covered. Redis Enterprise is the go-to managed Redis service that allows you to reimagine how your geo-distributed applications process, deliver and store data. To learn more from the experts in Redis how to be real-time, right now, from anywhere, visit snark.cloud/redis. That's snark dot cloud slash R-E-D-I-S.Corey: So, where's this space going, I guess is sort of the dangerous but inevitable question I have to ask. Because whenever you talk to someone who is involved in a very early stage of what is potentially a transformative idea, it's almost indistinguishable from someone who is whatever the polite term for being wrapped around their own axle is, in a technological sense. It's almost a form of reverse Schneier's Law of anyone can create an encryption algorithm that they themselves cannot break. So, the possibility that this may come back to bite us in the future if it turns out that this is not potentially the revelation that you see it as, where do you see the future of this going?Ram: Really great question. The way I think about it is, and the reason why I keep going back to databases and these sort of ideas is, we have a really great way to deal with structured data and structured queries, right? This is the evolution of the last maybe 40, 50 years is to come up with relational databases, come up with SQL engines, come up with scalable ways of running structured queries on large amounts of data. What I feel like this sort of technology does is it takes it to the next level, which is you can actually ask unstructured questions on unstructured data, right? So, even the couple of examples we just talked about, doing near duplicate detection of images, that's a very unstructured question. What does it even mean to say that two images are nearly duplicate of each other? I couldn't even phrase it as kind of a concrete thing. I certainly cannot write a SQL statement for it, but I cannot even phrase it properly.With these sort of technologies, with the vector embeddings, with deep learning and so on, you can actually mathematically phrase it, right? The mathematical phrasing is very simple once you have the right representation that understands your image as a vector. Two images are nearly duplicate if they are close enough in the space of vectors. Suddenly you've taken a problem that was even hard to express, let alone compute, made it precise to express, precise to compute. This is going to happen not just for images, not just for semantic search, it's going to happen for all sorts of unstructured data, whether it's time series, where it's anomaly detection, whether it's security analytics, and so on.I actually think that fundamentally, a lot of fields are going to get disrupted by this sort of way of thinking about things. We are just scratching the surface here with semantic search, in my opinion.Corey: What is I guess your barometer for success? I mean, if I could take a very cynical point of view on this, it's, “Oh, well, whenever there's a managed vector database offering from AWS.” They'll probably call it Amazon Basics Vector or something like that. Well, that is a—it used to be a snarky observation that, “Oh, we're not competing, we're just validating their market.” Lately, with some of their competitive database offerings, there's a lot more truth to that than I suspect AWS would like.Their offerings are nowhere near as robust as what they pretend to be competing against. How far away do you think we are from the larger cloud providers starting to say, “Ah, we got the sense there was money in here, so we're launching an entire service around this?”Ram: Yeah. I mean, this is a—first of all, this is a great question. There's always something that's constantly, things that any innovator or disrupter has to be thinking about, especially these days. I would say that having a multi-year head, start in the use cases, in thinking about how this system should even look, what sort of use cases should it [unintelligible 00:19:34], what the operating points for the [unintelligible 00:19:37] database even look like, and how to build something that's cloud-native and scalable, is very hard to replicate. Meaning if you look at what we have already done and kind of tried to base the architecture of that, you're probably already a couple of years behind us in terms of just where we are at, right, not just in the architecture, but also in the use cases in where this is evolving forward.That said, I think it is, for all of these companies—and I would put—for example, Snowflake is a great example of this, which is Snowflake needn't have existed if Redshift had done a phenomenal job of being cloud-native, right, and kind of done that before Snowflake did it. In hindsight, it seems like it's obvious, but when Snowflake did this, it wasn't obvious that that's where everything was headed. And Snowflake built something that's very technologically innovative, in a sense that it's even now hard to replicate. Plus, it takes a long time to replicate something like that. I think that's where we are at.If Pinecone does its job really well and if we simply execute efficiently, it's very hard to replicate that. So, I'm not super worried about cloud providers, to be honest, in this space, I'm more worried about our execution.Corey: If it helps anything, I'm not very deep into your specific area of the world, obviously, but I am optimistic when I hear people say things like that. Whenever I find folks who are relatively early along in their technological journey being very concerned about oh, the large cloud provider is going to come crashing in, it feels on some level like their perspective is that they have one weird trick, and they were able to crack that, but they have no defensive mode because once someone else figures out the trick, well, okay, now we're done. The idea of sustained and lasting innovation in a space, I think, is the more defensible position to take, with the counterargument, of course, that that's a lot harder to find.Ram: Absolutely. And I think for technologies like this, that's the only solution, which is, if you really want to avoid being disrupted by cloud providers, I think that's the way to go.Corey: I want to talk a little bit about your own background. Before you wound up as the VP of R&D over at Pinecone, you were in a bunch of similar… I guess, similar styled roles—if we'll call it that—at Yahoo, Databricks, and Splunk. I'm curious as to what your experience in those companies wound up impressing on you that made you say, “Ah, that's great and all, but you know what's next? That's right, vector databases.” And off, you went to Pinecone. What did you see?Ram: So, first of all, in was some way or the other, I have been involved in machine learning and systems and the intersection of these two for maybe the last decade-and-a-half. So, it's always been something, like, in the in between the two and that's been personally exciting to me. So, I'm kind of very excited by trying to think about new type of databases, new type of data platforms that really leverages machine learning and data. This has been personally exciting to me. I obviously learned very different things from different companies.I would say that Yahoo was just the learning in cloud to begin with because prior to joining Yahoo, I wasn't familiar with Silicon Valley cloud companies at that scale and Yahoo is a big company and there's a lot to learn from there. It was also my first introduction to Hadoop, Spark, and even machine learning where I really got into machine learning at scale, in online advertising and areas like that, which was a massive scale. And I got into that in Yahoo, and it was personally exciting to me because there's very few opportunities where you can work on machine learning at that scale, right?Databricks was very exciting to me because it was an earlier-stage company than I had been at before. Extremely well run and I learned a lot from Databricks, just the team, the culture, the focus on innovation, and the focus on product thinking. I joined Databricks as a product manager. I hadn't played the product manager hat before that, so it was very much a learning experience for me and I think I learned from some of the best in that area. And even at Pinecone, I carry that forward, which is think about how my learnings at Databricks informs how we should be thinking about products at Pinecone, and so on. So, I think I learned—if I had to pick one company I learned a lot from, I would say, it's Databricks. The most [unintelligible 00:23:50].Corey: I would also like to point out, normally when people say, “Oh, the one company I've learned the most from,” and they pick one of them out of their history, it's invariably the most recent one, but you left there in 2018—Ram: Yeah.Corey: —then went to go spend the next three years over at Splunk, where you were a Senior Principal, Scientist, a Senior Director and Head of Machine-Learning, and then you decided, okay, that's enough hard work. You're going to do something easier and be the VP of Engineering, which is just wild at a company of that scale.Ram: Yeah. At Splunk, I learned a lot about management. I think managing large teams, managing multiple different teams, while working on very different areas is something I learned at Splunk. You know, I was at this point in my career when I was right around trying to start my own company. Basically, I was at a point where I'd taken enough learnings and I really wanted to do something myself.That's when Edo and I—you know, the CEO of Pinecone—and I started talking. And we had worked together for many years, and we started working together at Yahoo. We kept in touch with each other. And we started talking about the sort of problems that I was excited about working on and then I came to realize what he was working on and what Pinecone was doing. And we thought it was a very good fit for the two of us to work together.So, that is kind of how it happened. It sort of happened by chance, as many things do in Silicon Valley, where a lot of things just happen by network and chance. That's what happened in my case. I was just thinking of starting my own company at the time when just a chance encounter with Edo led me to Pinecone.Corey: It feels from my admittedly uninformed perspective, that a lot of what you're doing right now in the vector database area, it feels on some level, like it follows the trajectory of machine learning, in that for a long time, the only people really excited about it were either sci-fi authors or folks who had trouble explaining it to someone without a degree in higher math. And then it turned into—a couple of big stories from the mid-2010s stick out at me when we've been people were trying to sell this to me in a variety of different ways. One of them was, “Oh, yeah, if you're a giant credit card processing company and trying to detect fraud with this kind of transaction volume—” it's, yeah, there are maybe three companies in the world that fall into that exact category. The other was WeWork where they did a lot of computer vision work. And they used this to determine that at certain times of day there was congestion in certain parts of the buildings and that this was best addressed by hiring a second barista. Which distilled down to, “Wait a minute, you're telling me that you spent how much money on machine-learning and advanced analyses and data scientists and the rest have figured out that people like to drink coffee in the morning?” Like, that is a little on the ridiculous side.Now, I think that it is past the time for skepticism around machine learning when you can go to a website and type in a description of something and it paints a picture of the thing you just described. Or you can show it a picture and it describes what is in that picture fairly accurately. At this point, the only people who are skeptics, from my position on this, seem to be holding out for some sort of either next-generation miracle or are just being bloody-minded. Do you think that there's a tipping point for vector search where it's going to become blindingly obvious to, if not the mass market, at least more run-of-the-mill, more prosaic level of engineer that haven't specialized in this?Ram: Yeah. It's already, frankly, started happening. So, two years back, I wouldn't have suspected this fast of an adoption for this new of technology from this varied number of use cases. I just wouldn't have suspected it because I, you know, I still thought, it's going to take some time for this field to mature and, kind of, everybody to really start taking advantage of this. This has happened much faster than even I assumed.So, to some extent, it's already happening. A lot of it is because the barrier to entry is quite low right now, right? So, it's very easy and cost-effective for people to create these embeddings. There is a lot of documentation out there, things are getting easier and easier, day by day. Some of it is by Pinecone itself, by a lot of work we do. Some of it is by, like, companies that I mentioned before who are building better and better models, making it easier and easier for people to take these machine-learning models and use them without having to even fine-tune anything.And as technologies like Pinecone really mature and dramatically become cost-effective, the barrier to entry is very low. So, what we tend to see people do, it's not so much about confidence in this new technology; it is connecting something simple that I need this sort of value out of, and find the least critical path or the simplest way to get going on this sort of technology. And as long as it can make that barrier to entry very small and make this cost-effective and easy for people to explore, this is going to start exploding. And that's what we are seeing. And a lot of Pinecone's focus has been on ease-of-use, in simplicity in connecting the zero-to-one journey for precisely this reason. Because not only do we strongly believe in the value of this technology, it's becoming more and more obvious to the broader community as well. The remaining work to be done is just the ease of use and making things cost-effective. And cost-effectiveness is also what the focus on a lot. Like, this technology can be even more cost-effective than it is today.Corey: I think that it is one of those never-mistaken ideas to wind up making something more accessible to folks than keeping it in a relatively rarefied environment. We take a look throughout the history of computing in general and cloud in particular, were formerly very hard things have largely been reduced down to click the button. Yes, yes, and then get yelled at because you haven't done infrastructure-as-code, but click the button is still possible. I feel like this is on that trendline based upon what you're saying.Ram: Absolutely. And the more we can do here, both Pinecone and the broader community, I think the better, the faster the adoption of this sort of technology is going to be.Corey: I really want to thank you for spending so much time talking me through what it is you folks are working on. If people want to learn more, where's the best place for them to go to find you?Ram: Pinecone.io. Our website has a ton of information about Pinecone, as well as a lot of standard documentation. We have a free tier as well where you can play around with small data sets, really get a feel for vector search. It's completely free. And you can reach me at Ram at Pinecone. I'm always happy to answer any questions. Once again, thanks so much for having me.Corey: Of course. I will put links to all of that in the show notes. This promoted guest episode is brought to us by our friends at Pinecone. Ram Sriharsha is their VP of Engineering and R&D. And I'm Cloud Economist Corey Quinn. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry, insulting comment that I will never read because the search on your podcast platform is broken because it's not using a vector database.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.
The optimal format for storage and retrieval of data is dependent on how it is going to be used. For analytical systems there are decades of investment in data warehouses and various modeling techniques. For machine learning applications relational models require additional processing to be directly useful, which is why there has been a growth in the use of vector databases. These platforms store direct representations of the vector embeddings that machine learning models rely on for computing relevant predictions so that there is no additional processing required to go from input data to inference output. In this episode Frank Liu explains how the open source Milvus vector database is implemented to speed up machine learning development cycles, how to think about proper storage and scaling of these vectors, and how data engineering and machine learning teams can collaborate on the creation and maintenance of these data sets.
MLOps Coffee Sessions #111 with Samuel Partee, Principal Applied AI Engineer of Redis, More than a Cache: Turning Redis into a Composable, ML Data Platform co-hosted by Mihail Eric. This episode is sponsored by Redis. // Abstract Pushing forward the Redis platform to be more than just the web-serving cache that we've known it up to now. It seems like a natural progression for the platform, we see how they're evolving to be this AI-focused, AI native serving platform that does vector similarity, feature stored provides those kinds of functionalities. // Bio A Principal Applied AI Engineer at Redis, Sam helps guide the development and direction of Redis as an online feature store and vector database. Sam's background is in high-performance computing including ML-related topics such as distributed training, hyperparameter optimization, and scalable inference. // MLOps Jobs board https://mlops.pallet.xyz/jobs MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links https://partee.io Redis VSS demo: https://github.com/Spartee/redis-vector-search Redis Stack: https://redis.io/docs/stack/ Github - https://github.com/Spartee OSS org Sam co-founded at HPE/Cray - https://github.com/CrayLabs This paper last year was some of the best research and collaborations Sam has been a part of. The Paper is published here: https://www.sciencedirect.com/science/article/pii/S1877750322001065?via%3Dihub Do you really need an extra database for vectors? https://databricks.com/dataaisummit/session/emerging-data-architectures-approaches-real-time-ai-using-redis Blink: The Power of Thinking Without Thinking by Malcolm Gladwell, Barry Fox, Irina Henegar (Translator): https://www.goodreads.com/book/show/40102.Blink --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Mihail on LinkedIn: https://www.linkedin.com/in/mihaileric/ Connect with Sam on LinkedIn: www.linkedin.com/in/sam-partee-b04a1710a Timestamps: [00:00] Introduction to Samuel Partee [00:24] Takeaways [02:46] Updates on the Community [05:17] Start of Redis [08:10] Vision for Vector Search [11:05] Changing the narrative going from the "Cache" for all servers and web endpoints [14:35] Clear value prop on demos [20:17] Vector Database [26:26] Features with benefits [28:41] AWS Spend [30:39] Vector Database upsell model and bureaucratic convenience [32:08] Distributed training hyperparameter optimization and scalable inference [35:03] Core infrastructural advancement [36:55] Tools movement to help [39:00] Using Machine Learning at scale in numerical simulations with SmartSim: An application to ocean climate modeling (published paper) [42:52] Future applications of tech to get excited with [44:20] Lightning round [47:48] Wrap up
Order your Milvus t-shirt / hoodie! https://milvus.typeform.com/to/IrnLAgui Thanks Filip for arranging.Show notes: - Milvus DB: https://milvus.io/ - Not All Vector Databases Are Made Equal: https://towardsdatascience.com/milvus... - Milvus talk at Haystack: https://www.youtube.com/watch?v=MLSMs... - BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models https://arxiv.org/abs/2104.08663 - End-to-End Environmental Sound Classification using a 1D Convolutional Neural Network: https://arxiv.org/abs/1904.08990 - What BERT is not: Lessons from a new suite of psycholinguistic diagnostics for language models https://arxiv.org/abs/1907.13528 - NVIDIA Triton Inference Server: https://developer.nvidia.com/nvidia-t... - Towhee -- ML / Embedding pipeline making steps before Milvus easier: https://github.com/towhee-io/towhee - Being at the leading edge: http://paulgraham.com/startupideas.html
Pinecone Systems' new vector database provide similarity search as a cloud service. Use cases include recommendations, personalization, image search, and deduplication of records. A vector, or vector embedding, is a string of numbers that represents documents, images, or other data. Vectors are used in the development of machine learning applications. A vector database stores, searches, and retrieves the representations by similarity or by relevance.Pinecone's vector database is accessed through an API. Early adopters range from startups to large companies with machine learning initiatives that need to scale. Pinecone Systems' lead investor was also an early investor in Snowflake, and the similarities don't stop there.
Machine learning models use vectors as the natural mechanism for representing their internal state. The problem is that in order for the models to integrate with external systems their internal state has to be translated into a lower dimension. To eliminate this impedance mismatch Edo Liberty founded Pinecone to build database that works natively with vectors. In this episode he explains how this technology will allow teams to accelerate the speed of innovation, how vectors make it possible to build more advanced search functionality, and how Pinecone is architected. This is an interesting conversation about how reconsidering the architecture of your systems can unlock impressive new capabilities.
Machine learning models use vectors as the natural mechanism for representing their internal state. The problem is that in order for the models to integrate with external systems their internal state has to be translated into a lower dimension. To eliminate this impedance mismatch Edo Liberty founded Pinecone to build database that works natively with vectors. In this episode he explains how this technology will allow teams to accelerate the speed of innovation, how vectors make it possible to build more advanced search functionality, and how Pinecone is architected. This is an interesting conversation about how reconsidering the architecture of your systems can unlock impressive new capabilities.
Vectors are the foundational mathematical building blocks of Machine Learning. Machine Learning models must transform input data into vectors to perform their operations, creating what is known as a vector embedding. Since data is not stored in vector form, an ML application must perform significant work to transform data in different formats into a form that ML models can understand. This can be computationally intensive and hard to scale, especially for the high-dimensional vectors used in complex models.Pinecone is a managed database built specifically for working with vector data. Pinecone is serverless and API-driven, which means engineers and data scientists can focus on building their ML application or performing analysis without worrying about the underlying data infrastructure.Edo Liberty is the founder and CEO of Pinecone. Prior to Pinecone, he led the creation of Amazon SageMaker at AWS. He joins the show today to talk about the fundamental importance of vectors in machine learning, how Pinecone built a vector-centric database, and why data infrastructure improvements are key to unlocking the next generation of AI applications.
Orchestrate all the Things podcast: Connecting the Dots with George Anadiotis
Vectors are foundational for machine learning applications. Pinecone, a specialized cloud database for vectors, has secured significant investment from the people who brought Snowflake to the world. Could this be the next big thing? Article published on ZDNet