Podcasts about snorkel ai

24PODCASTS
30EPISODES
41mAVG DURATION
1MONTHLY NEW EPISODE
Aug 19, 2025LATEST

POPULARITY

20172018201920202021202220232024

Best podcasts about snorkel ai

MLOps.community

3 episodes with snorkel ai

OV | BUILD

2 episodes with snorkel ai

The Machine Learning Podcast

2 episodes with snorkel ai

Monday Science

3 episodes with snorkel ai

Latest podcast episodes about snorkel ai

Trump's High-risk, High-reward AI Action Plan

Columbia Energy Exchange

Play Episode Listen Later Aug 19, 2025 58:30 Transcription Available

In July, the Trump administration released what it calls an AI action plan. In it, along with several executive orders, the White House lays out its vision for building and expanding the country's AI infrastructure. Key tenets of that vision include removing regulatory hurdles and accelerating US dominance in the industry. It also has broad energy and security implications. So how could the administration's high-risk, high-reward approach increase US market share in AI? Will it create tensions with major AI companies while potentially democratizing access to AI capabilities? And how does the plan diverge from Biden-era AI support, especially around environmental and energy considerations? To discuss the action plan, we convened some of the leading AI experts at the Center for Global Energy Policy in early August, and this week on Columbia Energy Exchange we are sharing an audio recording of their discussion. David Sandalow, CGEP's inaugural fellow and the host of the AI, Energy and Climate podcast, moderated the panel. David also co-directs the Energy and Environment Concentration at the School of International and Public Affairs at Columbia University and was the lead author of the “Artificial Intelligence for Climate Change Mitigation Roadmap” report for the Innovation for Cool Earth Forum. Aaron Bartnick, Jared Dunnmon, and Ashley Finan joined David on the webinar. Aaron Bartnick is a global fellow at CGEP, where he focuses on technology and economic security. He also serves as chief of staff at the neural engineering company Science Corporation and as a fellow at Carnegie Mellon University's Critical Technology Initiative. Jared Dunnmon is a non-resident CGEP fellow and the co-founder and chief scientist of a maritime logistics startup. He previously served in the Department of Defense as technical director for artificial intelligence at the Defense Innovation Unit, was vice president of future technologies at battery firm Our Next Energy, and was an early team member at Snorkel AI. Ashley Finan is a CGEP global fellow who previously served in senior leadership roles at Idaho National Laboratory, where she worked on nuclear energy and national security issues. Credits: Hosted by Jason Bordoff and Bill Loveless. Produced by Mary Catherine O'Connor, Caroline Pitman, and Kyu Lee. Engineering by Gregory Vilfranc.

donald trump ai school energy joe biden international innovation white house defense artificial intelligence climate engineering columbia university public affairs carnegie mellon university action plan high risk high reward global energy policy idaho national laboratory c gep jason bordoff snorkel ai columbia energy exchange bill loveless

Building Enterprise-Grade AI: Lessons from Snorkel AI's Fundraising and Expansion

The SaaS CFO

Play Episode Listen Later Aug 14, 2025 21:48

Welcome to The SaaS CFO Podcast! In this episode, host Ben Murray sits down with Braden Hancock, co-founder of Snorkel AI, to explore the journey of turning an academic research project at Stanford into one of today's leading AI data platforms. Braden shares the origins of Snorkel, how it grew from a PhD initiative to a company helping some of the world's largest enterprises tackle their toughest data labeling and AI evaluation challenges, and why good data remains the “secret sauce” behind successful AI. We'll dive into how Snorkel's platform has enabled organizations—from major banks to highly regulated industries—to quickly build and customize AI models, and why scalable, programmatic approaches to data labeling are game changers. Braden also gives us a behind-the-scenes look at Snorkel's growth, fundraising journey (raising over $237 million to date!), and valuable lessons learned along the way. If you're interested in the real-world applications of AI, SaaS growth strategies, or the evolving landscape of enterprise tech, you won't want to miss this insightful conversation with one of AI's innovative leaders. Let's jump in! Show Notes: 00:00 "AI Success: Data-Driven Approach" 06:24 Startup Investment Stages Simplified 09:57 Strategic Momentum Drives Startup Success 13:16 Expanding Channels and Partnerships 14:33 AI Platform & Data Solutions Pricing 20:25 Focus on Evaluation Challenges 21:17 "Discover Snorkel Online" Links: SaaS Fundraising Stories: https://www.thesaasnews.com/news/snorkel-ai-raises-raises-85-million-in-series-c https://www.thesaasnews.com/news/snorkel-ai-raises-100-million-in-series-d Braden Hancock's LinkedIn: https://www.linkedin.com/in/bradenhancock/ Snorkel AI's LinkedIn: https://www.linkedin.com/company/snorkel-ai/ Snorkel AI's Website: https://snorkel.ai/ To learn more about Ben check out the links below: Subscribe to Ben's daily metrics newsletter: https://saasmetricsschool.beehiiv.com/subscribe Subscribe to Ben's SaaS newsletter: https://mailchi.mp/df1db6bf8bca/the-saas-cfo-sign-up-landing-page SaaS Metrics courses here: https://www.thesaasacademy.com/ Join Ben's SaaS community here: https://www.thesaasacademy.com/offers/ivNjwYDx/checkout Follow Ben on LinkedIn: https://www.linkedin.com/in/benrmurray

ai lessons phd focus partnership stanford expansion saas enterprise grade fundraising snorkel ben murray saas metrics snorkel ai

CMO Strategies for Building B2B Brands | Cate Lochead (Snorkel AI, JumpCloud, DataStax, Couchbase, and Oracle)

Growth Talks

Play Episode Listen Later Aug 5, 2025 50:28

"The B2B market is constantly shifting. Your ability to innovate and pivot quickly defines your competitive edge.” In this episode of Growth Talks, we're joined by Cate Lochead, the former CMO o Snorkel AI whose career in B2B marketing leadership spans more than 25 years. She has held high-impact executive roles at companies like JumpCloud, DataStax, and Couchbase. Cate shares her proven strategies for scaling global marketing teams, securing venture funding, and leading a company through a successful IPO. She explains why operational rigor, financial alignment, and a deep understanding of technical buyers are critical to sustainable growth. Building on these foundational principles, Cate also offers a clear-eyed look at how the role of the marketer is evolving with AI, along with actionable insights on how to leverage new technologies to drive business success.

Episode 69 - Building Confidence in Agentic AI, featuring BNY, Snorkel AI and CIBC Mellon

CIBC Mellon Industry Perspectives

Play Episode Listen Later Jul 21, 2025 44:52

With the advent of agentic AI, artificial intelligence is rapidly evolving and becoming an influential part of our industry and the world as we know it. Featuring Michael Ross, BNY Product Manager for Eliza, Bryan Wood, Principal AI Architect of Banking at Snorkel AI and moderated by Joe Lacopo, Vice President, Relationship Management and Head of the Asset Manager segment at CIBC Mellon, this episode of CIBC Mellon Industry Perspectives discusses the impact and innovation of agentic AI within the financial services industry. Together, they highlight the use of AI within their own organizations, articulate the importance of building trustworthy Agentic AI solutions and discuss what the future will look like with AI. This presentation contains the presenter's personal views and not those of CIBC Mellon or any other person. It may be considered advertising, and provides general information only and neither the presenter nor CIBC Mellon nor any other person are, by means of this presentation, rendering accounting, business, financial, investment, legal, tax, or other professional advice or services. This presentation is intended for general informational purposes only. It may not be regarded as comprehensive nor as a substitute for professional advice. Before taking any particular course of action, contact your professional advisor to discuss these matters in the context of your particular circumstances. Neither the presenter nor CIBC Mellon accept responsibility for any loss or damage occasioned by your reliance on information contained in this presentation. ©2025 CIBC Mellon. CIBC Mellon is a licensed user of the CIBC trade-mark and certain BNY trade-marks, and is the corporate brand of CIBC Mellon Trust Company. None of CIBC Mellon Trust Company, CIBC, The Bank of New York Mellon Corporation and their affiliates make any representations or warranties as to its accuracy, currency or completeness, makes any commitment to update any information. No part of the presentation is an offer or solicitation in respect of any particular strategy and may not be construed as such. Services referred to may not be offered in all jurisdictions nor by all companies.CIBC Mellon does not provide investment or asset management services. This presentation, either in whole or in part, must not be reproduced nor referred to without the express written permission of CIBC Mellon. Trademarks, service marks and logos belong to their respective owners.

head ai vice president bank services banking building confidence trademarks mellon agentic relationship management asset manager cibc bny snorkel ai bryan wood

How Hyper-Growth Companies Leverage Events to Elevate Their Brand & Stand Out with Cate Lochead

Creating The Perfect Experience

Play Episode Listen Later Dec 5, 2024 40:39 Transcription Available

In this episode, Mark speaks with Cate Lochead. Cate is CMO for Snorkel AI, that enables enterprises to develop AI that works for their unique workloads.Cate is a master in marketing and building world class teams. With over 25 years of experience building high growth organizations, Cate is now a leader in AI space. In this episode we discuss: How Cate got started in Tech, her background, Marketing and AI.Cate Shares how leveraging events can elevate your brand. Cate Lochead: https://www.linkedin.com/in/clochead/ https://snorkel.ai/ Mark Testa https://www.markstephenagency.com info@markstephenagency.com https://www.linkedin.com/company/mark-stephen-design-&-production/ https://www.instagram.com/markstephenea/ https://www.youtube.com/channel/UCK13o22i4RxQvbAgwwlh9tQ?view_as=subscriberThanks for tuning in. Check us out at on https://www.instagram.com/markstephenea/

ai marketing growth stand tech brand events companies leverage elevate cmo hyper snorkel ai

S4E8: Dave Munichiello on Investing in AI's Future

Theory and Practice

Play Episode Listen Later Sep 20, 2023 27:29

Throughout the fourth season of Theory and Practice, we explored emerging human-like artificial intelligence and robots. We asked if we could learn as much about ourselves as we do about the machines we use. The series has covered safety guardrails for AI, empathic AI communication, communication between minds and machines, robotic surgery, computers that smell, and using AI to understand human vision. The most recent episode with Google DeepMind's Dr. Clément Farabet illuminates how computers might demonstrate understanding and reasoning on par with humans. In the final episode, we reflect on investing in artificial intelligence's future with the leader of GV's Digital Investing Team, Dave Munichiello, who has a long-standing history with AI and robotics. Dave was an early technologist at Kiva Systems, purchased by Amazon and ultimately becoming Amazon Robotics. Over the past decade-plus at GV, Dave has been leading investments across two major categories: Platforms Empowering Developers (GitLab, Segment, Slack, RedPanda, etc) and Platforms Powering AI Systems (Determined, Modular, SambaNova, Snorkel AI, etc), along with others. Dave's first AI investment, Lattice (bought by Apple's Siri team) was seven years before the hype of generative AI. We asked, from a seasoned AI investor's perspective, where does AI hold the most promise? To answer this, Dave returns to the themes we've investigated over the last eight weeks — including AI trust and safety, which Google Health's Greg Corrado raised in the first episode. Together, we explore how AI will change how we work, the nature of jobs, and how an investing team with a culture focused on having more questions than answers is well positioned for AI's future.Dave rounds out the discussion with a picture of how artificial intelligence, with real-life use cases, will move research lab theory to real-world practice. He also walks us through his hopes for AI, including a world where humans and computers exist as co-pilots.Ultimately, Dave shares an optimistic and rational view of AI's future. “AI has the potential to democratize the very creation of technology," he reflects. "With AI-assistance, folks across the country will no longer need to rely on software programmers to solve everyday digital problems – they'll be able to create these tools themselves. That is incredibly exciting, and I'm honored to be a part of that journey."

amazon ai apple science technology practice medicine investing startups theory slack siri machine learning life sciences modular gv lattice google deepmind google health kiva systems snorkel ai

Programmatic AI data development, Multimodal AI, False dichotomy of fine-tuning vs RAG, Compute-optimal LLMs | Alex Ratner, CEO of Snorkel AI

Infinite Machine Learning

Play Episode Listen Later Sep 18, 2023 51:11

Alex Ratner is the CEO of Snorkel AI, a platform that provides programmatic data labeling and foundation models to enable companies to build AI applications. They've raised $135M so far from amazing investors such as Addition, Greylock, Google Ventures, and Lightspeed. He was previously the cofounder and CEO of SiftPage. He has a bachelors degree from Harvard and a PhD from Stanford. In this episode, we cover a range of topics including: - Making AI data development first-class and programmatic - The data-centric step for every model-centric step - False dichotomy of fine tuning vs RAG - Foundation model dynamics: winner take all vs diverse models - Training compute-optimal LLMs - Designing multimodal datasets (DataComp) - Distilling Step-by-Step - 'GPT-You' for every enterprise Alex's favorite books: Foundation series books (Author: Isaac Asimov)--------Where to find Prateek Joshi: Newsletter: https://prateekjoshi.substack.com Website: https://prateekj.com LinkedIn: https://www.linkedin.com/in/prateek-joshi-91047b19 Twitter: https://twitter.com/prateekvjoshi

MLOps vs. LLMOps Panel // LLMs in Conference in Production Conference Part 2 // MLOps Podcast # 176

MLOps.community

Play Episode Listen Later Sep 1, 2023 35:48

MLOps Coffee Sessions #176 with MLOps vs. LLMOps Panel, Willem Pienaar, Chris Van Pelt, Aparna Dhinakaran, and Alex Ratner hosted by Richa Sachdev. // Abstract What do MLOps and LLMOps have in common? What has changed? Are these just new buzzwords or is there validity in calling this ops something new? // Bio Richa Sachdev A passionate and impact-driven leader whose expertise spans leading teams, architecting ML and data-intensive applications, and driving enterprise data strategy. Richa has worked for a Tier A Start-up developing feature platforms and in financial companies, leading ML Engineering teams to drive data-driven business decisions. Richa enjoys reading technical blogs focussed on system design and plays an active role in the MLOps Community. Willem Pienaar Willem is the creator of Feast, the open-source feature store and a builder in the generative AI space. Previously Willem was an engineering manager at Tecton where he led teams in both their open source and enterprise initiatives. Before that Willem built the core ML systems and created the ML platform team at Gojek, the Indonesian decacorn. Chris Van Pelt Chris Van Pelt is a co-founder of Weights & Biases, a developer MLOps platform. In 2009, Chris founded Figure Eight/CrowdFlower. Over the past 12 years, Chris has dedicated his career optimizing ML workflows and teaching ML practitioners, making machine learning more accessible to all. Chris has worked as a studio artist, computer scientist, and web engineer. He studied both art and computer science at Hope College. Aparna Dhinakaran Aparna Dhinakaran is the Co-Founder and Chief Product Officer at Arize AI, a pioneer and early leader in machine learning (ML) observability. A frequent speaker at top conferences and thought leader in the space, Dhinakaran was recently named to the Forbes 30 Under 30. Before Arize, Dhinakaran was an ML engineer and leader at Uber, Apple, and TubeMogul (acquired by Adobe). During her time at Uber, she built several core ML Infrastructure platforms, including Michelangelo. She has a bachelor's from UC Berkeley's Electrical Engineering and Computer Science program, where she published research with Berkeley's AI Research group. She is on a leave of absence from the Computer Vision Ph.D. program at Cornell University. Alex Ratner Alex Ratner is the co-founder and CEO at Snorkel AI, and an Affiliate Assistant Professor of Computer Science at the University of Washington. Prior to Snorkel AI and UW, he completed his Ph.D. in CS advised by Christopher Ré at Stanford, where he started and led the Snorkel open source project, and where his research focused on defining and forwarding the concept of “data-centric AI”, the idea that labeling and developing data is the new center of the AI development workflow. His academic work focuses on data-centric AI and related topics in data management and statistical learning techniques, and applications to real-world problems in medicine, science, and more. Previously, he earned his A.B. in Physics from Harvard University. // MLOps Jobs board https://mlops.pallet.xyz/jobs // MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links ⁠ --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Richa on LinkedIn: https://www.linkedin.com/in/richasachdev/ Connect with Willem on LinkedIn: https://www.linkedin.com/in/willempienaar/ Connect with Chris on LinkedIn: https://www.linkedin.com/in/chrisvanpelt/ Connect with Aparna on Twitter: https://www.linkedin.com/in/aparnadhinakaran/ Connect with Alex on Twitter: https://www.linkedin.com/in/alexander-ratner-038ba239/

Ep 15: Snorkel AI CEO Alex Ratner on What's Needed for Wider-Spread Enterprise AI Adoption

Unsupervised Learning

Play Episode Listen Later Aug 18, 2023 46:45

Jacob sits down with Alex to discuss how Snorkel grew from an open-source project in a Stanford AI lab to a $1B company. Alex shares his thoughts on why data development is at the heart of AI development, why enterprises are slow to deploy LLM applications, and the importance of academia in the future of AI development. 00:00 intro 01:03 moving from academia to Snorkel 05:08 the evolution of Snorkel 18:33 improving pre-training 21:37 avoiding hallucinations and other errors 33:00 barriers to enterprises deploying AI 36:59 the Snorkel footprint of the future 39:37 the role of academia in AI development 42:57 over-hyped/under-hyped 44:50 how should AI regulation change going forward? With your co-hosts: @jasoncwarner - Former CTO GitHub, VP Eng Heroku & Canonical @ericabrescia - Former COO Github, Founder Bitnami (acq'd by VMWare) @patrickachase - Partner at Redpoint, Former ML Engineer LinkedIn @jacobeffron - Partner at Redpoint, Former PM Flatiron Health @jordan_segall - Partner at Redpoint

ai partner adoption spread 1b wider llm vmware ratner snorkel enterprise ai ai ceo redpoint snorkel ai

Enhancing your marketing with Chat GPT

More Than Marketing with Arsham Mirshah

Play Episode Listen Later Mar 23, 2023 45:00

Today's guest, Devang Sachdev, is the Vice President of Marketing at Snorkel AI. Devang started his career engineering hardware and managing products before transitioning and developing his expertise in strategic marketing planning, new product launches, and identifying emerging opportunities. Devang shares how he uses generative AI models to conduct market research, keep a consistent tone The post Enhancing your marketing with Chat GPT appeared first on WebMechanix.

ai marketing vice president enhancing devang snorkel ai

Foundational Models are the Future but... with Alex Ratner CEO of Snorkel AI // MLOps Podcast #139

MLOps.community

Play Episode Listen Later Jan 3, 2023 52:27

MLOps Coffee Sessions #139 with Alex Ratner, Putting Foundation Models to Use for the Enterprise co-hosted by Abi Aryan sponsored by Snorkel AI. // Abstract Foundation models are rightfully being compared to other game-changing industrial advances like steam engines or electric motors. They're core to the transition of AI from a bespoke, less predictable science to an industrialized, democratized practice. Before they can achieve this impact, however, we need to bridge the cost, quality, and control gaps. Snorkel Flow Foundation Model Suite is the fastest way for AI/ML teams to put foundation models to use. For some projects, this means fine-tuning a foundation model for production dramatically faster by creating programmatically labeling training data. For others, the optimal solution will be using Snorkel Flow's distill, combine, and correct approach to extract the most relevant knowledge from foundation models and encode that value into the right-sized models for your use case. AI/ML teams can determine which Foundation Model Suite capabilities to use (and in what combination) to optimize for cost, quality, and control using Snorkel Flow's integrated workflow for programmatic labeling, model training, and rapid-guided iteration. // Bio Alex Ratner is the Co-founder and CEO of Snorkel AI and an Assistant Professor of Computer Science at the University of Washington. Prior to Snorkel AI and UW, he completed his Ph.D. in CS advised by Christopher Ré at Stanford, where he started and led the Snorkel open-source project, and where his research focused on applying data management and statistical learning techniques to emerging machine learning workflows such as creating and managing training data and applying this to real-world problems in medicine, knowledge base construction, and more. Previously, he earned his A.B. in Physics from Harvard University. // MLOps Jobs board https://mlops.pallet.xyz/jobs // MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links Website: www.snorkel.ai Huge “foundation models” are turbo-charging AI progress: https://www.economist.com/interactive/briefing/2022/06/11/huge-foundation-models-are-turbo-charging-ai-progress Nemo: Guiding and Contextualizing Weak Supervision for Interactive Data Programming: https://arxiv.org/abs/2203.01382 The Principles of Data-Centric AI Development: https://snorkel.ai/principles-of-data-centric-ai-development/ --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Abi on LinkedIn: https://www.linkedin.com/in/abiaryan/ Connect with Alex on LinkedIn: https://www.linkedin.com/in/alexander-ratner-038ba239/ Timestamps: [00:00] Alex's preferred coffee [01:20] Introduction to Alex Ratner [02:34] Takeaways [04:04] Huge shoutout to our Sponsor, Snorkel AI! [04:39] Comment, rate us, and share our podcasts with your friends! [04:50] Transfer Learning / Active Learning [11:30] Labeling Heuristics paper on Nemo [18:14] Geocentric AI [21:48] Enterprise use cases on Foundational Models [32:45] Foundational Models into the different Google products [38:36] Progress in Foundational Models [43:55] AutoML Models Baseline Accuracy [44:40] Hosting Infrastructure Snorkel Float vs GCP [46:53] Chris Re's venture capital firm / incubator / machine [51:00] Wrap

ceo university ai google washington wrap progress stanford takeaways assistant professor harvard university models enterprise physics computer science cs foundational abi nemo uw ai ml gcp ratner snorkel demetrios christopher r connect with us join snorkel ai

Real-time Machine Learning with Chip Huyen // Chip Huyen // MLOps Coffee Sessions #133

MLOps.community

Play Episode Listen Later Nov 22, 2022 58:42

MLOps Coffee Sessions #133 {Podcast BTS} with Chip Huyen, Real-time Machine Learning with Chip Huyen co-hosted by Vishnu Rachakonda. // Abstract Forcing functions and how you can supercharge your learning by putting yourself into a situation where you know you either have a responsibility to others to learn or accountability on you so you have to learn. It's not that hard when you think about streaming machine learning. It's not that big of a mental barrier to cross. It is simple in theory but maybe it's more complicated in practice and that's exactly where Chip's perspective is. // Bio Chip Huyen is a co-founder of Claypot AI, a platform for real-time machine learning. Previously, she was with Snorkel AI and NVIDIA. She teaches CS 329S: Machine Learning Systems Design at Stanford. She's the author of the book Designing Machine Learning Systems, an Amazon bestseller in AI. // MLOps Jobs board https://mlops.pallet.xyz/jobs // MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links Landing page: https://claypot.ai Designing Machine Learning Systems book: https://www.amazon.com/Designing-Machine-Learning-Systems-Production-Ready/dp/1098107969 --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Vishnu on LinkedIn: https://www.linkedin.com/in/vrachakonda/ Connect with Chip on LinkedIn: https://www.linkedin.com/in/chiphuyen/

amazon ai real coffee stanford chip machine learning nvidia real time time machine vishnu huyen demetrios connect with us join snorkel ai

Snorkel AI | Jumpstarting Data-Centric AI

Greymatter

Play Episode Listen Later Nov 18, 2022 35:31

Greylock general partner Saam Motamedi talks with Snorkel AI CEO and co-founder Alex Ratner. As more enterprise organizations have recognized the utility of artificial intelligence technology, there's been a major push to invest in and adopt new AI and ML infrastructure to drive insights and make predictions for businesses. However, many of these solutions lack the mechanism to unlock and operationalize the data needed to train and deploy models for high quality AI projects. That pain point spawned the creation of Snorkel AI, which has developed an end-to-end data-centric machine learning platform for the enterprise.Putting the capabilities to build impactful AI in the hands of more people has been Snorkel's goal since its inception. The company spun out of Stanford's AI Lab in 2019 and has been partnered with Greylock since 2020, and just released a new set of tools that enables enterprise organizations to put foundation models to use. You can read a transcript of this conversation here: https://greylock.com/greymatter/jumpstarting-data-centric-ai/

ai data putting stanford ml centric snorkel greylock jumpstarting snorkel ai

Solve The Cold Start Problem For Machine Learning By Letting Humans Teach The Computer With Aitomatic

The Machine Learning Podcast

Play Episode Listen Later Sep 28, 2022 52:07

Summary Machine learning is a data-hungry approach to problem solving. Unfortunately, there are a number of problems that would benefit from the automation provided by artificial intelligence capabilities that don’t come with troves of data to build from. Christopher Nguyen and his team at Aitomatic are working to address the "cold start" problem for ML by letting humans generate models by sharing their expertise through natural language. In this episode he explains how that works, the various ways that we can start to layer machine learning capabilities on top of each other, as well as the risks involved in doing so without incorporating lessons learned in the growth of the software industry. Announcements Hello and welcome to the Machine Learning Podcast, the podcast about machine learning and how to bring it from idea to delivery. Predibase is a low-code ML platform without low-code limits. Built on top of our open source foundations of Ludwig and Horovod, our platform allows you to train state-of-the-art ML and deep learning models on your datasets at scale. Our platform works on text, images, tabular, audio and multi-modal data using our novel compositional model architecture. We allow users to operationalize models on top of the modern data stack, through REST and PQL – an extension of SQL that puts predictive power in the hands of data practitioners. Go to themachinelearningpodcast.com/predibase today to learn more and try it out! Your host is Tobias Macey and today I’m interviewing Christopher Nguyen about how to address the cold start problem for ML/AI projects Interview Introduction How did you get involved in machine learning? Can you describe what the "cold start" or "small data" problem is and its impact on an organization’s ability to invest in machine learning? What are some examples of use cases where ML is a viable solution but there is a corresponding lack of usable data? How does the model design influence the data requirements to build it? (e.g. statistical model vs. deep learning, etc.) What are the available options for addressing a lack of data for ML? What are the characteristics of a given data set that make it suitable for ML use cases? Can you describe what you are building at Aitomatic and how it helps to address the cold start problem? How have the design and goals of the product changed since you first started working on it? What are some of the education challenges that you face when working with organizations to help them understand how to think about ML/AI investment and practical limitations? What are the most interesting, innovative, or unexpected ways that you have seen Aitomatic/H1st used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Aitomatic/H1st? When is a human/knowledge driven approach to ML development the wrong choice? What do you have planned for the future of Aitomatic? Contact Info LinkedIn @pentagoniac on Twitter Google Scholar Parting Question From your perspective, what is the biggest barrier to adoption of machine learning today? Closing Announcements Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@themachinelearningpodcast.com) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links Aitomatic Human First AI Knowledge First World Symposium Atari 800 Cold start problem Scale AI Snorkel AI Podcast Episode Anomaly Detection Expert Systems ICML == International Conference on Machine Learning NIST == National Institute of Standards and Technology Multi-modal Model SVM == Support Vector Machine Tensorflow Pytorch Podcast.__init__ Episode OSS Capital DALL-E The intro and outro music is from Hitman’s Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

interview technology model built cold teach humans computers standards solve expert machine learning hitman python ml atari ludwig sql anomaly cc by sa google scholar multimodal tensorflow scale ai pytorch ml ai anomaly detection cold start problem expert systems pql freak fandango orchestra nist national institute christopher nguyen snorkel ai predibase horovod

Making AI Accessible to All with Braden Hancock

The Business of Open Source

Play Episode Listen Later Sep 14, 2022 32:58

Braden Hancock, Co-founder and Head of Technology at Snorkel AI, joins me to talk about his path from academia to start-up co-founder and his vision to make AI more accessible to both traditional and no-code development. In this episode, Braden and I explore the journey he and his co-founders took to go from having an interesting idea to forming a company and the strategic business decisions they made along the way, such as why they opted not to use an open-source business model and the educational marketing strategy they've adopted. Highlights: Braden discusses his role as co-founder of Snorkel AI. (00:25) An introduction to Snorkel Flow, Snorkel AI's data-centric AI development program and the challenges they solve for. (01:49) Snorkel AI's relationship with open source. (06:30) Why Snorkel AI decided not to use an open-source business model in order to lower the barrier to entry. (09:01) Snorkel AI's trajectory coming from academia to the world of start-ups. (12:50) The unexpected challenges of building Snorkel AI. (17:50) Taking an educational approach to the marketing at Snorkel AI. (22:27) Braden discusses the meaningful applications of AI as well as where he sees AI being used as more of a buzzword. (27:27) Links:Braden LinkedIn: https://www.linkedin.com/in/bradenhancock/ Twitter: https://twitter.com/bradenjhancock Company: snorkel.ai Snorkel AI Twitter: https://twitter.com/SnorkelAI

head ai business technology real world accessible hancock kubernetes cloud native making ai snorkel ai

Build Better Models Through Data Centric Machine Learning Development With Snorkel AI

The Machine Learning Podcast

Play Episode Listen Later Jul 29, 2022 53:49

Summary Machine learning is a data hungry activity, and the quality of the resulting model is highly dependent on the quality of the inputs that it receives. Generating sufficient quantities of high quality labeled data is an expensive and time consuming process. In order to reduce that time and cost Alex Ratner and his team at Snorkel AI have built a system for powering data-centric machine learning development. In this episode he explains how the Snorkel platform allows domain experts to create labeling functions that translate their expertise into reusable logic that dramatically reduces the time needed to build training data sets and drives down the total cost. Announcements Hello and welcome to the Machine Learning Podcast, the podcast about machine learning and how to bring it from idea to delivery. Building good ML models is hard, but testing them properly is even harder. At Deepchecks, they built an open-source testing framework that follows best practices, ensuring that your models behave as expected. Get started quickly using their built-in library of checks for testing and validating your model’s behavior and performance, and extend it to meet your specific needs as your model evolves. Accelerate your machine learning projects by building trust in your models and automating the testing that you used to do manually. Go to themachinelearningpodcast.com/deepchecks today to get started! Data powers machine learning, but poor data quality is the largest impediment to effective ML today. Galileo is a collaborative data bench for data scientists building Natural Language Processing (NLP) models to programmatically inspect, fix and track their data across the ML workflow (pre-training, post-training and post-production) – no more excel sheets or ad-hoc python scripts. Get meaningful gains in your model performance fast, dramatically reduce data labeling and procurement costs, while seeing 10x faster ML iterations. Galileo is offering listeners a free 30 day trial and a 30% discount on the product there after. This offer is available until Aug 31, so go to themachinelearningpodcast.com/galileo and request a demo today! Predibase is a low-code ML platform without low-code limits. Built on top of our open source foundations of Ludwig and Horovod, our platform allows you to train state-of-the-art ML and deep learning models on your datasets at scale. Our platform works on text, images, tabular, audio and multi-modal data using our novel compositional model architecture. We allow users to operationalize models on top of the modern data stack, through REST and PQL – an extension of SQL that puts predictive power in the hands of data practitioners. Go to themachinelearningpodcast.com/predibase today to learn more and try it out! Your host is Tobias Macey and today I’m interviewing Alex Ratner about Snorkel AI, a platform for data-centric machine learning workflows powered by programmatic data labeling techniques Interview Introduction How did you get involved in machine learning? Can you describe what Snorkel AI is and the story behind it? What are the problems that you are focused on solving? Which pieces of the ML lifecycle are you focused on? How did your experience building the open source Snorkel project and working with the community inform your product direction for Snorkel AI? How has the underlying Snorkel project evolved over the past 4 years? What are the deciding factors that an organization or ML team need to consider when evaluating existing labeling strategies against the programmatic approach that you provide? What are the features that Snorkel provides over and above managing code execution across the source data set? Can you describe what you have built at Snorkel AI and how it is implemented? What are some of the notable developments of the ML ecosystem that had a meaningful impact on your overall product vision/viability? Can you describe the workflow for an individual or team who is using Snorkel for generating their training data set? How does Snorkel integrate with the experimentation process to track how changes to labeling logic correlate with the performance of the resulting model? What are some of the complexities involved in designing and testing the labeling logic? How do you handle complex data formats such as audio, video, images, etc. that might require their own ML models to generate labels? (e.g. object detection for bounding boxes) With the increased scale and quality of labeled data that Snorkel AI offers, how does that impact the viability of autoML toolchains for generating useful models? How are you managing the governance and feature boundaries between the open source Snorkel project and the business that you have built around it? What are the most interesting, innovative, or unexpected ways that you have seen Snorkel AI used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Snorkel AI? When is Snorkel AI the wrong choice? What do you have planned for the future of Snorkel AI? Contact Info LinkedIn Website @ajratner on Twitter Parting Question From your perspective, what is the biggest barrier to adoption of machine learning today? Closing Announcements Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@themachinelearningpodcast.com) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links Snorkel AI Data Engineering Podcast Episode University of Washington Snorkel OSS Natural Language Processing (NLP) Tensorflow PyTorch Podcast.__init__ Episode Deep Learning Foundation Models MLFlow SHAP Podcast.__init__ Episode The intro and outro music is from Hitman’s Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

How is AI used in healthcare settings?

Monday Science

Play Episode Listen Later Jul 4, 2022 35:49

In this episode Dr Bahijja Raimi-Abraham discusses artificial intelligence (AI) and healthcare with Brandon Yang Machine Learning Engineer at Snorkel AI.Additional InformationSnorkel Open Source - https://www.snorkel.org/Guidelines for clinical trial protocols for interventions involving artificial intelligence: the SPIRIT-AI extensionThank you for listening! If you liked the episode, please give us a five-star rating and review.Buy a Coffee for Monday Science Subscribe, follow, comment, leave a review and get in touch !Submit your questions or send your voice note questions (up to 30 seconds) here. See acast.com/privacy for privacy and opt-out information.

ai coffee healthcare guidelines settings snorkel ai

Chip Huyen: Machine Learning Tools and Systems

The Gradient Podcast

Play Episode Listen Later Jun 30, 2022 48:27

In episode 32 of The Gradient Podcast, Andrey Kurenkov speaks to Chip Huyen.Chip Huyen is a co-founder of Claypot AI, a platform for real-time machine learning. Previously, she was with Snorkel AI and NVIDIA. She teaches CS 329S: Machine Learning Systems Design at Stanford. She has also written four bestselling Vietnamese books, and more recently her new O'Reilly book Designing Machine Learning Systems has just come out! Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterShe also maintains a Discord server with a focus on Machine Learning Systems.Outline:(00:00) Intro(01:30) 3-year trip through Asia, Africa, and South America(04:00) Getting into AI at Stanford(11:30) Confession of a so-called AI expert(16:40) Academia vs Industry(17:40) Focus on ML Systems(20:00) ML in Academia vs Industry(28:15) Maturity of AI in Industry(31:45) ML Tools(37:20) Real Time ML(43:00) ML Systems Class and BookLinks:Chip's websiteMLOps Discord serverConfession of a so-called AI expertWhat I learned from looking at 200 machine learning toolsCS 329S: Machine Learning Systems DesignDesigning Machine Learning Systems Get full access to The Gradient at thegradientpub.substack.com/subscribe

ai africa focus confessions discord stanford south america chip academia maturity machine learning nvidia vietnamese ml outline gradient huyen learning tools snorkel ai

IA40 Winner Spotlight: Snorkel Co-founder Alex Ratner on data-centric AI, culture, and 'one of the most historic opportunities for growth in AI'

Founded and Funded

Play Episode Listen Later Jun 16, 2022 39:20

In this episode of Founded and Funded, we spotlight IIA40 winner Snorkel AI. Managing Director Tim Porter not only talks with Snorkel Co-founder and CEO Alex Ratner all about data-centric AI and programmatic data labeling and development, but they also dive into the importance of culture — especially now — and how to take advantage of what Alex calls "one of the most historic opportunities for growth in AI."

culture ai growth opportunities co founders data winner historic founded funded centric ratner snorkel snorkel ai

EP 08. 【连线硅谷独角兽】MLOps: 下一个基础软件百亿美金战场？

OnBoard!

Play Episode Listen Later May 19, 2022 100:16

这是一期的主题或许有些硬核：我们连线硅谷一线科技公司，聊聊机器学习的工具栈，ML Infra (更准确来说，MLOps)。这是M小姐公众号上个月的一次直播回放，主题是《下一个infra 百亿美金战场在哪里？顶尖开源公司眼里的 MLOps 新时代》。 Hello World, who is OnBoard?! 欢迎来到 OnBoard!, 真实的一线经验，走心的投资思考，我们聊聊软件如何改变世界。如果说过去5年是企业服务/SaaS的黄金年代（尤其在美国），那么，2021年绝对可以说是基础软件（主要是PaaS）的爆发之年。不过，在市场情绪急转直下的2022年，data infra 卷到发紫之后，一个infra领域的新战场已经在浮现。这就是这次我们要讨论的主题：MLOps（机器学习 DevOps 工具）. 对于MLOps，一个比较被广为接受的定义是（来自Nvidia），MLOps是结合了ML（Machine Learning，机器学习），应用开发和IT infra的一整套流程和相应的工具链。包括了ML开发的准备-开发-部署整个过程中数据收集、模型开发、模型训练、实验管理、CI/CD，到生产环境部署、监控等一系列工具。但是，要了解这个前沿趋势在硅谷实际的落地情况，这个转折点的背景，还有哪些机会，中国的创业者和创业公司可以如何参与到这个进程中甚至弯道超车，按照Monica 的习惯，当然是要请硅谷最一线的亲历者来聊聊！这次我们请来的嘉宾的经验，横跨了Apple, Databricks 这样的大厂，也有Snorkel AI, BentoML这样的创业公司，从创业者到产品经理到开发者，他们的视角足够全面，一个多小时的分享，干货也是非常丰富。直播的反响非常好，很多人要求回放，索性就上传到播客里，供大家回味。中间因为腾讯视频号直播的连接问题，后面的音质可能不大好，请大家多担待。另外，嘉宾们日常工作都是英文环境，夹杂英文在所难免，尤其是一些专业名词，也希望大家体谅呀！一些名词注释我们也整理在节目介绍中了，真的尽力了！想要了解更多背景知识和嘉宾介绍，请看这篇硬核宣传稿！重磅直播嘉宾 Yifan Cao, PM @Cruise ML platform, ex PM@Databricks, ex-PM @Apple ML Quinn, BizOps @Snorkel AI, ex-PM@Moveworks Chaoyu Yang, Co-founder & CEO @BentoML (github), ex software eng @Databricks 我们都聊了什么： 03:32 Monica 和几位嘉宾的自我介绍+Fun facts: 我们关注的 MLOps startups 14:02 如何理解 MLOps? 为什么企业需要关注 MLOps? 23:40 为什么现在 MLOps 开始得到了关注，主要的驱动力是什么？ 37:01 新的 MLOps 产品早期 adopters 都是怎样的用户？ 47:38 从技术提供方的 Databricks 到甲方 Apple, Cruise, 对于企业如何选择 MLOps 产品有什么新的思考？ 53:31 BentoML 为何选择 model serving 作为创业切入点？ 63:52 企业内部如何推动一个新的 MLOps 公司落地？决策链是怎样的？ 68:09 Billion dollar question: MLOps 工具做单点还是平台？未来会如何整合？ 77:54 MLOps 开源公司，如何设计商业化路径? 85:22 Yifan 如何看待单点还是平台，开源商业化两个问题（中间网络断掉了得补上……） 90:29 这些一线从业者眼中，MLOps 目前还有什么挑战和最令人兴奋的机会？ 97:23 新领域创业公司早期如何招到优秀的人才我们提到的公司 Yifan: Outerbounds, Netflix 的开源项目MetaflowOSS的商业公司 Chaoyu: Perfect.io, workflow orchestration（工作流编排，其实不是针对 MLOps, 跟Airflow 场景更接近） Quinn: Arthur.ai, ML monitoring 机器学习模型监控 AWS SageMaker Fiddler AI: ML 监控和可解释性平台 Arize AI: ML infra 观察和监控工具推荐文章有干货的直播介绍：下一个 Infra 百亿美金战场在哪里一篇MLOps 中文科普文章欢迎关注M小姐的微信公众号，了解更多中美连线对话！ M小姐研习录 (ID: MissMStudy) 大家的点赞、评论、转发是对我们最好的鼓励！希望你分享给对这个话题感兴趣的朋友哦~ 如果你有希望我们聊的话题，希望我们邀请的访谈嘉宾，都欢迎在留言中告诉我们哦！

billion saas cruise ml onboard paas databricks yifan snorkel ai

Devang Sachdev (Snorkel AI): The Story of Your Product

OV | BUILD

Play Episode Listen Later Mar 9, 2022 2:26

Your product's story is really the story of your customers. Listen to Devang's straightforward framework for successful product storytelling.Mentioned in this episode:Sign up for OpenView's weekly newsletterDevang Sachdev, Vice President of Marketing at Snorkel AISnorkel AIFollow Blake Bartlett on Linkedin.Podcast produced by OpenView.View our blog for more context/inspiration.OpenView on LinkedinOpenView on TwitterOpenView on InstagramOpenView on Facebook

marketing vice president product snorkel openview devang snorkel ai

Devang Sachdev (Snorkel AI): Platform GTM

OV | BUILD

Play Episode Listen Later Mar 2, 2022 26:18

Everyone wants to be a platform, but be careful what you wish for. Platform go-to-market is radically different from application go-to-market. It requires a whole different framework and a special focus on solutions. Devang describes the platform GTM frameworks he developed at Twilio and Snorkel AI.Mentioned in this episode:Sign up for OpenView's weekly newsletterDevang Sachdev, Vice President of Marketing at Snorkel AISnorkel AIFollow Blake Bartlett on Linkedin.Podcast produced by OpenView.View our blog for more context/inspiration.OpenView on LinkedinOpenView on TwitterOpenView on InstagramOpenView on Facebook

marketing vice president platform twilio gtm snorkel openview devang snorkel ai

Episode 81: Research, Engineering, and Product in Machine Learning with Aarti Bagul

Datacast

Play Episode Listen Later Jan 20, 2022 63:25

Timestamps(02:00) Aarti shared her upbringing growing up in India and going to New York for undergraduate.(04:47) Aarti recalled her academic experience getting dual degrees in Computer Science and Computer Engineering at New York University.(07:17) Aarti shared details about her involvement with the ACM chapter and the Women in Computing club at NYU.(10:46) Aarti shared valuable lessons from her research internships.(14:16) Aarti discussed her decision to pursue an MS degree in Computer Science at Stanford University.(20:27) Aarti reflected on her learnings being the Head Teaching Assistant for CS 230, one of Stanford's most popular Deep Learning courses.(23:59) Aarti shared her thoughts on ML applications in both clinical and administrative healthcare settings.(26:47) Aarti unpacked the motivation and empirical work behind CheXNet, an algorithm that can detect pneumonia from chest X-rays at a level exceeding practicing radiologists.(29:39) Aarti went over the implications of MURA, a large dataset of musculoskeletal radiographs containing over 40,000 images from close to 15,000 studies, for ML applications in radiology.(32:50) Aarti went over her experience working briefly as an ML engineer at Andrew Ng's startup Landing AI and applying ML to visual inspection tasks in manufacturing.(36:56) Aarti talked about her participation in external entrepreneurial initiatives such as Threshold Venture Fellowship and Greylock X Fellowship.(43:41) Aarti reminisced her time in a hybrid ML engineer/product manager/VC associate role at AI Fund, which works intensively with entrepreneurs during their startups' most critical and risky phase from 0 to 1.(48:43) Aarti shared advice that AI fund companies tended to receive regarding product-market fit and go-to-market fit strategy.(54:04) Aarti walked through her decision to onboard Snorkel AI, the startup behind the popular Snorkel open-source project capable of quickly generating training data with weak supervision.(56:36) Aarti reflected on the difference between being an ML researcher and an ML engineer.(01:00:18) Closing segment.Aarti's Contact InfoLinkedInTwitterGoogle ScholarPeopleAndrew NgJohn LangfordDavid SontagBooks and Papers“The Art of Doing Science & Engineering” (by Richard Hamming)“Deep Medicine: How AI Can Make Healthcare Human Again” (by Eric Topol)“CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning” (Dec 2017)“MURA: Large Dataset for Abnormality Detection in Musculoskeletal Radiographs” (May 2018)About the showDatacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.Datacast is produced and edited by James Le. Get in touch with feedback or guest suggestions by emailing khanhle.1013@gmail.com.Subscribe by searching for Datacast wherever you get podcasts or click one of the links below:Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you're new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.

E264 Aarti Bagul, Machine Learning Engineer at Snorkel AI

AI in Action Podcast

Play Episode Listen Later Oct 1, 2021 18:26

Today's guest is Aarti Bagul, Machine Learning Engineer at Snorkel AI in San Francisco, CA. Founded in 2019, Snorkel AI is a technology startup that empowers data scientists and developers to turn data into accurate and adaptable AI applications fast with Snorkel Flow, a first-of-its-kind data-centric development platform, powered by programmatic labeling. Snorkel Flow reduces the time, cost and friction of labeling training data so data science and development teams can more easily build and scale AI models to deploy more meaningful applications. Incorporating human judgment into the AI process through subject-matter experts is made more efficient and scalable leading to more ethical, responsible outcomes. Two out of the top three US banks, several government agencies and Fortune 500 companies use Snorkel Flow. Snorkel's core research was developed at Stanford AI lab and is deployed at Google, Intel, Apple, IBM, DARPA, and other trailblazing organizations. In the episode, Aarti will discuss: The interesting work they do at Snorkel AI, Problems they are solving in unlocking training data, Her role and interesting projects the team are working on, Transitioning from a research focused role into the startup world, Why Snorkel AI is a great place to work

ai google apple san francisco fortune transitioning engineers ibm founded intel incorporating darpa aarti snorkel machine learning engineer snorkel ai

Challenges of productionizing Machine Learning Research in Industry | Aarti Bagul

Machine Learning Podcast - Jay Shah

Play Episode Listen Later Jul 19, 2021 5:35

Why and where do companies fail at productionizing ML models? Watch the full podcast with Aarti here: https://youtu.be/VWJXiszQpTUAarti is a machine learning engineer at Snorkel AI. Prior to that, she worked closely with Andrew Ng in various capacities. She graduated with a master's in CS from Stanford, and bachelor's in CS and Computer Engineering from @New York University, and at @Microsoft Research as a research intern for John Langford, where she contributed to Vowpal Wabbit, an open-source project. About the Host:Jay is a Ph.D. student at Arizona State University, doing research on building Interpretable AI models for Medical Diagnosis.Jay Shah: https://www.linkedin.com/in/shahjay22/You can reach out to https://www.public.asu.edu/~jgshah1/ for any queries.Stay tuned for upcoming webinars!***Disclaimer: The information contained in this video represents the views and opinions of the speaker and does not necessarily represent the views or opinions of any institution. It does not constitute an endorsement by any Institution or its affiliates of such video content.***

challenges research microsoft stanford machine learning new york university arizona state university cs ml women in tech institution computer engineering aarti andrew ng women who code jay shah learning research women in research landing ai snorkel ai john langford

Snorkel.ai: Unlocking Subject Matter Experts to make Software 2.0 [Alex Ratner]

The Swyx Mixtape

Play Episode Listen Later Jul 8, 2021 9:37

Source: https://www.thecloudcast.net/2021/06/automated-data-labeling-for-ai-apps.htmlSee also: https://softwareengineeringdaily.com/2020/04/09/snorkel-training-dataset-management-with-braden-hancock/Software 2.0 is Andrej Karpathy's idea that instead of coding business logic by hand, the applications of the future will be trained by data. In other words, machine learning. But ML is limited by the quality of data available, and there is a lot of unstructured, unlabeled data out there that is still being manually labeled today. Scale.AI is a well known startup that has done very well offering a scalable manual labeling workforce, however they are still bottlenecked by the number of subject matter experts available for labeling critically important data, like cancer diagnosis and drug trafficking rings. In order to get labels from subject matter experts, you typically have to put them through a very tedious process of labeling to build up a useful structured dataset upfront before any useful machine learning can be done.I did some very minor ML work about 5 years ago and found Christopher Re's work on DeepDive at Stanford. It takes a revolutionary approach by making it easy to write the labeling functions themselves. This turns the labeling process into an iterative, REPL like experience where subject matter experts can suggest a function, see its impact right away, and continue refining it, assisted by AI. DeepDive is now commercialized in a startup called Snorkel.AI, so I was very excited to find a clear explanation of Snorkelflow from its CEO, Alex Ratner. Here it is!Transcript[00:01:15] Alex Ratner: [00:01:15] SnorkelFlow is a platform that's meant to take this process of building machine learning models and AI applications. And I get all starting with buildings, the data that they rely on that fuels them and make it, in a nutshell, look more like an iterative software development process. Then you know, this kind of 80, 90% upfront just, hand labeling exercise.[00:01:34]And so snorkel flow supports that entire iterative loop of, actually laboring data. Can be by hand in the platform, but also most centrally programmatically by letting users, what we call labeling. Basic idea, is that rather than say asking your, legal associate at a bank to, or your doctor friends to sit down and, label a hundred thousand contracts or a hundred thousand electronic health records have them, right.[00:02:00]Sharistics are bits of their expertise look for this keyword or look for this pattern or look for this, et cetera. I'm like a bridge from old, expert knowledge type input. Modern machine learning models using one to power. The other. So a snorkel flow is an IDE basically, and has a no-code UI component as well, but let's not people either via code or by pushing buttons for even, non-developer subject matter experts say to.[00:02:24]Programmatically labeled their data by writing these labeling functions and then uses a bunch of modeling techniques. A lot of which was actually, the work that, that the co-founding team. And I did in, in, in our kind of thesis work around how you take a bunch of programmatic data and clean it up and turn it into a final.[00:02:41]Instead of clean training data for machine learning models, and then actually in snorkel flow, you can, autumn, basically push button train best-in-class open source models. You can then analyze where they're succeeding or failing and, and use that to go back and iterate on your data.[00:02:54]And there's a Python SDK throughout the whole thing. So many of our customers will mix and match. Will you start. Create the training data set and then train the model on some other system, et cetera. But what's normal flames of support. Is it basic iterative development process where, you know, rather than just spending months to label a training at once and then being stuck with it and having to throw it out and start all over again, anything in the world changes your upstream input, data changes your downstream objectives.[00:03:18] Change, making it again more like an iterative process where you push some buttons or write some code. That label the data. You compile a model or train it, but you can think of it like compiling and then you go back and debug by, by iterating on your data, everything centers and snorkel flow around looking at your data and iterating on how it's labeled to improve models.[00:03:38]Brian Gracely: [00:03:38] I'm curious. So you mentioned you mentioned in there's a there's a Python SDK, which for anybody who, works in data science, data modeling, right? Python is your language to Frank sort of the language you use or are you a couple of them, that's the language that, you how you do your program, but I'm curious, like in today's world, Do data scientists consider themselves programmers or is there still Hey, look, I work on the numbers, I'm good at building models and the numbers, but I don't think of myself as a programmer.[00:04:08] Like how do you bridge those two worlds together or do you not really have to bridge them together? How much does the data scientists have to go? I have to focus on numbers and models versus I have to focus on programming, something to do stuff. What's their world look like?[00:04:21]Alex Ratner: [00:04:21] It's a great question. I think I, I haven't been are currently I'm part of four or five different data science institutes or something. And I don't even still know. I mean, the data science is such a broad umbrella term. There's so many different varietals of us and, and types.[00:04:35] And so I do think there's a very broad spectrum of, the data scientists. An ML engineer and just, loves writing codes are the one that, to your point really just wants to push some buttons and get back to the numbers and the modeling and the outcome. And, we definitely, try to support the range through a layered approach.[00:04:50]And, we, we have , but on top of that, we have a a no-code UI that allows you to write these wavelength functions without writing code. So for example, if you're trying to train a CA a contract classifier and snorkeled flow, you can, write Lateline functions based on clicking on keywords or pressing buttons with kind of templates for types of patterns or signals you want to look for.[00:05:11] So, No we try to support basically, if you want to move fast and you're a non developer, or you're just not looking to spend time there, you can just do it in push-button way. But then if you want to go and customize or inject custom logic or really get creative, you can always fall back to the Python SDK.[00:05:27] And so, I mean, I think a lot of the what we're trying to accomplish in the very beginning, right? Raised me abstraction know level at which you're interfacing with and programming your machine learning model or your AI application. And the first step is the hardest, right?[00:05:39] If you think of the way that hand labeled training data is, it's like the machine code, or really actually, just so you know, I think of it as like the ones and zeros, literally for binary classification cases. Yeah, a lot of the effort behind the circle project and the company is just, or was just getting from that layer to the layer of, assembly language day.[00:05:57] But once you get there, you can build all those layers on top and you can go up the stack and down the stack, according to the application of the user type, right. Actually, my co-founder Braden who was, who also did his PhD around, snorkel related stuff, had a paper actually on how you could use natural language inputs.[00:06:12] You could explain in, in natural light. Just speaking to the computer, why a certain data point should be labeled a certain way and then use off the shelf semantic parsers to parse that down to code, which then would get dumped into snorkel. So basically once you make this leap from labeling data, by ham kind of zeros and ones to labeling your training data with code, then the sky's the limit in terms of building layers of abstraction on top of it.[00:06:35] And that's actually a lot of what the company does and has been doing over the last two years is. Building a flexible interface through our platform, snorkel flow for different data types and use case types and user types. [00:06:45]Brian Gracely: [00:06:45] Yep. Well, and, and I think you, you really answered my question in there.[00:06:49] The reason I brought it up was on one hand you have this you have this language level SDK in terms of Python, you can get into, Some pretty granular level stuff. And then you have, on the other end, you've got application studio, which you said, like you said is this sort of low code graphical way of, building templates and building applications.[00:07:08] And I was like, There must be like, I think sometimes there's just perspective of there's one profile of a data scientist. And I think what you really highlighted is it, it's like a lot of things there's a spectrum of, those that specialize in one part of the job, others that don't care about it and want it, certain things to be easy.[00:07:25] And so that, that was useful because I think sometimes like in my head, I'm thinking, okay, Data scientists is served a certain sort of task the same way you might say okay, they're a Java developer. So they, there's a tool set that they always use. So that was super helpful.[00:07:39]Alex Ratner: [00:07:39] Yeah. And it depends on what the problem is too. I mean, the other thing also that I think goes under, emphasized in the air space big. Points number one. And I don't think it's that avant gardening where to say it was maybe more back in 2015 is, Hey, AI is about the data, not the models or the algorithms, which I think, fewer people will find a controversial statement today.[00:07:57]Even if it's phrases in a somewhat reductive way. But the other thing that I still think is under emphasized in practices and necessity of lupus. What we often refer to as subject matter experts into the process. And so I think w and I won't ramble here too long, but just for some perspective, and this is actually the very first funding that, that the snorkel project ever had was specifically about looping what they call SMEEs and the government subject matter experts.[00:08:20]Our original partners were some genomicists at Stanford. How do you loop them into the. Of AI in a better way than just saying, Hey go label data for eight months for me, please. And this idea of how do you get subject matter expertise from a human's head into a scalable machine format has been the focus of AI for, decades, but the answer of modern machine learning today for the last, five, 10 years.[00:08:44] Okay, just sit them down, have them labeled data points one by one, nothing else. They've got all of this rich domain knowledge, a doctor, a lawyer, a cyber analyst, network, technician, and underwriter. Throw that all away, just have them literally just, give zeros and ones labeling data. And that's a nice abstraction.[00:09:01]And it has been actually a very productive one for the field, because that means the ML engineers can totally abstract the way the messy realities of real-world data and real world subject matter experts. And just focus on optimizing, a fancier model architecture. But I think we've reached a point where it starts to become silly and impractical to have this wall.[00:09:19] The subject matter expert and the data scientists. So I'll let us loop back and say, but a big focus of circle flow is about making these interfaces in this process, accessible to a non-developer who's, a legal associate or an underwriter or a network technician and have the process too. And that's another motivation behind the kind of, layers, including no-code UI.

ceo learning ai business technology change building phd data modern deep dive unlocking software scale stanford points raised throw basic python ui ml java ide sdks subject matter experts ratner snorkel andrej karpathy repl lateline brian gracely snorkel ai programmatically

Snorkel AI: Building the First Truly Data-Centric AI platform

The Pulse of AI

Play Episode Listen Later Jun 18, 2021 23:35

On this podcast I am joined by Braden Hancock who is a co-founder and Head of Technology at Snorkel AI. Snorkel AI is unlocking a better, faster way to build applications With Snorkel Flow, the first truly data-centric AI platform. To date they have raised over $50 million from top VC firms such as Greylock, Lightspeed, GV and others. Snorkel AI is solving a real problem that has been holding back AI adoption by simplifying data labeling and making AI projects look more like software development efforts. Delivering both successfully will encourage companies to more fully embrace AI by removing some of the barriers and hassles that make “doing AI” so difficult. On this podcast we talk about Braden's journey to entrepreneurship, the origin story of Snorkel, how their approach to data labeling works and how it helps unlock the power of AI in the enterprise, why SME's are the key to data labelling and why Snorkel's approach empowers them, why you should drop what you are doing and immediately send in a resume and much more. I am excited for this conversation because Snorkel AI is solving a real problem in the industry that has been holding back AI adoption and I am just really impressed with their team. In my opinion this is definitely a company to keep an eye on. Let's get to it!

head ai technology data platform vc delivering sme centric lightspeed gv snorkel greylock snorkel ai

2x23: Overcoming the Obstacles of AI Application Development with Snorkel AI

Utilizing AI - The Enterprise AI Podcast

Play Episode Listen Later Jun 8, 2021 34:47

Developers of AI applications face many obstacles, but the chief challenge is simply that these are different from traditional software development projects. 85% of businesses say they are looking to adopt AI but a similar percentage of data science projects never reach production. Too many organizations approach AI application development similarly to other software projects. Another issue is focusing on the machine learning model rather than the data set that will be used. Devang Sachdev of Snorkel AI suggests being data-focused instead, and reducing and optimizing models instead of continually expanding the number of parameters. Another issue is the manual process of developing training data, which is time-consuming and error-prone. Finally, we must consider a process of iteration over models and training data to ensure quality. Machine learning is an excellent tool but it requires a re-think in how a company approaches software development. Three Questions Is it possible to create a truly unbiased AI? Can you think of an application for ML that has not yet been rolled out but will make a major impact in the future? How big can ML models get? Will today's hundred-billion parameter model look small tomorrow or have we reached the limit? Guests and Hosts Devang Sachdev, VP of Marketing at Snorkel AI. Connect with Devang on LinkedIn or on Twitter at @DevangSachdev. Chris Grundemann, Gigaom Analyst and Managing Director at Grundemann Technology Solutions. Connect with Chris on ChrisGrundemann.com on Twitter at @ChrisGrundemann. Stephen Foskett, Publisher of Gestalt IT and Organizer of Tech Field Day. Find Stephen's writing at GestaltIT.com and on Twitter at @SFoskett. Date: 6/08/2021 Tags: @SFoskett, @ChrisGrundemann, @SnorkelAI, @DevangSachdev

ai marketing overcoming managing directors developers obstacles publishers ml organizers snorkel application development devang tech field day stephen foskett snorkel ai gestalt it

Episode 58: [Recap] Snorkeling, Artificial Intelligence and Healthcare...? Conversation with Brandon Yang [Snorkel.ai]

Monday Science

Play Episode Listen Later Jan 26, 2021 58:05

In this episode Dr Bahijja Raimi-Abraham discusses artificial intelligence (AI) and healthcare with Brandon Yang (bio available here - https://mondayscience.wixsite.com/podcast/episode26), Machine Learning Engineer at Snorkel AI. Episode image credit: https://unsplash.com/ Additional Information Snorkel Open Source - https://www.snorkel.org/ Guidelines for clinical trial protocols for interventions involving artificial intelligence: the SPIRIT-AI extension Previous episodes discussing AI, AI and healthcare and AI and ethics -Episode 6 -Episode 23 [Interview with Dr David Leslie, The Alan Turing Institute) Episode summary available MondayScience.Medium.com Let us know what you thought of the episode. Subscribe, follow, comment and get in touch! Submit your questions or send your voice note questions (up to 30 seconds) via www.mondaysciencepodcast.com e. MondayScience2020@gmail.com --- Send in a voice message: https://anchor.fm/mondayscience/message

ai interview healthcare artificial intelligence medium previous guidelines snorkeling machine learning engineer david leslie snorkel ai

Episode 26: Snorkeling, Artificial Intelligence and Healthcare...? Conversation with Brandon Yang [Snorkel.ai]

Monday Science

Play Episode Listen Later Nov 9, 2020 58:05

In this episode Dr Bahijja Raimi-Abraham discusses artificial intelligence (AI) and healthcare with Brandon Yang (bio available here - https://mondayscience.wixsite.com/podcast/episode26), Machine Learning Engineer at Snorkel AI. Episode image credit: https://unsplash.com/ Additional Information Snorkel Open Source - https://www.snorkel.org/ Guidelines for clinical trial protocols for interventions involving artificial intelligence: the SPIRIT-AI extension Previous episodes discussing AI, AI and healthcare and AI and ethics -Episode 6 -Episode 23 [Interview with Dr David Leslie, The Alan Turing Institute) Subscribe, follow, comment and get in touch! Submit your questions or send your voice note questions (up to 30 seconds) via https://mondayscience.wixsite.com/podcast e. MondayScience2020@gmail.com --- Send in a voice message: https://anchor.fm/mondayscience/message

ai interview healthcare artificial intelligence previous guidelines snorkeling machine learning engineer david leslie snorkel ai

Podcasts about snorkel ai

Best podcasts about snorkel ai

MLOps.community

OV | BUILD

The Machine Learning Podcast

Monday Science

Latest news about snorkel ai

Latest podcast episodes about snorkel ai

Trump's High-risk, High-reward AI Action Plan

Building Enterprise-Grade AI: Lessons from Snorkel AI's Fundraising and Expansion

CMO Strategies for Building B2B Brands | Cate Lochead (Snorkel AI, JumpCloud, DataStax, Couchbase, and Oracle)

Episode 69 - Building Confidence in Agentic AI, featuring BNY, Snorkel AI and CIBC Mellon

How Hyper-Growth Companies Leverage Events to Elevate Their Brand & Stand Out with Cate Lochead

S4E8: Dave Munichiello on Investing in AI's Future

Programmatic AI data development, Multimodal AI, False dichotomy of fine-tuning vs RAG, Compute-optimal LLMs | Alex Ratner, CEO of Snorkel AI

MLOps vs. LLMOps Panel // LLMs in Conference in Production Conference Part 2 // MLOps Podcast # 176

Ep 15: Snorkel AI CEO Alex Ratner on What's Needed for Wider-Spread Enterprise AI Adoption

Enhancing your marketing with Chat GPT

Foundational Models are the Future but... with Alex Ratner CEO of Snorkel AI // MLOps Podcast #139

Real-time Machine Learning with Chip Huyen // Chip Huyen // MLOps Coffee Sessions #133

Snorkel AI | Jumpstarting Data-Centric AI

Solve The Cold Start Problem For Machine Learning By Letting Humans Teach The Computer With Aitomatic

Making AI Accessible to All with Braden Hancock

Build Better Models Through Data Centric Machine Learning Development With Snorkel AI

How is AI used in healthcare settings?

Chip Huyen: Machine Learning Tools and Systems

IA40 Winner Spotlight: Snorkel Co-founder Alex Ratner on data-centric AI, culture, and 'one of the most historic opportunities for growth in AI'

EP 08. 【连线硅谷独角兽】MLOps: 下一个基础软件百亿美金战场？

Devang Sachdev (Snorkel AI): The Story of Your Product

Devang Sachdev (Snorkel AI): Platform GTM

Episode 81: Research, Engineering, and Product in Machine Learning with Aarti Bagul

E264 Aarti Bagul, Machine Learning Engineer at Snorkel AI

Challenges of productionizing Machine Learning Research in Industry | Aarti Bagul

Snorkel.ai: Unlocking Subject Matter Experts to make Software 2.0 [Alex Ratner]

Snorkel AI: Building the First Truly Data-Centric AI platform

2x23: Overcoming the Obstacles of AI Application Development with Snorkel AI

Episode 58: [Recap] Snorkeling, Artificial Intelligence and Healthcare...? Conversation with Brandon Yang [Snorkel.ai]

Episode 26: Snorkeling, Artificial Intelligence and Healthcare...? Conversation with Brandon Yang [Snorkel.ai]