Automatic conversion of spoken language into text
POPULARITY
Gideon Mendels is the Chief Executive Officer at Comet, the leading solution for managing machine learning workflows. How to Systematically Test and Evaluate Your LLMs Apps // MLOps Podcast #269 with Gideon Mendels, CEO of Comet. // Abstract When building LLM Applications, Developers need to take a hybrid approach from both ML and SW Engineering best practices. They need to define eval metrics and track their entire experimentation to see what is and is not working. They also need to define comprehensive unit tests for their particular use-case so they can confidently check if their LLM App is ready to be deployed. // Bio Gideon Mendels is the CEO and co-founder of Comet, the leading solution for managing machine learning workflows from experimentation to production. He is a computer scientist, ML researcher and entrepreneur at his core. Before Comet, Gideon co-founded GroupWize, where they trained and deployed NLP models processing billions of chats. His journey with NLP and Speech Recognition models began at Columbia University and Google where he worked on hate speech and deception detection. // MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links Website: https://www.comet.com/site/ All the Hard Stuff with LLMs in Product Development // Phillip Carter // MLOps Podcast #170: https://youtu.be/DZgXln3v85s Opik by Comet: https://www.comet.com/site/products/opik/ --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Gideon on LinkedIn: https://www.linkedin.com/in/gideon-mendels/ Timestamps: [00:00] Gideon's preferred coffee [00:17] Takeaways [01:50] A huge shout-out to Comet ML for sponsoring this episode! [02:09] Please like, share, leave a review, and subscribe to our MLOps channels! [03:30] Evaluation metrics in AI [06:55] LLM Evaluation in Practice [10:57] LLM testing methodologies [16:56] LLM as a judge [18:53] OPIC track function overview [20:33] Tracking user response value [26:32] Exploring AI metrics integration [29:05] Experiment tracking and LLMs [34:27] Micro Macro collaboration in AI [38:20] RAG Pipeline Reproducibility Snapshot [40:15] Collaborative experiment tracking [45:29] Feature flags in CI/CD [48:55] Labeling challenges and solutions [54:31] LLM output quality alerts [56:32] Anomaly detection in model outputs [1:01:07] Wrap up
About the speaker: Katarzyna is a computational linguist with over 10 years of experience in NLP and speech recognition. She has developed language models for automotive brands like Audi and Porsche and specializes in phonetics, morpho-syntax, and sentiment analysis. Kasia also teaches at the University of Warsaw and is passionate about human-centered AI and multilingual NLP. Join our slack: https://datatalks.club/slack.html
In this episode of The Valley Current®, host Jack Russo sits down with tech visionary Ronjon Nag, whose journey through the world of AI and technology is nothing short of inspiring. From his days at Cambridge and MIT to founding a groundbreaking handwriting recognition company with just $500, Ronjon has always been ahead of the curve. His insights into the evolution of mobile technology, including his role in launching the first mobile app store, will captivate anyone interested in the tech industry's rapid transformation. But it doesn't stop there. Ronjon's current venture, R.42, is pushing the boundaries of AI, longevity, and healthcare, and he's got some bold predictions about the future of these fields. As a Stanford professor, he's not just shaping technology—he's shaping minds, teaching the next generation about the ethical and practical implications of AI. Listen in as Ronjon educates Jack with fascinating stories, forward-thinking ideas, and a deep dive into the cutting-edge innovations that could change our lives. Whether you're a tech enthusiast or just curious about the future, this conversation is one you won't want to miss. Check out these links to learn more about Ronjon Nag or even chat with him via his avatar! https://www.r42group.com/ https://www.superbio.ai/ https://app.mimio.ai/ronjon/chat Jack Russo Managing Partner Jrusso@computerlaw.com www.computerlaw.com https://www.linkedin.com/in/jackrusso "Every Entrepreneur Imagines a Better World"®️
Today Madrona Managing Director Karan Mahandru and Scott Stephenson, Co-Founder and CEO of Deepgram, a foundational AI company building a voice AI platform providing APIs for speech-to-text and text-to-speech. From medical transcription to autonomous agents, Deepgram is the go-to for developers of voice AI experiences, and they're already working with over 500 companies, including NASA, Spotify, and Twilio. Today, Scott and Karan dive into the realities of building a foundational AI company, meaning they're building models and modalities from scratch. They discuss the challenges of moving from prototype to production, how startups need to out-fox the hyperscalers while also partnering with them, and, of course, how Scott went from being a particle physicist working on detecting dark matter to building large language models for speech recognition. This is a must-listen for anyone building in AI. Full Transcript: http://www.madrona.com/founded-funded-deepgram-scott-stephenson Chapters: (00:00) Introduction (01:15) From Particle Physics to Voice AI (03:16) The Birth of Deepgram (03:40) Building a Developer-Centric AI Company (06:11) Challenges and Early Decisions (09:49) Navigating the AI Market (13:33) OpenAI's Whisper and Deepgram's Response (17:30) The Future of AI and Speech Recognition (21:59) Deepgram's Real-World Applications (31:19) From Prototype to Production
In this episode of the Eye on AI podcast, we explore the forefront of voice-powered AI technology with Trevor Back, Chief Product Officer at Speechmatics. Discover how Speechmatics is pushing the boundaries of speech recognition and conversational AI with their latest innovation, Flow. Trevor shares his journey from a background in computational astrophysics to becoming a key figure in AI at DeepMind and now Speechmatics. He delves into the development and potential of Flow, a groundbreaking tool combining automatic speech recognition (ASR), large language models (LLMs), and text-to-speech synthesis, aimed at creating seamless and responsive voice interactions. We explore the wide-ranging applications of Speechmatics' technology across industries, including media, call centers, and education. Trevor discusses the challenges of achieving high accuracy in speech recognition, especially in diverse and noisy environments, and how Speechmatics addresses these challenges with their unique approach to training models. Listen in as we uncover the intricacies of handling multiple languages, improving diarization, and the future goals of understanding complex audio cues like emotion and sarcasm. Learn about the company's vision for integrating voice technology into everyday products, making technology more accessible and user-friendly. Don't miss this insightful conversation on the future of voice technology, AI in business, and its role in the evolving landscape of AI. Like, subscribe, and hit the notification bell for more expert discussions on cutting-edge advancements in AI. This episode is sponsored by Shopify. Shopify is a commerce platform that allows anyone to set up an online store and sell their products. Whether you're selling online, on social media, or in person, Shopify has you covered on every base. With Shopify you can sell physical and digital products. You can sell services, memberships, ticketed events, rentals and even classes and lessons. Sign up for a $1 per month trial period at http://shopify.com/eyeonai Checkout Speechmatics, the most accurate AI speech technology - with AI transcription & real-time translation components.: https://www.speechmatics.com/ Stay Updated: Craig Smith Twitter: https://twitter.com/craigss Eye on A.I. Twitter: https://twitter.com/EyeOn_AI (00:00) Introduction and Background (01:49) Trevor Back's Journey into AI (04:02) DeepMind and Early AI Applications (07:30) Speechmatics' Mission and Focus (12:06) Key Applications of Speechmatics Technology (14:25) Achieving High Accuracy and Low Latency (17:52) Language Coverage and Challenges (21:27) Future of Voice Technology and AGI (24:52) Integrating Large Language Models (27:31) Handling Multiple Voices and Diarization (29:32) Real-world Applications and Challenges (35:20) Demonstration of Flow and Capabilities (41:14) Endpoint Prediction and Interruption (43:53) Real-time Interactions and Future Prospects (45:34) Launch Event and Future Plans (50:13) New Language Releases and Compliance
Voice recognition is getting integrated in nearly all facets of modern living, but there remains a big gap: speakers of minority languages, and those with thick accents or speech disorders like stuttering are typically less able to use speech recognition tools that control applications, transcribe or automate tasks, among other functions. Learn more about your ad choices. Visit podcastchoices.com/adchoices
An airhacks.fm conversation with Bruce Hopkins about: transition from Basic to Java, work on Bluetooth technology and writing a book on Bluetooth for Java, involvement with Sun Microsystems and Java ME, becoming a Java Champion, shift to AI and natural language processing research, development of speech recognition and hands-free web navigation systems using pure Java, use of Hugging Face libraries for NLP in 2016, writing for Linux Magazine about mesh VPNs, discovery and exploration of ChatGPT, writing a book on integrating ChatGPT with Java, shared experiences and parallel paths in Java development, discussion about Sun Microsystems vs Oracle's approach to Java, mention of various Java-related technologies like JXTA, Sphinx, FreeTTS, and Dalvik, brief explanation of mesh VPNs and Tailscale, plans for a future podcast episode focused on Bruce's JavaChatGPT book
OpenAI has released their newest model, GPT-4o mini, which is more cost-efficient and excels in mathematical reasoning and coding tasks. NVIDIA's Mistral NeMo 12B is a state-of-the-art language model with unprecedented accuracy and enterprise-grade support. A new speech recognition keyboard and service for Android called Transcribro has been developed, which is private and on-device. Research papers explore the impact of vocabulary size on language model scaling, the use of large datastores for retrieval-based language models, and a method for generating long sequences of views of a cityscape using AI and computer vision. Contact: sergi@earkind.com Timestamps: 00:34 Introduction 01:40 OpenAI Announces GPT 4o mini 03:11 Mistral AI and NVIDIA Unveil Mistral NeMo 12B, a Cutting-Edge Enterprise AI Model 05:28 Transcribro: Private and on-device speech recognition keyboard and service for Android 06:43 Fake sponsor 08:49 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies 10:19 Scaling Retrieval-Based Language Models with a Trillion-Token Datastore 11:49 Streetscapes: Large-scale Consistent Street View Generation Using Autoregressive Video Diffusion 13:26 Outro
Most people have encountered speech recognition software in their day-to-day lives, whether through personal digital assistants, auto transcription, or other such modern marvels. As the technology advances, though, it still fails to understand speakers of African American English (AAE). In this episode, we talk to Michelle Cohn (Google Research and University of California Davis) and Zion Mengesha (Google Research and Stanford University) about their research into why these problems with speech recognition software seem to persist and what can be done to make sure more voices are understood by the technology.Associated paper: Michelle Cohn, Zion Mengesha, Michal Lahav, and Courtney Heldreth. "African American English speakers' pitch variation and rate adjustments for imagined technological and human addressees." JASA Express Letters 4, 047601 (2024). https://doi.org/10.1121/10.0025484.Read more from JASA Express Letters. Learn more about Acoustical Society of America Publications Music: Min 2019 by minwbu from Pixabay.
Amir Haramaty shares his journey from cybersecurity to pioneering AI solutions, focusing on the power of voice-driven technology to revolutionize business processes. He discusses the challenges of integrating AI into traditional industries and highlights how his team's innovative solutions are making complex tasks simpler and more efficient. Amir's insights reveal the practical applications of AI, offering a glimpse into a future where technology enhances everyday work and drives measurable value in various fields.
Ricardo Herreros Symons is the VP Corporate Development and Strategy at AI speech to text company, Speechmatics. Hosted on Acast. See acast.com/privacy for more information.
KPMG has revealed the esteemed judging panel for the Irish round of its fourth annual Global Tech Innovator competition. The winner will represent Ireland, competing against 22 other countries in the global final in November. The competition is accepting applications from tech start-ups in the Republic of Ireland and Northern Ireland until midnight on Friday, May 31st, 2024. This year's KPMG Global Tech Innovator competition will be hosted by award-winning tech journalist and broadcaster Jess Kelly. As the host of Tech Talk on Newstalk, Ireland's only national radio show dedicated to technology, Jess brings a wealth of knowledge and insight to the competition. The Judges The judging panel comprises of industry leaders, including investors, founders, and advisors, all with significant expertise in the tech sector. Alan Bromell, Head of Private Enterprise at KPMG, said: "We're thrilled to have such an outstanding panel of judges for this year's competition. Their diverse expertise, spanning investment, leadership, and advisory roles, ensures we are well-equipped to identify Ireland's next top tech innovator." Will Prendergast, Partner and Co-founder, Frontline Will is a Partner on Frontline Venture's European Seed fund and co-founded the firm in 2012. He has since led over 25 investments at Frontline and has a particular interest in backing mission-driven founders building in Ireland, the Netherlands and DACH region, and supporting them to expand stateside. He has lived and worked in the U.S. himself and continues to spend significant time there to develop investor and corporate relationships for the benefit of Frontline's portfolio. Dr. Patricia Scanlon, Founder Soapbox Labs Patricia holds a PhD in Artificial Intelligence (AI) and Speech Recognition and a bachelor's degree in electronic engineering. Her 25 years' experience in AI spans both academia and industry including roles at Columbia University, Bell Labs and IBM. In 2013, she founded SoapBox Labs, the world's leading provider of ethical Voice AI for children which was acquired by US based Curriculum Associates in 2023. In 2022, Dr Scanlon was appointed Ireland's first Artificial Intelligence Ambassador, to lead a national conversation on Artificial Intelligence, working with the Department of Enterprise, Trade and Employment. In 2023 she was appointed Chairperson of Ireland's new expert AI Advisory Council providing the government with early foresight of emerging trends, challenges, risks and opportunities. Eimear Hennessy, Global Head of Enterprise Services, Stripe Eimear leads Stripe's Global Enterprise Services, based in Dublin, Ireland. This team delivers Stripe's proactive, preventative technical account management partnerships to Stripe's largest Users. Her career spans 20 years across technology and engineering consulting. Building is the common thread in Eimear' s career from the start of her career as a Civil Engineer through to joining Google in the 2nd decade of its existence building businesses, solutions and teams. Now in Stripe she is heavily invested in building Stripe as it too enters its second decade of existence. Eimear's experience is hugely diverse, having led organisations across Engineering (working in sectors such as Oil, Pharma, Nuclear power, Commercial construction) Advertising, Cloud, Publishing and now Fintech, working across Sales, Support, Partnerships, Project management and Strategy. Barry Napier, CEO, Cubic Telecom Barry has extensive experience and a proven track record in building innovative technology companies. Barry possesses a rare combination of leadership and skills in strategy, business and corporate development, well suited to establishing high performance teams to grow and transform technology organisations. Most recently Barry led Cubic in a strategic partnership with Softbank Corp whereby they invested €473 million, valuing the company at over €900 million. Barry also holds board seats on a range of high-growth companies ...
Send me a messageIn this episode of the Sustainable Supply Chain Podcast, I'm joined by Amir Haramaty, CEO of aiOla, to explore how artificial intelligence is reshaping business processes with high accuracy in speech recognition. Amir delves into the core functions of aiOla—a tool designed to bridge the gap between human speech and actionable data, thereby streamlining operations across industries.Amir outlines how aiOla not only captures spoken language but converts it into structured, usable data that integrates seamlessly with existing ERP and CRM systems. This transformation is particularly crucial in sectors where precision and speed are paramount, such as logistics, pharmaceuticals, and manufacturing. By enhancing data capture, aiOla facilitates more informed decision-making and operational efficiency.A key focus of our discussion centres on sustainability—how aiOla's technology minimises waste and optimises resource use by eliminating paper processes and improving data accuracy. These enhancements have tangible impacts on the bottom line and environmental sustainability.Tune in to hear how Amir's technology is making significant strides in making business processes smarter, safer, and more sustainable. Whether it's improving pre-op inspections in food processing or ensuring compliance in pharmaceuticals, aiOla is setting a new standard for integrating AI into daily operations.Join us to discover how integrating AI into your supply chain can lead to substantial efficiency gains and a more sustainable future.Don't forget to check out the video version of this episode at https://youtu.be/MQtz0fP3ytkElevate your brand with the ‘Sustainable Supply Chain' podcast, the voice of supply chain sustainability.Last year, this podcast's episodes were downloaded over 113,000 times by senior supply chain executives around the world.Become a sponsor. Lead the conversation.Contact me for sponsorship opportunities and turn downloads into dialogues.Act today. Influence the future.Support the Show.Podcast supportersI'd like to sincerely thank this podcast's generous supporters: Lorcan Sheehan Olivier Brusle Alicia Farag Luis Olavarria Alvaro Aguilar And remember you too can Support the Podcast - it is really easy and hugely important as it will enable me to continue to create more excellent Digital Supply Chain episodes like this one.Podcast Sponsorship Opportunities:If you/your organisation is interested in sponsoring this podcast - I have several options available. Let's talk!FinallyIf you have any comments/suggestions or questions for the podcast - feel free to just send me a direct message on Twitter/LinkedIn. If you liked this show, please don't forget to rate and/or review it. It makes a big difference to help new people discover it. Thanks for listening.
Peak Mandarin Newsletter: https://www.peakmandarin.com/free-ebook Isaac's Mandarin from the Ground Up podcast: https://www.mftgu.com/ -- On today's podcast, I speak to language teacher and podcast host, Isaac Myers. Isaac has a fascinating Mandarin learning backstory which involves extensive travel around Taiwan and China and working at Google training language models for speech recognition systems. Through his podcast, Mandarin from the Ground Up, he teaches Chinese using the same imitation techniques we all used to learn our first language. All of which gives him a unique perspective and makes his insights on learning Chinese well worth listening to!
The privacy theme rolls on as Chuck Joiner, Brian Flanigan-Arthurs, Eric Bolden, Marty Jencius, Jim Rea, Jeff Gamet, and David Ginsburg look at the almost unbelievable Terms of Service in the most recent Roku update and how it applies to which Roku device/channel/app. Then, the MacVoices panel delivers some initial thoughts on what appears to be a relationship between Apple and Google over AI capabilities, and yet another lawsuit targeting AirTags. No other Bluetooth or GPS tracker, just AirTags. This edition of MacVoices is supported by The MacVoices Slack. Available all Patrons of MacVoices. Sign up at Patreon.com/macvoices. Show Notes: Chapters: 01:12 Roku's Onerous Terms 04:15 Audience Experiences with Roku Software 08:34 AI Collaboration Between Apple and Google 13:40 Speculations on Apple and Google AI Partnership 16:46 Targeted AI Models and Speech Recognition 20:28 AirTags Anti-Stalking Lawsuit 24:20 Discussion on AirTag Lawsuit and Tracking Devices Links: Your Roku will stop working unless you agree to its new terms — what to know and how to get around it https://www.tomsguide.com/tvs/your-roku-will-stop-working-unless-you-agree-to-its-new-terms-what-to-know-and-how-to-get-around-it Roku Dispute Resolution Terms https://docs.roku.com/published/disputeresolution/en/ca?cjdata=MXxOfDB8WXww&Ref=CJ&utm_source=cj&utm_medium=affiliate&utm_campaign=cj_affiliate_sale_6361382&utm_content=3486349_Future+Publishing+Limited&utm_term=13571892&cjevent=0a7baf1ee65511ee819000020a82b820&AID=13571892&PID=6361382&SID=trd-us-9097961795919355163 Apple might use Google Gemini to power some AI features on the iPhone https://9to5google.com/2024/03/17/gemini-apple-iphone-talks/ AirTag anti-stalking class-action lawsuit given the green light https://appleinsider.com/articles/24/03/17/airtag-anti-stalking-class-action-lawsuit-given-the-green-light Guests: Eric Bolden is into macOS, plants, sci-fi, food, and is a rural internet supporter. You can connect with him on Twitter, by email at embolden@mac.com, on Mastodon at @eabolden@techhub.social, and on his blog, Trending At Work. Brian Flanigan-Arthurs is an educator with a passion for providing results-driven, innovative learning strategies for all students, but particularly those who are at-risk. He is also a tech enthusiast who has a particular affinity for Apple since he first used the Apple IIGS as a student. You can contact Brian on twitter as @brian8944. He also recently opened a Mastodon account at @brian8944@mastodon.cloud. Jeff Gamet is a technology blogger, podcaster, author, and public speaker. Previously, he was The Mac Observer's Managing Editor, and the TextExpander Evangelist for Smile. He has presented at Macworld Expo, RSA Conference, several WordCamp events, along with many other conferences. You can find him on several podcasts such as The Mac Show, The Big Show, MacVoices, Mac OS Ken, This Week in iOS, and more. Jeff is easy to find on social media as @jgamet on Twitter and Instagram, jeffgamet on LinkedIn., @jgamet@mastodon.social on Mastodon, and on his YouTube Channel at YouTube.com/jgamet. David Ginsburg is the host of the weekly podcast In Touch With iOS where he discusses all things iOS, iPhone, iPad, Apple TV, Apple Watch, and related technologies. He is an IT professional supporting Mac, iOS and Windows users. Visit his YouTube channel at https://youtube.com/daveg65 and find and follow him on Twitter @daveg65 and on Mastodon at @daveg65@mastodon.cloud Dr. Marty Jencius has been an Associate Professor of Counseling at Kent State University since 2000. He has over 120 publications in books, chapters, journal articles, and others, along with 200 podcasts related to counseling, counselor education, and faculty life. His technology interest led him to develop the counseling profession ‘firsts,' including listservs, a web-based peer-reviewed journal, The Journal of Technology in Counseling, teaching and conferencing in virtual worlds as the founder of Counselor Education in Second Life, and podcast founder/producer of CounselorAudioSource.net and ThePodTalk.net. Currently, he produces a podcast about counseling and life questions, the Circular Firing Squad, and digital video interviews with legacies capturing the history of the counseling field. Generally, Marty is chasing the newest tech trends, which explains his interest in A.I. for teaching, research, and productivity. Marty is an active presenter and past president of the NorthEast Ohio Apple Corp (NEOAC). Jim Rea built his own computer from scratch in 1975, started programming in 1977, and has been an independent Mac developer continuously since 1984. He is the founder of ProVUE Development, and the author of Panorama X, ProVUE's ultra fast RAM based database software for the macOS platform. He's been a speaker at MacTech, MacWorld Expo and other industry conferences. Follow Jim at provue.com and via @provuejim@techhub.social on Mastodon. Support: Become a MacVoices Patron on Patreon http://patreon.com/macvoices Enjoy this episode? Make a one-time donation with PayPal Connect: Web: http://macvoices.com Twitter: http://www.twitter.com/chuckjoiner http://www.twitter.com/macvoices Mastodon: https://mastodon.cloud/@chuckjoiner Facebook: http://www.facebook.com/chuck.joiner MacVoices Page on Facebook: http://www.facebook.com/macvoices/ MacVoices Group on Facebook: http://www.facebook.com/groups/macvoice LinkedIn: https://www.linkedin.com/in/chuckjoiner/ Instagram: https://www.instagram.com/chuckjoiner/ Subscribe: Audio in iTunes Video in iTunes Subscribe manually via iTunes or any podcatcher: Audio: http://www.macvoices.com/rss/macvoicesrss Video: http://www.macvoices.com/rss/macvoicesvideorss 00:01:12 Roku's Onerous Terms 00:04:15 Audience Experiences with Roku Software 00:08:34 AI Collaboration Between Apple and Google 00:13:40 Speculations on Apple and Google AI Partnership 00:16:46 Targeted AI Models and Speech Recognition 00:20:28 AirTags Anti-Stalking Lawsuit 00:24:20 Discussion on AirTag Lawsuit and Tracking Devices
Open Tech Talks : Technology worth Talking| Blogging |Lifestyle
Integrating Artificial Intelligence (AI) in the educational technology sector is revolutionizing the learning and teaching landscape. As AI continues to advance, its application within education is not just enhancing educational experiences but is fundamentally transforming the industry. From personalized learning paths to intelligent tutoring systems, AI in education unlocks unprecedented opportunities for students and educators alike, setting the stage for a future where education is more accessible, efficient, and tailored to individual needs. In our earlier podcast session 126, we talked to the founders of AI Teacher about how they are using Generative AI for secondary school teachers. In today's enlightening episode, we had the privilege of interviewing the founder of Plabook, a revolutionary educational tool designed to transform how we approach oral reading fluency and comprehension in students. Plabook stands out by leveraging advanced speech recognition technology to listen to students as they read, providing instant, personalized feedback and smart recommendations. This innovative platform assesses reading abilities and delves deep into understanding each student's unique strengths and weaknesses through diagnostic reports. These reports are powerful tools for teachers and parents, offering insights to support and enhance the learning journey. Throughout our discussion, we explored Plabook's inception, development, and impact on the educational landscape, uncovering the vision and challenges behind integrating AI into education. Episode # 131 Today's Guest: Dr. Phil Hickman, Founder / CEO PlaBook Website: PlaBookEducation Linkedin: Dr. Phil This podcast offers invaluable insights into how AI and technology can bridge educational gaps, challenge traditional learning methods, and pave the way for a future where every student has the tools to succeed. What Listeners Will Learn: In today's podcast, you'll gain insights into several key areas: Exploring Use of Speech Recognition for Feedback: Discover the role of advanced speech recognition in providing real-time, tailored feedback to students. Understanding the Impact of Plabook on English Reading Comprehension: Gain insights into how Plabook is transforming the approach to teaching and assessing reading fluency. The Journey from Pilot to Validation: Learn about initial testing phases and the validation of its innovative concept. Adapting to Diverse Dialects and Languages: Find out how accommodates the linguistic diversity of its users. The Educational Sector's Response to AI Innovations: Delve into how new technologies like Plabook are being received and adopted in educational settings. Anticipating Future Educational Innovations: Get a glimpse of potential advancements and changes in the education sector influenced by AI technology. Resources: Website: PlaBookEducation Linkedin: Dr. Phil Meet AI Teacher: The Future of AI in Education Unveiled with Dr Pauldy Otermans and Dev Aditya
In this Podcast with nVoq Chief Revenue Officer Chris Moran discusses the transformative role of technology in the home health industry. nVoq, a cloud-based, medically relevant, and HIPAA-compliant speech recognition platform, is enabling in-home healthcare caregivers. T he platform ensures accurate transcriptions, maintains patient information security, and keeps home health agencies compliant with healthcare regulations. It also facilitates interoperability with other healthcare systems and enhances the accuracy and completeness of documentation, ultimately giving clinicians more time back in their day. nVoq focuses on organizational readiness and easy adoption for effective use of the technology. Listeners can find more about nVoq on their website. To learn more, visit nVoq's website: https://www.nvoq.com/ This podcast is brought to you by HealthRev Partners, offering revenue cycle management services powered by Velocity, the most advanced coding and billing software in the market.
In this episode, Nathan sits down with Dan O'Connell, Chief Strategy Officer at Dialpad. They discuss building their own language models using 5 billion minutes of business calls, custom speech recognition models for every customer, and the challenges of bringing AI into business. If you need an ecommerce platform, check out our sponsor Shopify: https://shopify.com/cognitive for a $1/month trial period. We're sharing a few of Nathan's favorite AI scouting episodes from other shows. Today, Shane Legg, Cofounder at Deepmind and its current Chief AGI Scientist, shares his insights with Dwarkesh Patel on AGI's timeline, the new architectures needed for AGI, and why multimodality will be the next big landmark. If you need an ecommerce platform, check out our sponsor Shopify: https://shopify.com/cognitive for a $1/month trial period. We're hiring across the board at Turpentine and for Erik's personal team on other projects he's incubating. He's hiring a Chief of Staff, EA, Head of Special Projects, Investment Associate, and more. For a list of JDs, check out: eriktorenberg.com. --- SPONSORS: Shopify is the global commerce platform that helps you sell at every stage of your business. Shopify powers 10% of ALL eCommerce in the US. And Shopify's the global force behind Allbirds, Rothy's, and Brooklinen, and 1,000,000s of other entrepreneurs across 175 countries.From their all-in-one e-commerce platform, to their in-person POS system – wherever and whatever you're selling, Shopify's got you covered. With free Shopify Magic, sell more with less effort by whipping up captivating content that converts – from blog posts to product descriptions using AI. Sign up for $1/month trial period: https://shopify.com/cognitive Omneky is an omnichannel creative generation platform that lets you launch hundreds of thousands of ad iterations that actually work customized across all platforms, with a click of a button. Omneky combines generative AI and real-time advertising data. Mention "Cog Rev" for 10% off www.omneky.com NetSuite has 25 years of providing financial software for all your business needs. More than 36,000 businesses have already upgraded to NetSuite by Oracle, gaining visibility and control over their financials, inventory, HR, eCommerce, and more. If you're looking for an ERP platform ✅ head to NetSuite: http://netsuite.com/cognitive and download your own customized KPI checklist. X/SOCIAL: @labenz (Nathan) @dialdoc (Dan) @dialpad @CogRev_Podcast (Cognitive Revolution) TIMESTAMPS: (00:00) - Introduction and Welcome (06:50) - Interview with Dan O'Connell, Chief AI and Strategy Officer at Dialpad (07:13) - The Functionality and Utility of Dialpad (17:20) - The Development of Dialpad's Large Language Model Trained on 5Billion Minutes of Calls 19:56 The Future of AI in Business (22:21) - Sponsor Break: Shopify (23:56) - The Challenges and Opportunities of AI Development (31:17 ) - Prioritizing latency, capacity, and cost when evaluating AI (39:41) - Most Loved AI Features in Dialpad (42:01) - The Role of AI in Quality Assurance (43:10) - The Future of Transcription Accuracy (44:06) - The Importance of Speech Recognition in Business (46:59) - Personalizing AI for Better Business Interactions (47:01) - The Role of AI in Content Generation (52:47) - The Challenges and Opportunities of AI in Sales and Support
Bard is our creative collaborator. It's a place where you can come in and have a conversation with the large language model which really helps you to boost your productivity and bring your ideas to life. –Yuri Pinsky, 02:16 Step behind the curtain and into the world of Google's Bard with its Director of Product Management, Yury Pinsky, in an exclusive conversation with SEJ Editor-in-Chief Amanda Zantal-Wiener. Hear about the origins and journey to Bard's unveiling, and discover how the team behind it envisions a collaborative future with AI. SEO pros and seasoned digital marketers alike will get an up-close look at the nuances of generative AI and a glimpse at what predictions for what's next. So prep your popcorn -- grab your notetaking method of choice -- and tune in to learn how to incorporate Google's current and forward-looking AI initiatives into your own business innovation. [07:11] - The origin of Bard and its market niche. [13:23] - Impact of generative AI and Bard on SEO and content creation. [17:37] - Using Bard for audience evaluation in content creation. [23:54] - Distinctive features of Bard compared to other AI models. [28:33] - Most interesting prompts seen in Bard. [33:29] - Future vision for Bard and generative AI. I'm inspired by this idea that technology can work together with us, and we can bring Bard in as a creative partner in your editorial work or when we're trying to write a document for work or something in our personal lives. –Yuri Pinsky, 09:46 It's a very vibrant, fast-paced, fast moving industry right now. I think some of the unique things we have with Bard are things like the ability to plug into Google tools. –Yuri Pinsky, 23:54 In the sciences and the medical field, there could be lots of interesting breakthroughs in drug discovery in climate applications. How can they use the power of these foundational models to really benefit all of us in some way? –Yuri Pinsky, 25:44 Your ideas still have to be your own in order for AI to work with you best and work for you best. –Yuri Pinsky, 28:33 It is not the end of search. Bard is an experiment. It's complementary to search. It's this conversational collaborator. –Yuri Pinsky, 32:47 Connect with Yury Pinsky: Yury is a Product Manager for Bard, leading areas including Extensions, Factuality, and multi-modality. Yury is passionate about cutting edge technology and finding ways to bring it to users around the world. Prior to serving in his current role, Yury led product teams around Natural Language and Speech Recognition for the Google Assistant, spent time building wearables in Google [X], and helped build out Google Search on mobile devices. Outside of work, Yury enjoys spending time with his family, planning his next vacation, and the daily logistics of kids' extracurricular activities. Connect on LinkedIn: https://www.linkedin.com/in/ypinsky Connect with Amanda Zantal-Wiener: Follow her on Twitter: https://twitter.com/Amanda_ZW Connect with her on LinkedIn: https://www.linkedin.com/in/amandazantalwiener/
The Personal Brain Trainer Podcast: Embodying Executive Functions
In this episode of The Personal Brain Trainer, Darius and Erica explore the world of voice typing and how it helps with executive function. They discuss the cognitive benefits of voice typing, its applications for individuals with dyslexia and ADHD, and practical strategies for using voice typing to enhance productivity. Discover the power of assistive technology, the future of note-taking apps, and how to streamline your writing process. Tune in for valuable insights and tips to unlock your mind's potential in this engaging and informative discussion. Links: 1. Built-in Software Speech-to-Text Functions: Siri: https://www.apple.com/siri/ Apple Dictation: https://tinyurl.com/4rvxumdt Windows 10 Speech Recognition: https://tinyurl.com/37383efk Google Voice Typing: https://tinyurl.com/3d2susur 2. Speech-to-Text Apps: Otter: https://otter.ai/ Google Docs Voice Typing: https://tinyurl.com/2fa85tuc Gboard: https://tinyurl.com/58mff5r2 3. Desktop Software: Read Write Gold: https://tinyurl.com/4wa5drzp Dragon by Nuance: https://www.nuance.com/dragon.html SpeechTexter: https://www.speechtexter.com/ 4. Other: Executive Functioning Competency Screener: https://mymemorymentor.com/efcs/ One to one sessions with Dr. Warren: https://learningtolearn.biz/ One to one sessions with Darius:www.dyslexiawork.com Executive functions and Study Skills Course: https://tinyurl.com/n86mf2bx BulletMap Academy: https://bulletmapacademy.com/ Learning Specialist Courses:https://www.learningspecialistcourses.com/ Executive functions and Study Skills Course: https://tinyurl.com/n86mf2bx Good Sensory Learning: https://goodsensorylearning.com/ Dyslexia at Work: www.dyslexiawork.com Brought to you by: https://goodsensorylearning.com https://learningspecialistcourses.com https://bulletmapacademy.com https://www.dyslexiaproductivitycoaching.com
The Generative AI News (GAIN) rundown is back for August 24, 2023. Special segments this week include: What does the market data say about generative AI adoption? We look at 10 charts that explain a lot about what is happening, why it is happening, and where we are headed. Meta challenges OpenAI with an open-source automated speech recognition and translation system. Game on! Generative AI winners and losers of the week. Eric Schwartz, Voicebot.ai's head writer, and Bret Kinsella gathered again this week to break down the top generative AI stories and a few other useful pieces of information. Generative AI News Links to the stories we covered this week are included below. Like Perplexity AI, we give you source links! Top Stories of the Week - Market Data
Speech recognition on an FPGA? That doesn't sound like the most effective path, but Bill Jenkins, of Achronix had a different opinion. Hear his take on why the FPGA is the right way to go for this application on this week's Embedded Executives podcast.
Guest/s Name ✨Nigel Cannings, CTO at Intelligent Voice [@intelligentvox]Bio ✨Nigel Cannings is the CTO at Intelligent Voice. He has over 25 years' experience in both Law and Technology, is the founder of Intelligent Voice Ltd and a pioneer in all things voice. Nigel is also a regular speaker at industry events such as NVIDIA GTC and holds multiple patents in Speech, NLP and Confidential Computing technologies. He is an Industrial Fellow at the University of East London.On Linkedin | https://www.linkedin.com/in/nigelcannings/?originalSubdomain=ukGoogle Scholar | https://scholar.google.co.uk/citations?user=zHL1sngAAAAJ&hl=en____________________________Host: Marco Ciappelli, Co-Founder at ITSPmagazine [@ITSPmagazine] and Host of Redefining Society PodcastOn ITSPmagazine | https://www.itspmagazine.com/itspmagazine-podcast-radio-hosts/marco-ciappelli_____________________________This Episode's SponsorsBlackCloak
This episode is the entire conversation we had with Matt MacNeil and Ed Casagrande from the Canadian Down Syndrome Society concerning their collaboration with Google AI to create a database that can help train Google's speech recognition technology to better understand people with Down syndrome. Donate your voice at: https://projectunderstood.ca Learn more about the CDSS: https://cdss.ca Episode Transcript: https://ifweknewthen701833686.wordpress.com/2023/08/13/152-revisiting-the-canadian-down-syndromes-project-understood-training-speech-technology/2/ Please follow us on Twitter @ifweknewthenPOD you can drop us a line on our Facebook page @ifweknewthenPOD or visit our website https://www.IfWeKnewThen.com to send us an email with questions and comments. You can join our mailing list there and get alerts of future podcast episodes. Thank you again and we look forward to you joining us on the next episode of IF WE KNEW THEN.
Speech recognition technology has been around for longer than you might think. Discover how it has evolved and advanced over the years from Thomas Schaaf, principal research scientist at 3M HIS. He started his career in the speech recognition environment in the 1990s, working for companies like Amazon and Toshiba. Listen as he shares his insights into the future of the technology for health care and beyond.
CM3leon, a new generative model for text and images that is more efficient and state-of-the-art. OpenAI researcher Jason Wei is also featured, offering an "Ask Me Anything" document on AI research. Additionally, Sumformer, a linear-complexity alternative to self-attention for speech recognition, and DreamTeacher, a self-supervised feature representation learning framework that uses generative networks for pre-training downstream image backbones, are discussed. Contact: sergi@earkind.com Timestamps: 00:34 Introduction 02:16 Introducing CM3leon, a more efficient, state-of-the-art generative model for text and images 03:40 OpenAI product leader denies claims GPT-4 has gotten ‘lazier and dumber' 05:12 Jason Wei (OpenAI Researcher) tweets 06:04 Fake sponsor 07:54 Sumformer: A Linear-Complexity Alternative to Self-Attention for Speech Recognition 09:05 NIFTY: Neural Object Interaction Fields for Guided Human Motion Synthesis 10:19 DreamTeacher: Pretraining Image Backbones with Deep Generative Models 12:15 Outro
Episode: 2719 The Mathematics of Language. Today, let's see what mathematics can tell us about language.
AI Today Podcast: Artificial Intelligence Insights, Experts, and Opinion
In this episode of the AI Today podcast hosts Kathleen Walch and Ron Schmelzer define the terms Cloud ML, On-Premise, Edge Device, Machine Learning -as-a-Service (MLaaS), explain how these terms relates to AI and why it's important to know about them. Show Notes: FREE Intro to CPMAI mini course CPMAI Training and Certification AI Glossary Glossary Series: Training Data, Epoch, Batch, Learning Curve Glossary Series: (Artificial) Neural Networks, Node (Neuron), Layer Glossary Series: Bias, Weight, Activation Function, Convergence, ReLU Glossary Series: Perceptron Glossary Series: Hidden Layer, Deep Learning Glossary Series: Loss Function, Cost Function & Gradient Descent Glossary Series: Backpropagation, Learning Rate, Optimizer Glossary Series: Feed-Forward Neural Network Glossary Series: OpenAI, GPT, DALL-E, Stable Diffusion Glossary Series: Natural Language Processing (NLP), NLU, NLG, Speech-to-Text, TTS, Speech Recognition AI Glossary Series – Machine Learning, Algorithm, Model AI Today Podcast: AI Glossary Series – Batch Prediction, Microservice, Real-time Prediction, Stream Learning, Cold-Path Analytics, Hot-Path Analytics This episode is sponsored by Algolia: Algolia Powers Discovery. Continue reading AI Today Podcast: AI Glossary Series – Cloud ML, On-Premise, Edge Device, Machine Learning -as-a-Service (MLaaS) at AI & Data Today.
AI Today Podcast: Artificial Intelligence Insights, Experts, and Opinion
In this episode of the AI Today podcast hosts Kathleen Walch and Ron Schmelzer define the terms Batch Prediction, Microservice, Real-time Prediction, Stream Learning, Cold-Path Analytics, and Hot-Path Analytics, explain how these terms relate to AI and why it's important to know about them. Show Notes: FREE Intro to CPMAI mini course CPMAI Training and Certification AI Glossary Glossary Series: Training Data, Epoch, Batch, Learning Curve Glossary Series: (Artificial) Neural Networks, Node (Neuron), Layer Glossary Series: Bias, Weight, Activation Function, Convergence, ReLU Glossary Series: Perceptron Glossary Series: Hidden Layer, Deep Learning Glossary Series: Loss Function, Cost Function & Gradient Descent Glossary Series: Backpropagation, Learning Rate, Optimizer Glossary Series: Feed-Forward Neural Network Glossary Series: OpenAI, GPT, DALL-E, Stable Diffusion Glossary Series: Natural Language Processing (NLP), NLU, NLG, Speech-to-Text, TTS, Speech Recognition AI Glossary Series – Machine Learning, Algorithm, Model AI Glossary Series – Model Tuning and Hyperparameter AI Glossary Series: Overfitting, Underfitting, Bias, Variance, Bias/Variance Tradeoff AI Glossary Series: Operationalization Interview with Alex Measure, BLS This episode is sponsored by Algolia: Algolia Powers Discovery. Continue reading AI Today Podcast: AI Glossary Series – Batch Prediction, Microservice, Real-time Prediction, Stream Learning, Cold-Path Analytics, Hot-Path Analytics at AI & Data Today.
AI Today Podcast: Artificial Intelligence Insights, Experts, and Opinion
In this episode of the AI Today podcast hosts Kathleen Walch and Ron Schmelzer define the term Operationalization, explain how this term relates to AI and why it's important to know about them. Show Notes: FREE Intro to CPMAI mini course CPMAI Training and Certification AI Glossary Glossary Series: Training Data, Epoch, Batch, Learning Curve Glossary Series: (Artificial) Neural Networks, Node (Neuron), Layer Glossary Series: Bias, Weight, Activation Function, Convergence, ReLU Glossary Series: Perceptron Glossary Series: Hidden Layer, Deep Learning Glossary Series: Loss Function, Cost Function & Gradient Descent Glossary Series: Backpropagation, Learning Rate, Optimizer Glossary Series: Feed-Forward Neural Network Glossary Series: OpenAI, GPT, DALL-E, Stable Diffusion Glossary Series: Natural Language Processing (NLP), NLU, NLG, Speech-to-Text, TTS, Speech Recognition AI Glossary Series – Machine Learning, Algorithm, Model For more information on visit Algolia website FREE CPMAI Intro Course Glossary Series: Natural Language Processing (NLP), NLU, NLG, Speech-to-Text, TTS, Speech Recognition Glossary Series: Tokenization, Vectorization This episode is sponsored by Algolia: Algolia Powers Discovery. Continue reading AI Today Podcast: AI Glossary Series – Operationalization at AI & Data Today.
AI Today Podcast: Artificial Intelligence Insights, Experts, and Opinion
In this episode of the AI Today podcast hosts Kathleen Walch and Ron Schmelzer define the terms Digital Transformation, Return on Investment (ROI), and Key Performance Indicator (KPI), explain how these terms relate to AI and why it's important to know about them. Show Notes: FREE Intro to CPMAI mini course CPMAI Training and Certification AI Glossary For more information on visit Algolia website Glossary Series: Natural Language Processing (NLP), NLU, NLG, Speech-to-Text, TTS, Speech Recognition Glossary Series: Tokenization, Vectorization This episode is sponsored by Algolia: Algolia Powers Discovery. Continue reading AI Today Podcast: AI Glossary – Digital Transformation, Return on Investment (ROI), Key Performance Indicator (KPI) at AI & Data Today.
AI Today Podcast: Artificial Intelligence Insights, Experts, and Opinion
In this episode of the AI Today podcast hosts Kathleen Walch and Ron Schmelzer define the terms Confusion Matrix, Accuracy, Precision, F1, Recall, Sensitivity, Specificity, Receiver-Operating Characteristic (ROC) Curve, explain how these terms relate to AI and why it's important to know about them. Show Notes: FREE Intro to CPMAI mini course CPMAI Training and Certification AI Glossary Glossary Series: Training Data, Epoch, Batch, Learning Curve Glossary Series: (Artificial) Neural Networks, Node (Neuron), Layer Glossary Series: Bias, Weight, Activation Function, Convergence, ReLU Glossary Series: Perceptron Glossary Series: Hidden Layer, Deep Learning Glossary Series: Loss Function, Cost Function & Gradient Descent Glossary Series: Backpropagation, Learning Rate, Optimizer Glossary Series: Feed-Forward Neural Network Glossary Series: OpenAI, GPT, DALL-E, Stable Diffusion Glossary Series: Natural Language Processing (NLP), NLU, NLG, Speech-to-Text, TTS, Speech Recognition AI Glossary Series – Machine Learning, Algorithm, Model AI Glossary Series – Model Tuning and Hyperparameter AI Glossary Series: Overfitting, Underfitting, Bias, Variance, Bias/Variance Tradeoff Glossary Series: Classification & Classifier, Binary Classifier, Multiclass Classifier, Decision Boundary Continue reading AI Today Podcast: AI Glossary Series – Confusion Matrix, Accuracy, Precision, F1, Recall, Sensitivity, Specificity, Receiver-Operating Characteristic (ROC) Curve at AI & Data Today.
Discover the fascinating world of Hidden Markov Models (HMMs) in this episode of "The AI Frontier" podcast. Explore the fundamentals of HMMs, their applications in fields like speech recognition, bioinformatics, and finance, and learn about their limitations and alternatives. Gain insights into the theoretical concepts and real-world use cases, and stay up-to-date with emerging trends in sequential data analysis. Join us on this journey to uncover the power and potential of HMMs in artificial intelligence and machine learning.Support the Show.Keep AI insights flowing – become a supporter of the show!Click the link for details
AssemblyAI Founder & CEO, Dylan Fox joined FirstMark Managing Partner, Matt Turck for Data Driven NYC! AssemblyAI is the fastest way to build with AI for audio. With a simple API, get access to production-ready AI models to transcribe and understand speech. AssemblyAI has raised $63M+.
AI Today Podcast: Artificial Intelligence Insights, Experts, and Opinion
In this episode of the AI Today podcast hosts Kathleen Walch and Ron Schmelzer define the terms CPU, GPU, TPU, and Federated Learning, explain how these terms relate to AI and why it's important to know about them. Show Notes: FREE Intro to CPMAI mini course CPMAI Training and Certification AI Glossary Glossary Series: Artificial Intelligence AI Glossary Series – Machine Learning, Algorithm, Model Glossary Series: (Artificial) Neural Networks, Node (Neuron), Layer Glossary Series: Natural Language Processing (NLP), NLU, NLG, Speech-to-Text, TTS, Speech Recognition Continue reading AI Today Podcast: AI Glossary Series – CPU, GPU, TPU, and Federated Learning at AI & Data Today.
AI Today Podcast: Artificial Intelligence Insights, Experts, and Opinion
In this episode of the AI Today podcast hosts Kathleen Walch and Ron Schmelzer define the terms Tokenization and Vectorization, explain how these terms relates to AI and why it's important to know about them. Show Notes: FREE Intro to CPMAI mini course CPMAI Training and Certification AI Glossary Glossary Series: Artificial Intelligence AI Glossary Series – Machine Learning, Algorithm, Model Glossary Series: (Artificial) Neural Networks, Node (Neuron), Layer Glossary Series: Natural Language Processing (NLP), NLU, NLG, Speech-to-Text, TTS, Speech Recognition Continue reading AI Today Podcast: AI Glossary Series – Tokenization and Vectorization at AI & Data Today.
Fr. Brendan McGuire - Podcasts that Break open the Word of God
Today's gospel tells us that the Good Shepherd is the voice we are called to listen to because he will bring us to all truth; all goodness; and all beauty. We need to make sure that what we are listening to has that voice. (Read more…)Here is my homily from the Fourth Sunday of Easter . I hope you are enjoying this Easter Season.Alleluia, He is Risen Indeed!
Nigel Cannings, CTO at Intelligent Voice, spoke to Rudolf Falat, founder of the Voice of FinTech podcast, about leveraging AI to ensure banks and others trade responsibly and in line with regulations.Here is what they talked about: Nigel's backstory What problem is Intelligent Voice solving and why is it worth solving? Key clients Tech angle? How does Intelligent Voice (IV) keep improving? Is human monitoring needed? Why is this solution better than the competition or incumbent solutions? How IV addresses the privacy concerns of individuals and companies How do they ensure their AI is ethical and responsible Interoperability of Intelligent Voice - how easy is it to plug into the client´s enterprise systems Success stories in Financial Services: e.g., Daiwa What are your plans for the rest of the year? Hint: international expansion? Recommend info channels: 1. Medium What's the best way to reach out? LinkedIn and website: Nigel Cannings and Intelligent Voice.
Dr Simon Wallace is Nuance's UK and Ireland Chief Clinical Information Officer for Healthcare (CCIO).Nuance is at the forefront of developing clinical understanding solutions that improve healthcare through more informed and timely decisions.Dr Simon Wallace joins Pete in this episode to discuss the importance of speech recognition technology in clinical documentation.If you work in healthcare, have you ever considered the potential applications of speech recognition technology? In this segment, you'll find out how Nuance's Dragon Medical One can enhance your organisation's clinical documentation processes. Find out how AI can help you and learn about the advantages of speech recognition.Check out the episode and full show notes here.To see the latest information, news, events and jobs on offer at Nuance, visit their Talking HealthTech Directory here. Loving the show? Leave us a review, and share it with someone who might get some value from it.Keen to take your healthtech to the next level? Become a THT+ Member for access to our online community forum, quarterly summits and more exclusive content. For more information visit here.
From his research outcomes and clinical experience, Dr. Michael Canfarotta shares what the minimal angular of insertion of a CI electrode should be in order to optimize hearing outcomes.
Games like Disney Friends are changing how people communicate and relate to video game characters. You can now tell Disney favorites like Stitch that you love him. Today's guest helped develop Disney Friends and is the Director of User Experience at Riot Games. Join Cheryl Platz as she discusses speech recognition technology in the video game industry. Cheryl also talks about what it's like to work in Riot Games and how everything has to be consistent now that they have more games. You can also learn about these latest game techs from her book, Design Beyond Devices: Creating Multimodal, Cross-Device Experiences. Discover the future of the video game industry today! Love the show? Subscribe, rate, review & share! https://www.acrolinx.com/wordbirds
AI Today Podcast: Artificial Intelligence Insights, Experts, and Opinion
In this episode of the AI Today podcast hosts Kathleen Walch and Ron Schmelzer define Natural Language Processing (NLP), Natural Language Understanding (NLU), Natural Language Generation (NLG), Speech-to-Text, Test-to-Speech, and (Automated) Speech Recognition. We share how these terms are related and how they fit into AI. Show Notes: FREE Intro to CPMAI mini course CPMAI Training and Certification AI Glossary AI Glossary Series – Content Summarization & Analysis, Sentiment Analysis AI Glossary Series – Conversational Systems, Chatbots, Voice Assistants, Machine Translation AI Today Podcast #104: Patterns of AI – Conversation / Human Interaction Continue reading AI Today Podcast: AI Glossary Series: Natural Language Processing (NLP), NLU, NLG, Speech-to-Text, TTS, Speech Recognition at Cognilytica.
Recently I returned to Kdenlive after about a 10-year break, and was pleased to discover the speech recognition feature. https://docs.kdenlive.org/en/effects_and_compositions/speech_to_text.html#install-python
In this episode, AI evangelist Juggy Jagannathan, PhD, discusses the advancement of speech recognition technology with Detlef Koll, global vice president of research and development at 3M Health Information Systems. Travel along the timeline of speech recognition history, starting with isolated word speech recognition all the way to continuous word speech recognition and automatic transcription technologies that create time to care for physicians.
What's your secret to superb audio recognition? Whisper it. We mean that literally—Whisper is the latest in OpenAI's growing suite of models aimed to benefit humanity. On this episode of Five-Minute Friday, host Jon Krohn reviews OpenAI's latest model, Whisper. This tool will vastly improve the way human speech is recognized and converted to text. Jon gets under the hood to show how the team managed to get such a powerfully accurate recognition model. Listen to the episode and find out how you can try it yourself, for free! Additional materials: www.superdatascience.com/620 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@jonkrohn.com for sponsorship information.
MLOps Coffee Sessions #129 {Podcast BTS} with Catherin Breslin, Voice and Language Tech co-hosted by Adam Sroka. // Abstract Back in the day, Speech Recognition was its own thing. It's a very different flavor of Data Science. You could not use a lot of the tools. It wouldn't cross over to this type of machine learning. Now, with the advancements, Speech Recognition and Machine learning are coming in together. It's interesting to hear right from someone with a Ph.D. level working with some of the biggest companies in the world doing it. The fact that something like Alexa is lots of models back to back and just fathom the complexity of that is quite cool! // Bio Catherine is a machine learning scientist and consultant based in Cambridge UK, and the founder of Kingfisher Labs consulting. Since completing her Ph.D. at the University of Cambridge in 2008, Catherine has commercial and academic experience in automatic speech recognition, natural language understanding, and human-computer dialogue systems, having previously worked at Cambridge University, Toshiba Research, Amazon Alexa, and Cobalt Speech. Catherine has been excited by the application of research to real-world problems involving speech and language at scale. // MLOps Jobs board https://mlops.pallet.xyz/jobs // MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links www.catherinebreslin.co.uk https://catherinebreslin.medium.com/ MLOps Community Newsletter: https://airtable.com/shrx9X19pGTWa7U3YTwitter: https://twitter.com/catherinebuk --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Adam on LinkedIn: https://www.linkedin.com/in/aesroka Connect with Catherine on LinkedIn: https://www.linkedin.com/in/catherine-breslin-0592423a/ Timestamps: [00:00] Catherine's preferred coffee [01:50] Takeaways [03:59] Introduction to Catherine Breslin [05:04] Subscribe to our newsletter! [06:25] Catherine's background [08:13] Speech Recognition trajectory [09:36] Challenges around technologies and tools [11:34] Reflective trend [13:02] Developer experiences hiccups [15:09] Speech Recognition use case backup [16:56] Toshiba research [17:48] Transition from a research lab to working in the industry [20:01] Unit test of Speech Recognition [20:56] Alexa [22:33] Maturity process of Speech Recognition [26:48] Speech Recognition unrecognizing challenges [30:38] Mechanical Terk [33:00] Social media listening [34:05] Pipeline models and speed of Speech Recognition [36:48] Development of Speech Recognition excited about [37:23] Data from people for the Speech Recognition system vs Scowering news vs watching Youtube for a long time [40:00] Disappearing Languages [41:30] Future of an online practice partner [43:17] Speech-to-speech translation [44:04] Interesting ways to use unfamiliar models to achieve a result [45:40] Meeting transcriptions [48:37] First toy problem of a new Speech Recognition learner [51:37] Kingfisher Labs' problems to tackle [52:18] Off-the-shelf solution [53:38] Translation layer [54:15] Connect with Catherine on Twitter and LinkedIn for available jobs [54:43] Wrap up
From his research outcomes and clinical experience, Dr. Michael Canfarotta shares what the minimal angular of insertion of a CI electrode should be in order to optimize hearing outcomes.
Summary The increasing sophistication of machine learning has enabled dramatic transformations of businesses and introduced new product categories. At Assembly AI they are offering advanced speech recognition and natural language models as an API service. In this episode founder Dylan Fox discusses the unique challenges of building a business with machine learning as the core product. Announcements Hello and welcome to the Machine Learning Podcast, the podcast about machine learning and how to bring it from idea to delivery. Predibase is a low-code ML platform without low-code limits. Built on top of our open source foundations of Ludwig and Horovod, our platform allows you to train state-of-the-art ML and deep learning models on your datasets at scale. Our platform works on text, images, tabular, audio and multi-modal data using our novel compositional model architecture. We allow users to operationalize models on top of the modern data stack, through REST and PQL – an extension of SQL that puts predictive power in the hands of data practitioners. Go to themachinelearningpodcast.com/predibase today to learn more and try it out! Your host is Tobias Macey and today I’m interviewing Dylan Fox about building and growing a business with ML as its core offering Interview Introduction How did you get involved in machine learning? Can you describe what Assembly is and the story behind it? For anyone who isn’t familiar with your platform, can you describe the role that ML/AI plays in your product? What was your process for going from idea to prototype for an AI powered business? Can you offer parallels between your own experience and that of your peers who are building businesses oriented more toward pure software applications? How are you structuring your teams? On the path to your current scale and capabilities how have you managed scoping of your model capabilities and operational scale to avoid getting bogged down or burnt out? How do you think about scoping of model functionality to balance composability and system complexity? What is your process for identifying and understanding which problems are suited to ML and when to rely on pure software? You are constantly iterating on model performance and introducing new capabilities. How do you manage prototyping and experimentation cycles? What are the metrics that you track to identify whether and when to move from an experimental to an operational state with a model? What is your process for understanding what’s possible and what can feasibly operate at scale? Can you describe your overall operational patterns delivery process for ML? What are some of the most useful investments in tooling that you have made to manage development experience for your teams? Once you have a model in operation, how do you manage performance tuning? (from both a model and an operational scalability perspective) What are the most interesting, innovative, or unexpected aspects of ML development and maintenance that you have encountered while building and growing the Assembly platform? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Assembly? When is ML the wrong choice? What do you have planned for the future of Assembly? Contact Info @YouveGotFox on Twitter LinkedIn Parting Question From your perspective, what is the biggest barrier to adoption of machine learning today? Closing Announcements Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@themachinelearningpodcast.com) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links Assembly AI Podcast.__init__ Episode Learn Python the Hard Way NLTK NLP == Natural Language Processing NLU == Natural Language Understanding Speech Recognition Tensorflow r/machinelearning SciPy PyTorch Jax HuggingFace RNN == Recurrent Neural Network CNN == Convolutional Neural Network LSTM == Long Short Term Memory Hidden Markov Models Baidu DeepSpeech CTC (Connectionist Temporal Classification) Loss Model Twilio Grid Search K80 GPU A100 GPU TPU == Tensor Processing Unit Foundation Models BLOOM Language Model DALL-E 2 The intro and outro music is from Hitman’s Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0
Brent sits down with Dr Quentin Stafford-Fraser, computer scientist, serial-entrepreneur, inventor (perhaps) of the webcam, Augmented Reality Ph.D. who ran the very first web server at the University of Cambridge, among much more. We explore topics including computer science as an art-form, the origins of the Raspberry Pi and T9 predictive text, philosophies around innovation and invention, challenging the patent system, and more. Special Guest: Quentin Stafford-Fraser.