Two Voice Devs

Follow Two Voice Devs

Share on

Mark and Allen talk about the latest news in the VoiceFirst world from a developer point of view.

Mark and Allen

Jul 26, 2024 LATEST EPISODE
every other week NEW EPISODES
32m AVG DURATION
201 EPISODES

Search for episodes from Two Voice Devs with a specific topic:

Latest episodes from Two Voice Devs

Episode 200 - Four Years and Looking Forward

Play Episode Listen Later Jul 26, 2024 26:09

Mark Tucker and Allen Firstenberg celebrate 200 episodes and four years of Two Voice Devs! In this special episode, they reflect on the journey so far, the evolution of the AI landscape, and what excites them most about the future of development. Join them as they discuss: 00:00 Four years ago... 00:10 The evolution of large language models (LLMs) and how the landscape has shifted over the past year. 03:10 The emergence of new players in the AI model space and how Google, Microsoft, and Amazon are vying for dominance. 05:30 The growing trend of smaller and locally deployable models and the future of AI development. 08:00 The ongoing quest for seamless integration of conversational AI with web experiences. 10:30 The need for a convergence of traditional NLU concepts with modern AI approaches. 11:30 The pressing need for sustainability and responsible development in the AI space. 14:00 The importance of integrating AI tools with existing methods and workflows . 16:00 An open invitation for developers to join Mark and Allen as co-hosts and share their perspectives on AI development. 18:00 A reminder that learning is at the heart of the developer experience and the importance of community. 20:00 The highlights from their favorite episodes over the past four years. 23:00 The value of connection and friendship within the developer community. 26:09 Four years ago... Don't miss this milestone episode as Two Voice Devs look back and look forward!

amazon google ai microsoft looking forward nlu mark tucker

Episode 199 - Is the Future of AI Local?

Play Episode Listen Later Jul 22, 2024 31:34

Join Allen Firstenberg and Roger Kibbe as they delve into the exciting world of local, embedded LLMs. We navigate some technical gremlins along the way, but that doesn't stop us from exploring the reasons behind this shift, the potential benefits for consumers and vendors, and the challenges developers will face in this new landscape. We discuss the "killer features" needed to drive adoption, the role of fine-tuning and LoRA adapters, and the potential impact on autonomous agents and an appless future. Resources: * https://developer.android.com/ai/aicore * https://machinelearning.apple.com/research/introducing-apple-foundation-models Timestamps: 00:20: Why are vendors embedding LLMs into operating systems? 04:40: What are the benefits for consumers? 09:40: What opportunities will this open up for app developers? 14:10: The power of LoRA adapters and fine-tuning for smaller models. 17:40: A discussion about Apple, Microsoft, and Google's approaches to local LLMs. 20:10: The challenge of multiple LLM models in a single browser. 23:40: How might developers handle browser compatibility with local LLMs? 24:10: The "three-tiered" system for local, cloud, and third-party LLMs. 27:10: The potential for an "appless" future dominated by browsers and local AI. 28:50: The implications of local LLMs for autonomous agents.

google apple ai microsoft local llm future of ai

Episode 198 - Wisdom from Unparsed: LLMs are Hammers, Not Silver Bullets

Play Episode Listen Later Jul 12, 2024 41:47

Join us on Two Voice Devs as we welcome back Roger Kibbe. Fresh off emceeing the developer track at the Unparsed Conference in London, Roger shares his insights on the biggest takeaways, trends, and challenges facing #GenAI, #VoiceFirst and #ConversationalAI developers today. Get ready for a dose of reality as Roger emphasizes the need to view LLMs as powerful tools – think hammers – rather than magical solutions. We dive deep into: Timestamps: * 0:00 - Intro * 1:56 - Exploring the Unparsed Conference * 4:47 - LLMs: The hype vs. the reality for developers * 6:37 - The underappreciated power of LLMs for "understanding", not just generating * 11:03 - The right tool for the job: Why a toolbox approach is essential for conversational AI * 13:52 - Beyond the chatbot: Detecting emotion and the future of human communication * 20:28 - Hackathon highlights and the need for more realistic QA approaches * 28:55 - Navigating the shift from deterministic to stochastic systems * 31:59 - Will AI replace junior developers? * 36:30 - How senior developers can (and can't) benefit from AI coding assistants * 39:04 - Final thoughts: The value of cutting through the hype Don't miss this insightful conversation about the future of conversational AI development – grab your toolbox and hit play!

ai wisdom navigating fresh exploring bullets qa hammers detecting hackathons silver bullet genai voicefirst

Episode 197 - Alexa Skill Development in the Age of LLMs

Play Episode Listen Later Jul 5, 2024 40:19

What should people developing with LLMs learn from a decade of experience building Alexa skills? How will Alexa skill developers leverage the latest #GenerativeAI and #CoversationalAI tools as they continue to build #VoiceFirst and multimodal skills? Join Allen and Mark on Two Voice Devs as they delve into the evolving landscape of Alexa skill development in the era of large language models (LLMs). Sparked by a thought-provoking discussion on the Alexa forums, they explore the potential benefits and challenges of integrating LLMs into skills. Key topics and timestamps: (0:00:00) Introduction (0:02:00) LLMs and the Future of Alexa Skills (0:04:00) Limitations of Current Alexa Skill Model with LLMs (0:07:00) Benefits and Drawbacks of Developing for Alexa (0:10:30) Overlooked Potential of Multimodality with LLMs (0:14:50) Lessons from Early Voice Experiences (0:17:00) Intents vs. Tool/Function Calling (0:21:30) Handling Hallucinations and Off-Topic Requests (0:22:00) LLMs' Ability to Handle Nuanced Intents (0:28:00) Cost Considerations of LLMs (0:32:00) Monetizing LLM-Powered Alexa Skills (0:39:40) The Future of Alexa Skill Development: A Hybrid Approach? (0:40:00) Outro Tune in as they discuss the need for hybrid models, the importance of conversation design, and the uncertain future of monetization in this rapidly changing landscape. Don't forget to join the conversation on the Alexa Slack channel or leave your thoughts in the comments below!

lessons future benefits developing ability limitations sparked drawbacks skill development alexa skills intents generativeai voicefirst multimodality

Epsidoe 196 - Is GPT 4o a Game Changer?

Play Episode Listen Later Jun 6, 2024 22:50

OpenAI's ChatGPT 4o and GPT 4o announcements have sent shockwaves through the developer community! In this episode of Two Voice Devs, Mark and Allen dive into the implications of these new models, comparing them to Google's Gemini. We discuss: [00:00:10] Initial takeaways from the OpenAI presentations. [00:02:29] The impressive voice capabilities of ChatGPT 4o. [00:04:49] Concerns about OpenAI's ambitions for conversational AI. [00:07:30] The difference between "doing" and "knowing" AI systems. [00:14:15] A detailed breakdown of GPT 4o, including its strengths and weaknesses. [00:17:43] Comparison with Gemini and implications for developers. [00:19:41] The importance of competition in driving innovation and lowering prices. [00:21:48] The future of AI assistants and the role of developers. Let us know what you think about GPT 4o and Gemini! Have you used them? Share your experiences and thoughts in the comments below.

google ai chatgpt comparison concerns game changers gemini openai initial gpt epsidoe

Episode 195 - Android, Agents, and the Rabbit R1

Play Episode Listen Later May 30, 2024 35:19

Allen Firstenberg chats with fellow Google Developer Expert (GDE) Mike Wolfson about his career, the evolution of Android, and his new interest in generative AI. Mike shares his thoughts on the future of AI with agents, Large Action Models (LAMs), and the potential of the "Rabbit," a new AI-powered device. Does the Rabbit live up to its promise? If not - what could? Timestamps: 00:00:00 - Introduction 00:01:32 - Mike's career journey 00:04:15 - Transition from enterprise Java to Android development 00:05:04 - Creating "Droid of the Day" app 00:06:49 - Becoming an Android developer and Google Developer Expert 00:09:23 - Shift in focus from Android to generative AI 00:10:57 - Generative AI as a platform 00:11:47 - The Rabbit and its potential 00:14:59 - Mike's take on the Rabbit as a developer 00:17:31 - Current integrations with the Rabbit 00:19:52 - The future of AI and the Rabbit 00:24:46 - Edge AI and its potential 00:27:16 - The capabilities of the Rabbit and its future 00:32:17 - The Rabbit vs. other devices like meta glasses 00:34:28 - Conclusion and call to action

ai current transition shift android conclusion rabbit java droid rabbit r1 edge ai

Episode 194 - Google AI/O 2024

Play Episode Listen Later May 17, 2024 23:20

Join Allen and Roya as they dissect the major AI announcements from Google I/O 2024. From Gemini updates and new models to responsible AI and groundbreaking projects like ASTRA, this episode dives into the future of AI development. Timestamps: [00:00:00] Introduction and Google I/O Overview [00:02:00] Gemini 1.5 Flash & Gemini 1.5 Pro: New Models and Features [00:04:30] AI Studio Access Expansion for Europe, UK & Switzerland [00:06:20] Choosing the Right AI Model for Your Project [00:06:50] Gemini Nano in Google Chrome: Bringing AI to the Browser [00:08:00] Pali Gemma: Open Source Model with Image & Text Input [00:08:50] AI Red Teaming & Model Safety Tools [00:09:50] Parallel Function Calling for Developers [00:10:30] Video Frame Extraction: Easier Multimodal Development [00:11:20] GenKit: Firebase's Generative AI Integration [00:12:00] Gems: Customizable Gemini for Developers [00:12:50] Semantic Embeddings: Understanding & Creating Images [00:13:50] Imogen 3: API Access for Image Generation [00:14:20] Veo: Video Generation with Lumiere Architecture [00:14:50] SynthID: Watermarking & Identifying Generated Content [00:16:30] Responsible AI & Inclusivity [00:18:00] Gemini Developer Competition: Win a DeLorean & Cash Prizes! [00:19:30] Project ASTRA: Multimodal AI with Contextual Memory [00:21:00] Google Glasses & Project ASTRA Integration [00:22:00] Closing Thoughts: AI for Everyone

europe uk ai flash switzerland developers gemini inclusivity delorean browsers google i o google ai responsible ai roya google glasses image generation cash prizes your project

Episode 193 - Revolutionizing Intent Classification

Play Episode Listen Later May 9, 2024 39:58

Join Allen and Mark as they delve into Voiceflow's groundbreaking new feature: intent classification using a hybrid of LLMs and classic NLU models. Discover how this innovative approach leverages the strengths of both technologies to achieve greater accuracy and flexibility in understanding user intent. How they're doing it just may blow your mind!

discover deep exploring intent revolutionizing classification llm nlu voiceflow

Episode 192 - Google Cloud Next 2024 Recap

Play Episode Listen Later Apr 26, 2024 40:35

Join Allen Firstenberg and guest host Stefania Pecore on Two Voice Devs as they delve into the exciting announcements and highlights from Google Cloud Next 2024! This episode focuses on the latest advancements in AI and their impact on the healthcare industry, providing valuable insights for developers and tech enthusiasts. Learn more: * https://cloud.google.com/blog/topics/google-cloud-next/google-cloud-next-2024-wrap-up Timestamps: 00:00:00: Introduction 00:01:02: Stefania's background and journey into AI 00:07:20: Stefania's overall experience at Google Cloud Next 00:11:59: Focus on Healthcare and AI applications, including Mayo Clinic's Solution Studio 00:15:38: Exploring the new Gemini product suite and its features like code assistance and data analysis 00:20:44: Discussing Gemini API updates, including the 1.5 public preview with 1M token context window and grounding tools 00:26:06: Vertex AI Agent Builder and its no-code approach to chatbot developmen t 00:33:02: Hardware announcements, including the A3 VM with NVIDIA H100 GPUs 00:35:24: Stefania's reflections on Cloud Next and the value of attending Tune in to discover the future of AI and its transformative potential, especially in the healthcare sector. Share your thoughts on the Google Cloud Next announcements in the comments below!

ai focus healthcare exploring gemini hardware mayo clinic google cloud next cloud next

Episode 191 - Beyond the Hype: Exploring BERT

Play Episode Listen Later Apr 19, 2024 40:15

This episode of Two Voice Devs takes a closer look at BERT, a powerful language model with applications beyond the typical hype surrounding large language models (LLMs). We delve into the specifics of BERT, its strengths in understanding and classifying text, and how developers can utilize it for tasks like sentiment analysis, entity recognition, and more. Timestamps: 0:00:00: Introduction 0:01:04: What is BERT and how does it differ from LLMs? 0:02:16: Exploring Hugging Face and the BERT base uncased model. 0:04:17: BERT's pre-training process and tasks: Masked Language Modeling and Next Sentence Prediction. 0:11:11: Understanding the concept of masked language modeling and next sentence prediction. 0:19:45: Diving into the original BERT research paper. 0:27:55: Fine-tuning BERT for specific tasks: Sentiment Analysis example. 0:32:11: Building upon BERT: Exploring the Roberta model and its applications. 0:39:27: Discussion on BERT's limitations and its role in the NLP landscape. Join us as we explore the practical side of BERT and discover how this model can be a valuable tool for developers working with text-based data. We'll discuss i ts capabilities, limitations, and potential use cases to provide a comprehensive understanding of this foundational NLP model.

building diving nlp beyond the hype sentiment analysis

Episode 190 - Google Gemma's Tortoise and Hare Adventure

Play Episode Listen Later Apr 11, 2024 28:14

Embark on a wild race with Gemma as we explore the exciting (and sometimes slow) world of running Google's open-source large language model! We'll test drive different methods, from the leisurely pace of Ollama on a local machine to the speedier Groq platform. Join us as we compare these approaches, analyzing performance, costs, and ease of use for developers working with LLMs. Will the tortoise or the hare win this race? Learn more: * Model card: https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/335 * Ollama: https://ollama.com/ * LangChain.js with Ollama: https://js.langchain.com/docs/integrations/llms/ollama * Groq: https://groq.com/ Timestamps: 0:00:00 - Introduction 0:03:05 - Getting to Know Gemma: Exploring the Model Card 0:05:30 - Vertex AI Endpoint: Fast Deployment, But at What Cost? 0:13:40 - Ollama: The Tortoise of Local LLM Hosting 0:17:40 - LangChain Integration: Adding Functionality to Ollama 0:21:44 - Groq: The Hare of LLM Hardware 0:26:06 - Comparing Approaches: Speed vs. Cost vs. Control 0:27:35 - Future of Open LLMs and Google Cloud Next #GemmaSprint This project was supported, in part, by Cloud Credits from Google

google future cost model adventure embark hare tortoise groq langchain

Episode 189 - Farewell, ADR: The Impact on Alexa Developers

Play Episode Listen Later Apr 5, 2024 26:17

The Alexa Developer Rewards Program (ADR) is shutting down, leaving many developers wondering about the future of Alexa skills. Mark and Allen discuss the implications of this change, explore alternative monetization options, and share their thoughts on the future of skill development. Timestamps: 0:00 - Intro and announcement of the ADR program ending 1:45 - History of the ADR program and its impact on skill development 7:13 - Discussion of the Skill Developer Accelerator Program (SDAP) and Skill Coach 14:04 - Status of AWS credits for skill developers 15:10 - Incentives for building skills in the absence of the ADR program 21:30 - Cost-benefit analysis and the future of skill development 25:48 - Call to action: Share your thoughts on the ADR program ending and the future of skills Join the conversation and let us know what you think!E

history cost status farewell developers incentives aws adr

Episode 188 - Building Responsible AI with Gemini

Play Episode Listen Later Mar 29, 2024 30:06

As large language models (LLMs) become increasingly powerful, ensuring their responsible use is crucial. In this episode of Two Voice Devs, Allen and Mark delve into Google's Gemini LLM, specifically its built-in safety features designed to prevent harmful outputs like harassment, hate speech, sexually explicit content, and dangerous information. Join them as they discuss: (00:01:55) The importance of safety features in LLMs and Google's approach to responsible AI. (00:03:08) A walkthrough of Gemini's safety settings in AI Studio, including the four categories of evaluation and developer control options. (00:06:51) Examples of how Gemini flags potentially harmful prompts and responses, and how developers can adjust settings to control output. (00:08:55) A deep dive into the API, exploring the parameters and responses related to safety features. (00:19:38) The challenges of handling incomplete responses due to safety violations and the need for better recovery strategies. (00:26:47) The importance of industry standards and finer-grained control for responsible AI development. (00:29:00) A call to action for developers and conversation designers to discuss and collaborate on best practices for handling safety issues in LLMs. This episode offers valuable insights for developers working with LLMs and anyone interested in the future of responsible AI. Tune in and share your thoughts on how we can build safer and more ethical AI systems!

google ai gemini api responsible ai

Episode 187 - LLMs in Developer Tools

Play Episode Listen Later Mar 21, 2024 26:23

In this episode of Two Voice Devs, Mark and Allen discuss how developers can leverage AI tools like ChatGPT to improve their workflow. Mark shares his experience using ChatGPT to generate an OpenAPI specification from TypeScript types, saving him significant time and effort. They discuss the benefits and limitations of using AI for code generation, emphasizing the importance of understanding the generated code and maintaining healthy skepticism. Timestamps: 00:00:00 Introduction 00:00:49 Using AI as a developer tool 00:01:17 Generating OpenAPI specifications with ChatGPT 00:04:02 Mark's prompt and TypeScript types 00:05:37 Reviewing the generated OpenAPI specification 00:07:12 Adding request examples with ChatGPT 00:10:11 Benefits and limitations of AI code generation 00:13:43 Using AI tools for learning and understanding code 00:17:39 Trusting AI-generated code and potential for bias 00:19:04 Integrating AI tools into the development workflow 00:22:38 The future of AI in software development 00:23:17 Programmers as problem solvers, not just code writers 00:25:41 AI as a tool in the developer's toolbox 00:26:07 Call to action: Share your experiences with AI tools This episode offers valuable insights for developers interested in exploring the potential of AI to enhance their productivity and efficiency.

ai benefits chatgpt reviewing developers using ai programmers typescript integrating ai developer tools open api

Episode 186 - Conversational AI with Voiceflow Functions

Play Episode Listen Later Mar 14, 2024 26:56

Join us on Two Voice Devs as we chat with Xavi, Head of Cloud Infrastructure at Voiceflow, about the exciting new Voiceflow Functions feature and the future of conversational AI development. Xavi shares his journey into the world of bots and assistants, dives into the technology behind Voiceflow's infrastructure, and explains how functions empower developers to create custom, reusable components for their conversational experiences. Timestamps: 00:00:00 Introduction 00:00:49 Xavi's journey into conversational AI 00:06:08 Voiceflow's infrastructure and technology 00:09:29 Voiceflow's evolution and direction 00:13:28 Introducing Voiceflow Functions 00:16:05 Capabilities and limitations of functions 00:20:35 Future of Voiceflow Functions 00:21:02 Sharing and contributing functions 00:24:02 Technical limitations of functions 00:25:35 Closing remarks and call to action Whether you're a seasoned developer or just getting started with conversational AI, this episode offers valuable insights into the evolving landscape of bot development and the powerful capabilities of Voiceflow.

head ai future sharing technical functions capabilities xavi conversational ai cloud infrastructure voiceflow

Episode 185 - Cloud vs Local LLMs: A Developer's Dilemma

Play Episode Listen Later Mar 7, 2024 51:07

In this episode of Two Voice Devs, Allen Firstenberg and Roger Kibbe explore the rising trend of local LLMs, smaller language models designed to run on personal devices instead of relying on cloud-based APIs. They discuss the advantages and disadvantages of this approach, focusing on data privacy, control, cost efficiency, and the unique opportunities it presents for developers. They also delve into the importance of fine-tuning these smaller models for specific tasks, enabling them to excel in areas like legal contract analysis and mobile app development. The conversation dives into various popular local LLM models, including: Mistral: Roger's favorite, lauded for its capabilities and ability to run efficiently on smaller machines. Phi-2: A tiny model from Microsoft ideal for on-device applications. Llama: Meta's influential model, with Llama 2 currently leading the pack and Llama 3 anticipated to be comparable to ChatGPT 4. Gemma: Google's new open-source model with potential, but still under evaluation. Learn more: Ollama: https://ollama.com/ Ollama source: https://github.com/ollama/ollama LM Studio: https://lmstudio.ai/ Timestamps: 00:00:00: Introduction and welcome back to Roger Kibbe. 00:01:31: Roger discusses his career path and his passion for voice and AI. 00:06:33: The discussion turns to the larger vs. smaller LLMs. 00:13:52: Understanding key terminology like quantization and fine-tuning. 00:20:58: Roger shares his favorite local LLM models. 00:25:14: Discussing the strengths and weaknesses of smaller models like Gemma. 00:30:32: Exploring the benefits and challenges of running LLMs locally. 00:39:15: The value of local LLMs for developers and individual learning. 00:40:29: The impact of local LLMs on mobile devices and app development. 00:49:27: Closing thoughts and call for audience feedback. Join Allen and Roger as they explore the exciting potential of local LLMs and how they might revolutionize the development landscape!

ai microsoft local chatgpt cloud exploring dilemma developers llama apis llm phi

Episode 184 - Large Action Models: The Future of Conversational AI?

Play Episode Listen Later Mar 1, 2024 39:13

Join Allen and Mark on Two Voice Devs as they dive into the world of Large Action Models (LAMs) and explore their potential to revolutionize how we build chatbots and voice assistants. Inspired by Braden Ream's article "How Large Action Models Work and Change the Way We Build Chatbots and Agents," the discussion dissects the core functions of conversational AI - understand, decide, and respond - and examines how LAMs might fit into this framework. Allen and Mark also compare and contrast LAMs with Large Language Models (LLMs) and Natural Language Understanding (NLU), highlighting the strengths and limitations of each approach. Tune in to hear their insights on: The evolution of Voiceflow and its shift towards LLMs (03:20) Understanding the core functions of conversational AI (05:40) Clippy as an example of a deterministic agent (06:15) The differences between deterministic and probabilistic models (07:50) NLU vs. LLMs for understanding user input (09:20) How LAMs might fit into the "decide" stage of conversational AI (18:50) The challenges of training LAMs and avoiding hallucinations (20:00) The potential of LAMs to improve response generation (29:30) Cost considerations of using LLMs vs. NLUs (37:00) Whether you're a seasoned developer or just curious about the future of conversational AI, this episode offers a thought-provoking discussion on the potential of LAMs and the challenges that lie ahead. Be sure to share your thoughts in the comments below! Additional Info: https://www.voiceflow.com/blog/large-action-models-change-the-way-we-build-chatbots-again

ai action change cost large models conversational ai clippy large language models llms nlu lams voiceflow additional info

Episode 183 - Gemini 1.5: One Million Tokens, Endless Possibilities?

Play Episode Listen Later Feb 23, 2024 45:20

Google's Gemini 1.5 is here, boasting a mind-blowing 1 million token context window!

netflix google ai performance testing comparison ethical danish gemini openai api generating notable tokens les miserables one million endless possibilities

Episode 182 - Bard Becomes Gemini: Why Devs Care

Play Episode Listen Later Feb 9, 2024 18:58

In this episode of Two Voice Devs, hosts Allen Firstenberg and Mark Tucker discuss Gemini, Google's latest name for its Generative AI... stuff. Originally known as separate products including Bard and Duet AI, Gemini encompasses a suite of AI tools, including chatbots, product-specific assistants, models, and APIs that developers can use for various tasks. The discussion covers how Gemini compares with offerings from other companies such as OpenAI and Microsoft, including visible similarities and differences. The show concludes by answering the question about why developers should care about this rename with a call to explore possibilities with AI tools like Gemini to let us create more natural and user-friendly interfaces. Learn more: https://blog.google/technology/ai/google-gemini-update-sundar-pichai-2024/ https://blog.google/products/gemini/bard-gemini-advanced-app/ 00:04 Introduction and Catching Up 00:55 Exploring the Gemini Model 04:09 Gemini vs OpenAI: A Comparison 10:20 Understanding the Gemini Branding 12:00 The Developer's Perspective on Gemini 17:46 Closing Thoughts and Future Discussions

google ai care microsoft exploring perspective developers gemini openai bard apis devs closing thoughts duet ai mark tucker

Episode 181 - Let Your Web Pages Talk With CSS

Play Episode Listen Later Feb 2, 2024 43:06

In this episode of Two Voice Devs, hosts Allen Firstenberg and Mark Tucker discuss the CSS Speech Module Level 1 Candidate Recommendation Draft, a standard that enables webpages to talk, developed in collaboration with the voice browser activity. They explore its features including the 'aural' box model concept, voice families, earcons and more, drawing parallels with SSML and highlight its innovative approach to web accessibility complementing screen readers. Despite acknowledging its potential, they address some of its key omissions such as phonemes and the lack of a background audio feature. 00:04 Introduction and Welcome 01:14 Exploring the Concept of Webpages Talking 03:00 Deep Dive into CSS Speech Module 03:48 Understanding the Scope of CSS Speech Module 04:27 The Evolution of Voice Interaction 05:22 Comparing CSS Speech with SSML 07:13 The Power of CSS in Voice Development 22:49 The Impact of Voice Balance Property 29:20 The Limitations of CSS Speech 39:37 The Future of CSS Speech 42:50 Conclusion and Final Thoughts

power future evolution impact deep dive exploring conclusion concept limitations pages final thoughts scope css mark tucker ssml

Episode 180 - Run Rabbit One

Play Episode Listen Later Jan 29, 2024 47:23

Forget Apps! Talking to this Orange Cube Could Change Everything Is the app model broken? The creators of Rabbit R1, a new voice-first device, certainly think so. In this episode of Two Voice Devs, Mark and Allen break down this innovative device and its potential to change how we interact with technology. What do developers think about the technology underlying RabbitOS? You may be surprised! Key topics: 00:02:00 - What is the Rabbit R1? Rabbit R1 is a new type of device that prioritizes voice input and output. It aims to shift users away from apps and toward a more conversational way of interacting with technology. 00:05:17 - AI models: Rabbit uses a unique "large action model" to understand and complete tasks. It claims to do this faster and more intuitively than existing voice assistants. 00:14:14 - Teach Me mode: See how Rabbit can be trained to interact with new websites and applications. What implications does this have for the future? 00:18:41 - Can it replace apps? While that's a bold claim, Rabbit's conversational approach and innovative features show promise. Could this be the first step towards a new era in human-computer interaction? Additional thoughts: 00:25:06 - Hybrid approach: Rabbit smartly combines intent-based and language-based AI models, potentially offering speed and accuracy. 00:32:56 - Asynchronous interactions: It breaks away from the traditional request-response model, offering a more natural conversational experience that aligns with the Star Trek computer vision. 00:07:48 - Price: At just $199, many people are willing to check it out, and this could accelerate interest in voice-driven interfaces. Is Rabbit R1 a game-changer or just a gimmick? Let us know your thoughts in the comments!

ai star trek hybrid rabbit asynchronous teach me rabbit r1

Episode 179 - What's New With APL 2023.3

Play Episode Listen Later Jan 12, 2024 39:27

In this episode of 'Two Voice Devs', hosts Allen Firstenberg and Mark Tucker discuss updates made to Alexa Presentation Language (APL) version 2023.3. They highlight conditional imports, updates made for animations, and more, including APL support for different devices and how to "handle" backward compatibility. Learn More: https://developer.amazon.com/en-US/docs/alexa/alexa-presentation-language/apl-latest-version.html 00:08 Introduction and Welcome 00:17 Alexa Presentation Language (APL) Overview 01:02 Understanding APL and its Components 03:23 Exploring APL's Functionality and Usage 05:22 APL's Versioning Strategy and Device Compatibility 09:23 New Features in APL 2023.3: Conditional Imports 15:22 New Features in APL 2023.3: Item Insertion and Removal Commands 18:05 New Features in APL 2023.3: Control Over Scrolling and Paging 19:43 New Features in APL 2023.3: Accessibility Improvements 20:36 New Features in APL 2023.3: Frame Component Deprecation 22:23 New Features in APL 2023.3: Data Property for Sequential and Parallel Commands 25:07 New Features in APL 2023.3: Support for Variable Sized Viewports 26:47 New Features in APL 2023.3: Support for Lottie Files 28:33 New Features in APL 2023.3: String Functions and Vector Graphic Improvements 30:11 New Features in APL 2023.3: Extensions and APL Cheat Sheets 37:26 Strategies for Backwards Compatibility in APL 38:40 Conclusion and Farewell

strategy farewell conclusion extensions functionality apl sequential backwards compatibility new features mark tucker

Episode 178 - Looking Forward to 2024

Play Episode Listen Later Jan 5, 2024 30:22

In their New Year's discussion, Mark and Allen explore their hopes and predictions for technological advancements in 2024. They discuss the future of Large Language Models (and if that's the right name for them now), expressing anticipation for improvements in latency issues and the potential for models to be hosted on devices rather than cloud-based platforms. The conversation also ventures into the world of AI agents, function calling, and the importance of developers in ensuring safety measures are integrated in AI systems. Finally, they exude excitement about the possibility of AI in multimedia formats, where tools can generate differing output forms like text, video, images, and possibly even audio directly. They explore potential developer opportunities and challenges, emphasizing the importance of understanding regulations and ensuring user privacy and safety. 00:04 Introduction and New Year Reflections 02:05 Looking Forward: Predictions for 2024 02:14 The Future of Large Language Models (LLMs) 03:08 The Impact of LLMs on Voice Assistants 07:44 The Potential of On-Device AI Models 10:14 The Role of Developers in the AI Landscape 20:11 The Future of Multimodal AI Models 26:35 The Importance of Regulations in AI 29:22 Conclusion: Exciting Times Ahead

new year ai future impact developers looking forward regulations large language models large language models llms

Episode 177 - Looking Back at 2023

Play Episode Listen Later Dec 29, 2023 21:36

Allen Firstenberg and Mark Tucker, hosts of Two Voice Devs, reflect on the year 2023, discussing significant changes and trends in the #VoiceFirst and #GenerativeAI industry and where their predictions from last year were accurate... or fell short. They discuss the transformation and challenges Amazon faced, gleaning predictions from hints at large language models (LLMs) from Google, Amazon, Microsoft, and Apple. They also mention the shift of Voiceflow towards LLMs and recall the notion of retrieval augmented generation. 00:04 Introduction and Welcome 00:12 Reflecting on the Past Year 01:13 Amazon's Progress and Challenges 01:59 Exploring Amazon's Monetization and Widgets 08:45 Google's Journey and the End of Conversational Actions 11:53 The Rise of Large Language Models (LLMs) 17:04 The Impact of Voiceflow and Dialogflow 20:48 Closing Remarks and New Year Wishes

amazon google apple microsoft impact progress reflecting monetization closing remarks large language models llms generativeai voicefirst voiceflow mark tucker

Episode 176 - The Night Before Tech-mas

Play Episode Listen Later Dec 21, 2023 3:39

Mark and Allen get into the Tech-mas spirit, with a little help from Bard. Hoping you all have the happiest of holiday seasons. #GenerativeAI #VoiceFirst #ConversationalAI #HappyHolidays

tech hoping bard

Episode 175 - Gemini: A First Look

Play Episode Listen Later Dec 15, 2023 41:39

In this in-depth chat between Allen Firstenberg and Linda Lawton, they dive into the functionalities and potential of Google's newly released Gemini model. From their initial experiences to exciting possibilities for the future, they discuss the Gemini Pro and Gemini Pro Vision models, how to #BuildWithGemini, its focus on both text and images, and speedier and more cohesive responses compared to older models. They also delve into its potential for multi-modal support, unique reasoning capabilities, and the challenges they've encountered. The conversation draws interesting insights and sparks exciting ideas on how Gemini could evolve in the future. 00:04 Introduction and Welcome 00:23 Discussing the New Gemini Model 01:33 Comparing Gemini and Bison Models 02:07 Exploring Gemini's Vision Model 03:03 Gemini's Response Quality and Speed 03:53 Gemini's Token Length and Context Window 05:05 Gemini's Pricing and Google AI Studio 05:33 Upcoming Projects and Previews 06:16 Gemini's Role in Code Generation 07:54 Gemini's Model Variants and Limitations 12:01 Creating a Python Desktop App with Gemini 14:07 Gemini's Potential for Assisting the Visually Impaired 18:35 Gemini's Ability to Reason and Count 20:15 Gemini's Multi-Step Reasoning 20:33 Testing Gemini with Multiple Images 21:52 Exploring Image Recognition Capabilities 22:13 Discussing the Limitations of 3D Object Recognition 23:53 Testing Image Recognition with Personal Photos 24:52 Potential Applications of Image Recognition 25:45 Exploring the Multimodal Capabilities of the AI 26:41 Discussing the Challenges of Using the AI in Europe 27:26 Exploring the AQA Model and Its Potential 33:37 Discussing the Future of AI and Image Recognition 37:12 Wishlist for Future AI Capabilities 40:11 Wrapping Up and Looking Forward

google ai future challenges exploring pricing reason ability limitations gemini looking forward wrapping up wishlists first look assisting upcoming projects gemini pro

Episode 174 - Live and In Person at Voice+AI 2023

Play Episode Listen Later Dec 8, 2023 37:48

Join Allen Firstenberg and guest host Noble Ackerson, at the Voice and AI 2023 conference. They discuss the growth of AI and how LLM (large language models) are affecting the tech world and delve deep into topics like LangChain, generative AI, and how to optimize AI operations to tackle network latency. There are also plenty of audience questions, exploring the current challenges in AI and potential solutions. 00:03 Introduction and Background of Two Voice Devs 00:31 The Evolution of Voice Technology and AI 01:50 Interactive Q&A Session Begins 01:58 Discussion on Open Source Software and Generative AI 02:59 Deep Dive into LangChain 05:43 Audience Participation and Questions 06:00 Challenges with LangChain and Overhead 08:14 Exploring the Intersection of Voice Technology and Generative AI 12:51 Addressing Network Latency in Voice Technology 19:49 The Future of AI and Voice Technology 26:53 Addressing the Challenges of Network Latency 37:13 Closing Remarks and Future Engagements

ai voice future challenges evolution deep dive exploring addressing intersection llm closing remarks open source software voice technology audience participation live and in person langchain

Episode 173 - Thanksgiving Thoughts 2023

Play Episode Listen Later Nov 23, 2023 8:49

Join Mark Tucker and Allen Firstenberg on Thanksgiving Day for a sincere heart-to-heart on the highs and lows of their tech industry journey. Expressing their gratitude for their family, friends, and colleagues in the tech industry and beyond, they acknowledge the challenging times faced by many. They call on their viewers to remember how unique and important they are and invite them to express their thoughts and emotions openly by reaching out to them. 00:04 Introduction and Thanksgiving Greetings 00:28 Reflecting on the Past Year 02:19 Gratitude for Personal Relationships 03:54 Acknowledging Industry Challenges and Layoffs 05:59 Importance of Community and Support 07:59 Encouragement and Closing Remarks

community thanksgiving gratitude reflecting encouragement thanksgiving day closing remarks

Episode 172 - VoiceFlow Changes and Solutions

Play Episode Listen Later Nov 16, 2023 26:05

Mark Tucker and Allen Firstenberg delve into the recent changes made by VoiceFlow. We explore how VoiceFlow, originally a design resource for Alexa Skills and Google Assistant Actions, has evolved and shifted to include chatbot roles and generative AI responses. Highlighted too are the implications of VoiceFlow's decoupling and transition to 'bot logic as a service'. We look at the necessary technical adjustments and solutions required in the aftermath of these changes, and Mark shares how he created a Jovo plugin as a hassle-free 'integration layer' for handling multiple platforms, taking advantage of Jovo's generic input output. More info: https://github.com/jovo-community/jovo4-voiceflowdialog-app 00:04 Introduction 00:54 Introducing VoiceFlow 01:44 Exploring VoiceFlow's Evolution 03:13 Understanding VoiceFlow's Changes 05:39 Explaining the VoiceFlow Integration 14:39 Discussing the VoiceFlow Dialog API 25:42 Conclusion

ai conclusion explaining highlighted alexa skills voiceflow mark tucker jovo

Episode 171 - Ups and Downs of the OpenAI DevDay Roller Coaster

Play Episode Listen Later Nov 10, 2023 39:52

On this episode, Mark Tucker and Allen Firstenberg dive deep into the latest announcements by OpenAI. They discuss various developments including the launch of GPTs (collections of prompts and documents with configuration settings), the new text-to-speech model, upcoming GPT-4 Turbo, reproducible outputs, and the introduction of the Assistant API. While they express excitement for what these developments could mean for #VoiceFirst, #ConversationAI, and #GenerativeAI, they also voice concerns about discovery solutions, monetization, and the reliance on platform-based infrastructure. Tune in and join the conversation. More info: https://openai.com/blog/new-models-and-developer-products-announced-at-devday 00:04 Introduction and OpenAI Announcements Edition 00:52 Discussion on OpenAI's New Text to Speech Model 02:15 Exploring the Pricing and Quality of OpenAI's Text to Speech Model 02:52 Concerns and Limitations of OpenAI's Text to Speech Model 06:24 Introduction to GPT 4 Turbo 06:48 Benefits and Limitations of GPT 4 Turbo 09:27 Exploring the Features of GPT 4 Turbo 18:52 Introduction to GPTs and Their Potential 22:22 Concerns and Questions About GPTs 32:14 Discussion on the Assistant API 37:32 Final Thoughts and Wrap Up

benefits concerns exploring pricing roller coasters limitations openai final thoughts turbo gpt ups and downs wrap up gpts generativeai voicefirst mark tucker openai devday

Episode 170 - At the Hub of MakerSuite and LangChain

Play Episode Listen Later Nov 2, 2023 18:22

Allen and Mark discuss the practical uses and advantages offered by MakerSuite, an API currently available for Google's PaLM #GenerativeAI model. We look at its unique feature that treats prompts like templates, allowing for versatile manipulation of these templates for varying results. We further delve into how it saves these prompts in Google Drive and how this can be linked to LangChain's new hub concept, leading to an effective 'MakerSuite hub.' Finally, we explore if prompts are more like code or content, and how that fits into the development process. What do you think? More info: MakerSuite: https://makersuite.google.com/ MakerSuite Hub in LangChain JS: https://js.langchain.com/docs/ecosystem/integrations/makersuite

google api google drive langchain

Episode 169 - First Thoughts on TypeChat

Play Episode Listen Later Nov 2, 2023 27:42

Mark and Allen explore TypeChat - a new library from Microsoft that makes prompt engineering for function-like operations in #ConversationalAI easier and more robust. Is this a replacement for Intents? Does it go beyond what we could do with Intent-based systems? Is it lacking something? Let's explore! Learn more: https://github.com/microsoft/TypeChat

microsoft intent intents

Episode 168 - Defining Retrieval Augmented Generation

Play Episode Listen Later Oct 20, 2023 13:30

What started as a casual conversation between Mark and Allen turned into a brief exploration of what Retrieval Augmented Generation (RAG) means in the #GenerativeAI and #ConversationalAI world. Toss in some discussion about VoiceFlow and Google's Vertex AI Search and Conversation and we have another dive into the current hot method to bridge the Fuzzy Human / Digital Computer divide.

google conversations generation defining toss augmented retrieval generativeai voiceflow

Episode 167 - What Does Bard Have to Say to Devs?

Play Episode Listen Later Oct 12, 2023 32:34

Last week, before Google's annual hardware event, Allen teased part of his prediction about Google Assistant and Bard. This week, we'll show the full clip of Allen's prediction and see just how close he was. Then Mark and Allen discuss how recent announcements from OpenAI, Amazon Alexa, and Google compare to each other and, more important, what they each mean for developers in a #GenerativeAI, #ConversationalAI, and perhaps even a #VoiceFirst world, and perhaps make a few more predictions and what we'll hear next. More info: Blog post about Assistant With Bard: https://blog.google/products/assistant/google-assistant-bard-generative-ai/ Announcement at the the Made By Google event: https://www.youtube.com/live/pxlaUCJZ27E?si=I1noN-l3LQHgBktp&t=2941

google blog openai bard announcement devs amazon alexa google assistant made by google generativeai voicefirst

Episode 166 - What's Next at Google Cloud Next 2023

Play Episode Listen Later Oct 6, 2023 35:05

The Google Cloud Next conference is a massive display of the latest technologies and products available from Google Cloud - from AI to Zero-Trust solutions. Unsurprisingly, #MachineLearning was prominent in this years show, so Mark and Allen take a look at some of the biggest #GenerativeAI and #ConversationalAI announcements this year. More info: https://cloud.google.com/blog/topics/google-cloud-next/next-2023-wrap-up

ai machine learning google cloud zero trust generativeai google cloud next

Episode 165 - Speaking of LLMs and Alexa...

Play Episode Listen Later Sep 28, 2023 42:30

Mark shares the exciting news that Amazon Alexa will soon have a #VocieFirst #ConversationalAI LLM chat mode! While Allen agrees that this is very exciting news, he still has quite a few questions about how #GenerativeAI technology will fit into Alexa skills. We ask the difficult questions and see what answers are currently out there. What do you think about this announcement from Alexa? More info: LLM feature description: https://developer.amazon.com/en-US/blogs/alexa/alexa-skills-kit/2023/09/alexa-llm-fall-devices-services-sep-2023 Event video: https://youtu.be/_JcP7N0QPOk

speaking event llm amazon alexa generativeai while allen

Episode 164 - VOICE + AI 2023 Recap

Play Episode Listen Later Sep 26, 2023 37:18

Noble and Allen take a look back at our experiences at this years VOICE + AI conference. What were the big topics being discussed? The amusing moments? And what do we want to see next year? #GenerativeAI #ConversationalAI #VoiceFirst

voice noble

Episode 163 - Using Google's MakerSuite PaLM API for Analytics

Play Episode Listen Later Sep 13, 2023 43:54

Allen and guest host Linda have a wide ranging conversation, from Linda's career path and her experiences as a Google Developer Expert for Google Analytics, to how she leveraged that knowledge while trying out something new with Google's #GenerativeAI tool, MakerSuite and the PaLM API. We take a close look at how developers can use prompts (more than one!) to help turn a user's request into actionable data structures that feed into an API and get results. More from Linda: https://LindaLawton.DK https://daimto.com #MakerSuiteSprint #LargeLanguageModel

google analytics api palm google analytics dk using google google developer expert generativeai

Episode 162 - Previewing Voice+AI 2023

Play Episode Listen Later Sep 1, 2023 27:37

We're just days away from the annual VOICE+AI conference, hosted this year in Washington, DC. Both Allen and Noble will be speaking (and hosting a live and in person recording of a future episode!), so we'll give a little preview of what you can hear if you're attending.

washington voice dc noble previewing

Episode 161 - LangChain JS + Matching Engine = ?

Play Episode Listen Later Aug 24, 2023 26:46

Allen and Mark revisit a conversation from episode 146 where they discovered Google had a Vector Database. Now, several months later, Allen has done some work with the Google Cloud Vertex AI Matching Engine and incorporated it into LangChain JS. We discuss why this is important, and how it fits into the overall landscape of LLMs and MLs today. (And Allen has a little announcement towards the end.) More info: * Matching Engine: https://cloud.google.com/vertex-ai/docs/matching-engine/overview * LangChain JS: https://js.langchain.com/docs/modules/data_connection/vectorstores/integrations/googlevertexai

google mls engine matching langchain vector database

Episode 160 - So You Downloaded an LLM. Now what?

Play Episode Listen Later Aug 17, 2023 42:47

This seems like an easy question, right? If you want to do #ConversationalAI or #GenerativeAI on your own machine with a model such as Llama 2, you can just download the model and... well... then what? This is the question posed to guest host Noble Ackerson - and the answer was both more complicated and simpler than Allen could imagine!

llama generativeai

Episode 159 - What's New With APL 2023.2?

Play Episode Listen Later Aug 10, 2023 36:44

Amazon has made some changes to the Alexa Presentation Language, dubbing this version 2023.2, and Allen is a bit confused about what these updates bring. Mark, however, clarifies what's new, how it relates to what was previously available, and why some users can benefit from this latest APL release.

amazon apl

Episode 158 - Picture an Embedding, If You Will

Play Episode Listen Later Aug 3, 2023 38:38

One of the neat features we've seen come out of the #GenerativeAI and #ConversationalAI explosion recently has been the attention being paid to text embeddings and how they can be used to radically change how we index and search for things. Allen, however, has recently been working with an image embedding model from Google, including incorporating it into LangChain JS. Mark asks about what that process was like, what this new model lets us do, and starts to explore some of the potential of this new tool that is available for everyone. References: LangChain JS module: https://js.langchain.com/docs/modules/data_connection/experimental/multimodal_embeddings/google_vertex_ai Information from Google: https://cloud.google.com/vertex-ai/docs/generative-ai/embeddings/get-image-embeddings Google Model Garden info: https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/5 XKCD: https://xkcd.com/1425/

google picture embedding xkcd generativeai

Episode 157 - Three Years... and Still Going!

Play Episode Listen Later Jul 27, 2023 17:42

Three years of Two Voice Devs! There's no doubt that the #VoiceFirst industry has changed over that time, with the rise of #GenerativeAI and #ConversationalAI taking the world by storm. Mark and Allen look back at how the show has evolved over this time, and why we hope you'll be joining us as we continue forward on our journey!

still going generativeai voicefirst

Episode 156 - Go with the Dialogflow CX Flow

Play Episode Listen Later Jul 20, 2023 22:13

Guest Host Xavier Portilla returns to chat with Allen about some of the latest additions to Dialogflow CX. New system functions make some of the processing you can do on inputs easier and faster, while prebuilt flows and flow scoped parameters make it easier to have clearly defined, and reusable, components in your conversation design. More info: https://cloud.google.com/dialogflow/docs/release-notes#July_05_2023

dialogflow

Episode 155 - New Alexa Slot Type is Wild!

Play Episode Listen Later Jul 13, 2023 14:18

Guest host Xavier Portilla joins Allen to take a look at a new slot type that the Alexa team has in public beta. How can this new type be used? How does it differ from previous slot types? And what is a slot type anyway?

wild slot

Episode 154 - The Philosophical Developer

Play Episode Listen Later Jul 6, 2023 34:33

Guest Host Leslie Pound joins Allen to discuss her perspective on software development and #GenerativeAI and how, rather than trying to translate our fuzzy side, developers should think about how it helps us be more aware of how users are seeking to be more inspired or creative.

developers philosophical generativeai

Episode 153 - Between Fuzzy and Discrete With LLMs

Play Episode Listen Later Jun 29, 2023 32:15

Noble Ackerson returns to discuss about a recent presentation that Allen made to the Google Developer Group NYC chapter where he illustrates how #GenerativeAI can be used as a bridge between the discrete nature of computers and the "fuzzy" nature of humans. He and Noble discuss how Large Language Models, such as OpenAI and Google's PaLM 2, along with libraries like LangChain become a powerful tool in every developer's toolbox.

google noble openai palm fuzzy large language models discrete generativeai langchain

Episdoe 152 - What's the Intent of OpenAI Functions?

Play Episode Listen Later Jun 27, 2023 30:55

Allen is joined by Noble Ackerson to discuss the latest feature that OpenAI has included with it's GPT models. Functions provide a well defined way for developers to turn unstructured human input to a more structured format that can be processed by your code or using a library such as LangChain. We take a look at both how they can be used, but some of the open questions that remain about their use. More info: - https://platform.openai.com/docs/guides/gpt/function-calling

openai intent gpt functions episdoe langchain

Episode 151 - Requiem for Conversational Actions

Play Episode Listen Later Jun 15, 2023 37:41

This week, Google completed the "sunset" of Conversational Actions for the Google Assistant. Mark and Allen discuss the ups and downs of Actions on Google, how it fit into the #VoiceFirst landscape, and what may come next.

google requiem conversational google assistant voicefirst

Episode 150 - Another Look Backwards and Forwards

Play Episode Listen Later Jun 8, 2023 25:35

Another milestone episode! Mark and Allen take advantage of the event to look back at our predictions from episode 100, look back at how #VoiceFirst development has changed over the past 50 episodes (and several years), and look forward to what we'll be talking about in the next 50 episodes.

backwards forwards voicefirst

Claim Two Voice Devs

In order to claim this podcast we'll send an email to with a verification link. Simply click the link and you will be able to edit tags, request a refresh, and other features to take control of your podcast page!

Claim Cancel