Podcasts about text analytics

Process of analysing text to extract information from it

  • 38PODCASTS
  • 66EPISODES
  • 24mAVG DURATION
  • ?INFREQUENT EPISODES
  • Sep 11, 2024LATEST
text analytics

POPULARITY

20172018201920202021202220232024


Best podcasts about text analytics

Latest podcast episodes about text analytics

Cloud Wars Live with Bob Evans
Teradata's Hillary Ashton on Open Table Formats Driving Value for Customers | Cloud Wars Live

Cloud Wars Live with Bob Evans

Play Episode Listen Later Sep 11, 2024 16:05


 Creating Value with TeradataThe Big Themes:Open Table Formats: Datasets are one of the most common ways that organizations use open table formats, as it enables them to combine several types of data and access that data. Open table formats are improving performance which will help drive a more flexible, low-cost storage option for enterprises. Having this level of customer choice will drive greater adoption and outcomes for customers deploying open table formats over time.Trusted AI: Trusted systems require access to data, but it must be properly managed data. Open table formats can help with trusted data progression, such as reducing data duplication by consolidation and providing a single place of oversight. As open table formats match agility and flexibility with the appropriate levels of governance, it can deliver trusted outcomes that overflow into providing trusted AI.Driving Customer Value and Success: Providing customers with opportunities for success is the ultimate driver. For example, Teradata supported a call center that wanted to improve customer satisfaction and outcomes based on the call center data. Teradata provided the organization with text analytics, large language models, and more being driven through a Teradata analytic model engine that provides real-time advice to agents.The Big Quote: “We've always said, ‘Whoever has access to the most data can win in the analytics space.' So, open table formats are a key component of really helping companies create an environment of trust around the data because without trusted data, you can't have trusted AI.”

Data Transforming Business
Maximizing Data Relationships with Text Analytics: LLMS and Knowledge Graphs

Data Transforming Business

Play Episode Listen Later May 24, 2024 13:38


Maximising data relationships through text analytics, particularly with tools like LLMS and Knowledge Graphs, offers organisations unprecedented insights and capabilities. By leveraging these advanced technologies, businesses can unlock hidden connections and patterns within their data, leading to more informed decision-making and strategic planning. Integrating Ontotext's solutions is a game-changer, empowering organisations to extract, organise, and visualise complex information from unstructured data sources. With Ontotext's expertise in semantic technology, businesses can construct robust knowledge graphs that offer a comprehensive understanding of their data landscape. This comprehensive approach not only facilitates better analysis and interpretation of data but also ignites innovation and propels business growth in today's increasingly data-driven world.In this episode of the EM360 Podcast, Paulina Rios Maya, Head of Industry Relations, speaks to Doug Kimball, Chief Marketing Officer at Ontotext, to discuss: AI in Enterprise Knowledge LLMs Knowledge Graphs Chapters00:00 - Challenges of Integrating LLMs into Enterprise Knowledge Management Systems04:35 - Enhancing Compatibility and Efficacy with Knowledge Graphs07:21 - Innovative Strategies for Integrating LLMs into Knowledge Management Frameworks11:07 - The Future of LLM-Driven Knowledge Management Systems: Intelligent Question Answering and Insight Enablement

Now that's Significant
Text analytics that transform market research with Tovah Paglaro

Now that's Significant

Play Episode Listen Later Jan 23, 2024 47:23


On this episode of Now that's Significant, we talk transformative text analytics that transforms market research. Host Michael Howard, the Head of Marketing at Infotools, is joined by Tovah Paglaro, Co-founder and Chief Operations Officer at Fathom That. Fathom is an organization who seeks to be the most trusted text analytics software, to drive empathetic decision making based on open-ended listening, and become the new standard for employee, customer & public opinion research.  In this episode, we dive into: - Why text analytics will play a big role in Market Research, with some background into the story behind Fathom. - The main kinds of problems that Fathom helps insights teams to overcome.  - The general feeling that market researchers have towards text analytics. - The potential causes of those who are more skeptical to adopt new technology. - And what is potentially needed to help people see the value and use newer technologies like next-generation text analytics platforms. We hope you enjoy the show. *** Infotools Harmoni is a fit-for-purpose market research analysis, visualization, and reporting platform that gives the world's leading brands the tools they need to better understand their consumers, customers, organization, and market. Established in 1990, we work with some of the world's top brands around the world, including Coca-Cola, Orange, Samsung, and Mondelēz. Our powerful cloud-based platform, Harmoni, is purpose-built for market research. From data processing to investigation, dashboards to collaboration, Harmoni is a true "data-to-decision-making" solution for in-house corporate insights teams and agencies. While we don't facilitate market research surveys, we make it easy for to find and share compelling insights that go over-and-above what stakeholders want, inspiring them to act decisively. One of the most powerful features of Harmoni is Discover is a time-tested, time-saving, and investigative approach to data analysis. Using automated analyses to reveal patterns and trends, Discover minimizes potential research bias by removing the need for requesting and manually analyzing scores of cumbersome crosstabs – often seeing what you can't. Discover helps you easily find what differentiates groups that matter to you, uncover what makes them unique, and deliver data points that are interesting, relevant, and statistically significant, plus see things others can't. Add to all this an impending GenAI feature, and you have an extremely powerful, future-proofed tool. Feel free to check out our platform and services at www.infotools.com

Delighted Customers Podcast
The Art and Science of Customer Surveys - With Martha Brooke, Founder and Program Director, Interaction Metrics

Delighted Customers Podcast

Play Episode Listen Later Jun 15, 2023 56:35


We've all taken surveys.  Clearly some are better than others.Well designed and implemented surveys can provide business leaders with actionable insights that improve retention, reduce cost to serve, and drive customers loyalty - if acted upon.They can help companies serve their customers better and provide them with more value.  Poorly designed surveys do just the opposite.Martha Brooke is an expert when it comes to customer research.  Martha shares some of the secrets of best in class survey design with power and passion.In this episode we discuss:What's the primary problem with surveys today?What are the 3 areas of differentiation for scientific researchHow to ask provocative questions to increase market shareThe art and science of surveysThe problem with most CX programs The power of 3rd party surveysMEET MARTHAMartha Brooke is a Certified Customer Experience Professional (CCXP) and holds a Blackbelt in Six Sigma.To dramatically improve the Customer Experience, Martha founded Interaction Metrics, a Customer Experience (CX) Consulting Company in 2004.Clients such as Yaskawa America, Bosch, and Synchrony Financial choose Interaction Metrics for its pairing of scientific CX research with software such as Power BI, Alchemer, and Qualtrics. With her team of Analysts, Martha uses qualitative and quantitative research methods to pinpoint customer experience gaps and improvement opportunities. Methods include Text Analytics, Surveys, and Customer Service Evaluations.To spur dialogue about the Customer Experience, Martha leads top-rated conference sessions and workshops. Some of the organizations where Martha has been asked to speak include Project Management Institute (PMI), American Society of Plastic Surgeons (ASPS), HDI, The Score Conference, Customer Solutions Expo, American Marketing Association (AMA), NICSA at the Harvard Club, Operations Summit, Association of Support Professionals, and SOCAP International.Prior to Interaction Metrics, Martha worked for two dotcoms, Lucy.com and Food.com, where she drove improvements into the customer experience. And she held operations positions in areas that touch on aspects of CX such as quality assurance and customer service.Subscribe to The Delighted Customer Podcast so you don't miss an episode: https://www.empoweredcx.com/podcast Subscribe to The Delighted Customer Newsletter for practical tips and insights: https://www.empoweredcx.com/delightedcustomersnewsletter

Anshuman's Podcast
Visitor Experience Enhancement in ISKCON using Text Analytics

Anshuman's Podcast

Play Episode Listen Later Dec 6, 2022 5:53


Today we are going to learn about one of the most prominent charitable and spiritual societies in the world – The International Society for Krishna Consciousness (ISKCON) – and the obstacles faced by them in enhancing their visitor experience in such a data-driven world. I hope you enjoy this episode. Hare Krishna!

Discovering Data
Korbinian Spann: Text analytics to get in the head of your customers

Discovering Data

Play Episode Listen Later Nov 16, 2022 46:22


How do you leverage unstructured data to create better and more sustainable products, increase margins, and grow the business? Today I learn from Korbinian Spann, CEO, and founder of Insaas.ai.You can follow Korbinian on LinkedIn.Episode pageLinkFor BrandsDo you want to showcase your thought leadership with great content and build trust with a global audience of data leaders? We publish conversations with industry leaders to help practitioners create more business outcomes. Explore all the ways to tell your data story here https://www.discoveringdata.com/brands.For sponsorsWant to help educate the next generation of data leaders? As a sponsor, you get to hang out with the very in the industry. Want to see if you are a match? Apply now: https://www.discoveringdata.com/sponsorsFor GuestsDo you enjoy educating an audience? Do you want to help data leaders build indispensable data products? That's awesome! Great episodes start with a clear transformation. Pitch your idea at https://www.discoveringdata.com/guest.

Rock n' Roll Research Podcast
Episode #60: Tom Anderson - Text Analytics Pioneer, Entrepreneur, Founder Next Gen Market Research

Rock n' Roll Research Podcast

Play Episode Listen Later Apr 14, 2022 29:20


As the founder of OdinAnswers and StoryHub, Tom Anderson has done much to advance the veracity and application of text analytics, AI and big data within the disciplines of customer experience, insights and business strategy. But he also vaulted the market research industry into the modern world by creating Next Gen Market Research, enabling a network of tens of thousands of researchers to discuss important research topics with peers across the globe. He stimulated discussions, often passionate and contentious ones, and Tom has never been known to shy away from a good debate.Tom discusses how he got interested in analytics, his experience with NGMR and where analytics is taking him now. He also shares his take on missed opportunities for corporate researchers and the persistent research problem yet to be solved. I wish for nothing more than to be singing Karaoke at an industry conference with Tom, but a podcast is the next best thing!

CX Chronicles Podcast
CXChronicles Podcast 157 with Ryan Stuart, CEO at Kapiche

CX Chronicles Podcast

Play Episode Listen Later Feb 15, 2022 52:25 Transcription Available


Hey CX Nation,In episode #157 of The CXChronicles Podcast we welcomed Ryan Stuart, CEO & Founder at Kapiche based in Brisbane, Australia & Salt Lake City, Utah. Kapiche is a new breed customer insights platform that delivers deep, contextual understanding into your customer's experience, without manual coding or hand reading thousands of customer comments.The richest insights are found at the intersection of qualitative and quantitative data from every stage of the customer journey. Kapiche can help your team combine data from any source to make laser-focused business decisions for your organization.Listen to Ryan and Adrian chat through The Four CX Pillars: Team,  Tools, Process & Feedback + share some of the tips & tricks that have worked for Kapiche  as they've built & grown their business to improve the future of the customer experience & success space. **Episode #157 Highlight Reel:**1. The emergence of Voice of Customer (VOC) reporting & roles in the CX/CS landscape 2.  Understanding Difference Between Sales led cultures vs. Product led cultures 3.  Marrying quantitative & qualitative feedback data to provide crystal clear views4.  Capturing, assessing & leveraging  feedback data & insights to grow your business5.  Why you must re-think how you collect customer surveys in 2022 Huge thanks to Ryan for coming on The CXChronicles Podcast and featuring his team's work and efforts in pushing the customer experience & technology space into the future.Click here to learn more about Ryan StuartClick here to learn more about Kapiche If you enjoy The CXChronicles Podcast, please stop by your favorite podcast player and leave us a review. This is the easiest way that we can find new listeners, guests and future customer focused business leaders to tune into our weekly podcast. Be sure to grab a copy of our book "The Four CX Pillars To Grow Your Business Now" on Amazon +  check out the CXChronicles Youtube channel with all of our video episodes & content!Reach out to CXC at INFO@cxchronicles.com for more information about how we can help your team make customer happiness a habit!Support the show (https://cxchronicles.com/)

20 Minute Leaders
Ep699: Ezra Daya | Founder & CEO, Aspectiva

20 Minute Leaders

Play Episode Listen Later Jan 3, 2022 22:06


Ezra is the co-founder and CEO of Aspectiva, a startup company acquired by Walmart in 2019 to establish its first R&D and innovation center in Israel. Ezra is an entrepreneur at heart, constantly striving to connect the dots between technology and business, with a special focus on AI for eCommerce and consumer applications.Prior to founding Aspectiva, Ezra managed the Text Analytics group at NICE Systems. He holds an M.Sc. in Computer Science and is the owner of publications and patents in the field of Natural Language Processing (NLP).

Tech.eu
Everything you need to know about the news and text analytics platforms in the financial sector — with Sjoerd Leemhuis, Owlin

Tech.eu

Play Episode Listen Later Dec 28, 2021 23:06


In today's episode, listen to an interview with Sjoerd Leemhuis, CEO and co-founder at Owlin.In this conversation we discussed the differences between doing business in Europe and the United States, the mistakes entrepreneurs make when raising funds, the competition and partnership with Bloomberg and Reuters, and much more.We hope you enjoy(ed) the podcast! Please feel free to email us with any questions, suggestions, and opinions  to podcast@tech.eu or tweet at us @tech_eu.

Channel 9
What's new in Text Analytics for Health | AI Show

Channel 9

Play Episode Listen Later Jun 11, 2021 11:45


On this week's episode, Ashly Yeo will demo the exciting announcement made at #MSBuild about What's new in Text Analytics for Health!Jump to:[00:15] Seth welcomes Ashly[01:02] #MSBuild announcements on Text Analytics for Health[02:48] How to access Text Analytics for Health[03:40] API Output visualization 05:38 Demo in Visual Studio Code[08:54] Demo in Python SDK - see sample link belowLearn more:How-to Text Analytics for health https://aka.ms/AIShow/TextAnalyticsforHealth/HowToText Analytics for health sample https://aka.ms/AIShow/TextAnalytics/SampleText Analytics Tech Community blog https://aka.ms/AIShow/TechCommunity/TextAnalyticsAI Show https://aka.ms/AIShowCreate a Free account (Azure) https://aka.ms/aishow-seth-azurefreeGet Started with Machine Learning https://aka.ms/AIShow/StartMLAI for Developers https://aka.ms/AIShow/AIforDevelopersAzure Machine Learning https://aka.ms/AIShow/AzureMLFollow Seth https://twitter.com/sethjuarezDon't miss new episodes, subscribe to the AI Show https://aka.ms/AIShowsubscribeJoin us every other Friday, for an AI Show livestream on Learn TV and YouTube https://aka.ms/LearnTV - https://aka.ms/AIShowLive

AI Show  - Channel 9
What's new in Text Analytics for Health

AI Show - Channel 9

Play Episode Listen Later Jun 11, 2021 11:45


On this week's episode, Ashly Yeo will demo the exciting announcement made at #MSBuild about What's new in Text Analytics for Health!Jump to:[00:15] Seth welcomes Ashly[01:02] #MSBuild announcements on Text Analytics for Health[02:48] How to access Text Analytics for Health[03:40] API Output visualization 05:38 Demo in Visual Studio Code[08:54] Demo in Python SDK - see sample link belowLearn more:How-to Text Analytics for health https://aka.ms/AIShow/TextAnalyticsforHealth/HowToText Analytics for health sample https://aka.ms/AIShow/TextAnalytics/SampleText Analytics Tech Community blog https://aka.ms/AIShow/TechCommunity/TextAnalyticsAI Show https://aka.ms/AIShowCreate a Free account (Azure) https://aka.ms/aishow-seth-azurefreeGet Started with Machine Learning https://aka.ms/AIShow/StartMLAI for Developers https://aka.ms/AIShow/AIforDevelopersAzure Machine Learning https://aka.ms/AIShow/AzureMLFollow Seth https://twitter.com/sethjuarezDon't miss new episodes, subscribe to the AI Show https://aka.ms/AIShowsubscribeJoin us every other Friday, for an AI Show livestream on Learn TV and YouTube https://aka.ms/LearnTV - https://aka.ms/AIShowLive

GeekSprech Podcast
#61 - GeekSprech - Microsoft Build Recap

GeekSprech Podcast

Play Episode Listen Later Jun 4, 2021 26:41


Nach der Build ist vor dem Recap Podcast :-) daher beschäftigen wir uns in der aktuellen Folge mit den Ankündigungen und News der Microsoft Build. Wenn ihr also die Build verpasst habt (so wie der ein oder andere von uns) oder euch einfach nochmal mit den spannendsten News der Build auseinandersetzen wollt, dann ist das genau euer Podcast. Vergesst auch nicht die Links zu den weiteren Infos in den ShowNotes. Blog: https://geeksprech.de/geeksprech-podcast-folge-61-microsoft-build-recap ShowNotes: - Build 2021 Book of News - https://news.microsoft.com/build-2021-book-of-news - Text Analytics for Health - https://docs.microsoft.com/en-us/azure/cognitive-services/text-analytics/how-tos/text-analytics-for-health - Document Translation - https://techcommunity.microsoft.com/t5/azure-ai/translator-announces-document-translation-preview/ba-p/2144185 - PowerShell Durable Functions - https://docs.microsoft.com/en-us/azure/azure-functions/durable/quickstart-powershell-vscode - Bicep 0.4 Launch - https://www.youtube.com/watch?v=3xLLFuWhmdQ

Den of Rich
David Yang | Давид Ян

Den of Rich

Play Episode Listen Later Apr 13, 2021 103:10


David Yang, Ph.D., Co-founder Yva.ai, Founder and Board Director at ABBYY, member of Band of Angels, Silicon Valley based serial entrepreneur specializes in AI, founded 12 companies. David is a globally recognized thought leader, speaker and advisor in the areas of AI, Artificial Neural Networks, People Analytics, Smart 360, Peer-to-peer Continuous Listening, Organisation Network Analytics (ONA), HR metrics, talent measurement, workforce analytics, performance management and Organizational Change. He's the Founder & Chairman of the Board of ABBYY - world leading developer of AI, Content Intelligence, Text Analytics with 1300+ employees in 14 offices in 11 countries. More than 50 million users and thousands of enterprise customers in 200 countries rely on ABBYY's solutions including: PwC, McDonalds, Xerox, Toyota, Yum!Restaurants, Deloitte, PepsiCo, Volkswagen, UCSF. World leading RPA vendors, including UiPath, BluePrism, Robiquity among others rely on ABBYY's AI technologies. Currently David is dedicated to Yva.ai, the company that develops an Ethical People Analytics, Continuous Engagement and Performance Management Platform which helps organizations save millions of dollars by detecting burnout and predicting resignations of key employees. Yva makes 360 scalable, accurate and real-time. Each employee receives personal dashboards and recommendations. The Smart ONA helps executives find informal leaders to lead cross-functional collaboration, agile transformation and customer-focused experience. David created Cybiko, world first hand-held wireless communication computer for teenagers; co-founded iiko, new generation AI powered restaurant and hospitality industry solution; co-founded Plazius, a customer loyalty and mobile payment platform; founded a number of creative art-based ventures, including: FAQ-Café studio, DeFAQto; co-founded Ayb Educational Foundation and Ayb School. David is also a frequent keynote speaker at conferences and corporate events and author of numerous patents and scientific publications. FIND DAVID ON SOCIAL MEDIA LinkedIn | Facebook | Twitter | Instagram © Copyright 2022 Den of Rich. All rights reserved.

Den of Rich
#113 - David Yang

Den of Rich

Play Episode Listen Later Apr 13, 2021 103:10


David Yang, Ph.D., Co-founder Yva.ai, Founder and Board Director at ABBYY, member of Band of Angels, Silicon Valley based serial entrepreneur specializes in AI, founded 12 companies.David is a globally recognized thought leader, speaker and advisor in the areas of AI, Artificial Neural Networks, People Analytics, Smart 360, Peer-to-peer Continuous Listening, Organisation Network Analytics (ONA), HR metrics, talent measurement, workforce analytics, performance management and Organizational Change.He's the Founder & Chairman of the Board of ABBYY - world leading developer of AI, Content Intelligence, Text Analytics with 1300+ employees in 14 offices in 11 countries. More than 50 million users and thousands of enterprise customers in 200 countries rely on ABBYY's solutions including: PwC, McDonalds, Xerox, Toyota, Yum!Restaurants, Deloitte, PepsiCo, Volkswagen, UCSF. World leading RPA vendors, including UiPath, BluePrism, Robiquity among others rely on ABBYY's AI technologies.Currently David is dedicated to Yva.ai, the company that develops an Ethical People Analytics, Continuous Engagement and Performance Management Platform which helps organizations save millions of dollars by detecting burnout and predicting resignations of key employees. Yva makes 360 scalable, accurate and real-time. Each employee receives personal dashboards and recommendations. The Smart ONA helps executives find informal leaders to lead cross-functional collaboration, agile transformation and customer-focused experience.David created Cybiko, world first hand-held wireless communication computer for teenagers; co-founded iiko, new generation AI powered restaurant and hospitality industry solution; co-founded Plazius, a customer loyalty and mobile payment platform; founded a number of creative art-based ventures, including: FAQ-Café studio, DeFAQto; co-founded Ayb Educational Foundation and Ayb School.David is also a frequent keynote speaker at conferences and corporate events and author of numerous patents and scientific publications.FIND DAVID ON SOCIAL MEDIALinkedIn | Facebook | Twitter | Instagram

AI in Banking Podcast
Text Analytics and NLP in Financial Services - with Ram Sukumar of IndiumSoft

AI in Banking Podcast

Play Episode Listen Later Feb 1, 2021 28:57


Today's guest is the great and brilliant Ram Sukumar, Co-Founder and CEO of Indium Software, a Technology Solutions company with deep expertise in Digital and QA services, such as AI, Advanced Analytics, Text Analytics Center of Excellence, Big Data, Data Engineering, Stream Processing & Data Virtualization, and low-code development across platforms. Ram speaks with me today about particular workflows for text analytics. What does it look like in operation? Where do people get it wrong, where can it fit into play, and where is the potential value for text analytics when it comes to ideas for use cases? What are the practical realities of what you need to expect when deploying these technologies? Are you interested in more use cases and best practices around AI adoption and ROI? Check out Emerj Plus at emerj.com/p1

Silicon Valley Tech And AI With Gary Fowler
David Yang: Emotion AI and Going Global

Silicon Valley Tech And AI With Gary Fowler

Play Episode Listen Later Dec 14, 2020 52:26


Having founded — and led to success — a number of successful startups and GSD Venture Studios, the show will revolve around interviews with thought leaders, bringing them together to speak about trends and directions in global technology and AI. And the first episode will offer a peek into the future of Emotion AI and Going Global, featuring AI serial entrepreneur, Dr. David Yang, CEO and Co-Founder of Yva.ai and Founder of ABBYY. Based in Silicon Valley, David is a member of the Band of Angels and has the most captivating journey that brought him to where he is today. He started his first company, ABBYY, back in 1989 when he was in his 4th year as a student at MIPT. The company became a global success over years — today, ABBYY has over 1,000 employees and is a leading developer of Artificial Intelligence, Content Intelligence, Optical Character Recognition, and Text Analytics software with offices in 11 countries. Thousands of companies and more than 50 million users in 200 countries rely on ABBYY applications and solutions. David has become a leading authority of the use of AI for employee engagement — his current company, Yva.ai, explores new applications of machine learning and AI to revolutionize employee engagement, happiness and productivity in the current context of global companies, remote teams, and increased need for solutions to accommodate the changing landscape of business and people operations. About GSD Venture Studios: We travel the world investing in resilient teams bold enough to #GoGlobal. For too long self-motivated entrepreneurs have navigated the minefield of challenges to launching a global company with very little support. The last thing you should bet on in this situation is an unproven team that you don't trust. GSD Venture Studios travels to every corner of the globe inviting resilient teams to establish partnerships that ensure organizations grow the right way, without games or gimmicks. Unlike traditional investors, we take senior operational (often co-founder) roles in these companies, capitalizing on our trusted reputation, experiences, and network to drive explosive growth. More information can be found at: https://www.gsdvs.com/post/interview-with-derek-everything-you-need-to-know-about-gsd About Gary Fowler: Gary has 30 years of operational, marketing, sales, and executive leadership experience including a $1.35 billion dollar exit and a successful Nasdaq IPO. He has founded 15 companies: DY Investments, Yva.ai, GVA LaunchGurus Venture Fund, GSD Venture Studios, Broadiant, etc. Under his leadership, Yva.ai was named one of the Top 10 AI HR Tech companies globally. Gary was recently named one of the top 10 Most Influential AI Executives to Watch in 2020. He is a writer at Forbes Magazine and published over 60 articles on AI and Technology over the last year. More information can be found at: https://www.gsdvs.com/post/meet-gary-fowler

AI Show  - Channel 9
What’s New in Text Analytics: Opinion Mining and Async API

AI Show - Channel 9

Play Episode Listen Later Dec 1, 2020 17:56


In today's AI show, you will learn about the recently launched Opinion Mining and Async offerings of Text Analytics. In the first half, we will discuss how Opinion Mining (an extension of Sentiment Analysis) helps explore customers' perception of aspects/opinions, such as specific attributes of products or services, in text. In the second half, we will learn about the new Async capabilities of Text Analytics, which will allow bundling various skills of Text Analytics and also allows large amount of text up to 125K characters to be sent to Text Analytics via /analyze endpoint.Jump To:[00:55] - Introduction to Cognitive Services[01:50] - Introduction to Text Analytics[02:51] - Introduction to Opinion Mining[04:39] - Introduction to Async API (/analyze)[06:43] - Demo of Opinion Mining[11:30] - Demo of Async API (/analyze)[14:19] - Common scenarios of using Text Analytics[16:57] - Getting started with Text Analytics More Information:QuickStart Use the Text AnalyticsSentiment analysis and Opinion MiningGettingStarted with Text AnalyticsCreate a Free account (Azure)Deep Learning vs. Machine Learning Get Started with Machine LearningDon't miss new episodes, subscribe to the AI Show

Channel 9
What’s New in Text Analytics: Opinion Mining and Async API | AI Show

Channel 9

Play Episode Listen Later Dec 1, 2020 17:56


In today's AI show, you will learn about the recently launched Opinion Mining and Async offerings of Text Analytics. In the first half, we will discuss how Opinion Mining (an extension of Sentiment Analysis) helps explore customers' perception of aspects/opinions, such as specific attributes of products or services, in text. In the second half, we will learn about the new Async capabilities of Text Analytics, which will allow bundling various skills of Text Analytics and also allows large amount of text up to 125K characters to be sent to Text Analytics via /analyze endpoint.Jump To:[00:55] - Introduction to Cognitive Services[01:50] - Introduction to Text Analytics[02:51] - Introduction to Opinion Mining[04:39] - Introduction to Async API (/analyze)[06:43] - Demo of Opinion Mining[11:30] - Demo of Async API (/analyze)[14:19] - Common scenarios of using Text Analytics[16:57] - Getting started with Text Analytics More Information:QuickStart Use the Text AnalyticsSentiment analysis and Opinion MiningGettingStarted with Text AnalyticsCreate a Free account (Azure)Deep Learning vs. Machine Learning Get Started with Machine LearningDon't miss new episodes, subscribe to the AI Show

AI Show  - Channel 9
Introducing Text Analytics for Health

AI Show - Channel 9

Play Episode Listen Later Nov 24, 2020 14:58


Text Analytics for health is a preview feature of Text Analytics which enables developers to process and extract insights from unstructured clinical and biomedical text. Through a single API call, using NLP techniques such as named entity recognition, entity linking, relation extraction and entity negation, Text Analytics can extract critical and relevant medical information without the need for time-intensive, manual development of custom models. We will demonstrate how to make API calls to the synchronous operation offered in a downloadable container and also to the asynchronous operation offered in the hosted web API.Jump To:[01:49] About Text Analytics[04:20] Demo Text Analytics for Health[08:27] Demo API in postman[12:51] Data privacy[14:14] Find moreMore Information:Text Analytics for Health - BlogText Analytics for Health - Documentation Create a Free account (Azure)Deep Learning vs. Machine Learning Get Started with Machine LearningDon't miss new episodes, subscribe to the AI Show

Datenbusiness Podcast
#39 Datenchefs #30 mit Dr. Gerhard Rolletschek | Co-Founder & MD Glanos | Text-Analytics & Business-Monitoring

Datenbusiness Podcast

Play Episode Listen Later Sep 20, 2020 65:43


Dr. Gerhard Rolletschek ist einer der beiden Gründer der Glanos GmbH und Experte auf dem Gebiet der semantischen Datamining-Technologie mit einem speziellen Fokus auf Firmendatenextraktion und –aggregation. Er ist promovierter Computerlinguist und hat seit 2003 zahlreiche Projekte mit renommierten internationalen Kunden im Bereich Informationsextraktion und Suchtechnologien durchgeführt. Die wichtigsten Themen im Überblick: Zu den Ursprüngen von Hyperlinks und Hypertext. (ab 02:43) Was macht Glanos? (ab 07:26) Das Problem der Re-Identifikation. (ab 10:03) Mitbewerberanalyse mittels Text-Mining. (ab 16:20) Wieviel Linguistik steckt noch in heutigen Text-Analytics-Lösungen? (ab 21:52) Mitbewerberanalyse als konkretes Beispiel. (ab 25:33) Klassifikation von Stellenanzeigen als weiteres konkretes Beispiel. (ab 32:05) Fake News erkennen. (ab 38:46) Herausforderungen mit Social Media. (ab 46:28) Einschätzung zu GPT-3. (ab 54:34)

Ad Addict Podcast
2BT EP.49 | การวิเคราะห์ภาษาและรูปภาพบน Social Media - หมีเรื่องมาเล่า

Ad Addict Podcast

Play Episode Listen Later Aug 30, 2020 54:24


อีพีผมชวนแก็งค์ Machine Learning Engineer ของ Wisesight มาคุยเรื่อง NLP, Text Analytics, และการวิเคราะห์รูปภาพบน Social Media ว่ามีความยากง่ายอย่างไร และเอาไปใช้ทำอะไรได้บ้าง

Valuewalk Soundcloud RSS feed
Using Natural Language Processing with Nate Storch, CEO, Co-Founder of Amenity Analytics

Valuewalk Soundcloud RSS feed

Play Episode Listen Later Apr 10, 2020 33:33


Hello Podcast Listeners, Today is a very special episode with Nate Storch, co-founder and CEO at Amenity Analytics, a provider of big data analytical solutions and software to the investment management industry. Nate has over 15 years of investment management experience. Before founding Amenity, he was Managing Partner at Pilgrim Hill Capital and was Partner and Portfolio Manager at Talpion Fund Management. He was also Partner at One East Holdings. Nate earned his B.A. in Government from Harvard. In today’s episode we discuss Natural Language Processing, Text Analytics, and how investors can use technology to enhance their tasks. Enjoy and thanks for the listen!

The Top Entrepreneurs in Money, Marketing, Business and Life
1702 She's About to Pass $1m in ARR in AI Text Analytics Feedback

The Top Entrepreneurs in Money, Marketing, Business and Life

Play Episode Listen Later Mar 22, 2020 21:47


Dr. Narjès Boufaden is a pioneer and thought leader in the Natural Language Processing field who transitioned from the academic world to run her company. She founded Keatext in 2010 for professional services in AI, then pivoted into a product company in 2015. Narjes is also a contributor at Forbes and a dedicated mentor at Techstar AI.

Analytics and Data Science Pulse
Analytics and Data Science Pulse - #015. Q&A with Fredrik Olsson of RISE in Stockholm

Analytics and Data Science Pulse

Play Episode Listen Later Dec 3, 2019 31:48


In this week's episode, I was joined by former Chief Data Officer and Partner of Text Analytics company Gavagai and now of RISE (Research Institutes of Sweden), Fredrik Olsson. Fredrik is someone who has an impressive background within the sphere of Language Technology (PhD in Computational Linguistics), and he offers a unique insight as to what life is like when you move from the commercial sector back into research (where he is now). I think that Fredrik is an excellent example of someone who is comfortable with being uncomfortable and is always open to stepping outside of his comfort zone. I always enjoy having a chat with him and thank you for recording this with me Fredrik!

partner sweden stockholm pulse fredrik chief data officers computational linguistics text analytics fredrik olsson rise research institutes analytics and data science gavagai
Customer Experience Talks
Don't Be Afraid to Ask Why: Text Analytics in Action

Customer Experience Talks

Play Episode Listen Later Aug 21, 2019 18:32


In this podcast, we discussed how businesses can reveal the value behind the unstructured data. Here are some of our discussion topics:What to expect and not to expect from text analyticsChallenges organizations face when implementing text analyticsLatest trends and inspiring best practices

HRExaminer Radio Hour #HRRH
HRExaminer Executive Conversations w/ Andrew Marritt, OrganizationView | Aug 16

HRExaminer Radio Hour #HRRH

Play Episode Listen Later Aug 16, 2019 30:00


Andrew is the Founder of OrganizationView, one of the earliest European People Analytics practices now specialising in workforce text analytics with their Workometry service. Before starting OrganizationView in 2010 Andrew spent a decade in corporate HR departments managing global technology / analytics projects especially in the areas of global resourcing and employee experience. His earlier background was in the management consultancy sector. A hands-on data scientist he learnt to program in the 70s on a kit-built home computer. He is a member of the CIPD’s advisory group on People Analytics and teaches HR Analytics as part of the leading Swiss HR Masters degree program. Andrew lives with his family in St. Moritz, Switzerland. Outside work he enjoys skiing, skeleton and long-distance swimming.

Experiencing Data with Brian O'Neill
010 – Carl Hoffman (CEO, Basis Technology) on text analytics, NLP, entity resolution, and why exact match search is stupid

Experiencing Data with Brian O'Neill

Play Episode Listen Later Apr 9, 2019 45:04


My guest today is Carl Hoffman, the CEO of Basis Technology, and a specialist in text analytics. Carl founded Basis Technology in 1995, and in 1999, the company shipped its first products for website internationalization, enabling Lycos and Google to become the first search engines capable of cataloging the web in both Asian and European languages. In 2003, the company shipped its first Arabic analyzer and began development of a comprehensive text analytics platform. Today, Basis Technology is recognized as the leading provider of components for information retrieval, entity extraction, and entity resolution in many languages. Carl has been directly involved with the company’s activities in support of U.S. national security missions and works closely with analysts in the U.S. intelligence community. Many of you work all day in the world of analytics: numbers, charts, metrics, data visualization, etc. But, today we’re going to talk about one of the other ingredients in designing good data products: text! As an amateur polyglot myself (I speak decent Portuguese, Spanish, and am attempting to learn Polish), I really enjoyed this discussion with Carl. If you are interested in languages, text analytics, search interfaces, entity resolution, and are curious to learn what any of this has to do with offline events such as the Boston Marathon Bombing, you’re going to enjoy my chat with Carl. We covered: How text analytics software is used by Border patrol agencies and its limitations. The role of humans in the loop, even with good text analytics in play What actually happened in the case of the Boston Marathon Bombing? Carl’s article“Exact Match” Isn’t Just Stupid. It’s Deadly. The 2 lessons Carl has learned regarding working with native tongue source material. Why Carl encourages Unicode Compliance when working with text, why having a global perspective is important, and how Carl actually implements this at his company Carl’s parting words on why hybrid architectures are a core foundation to building better data products involving text analytics Resources and Links: Basis Technology Carl’s article: “Exact Match” isn’t Just Stupid. It’s Deadly. Carl Hoffman on LinkedIn Quotes from Today’s Episode “One of the practices that I’ve always liked is actually getting people that aren’t like you, that don’t think like you, in order to intentionally tease out what you don’t know. You know that you’re not going to look at the problem the same way they do…” — Brian O’Neill “Bias is incredibly important in any system that tries to respond to human behavior. We have our own innate cultural biases that we’re sometimes not even aware of. As you [Brian] point out, it’s impossible to separate human language from the underlying culture and, in some cases, geography and the lifestyle of the people who speak that language…” — Carl Hoffman “What I can tell you is that context and nuance are equally important in both spoken and written human communication…Capturing all of the context means that you can do a much better job of the analytics.” — Carl Hoffman “It’s sad when you have these gaps like what happened in this border crossing case where a name spelling is responsible for not flagging down [the right] people. I mean, we put people on the moon and we get something like a name spelling [entity resolution] wrong. It’s shocking in a way.” — Brian O’Neill “We live in a world which is constantly shades of gray and the challenge is getting as close to yes or no as we can.”– Carl Hoffman Episode Transcript Brian: Hey everyone, it’s Brian here and we have a special edition of Experiencing Data today. Today, we are going to be talking to Carl Hoffman who’s the CEO of Basis Technology. Carl is not necessarily a traditional what I would call Data Product Manager or someone working in the field of creating custom decision support tools. He is an expert in text analytics and specifically Basis Technology focuses on entity resolution and resolving entities across different languages. If your product, or service, or your software tool that you’re using is going to be dealing with inputs and outputs or search with multiple languages, I think your going to find my chat with Carl really informative. Without further ado here’s my chat Mr. Carl Hoffman. All right. Welcome back to Experiencing Data. Today, I’m happy to have Carl Hoffman on the line, the CEO of Basis Technology, based out of Cambridge, Massachusetts. How’s it going, Carl? Carl: Great. Good to talk to you, Brian. Brian: Yeah, me too. I’m excited. This episode’s a little but different. Basis Tech primarily focuses on providing text analytics more as a service as opposed to a data product. There are obviously some user experience ramifications on the downstream side of companies, software, and services that are leveraging some of your technology. Can you tell people a little bit about the technology of Basis and what you guys do? Carl: There are many companies who are in the business of extracting actionable information from large amounts of dirty, unstructured data and we are one of them. But what makes us unique is our ability to extract what we believe is one of the most difficult forms of big data, which is text in many different languages from a wide range of sources. You mentioned text analytics as a service, which is a big part of our business, but we actually provide text analytics in almost every conceivable form. As a service, as an on-prem cloud offering, as a conventional enterprise software, and also as the data fuel to power your in-house text analytics. There’s another half of our business as well which is focused specifically on one of the most important sources of data, which is what we call digital forensics or cyber forensics. That’s the challenge of getting data off of digital media that maybe either still in use or dead. Brian: Talk to me about dead. Can you go unpack that a little bit? Carl: Yes. Dead basically means powered off or disabled. The primary application there is for corporate investigators or for law enforcement who are investigating captured devices or digital media. Brian: Got it. Just to help people understand some of the use cases that someone would be leveraging some of the capabilities of your platforms, especially the stuff around entity resolution, can you talk a little bit about like my understanding, for example, one use case for your software is obviously border crossings, where your information, your name is going to be looked up to make sure that you should be crossing whatever particular border that you’re at. Can you talk to us a little bit about what’s happening there and what’s going on behind the scenes with your software? Like what is that agent doing and what’s happening behind the scenes? What kind of value are you providing to the government at that instance? Carl: Border crossings or the software used by border control authorities is a very important application of our software. From a data representational challenge, it’s actually not that difficult because for the most part, border authorities work with linear databases of known individuals or partially known individuals and queries. Queries may be the form manually typed by an officer or maybe scan of a passport. The complexity comes in when a match must be scored, where a decision must be rendered as to whether a particular query or a particular passport scan matches any of the names present on a watch list. Those watch list can be in many different formats. They can come from many different sources. Our software excels at performing that match at very high accuracy, regardless of the nature of the query and regardless of the source of the underlying watch list. Brian: I assume those watch lists may vary in the level of detail around for example, aliases, spelling, which alphabet they were being printed in. Part of the value of what your services is doing is helping to say, “At the end of the day, entity number seven on the list is one human being who may have many ways of being represented with words on a page or a screen,” so the goal obviously is to make sure that you have the full story of that one individual. Am I correct that you may get that in various formats and different levels of detail? And part of what your system is doing is actually trying to match up that person or give it what you say a non-binary response but a match score or something that’s more of a gray response that says, “This person may also be this person.” Can you compact that a little bit for us? Carl: Your remarks are exactly correct. First, what you said about gray is very important. These decisions are rarely 100% yes or no. We live in a world which is constantly shades of gray and the challenge is getting us close to yes or no as we can. But the quality of the data in watch lists can vary pretty wildly, based on the prominence and the number of sources. The US border authorities must compile information from many different sources, from UN, from Treasury Department, from National Counterterrorism Center, from various states, and so on. The amount of detail and the degree of our certainty regarding that data can vary from name to name. Brian: We talked about this when we first were chatting about this episode. Am I correct when I think about one of the overall values you’re doing is obviously we’re offloading some of the labor of doing this kind of entity resolution or analysis onto software and then picking up the last mile with human, to say, “Hey, are these recommendations correct? Maybe I’ll go in and do some manual labor.” Is that how you see it, that we do some of the initial grunt work and you present an almost finished story, and then the human comes in and needs to really provide that final decision at the endpoint? Are we doing enough of the help with the software? At what point should we say, “That’s no longer a software job to give you a better score about this person. We think that really requires a human analysis at this point.” Is there a way to evaluate or is that what you think about like, “Hey, we don’t want to go past up that point. We want to stop here because the technology is not good enough or the data coming in will never be accurate enough and we don’t want to go past that point.” I don’t know if that makes sense. Carl: It does makes sense. I can’t speak for all countries but I can say that in the US, the decision to deny an individual entry or certainly the decision to apprehend an individual is always made by a human. We designed our software to assume a human in the loop for the most critical decisions. Our software is designed to maximize the value of the information that is presented to the human so that nothing is overlooked. Really, the two biggest threats to our national security are one, having very valuable information overlooked, which is exactly what happened in the case of the Boston Marathon bombing. We had a great deal of information about Tamerlan and Dzhokhar Tsarnaev, yet that information was overlooked because the search engines failed to surface it in response to queries by a number of officials. And secondly, detaining or apprehending innocent individuals, which hurts our security as much as allowing dangerous individuals to pass. Brian: This has been in the news somewhat but talk about the “glitch” and what happened in that Boston Marathon bombing in terms of maybe some of these tools and what might have happened or not what might have happened, but what you understand was going on there such that there was a gap in this information. Carl: I am always very suspicious when anyone uses the word ‘glitch’ with regard to any type of digital equipment because if that equipment is executing its algorithm as it has been programmed to do, then you will get identical results for identical inputs. In this case, the software that was in use at the time by US Customs and Border Protection was executing a very naive name-matching algorithm, which failed to match two different variant spellings of the name Tsarnaev. If you look at the two variations for any human, it would seem almost obvious that the two variations are related and are in fact connected to the same name that’s natively written in Cyrillic. What really happened was a failure on the part of the architects of that name mentioning system to innovate by employing the latest technology in name-matching, which is what my company provides. In the aftermath of that disaster, our software was integrated into the border control workflow, first with the goal of redacting false-positives, and then later with the secondary goal of identifying false negatives. We’ve been very successful on both of those challenges. Brian: What were the two variants? Are you talking about the fact that one was spelled in Cyrillic and one was spelled in a Latin alphabet? They didn’t bring back data point A and B because they look like separate individuals? What was it, a transliteration? Carl: They were two different transliterations of the name Tsarnaev. In one instance, the final letters in the names are spelled -naev and the second instance it’s spelled -nayev. The presence or absence of that letter y was the only difference between the two. That’s a relatively simple case but there are many similar stories for more complex names. For instance, the 2009 Christmas bomber who successfully boarded a Northwest Delta flight with a bomb in his underwear, again because of a failure to match two different transliterations of his name. But in his case, his name is Umar Farouk Abdulmutallab. There was much more opportunity for divergent transliterations. Brian: On this kind of topic, you wrote an interesting article called “Exact Match” Isn’t Just Stupid. It’s Deadly. You’ve talked a little bit about this particular example with the Boston Marathon bombing. You mentioned that they’re thinking globally about building a product out. Can you talk to us a little about what it means to think globally? Carl: Sure. Thinking globally is really a mindset and an architectural philosophy in which systems are built to accommodate multiple languages and cultures. This is an issue not just with the spelling of names but with support for multiple writing systems, different ways of rendering and formatting personal names, different ways of rendering, formatting, and parsing postal addresses, telephone numbers, dates, times, and so on. The format of a questionnaire in Japanese is quite different from the format of a questionnaire in English. If you will get any complex global software product, there’s a great deal of work that must be done to accommodate the needs of a worldwide user base. Brian: Sure and you’re a big fan of Unicode-compliant software, am I correct? Carl: Yes. Building Unicode compliance is equivalent to building a solid stable foundation for an office tower. It only gets you to the ground floor, but without it, the rest of the tower starts to lean like the one that’s happening in San Francisco right now. Brian: I haven’t heard about that. Carl: There’s a whole tower that’s tipping over. You should read it. It’s a great story. Brian: Foundation’s not so solid. Carl: Big lawsuit’s going on right now. Brian: Not the place you want to have a sagging tower either. Carl: Not the place but frankly, it’s really quite comparable because I’ve seen some large systems that will go unnamed, where there’s legacy technology and people are unaware perhaps why it’s so important to move from Python version 2 to Python version 3. One of the key differences is Unicode compliance. So if I hear about a large-scale enterprise system that’s based on Python version 2, I’m immediately suspicious that it’s going to be suitable for a global audience. Brian: I think about, from an experience standpoint, inputs, when you’re providing inputs into forms and understanding what people are typing in. If it’s a query form, obviously giving people back what they wanted and not necessarily what they typed in. We all take for granted things like this spelling correction, and not just the spelling correction, but in Google when you type in something, it sometimes give you something that’s beyond a spelling thing, “Did you mean X, Y, and Z?” I would think that being in the form about what people are typing into your form fields and mining your query logs, this is something I do sometimes with clients when they’re trying to learn something. I actually just read an article today about dell.com and the top query term on dell.com is ‘Google,’ which is a very interesting thing. I would be curious to know why people are typing that in. Is it really like people are actually trying to access Google or are they trying to get some information? But the point is to understand the input side and to try to return some kind of logical output. Whether it’s text analytics that’s providing that or it’s name-matching, it’s being aware of that and it’s sad when you have these gaps like what happened in this border crossing case where a name spelling is responsible for not flagging down these people. I mean, we put people on the moon and we get something like a name spelling wrong. It’s shocking in a way. I guess for those who are working in tech, we can understand how it might happen, but it’s scary that that’s still going on today. You’ve probably seen many other. Are you able to talk about it? Obviously, you have some in the intelligence field and probably government where you can’t talk about some of your clients, but are there other examples of learning that’s happened that, even if it’s not necessarily entity resolution where you’ve put dots together with some of your platform? Carl: I’ll say the biggest lesson that I’ve learned from nearly two decades of working on government applications involving multi-lingual data is the importance of retaining as much of the information in its native form as possible. For example, there is a very large division of the CIA which is focused on collecting open source intelligence in the form of newspapers, magazines, the digital equivalent of those, radio broadcast, TV broadcasts and so one. It’s a unit which used to be known as the Foreign Broadcast Information Service, going back to Word War II time, and today it’s called the Open Source Enterprise. They have a very large collection apparatus and they produce some extremely high quality products which are summaries and translations from sources in other languages. In their workflow, previously they would collect information, say in Chinese or in Russian, and then do a translation or summary into English, but then would discard the original or the original would be hidden from their enterprise architecture for query purposes. I believe that is no longer the case, but retaining the pre-translation original, whether it’s open source, closed source, commercial, enterprise information, government-related information, is really very important. That’s one lesson. The other lesson is appreciating the limits of machine translation. We’re increasingly seeing machine translation integrated into all kinds of information systems, but there needs to be a very sober appreciation of what is and what is not achievable and scalable by employing machine translation in your architecture. Brian: Can you talk at all about the translation? We have so much power now with NLP and what’s possible with the technology today. As I understand it, when we talk about translation, we’re talking about documents and things that are in written word that are being translated from one language to another. But in terms of spoken word, and we’re communicating right now, I’m going to ask you two questions. What do you know about NLP and what do you know about NLP? The first one I had a little bit of attitude which assumes that you don’t know too much about it, and the second one, I was treating you as an expert. When this gets translated into text, it loses that context. Where are we with that ability to look at the context, the tone, the sentiment that’s behind that? I would imagine that’s partly why you’re talking about saving the original source. It might provide some context like, “What are the headlines were in the paper?” and, “Which paper wrote it?” and, “Is there a bias with that paper?” whatever, having some context of the full article that that report came from can provide additional context. Humans are probably better at doing some of that initial eyeball analysis or having some idea of historically where this article’s coming from such that they can put it in some context as opposed to just seeing the words in a native language on a computer screen. Can you talk a little bit about that or where we are with that? And am I incorrect that we’re not able to look at that sentiment? I don’t even know how that would translate necessarily unless you had a playing back of a recording of someone saying the words. You have translation on top of the sentiment. Now you’ve got two factors of difficulty right there and getting it accurate. Carl: My knowledge of voice and speech analysis is very naive. I do know there’s an area of huge investment and the technology is progressing very rapidly. I suspect that voice models are already being built that can distinguish between the two different intonations you used in asking that question and are able to match those against knowledge bases separately. What I can tell you is that context and nuance are equally important in both spoken and written human communication. My knowledge is stronger when it comes to its written form. Capturing all of the context means that you can do a much better job of the analytics. That’s why, say, when we’re analyzing a document, we’re looking not only the individual word but the sentence, the paragraph, where does the text appear? Is it in the body? Is it in a heading? Is it in a caption? Is it in a footnote? Or if we’re looking at, say, human-typed input—I think this is where your audience would care if you’re designing forms or search boxes—there’s a lot that can be determined in terms of how the input is typed. Again, especially when you’re thinking globally. We’re familiar with typing English and typing queries or completing forms with the letters A through Z and the numbers 0 through 9, but the fastest-growing new orthography today is emoticons and emoji offer a lot of very valuable information about the mindset of the author. Say that we look at Chinese or Japanese, which are basically written with thousand-year-old emoji, where an individual must type a sequence of keys in order to create each of the Kanji or Hanzu that appears. There’s a great deal of information we can capture. For instance, if I’m typing a form in Japanese, saying I’m filling out my last name, and then my last name is Tanaka. Well, I’m going to type phonetically some characters that represent Tanaka, either in Latin letters or one of the Japanese phonetic writing systems, then I’m going to pick from a menu or the system is going to automatically pick for me the Japanese characters that represent Tanaka. But any really capable input system is going to keep both whatever I typed phonetically and the Kanji that I selected because both of those have value and the association between the two is not always obvious. There are similar ways of capturing context and meaning in other writing systems. For instance, let’s say I’m typing Arabic not in Arabic script but I’m typing with Roman letters. How I translate from those Roman letters into the Arabic alphabet may vary, depending upon if I’m using Gulf Arabic, or Levantine Arabic, or Cairene Arabic, and say the IP address of the person doing the typing may factor into how I do that transformation and how I interpret those letters. There’s examples for many other writing systems other than the Latin alphabet. Brian: I meant to ask you. Do you speak any other languages or do you study any other languages? Carl: I studied Japanese for a few years in high school. That’s really what got me into using computers to facilitate language understanding. I just never had the ability to really quickly memorize all of the Japanese characters, the radical components, and the variant pronunciations. After spending countless hours combing through paper dictionaries, I got very interested in building electronic dictionaries. My interest in electronic dictionaries eventually led to search engines and to lexicons, algorithms powered by lexicons, and then ultimately to machine learning and deep learning. Brian: I’m curious. I assume you need to employ either a linguist or at least people that speak multiple languages. One concern with advanced analytics right now and especially anything with prediction, is bias. I speak a couple of different languages and I think one of the coolest things about learning another language is seeing the world through another context. Right now, I’m learning Polish and there’s the concept of case and it doesn’t just come down to learning the prefixes and suffixes that are added to words. Effectively, that’s what the output is but it’s even understanding the nuance of when you would use that and what you’re trying to convey, and then when you relay it back to your own language, we don’t even have an equivalent between this. We would never divide this verb into two different sentiments. So you start to learn what you don’t even know to think about. I guess what I’m asking here is how do you capture those things? Say, in our case where I assume you’re an American and I am to, so we have our English that we grew up with and our context for that. How do you avoid bias? Do you think about bias? How do you build these systems in terms of approaching it from a single language? Ultimately, this code is probably written in English, I assume. Not to say that the code would be written in a different language but just the approach when you’re thinking about all these systems that have to do with language, where does that come in having integrating other people that speaks other languages? Can you talk about that a little bit? Carl: Bias is incredibly important in any system that tries to respond to human behavior. We have our own innate cultural biases that we’re sometimes not even aware of. As you point out, it’s impossible to separate human language from the underlying culture and, in some cases, geography and the lifestyle of the people who speak that language. Yes, this is something that we think about. I disagree with your remark about code being written in English. The most important pieces of code today are the frameworks for implementing various machine learning and deep learning architectures. These architectures for the most part are language or domain-agnostic. The language bias tends to creep in as an artifact of the data that we collect. If I were to, say, harvest a million pages randomly on the internet, a very large percentage of those pages would be in English, out of proportion to the proportion of the population of the planet who speaks English, just because English is common language for commerce, science, and so on. The bias comes in from the data or it comes in from the mindset of the architect, who may do something as simple-minded as allocating only eight bits per character or deciding that Python version 2 is an acceptable development platform. Brian: Sure. I should say, I wasn’t so much speaking about the script, the code, as much as I was thinking more about the humans behind it, their background, and their language that they speak, or these kinds of choices that you’re talking about because they’re informed by that person’s perspective. But thank you for clarifying. Carl: I agree with that observation as well. You’re certainly right. Brian: Do you have a way? You’re experts in this area and you’re obviously heavily invested in this area. Are there things that you have to do to prevent that bias, in terms of like, “We know what we don’t know about it, or we know enough about it but we don’t know if about, so we have a checklist or we have something that we go through to make sure that we’re checking ourselves to avoid these things”? Or is it more in the data collection phase that you’re worried about more so than the code or whatever that’s actually going to be taking the data and generating the software value at the other end? Is it more on the collection side that you’re thinking about? How do you prevent it? How do you check yourself or tell a client or customer, “Here’s how we’ve tried to make sure that the quality of what we’re giving you is good. We did A, B, C, and D.” Maybe I’m making a bigger issue out of this than it is. I’m not sure. Carl: No, it is a big issue. The best way to minimize that cultural bias is by building global teams. That’s something that we’ve done from the very beginning days of our company. We have a company in which collectively the team speaks over 20 languages, originate from many different countries around the world, and we do business in native countries around the world. That’s just been an absolute necessity because we produce products that are proficient in 40 different human languages. If you’re a large enterprise, more than 500 people, and you’re targeting markets globally, then you need to build a global team. That applies to all the different parts of the organization, including the executive team. It’s rare that you will see individuals who are, say, American culture with no meaningful international experience being successful in any kind of global expansion. Brian: That’s pretty awesome that you have that many languages going in the staff that you have working at the company. That’s cool and I think it does provide a different perspective on it. We talk about it even in the design firm. Sometimes, early managers in the design will want to go hire a lot of people that look like they do. Not necessarily physically but in terms of skill set. One of the practices that I’ve always liked is actually getting people that aren’t like you, that don’t think like you, in order to intentionally tease out what you don’t know, you know that you’re not going to look at the problem the same way they are, and you don’t necessarily know what the output is, but you can learn that there’s other perspectives to have, so too many like-minded individuals doesn’t necessarily mean that it’s better. I think that’s cool. Can you talk to me a little bit about one of the fun little nuggets that stuck in my head and I think you’ve attributed to somebody else, but was the word about getting insights from medium data. Can you talk to us about that? Carl: Sure. I should first start by crediting the individual who planted that idea in my head, which is Dr. Catherine Havasi of the MIT Media Lab, who’s also a cofounder of a company called Luminoso, which is a partner of ours. They do common sense understanding. The challenge with building truly capable text analytics from large amounts of unstructured text is obtaining sufficient volume. If you are a company on the scale of Facebook or Google, you have access to truly enormous amount of text. I can’t quantify it in petabytes or exabytes, but it is a scale that is much greater than the typical global enterprise or Fortune 2000 company, who themselves may have very massive data lakes. But still, those data lakes are probably three to five orders of magnitudes smaller than what Google or Facebook may have under their control. That intermediate-sized data, which is sloppily referred to as big data, we think of it as medium data. We think about the challenge of allowing companies with medium data assets to obtain big data quality results, or business intelligence that’s comparable to something that Google or Facebook might be able to obtain. We do that by building models that are hybrid, that combine knowledge graphs or semantic graphs, derived from very large open sources with the information that they can extract from their proprietary data lakes, and using the open sources and the models that we build as amplifiers for their own data. Brian: I believe when we were talking, you have mentioned a couple of companies that are building products on top of you. Difio, I think, was one, and Tamr, and Luminoso. So is that related to what these companies are doing? Carl: Yes, it absolutely is related. Luminoso, in particular, is using this process of synthesizing results from their customers, proprietary data with their own models. The Luminoso team grew out of the team at MIT that built something called Constant Net, which is a very large net of graph in multiple languages. But actually, Difio as well is also using this approach of federating both open and closed source repositories by integrating a large number of connectors into their architecture. They have access to web content. They have access to various social media fire hoses. They have access to proprietary data feeds from financial news providers. But then, they fuse that with internal sources of information that may come from sources like SharePoint, or Dropbox, or Google Drive, or OneDrive, your local file servers, and then give you a single view into all of this data. Brian: Awesome. I don’t want to keep you too long. This has been super informational for me, learning about your space that you’re in. Can you tell us any closing thoughts, advice for product managers, analytics practitioners? We talked a little about obviously thinking globally and some of those areas. Any other closing thoughts about delivering good experiences, leveraging text analytics, other things to watch out for? Any general thoughts? Carl: Sure. I’ll close with a few thoughts. One is repeating what I’ve said before about Unicode compliance. The fact that I again have to state that is somewhat depressing yet it’s still isn’t taken as an absolute requirement, which is today, and yet continues to be overlooked. Secondly, just thinking globally, anything that you’re building, you got to think about a global audience. I’ll share with you an anecdote. My company gives a lot of business to Eventbrite, who I would expect by now would have a fully globalized platform, but it turns out their utility for sending an email to everybody who signed-up for an event doesn’t work in Japanese. I found that out the hard way when I needed to send an email to everybody that was signed up for our conference in Tokyo. That was very disturbing and I’m not afraid to say that live on a podcast. They need to fix it. You really don’t want customers finding out about that during a time of high stress and high pressure, and there’s just no excuse for that. Then my third point with regard to natural language understanding. This is a really incredibly exciting time to be involved with natural language, with human language because the technology is changing so rapidly and the space of what is achievable is expanding so rapidly. My final point of advice is that hybrid architectures have been the best and continue to be the best. There’s a real temptation to say, “Just grow all of my text into a deep neural net and magic is going to happen.” That can be true if you have sufficiently large amounts of data, but most people don’t. Therefore, you’re going to get better results by using hybrids of algorithmic simpler machine learning architectures together with deep neural nets. Brian: That last tip, can you take that down one more notch? I assume you’re talking about a level of quality on the tail-end of the technology implementation, there’s going to be some higher quality output. Can you translate what a hybrid architecture means in terms of a better product at the other end? What would be an example of that? Carl: Sure. It’s hard to do without getting too technical, but I’ll try and I’ll try to use some examples in English. I think the traditional way of approaching deep nets has very much been take a very simple, potentially deep and recursive neural network architecture and just throw data at it, especially images or audio waveforms. I throw my images in and I want to classify which ones were taken outdoors and which ones were taken indoors with no traditional signal processing or image processing added before or after. In the image domain, my understanding is that, that kind of purist approach is delivered the best results and that’s what I’ve heard. I don’t have first-hand information about that. However, when it comes to human language in its written form, there’s a great deal of traditional processing of that text that boosts the effectiveness of the deep learning. That falls into a number of layers that I won’t go into, but to just give you one example, let’s talk about what we called Orthography. The English language is relatively simple and that the orthography is generally quite simple. We’ve got the letters A through Z, an uppercase and lowercase, and that’s about it. But if you look inside, say a PDF of English text, you’ll sometimes encounter things like ligatures, like a lowercase F followed by a lowercase I, or two lowercase Fs together, will be replaced with single glyph to make it look good in that particular typeface. If I think those glyphs and I just throw them in with all the rest of my text, that actually complicates the job of the deep learning. If I take that FI ligature and convert it back to separate F followed by I, or the FF ligature and convert it back to FF, my deep learning doesn’t have to figure out what those ligatures are about. Now that seems pretty obscure in English but in other writing systems, especially Arabic, for instance, in which there’s an enormous number of ligatures, or Korean or languages that have diacritical marks, processing those diacritical marks, those ligatures, those orthographic variations using conventional means will make your deep learning run much faster and give you better results with less data. That’s just one example but there’s a whole range or other text-processing steps using algorithms that have been developed over many years, that simply makes the deep learning work better and that results in what we call a hybrid architecture. Brian: So it sounds like taking, as opposed to throw it all in a pot and stir, there’s the, “Well, maybe I’m going to cut the carrots neatly into the right size and then throw them in the soup.” Carl: Exactly. Brian: You’re kind of helping the system do a better job at its work. Carl: That’s right and it’s really about thinking about your data and understanding something about it before you throw it into the big brain. Brian: Exactly. Cool. Where can people follow you? I’ll put a link up to the Basis in the show notes but are you on Twitter or LinkedIn somewhere? Where can people find you? Carl: LinkedIn tends to be my preferred social network. I just was never really good at summarizing complex thoughts into 140 characters, so that’s the best place to connect with me. Basically, we’ll tell you all about Basis Technology and rosette.com is our text analytics platform, which is free for anybody to explore, and to the best of my knowledge, it is the most capable text analytics platform with the largest number of languages that you will find anywhere on the public internet. Brian: All right, I will definitely put those up in the show notes. This has been fantastic, I’ve learned a ton, and thanks for coming on Experiencing Data. Carl: Great talking with you, Brian. Brian: All right. Cheers. Carl: Cheers.

Alumni Aloud
Linguistics in AI-Powered Text Analytics (feat. Michelle McSweeney)

Alumni Aloud

Play Episode Listen Later Apr 3, 2019 30:35


Michelle McSweeney is director of data quality at Converseon, a social media analytics and consulting agency. Michelle earned her PhD in linguistics at the Graduate Center in 2016. The post Linguistics in AI-Powered Text Analytics (feat. Michelle McSweeney) appeared first on Career Planning and Professional Development.

Macro Musings with David Beckworth
Ryan Avent on Hyperinflation and the Fed’s New Dovish Direction

Macro Musings with David Beckworth

Play Episode Listen Later Mar 10, 2019 55:30


Ryan Avent is an economics columnist with The Economist magazine and is a previous guest of Macro Musings. He joins the show today to talk about some of his recent columns including work on hyperinflation, the Green New Deal, and Fed policy. David and Ryan also discuss the growing popularity of Modern Monetary Theory, the Fed’s dovish change in direction, and why hyperinflation is so devastating to a nation’s economy.   Transcript for the episode: https://www.mercatus.org/bridge/podcasts/03112019/hyperinflation-and-mmt   Ryan’s Twitter: @ryanavent Ryan’s Economist profile: http://mediadirectory.economist.com/people/ryan-avent/   Related Links:   *Hyperinflations Can End Quickly, Given the Right Sort of Regime Change* by Ryan Avent https://www.economist.com/finance-and-economics/2019/01/31/hyperinflations-can-end-quickly-given-the-right-sort-of-regime-change   *Taking the Fed at its Word: Direct Estimation of Central Bank Objectives using Text Analytics* by Adam Shapiro & Daniel Wilson https://www.frbsf.org/economic-research/files/wp2019-02.pdf   David’s blog: macromarketmusings.blogspot.com David’s Twitter: @DavidBeckworth

Data in Higher Education
The Data Science behind Text Analytics: Building a Bridge for Untapped Data

Data in Higher Education

Play Episode Listen Later Feb 25, 2019 22:54


What can you do with the power of text? JD White, PhD, Vice President of Product Management, and Tyler Rinker, PhD, Lead Data Scientist, discuss the buzz around text analytics, its increasing use on campus, and the power it can give you when assessing your qualitative data.

Department 12: An I-O Psychology Podcast
Sy Islam and Mike Chetta on Text Analytics

Department 12: An I-O Psychology Podcast

Play Episode Listen Later Sep 17, 2017 17:56


In this episode, we talk to Sy Islam and Mike Chetta about how they use text analytics in their consulting practice. Lots of great links to share this time around.Episode Links Talent Metrics: Website, Twitter, LinkedInSy Islam on TwitterSy Islam on LinkedInMike Chetta on TwitterMike Chetta on LinkedInTools: Text Analysis with R, Basic Text Mining in R, TropesFarmingdale State College Department of PsychologyFarmingdale State College Expert CenterTouro College and University System

Reinventing Professionals
ILTA Feature: Definitively Deploying Text Analytics

Reinventing Professionals

Play Episode Listen Later Aug 7, 2017 6:33


I spoke with Rob Wescott, the chief revenue officer for ayfie, a text analytics company that adds structure to unstructured information. We discussed the genesis of ayfie, how the company's technology differs from others in this sector, ways that it enables technology assisted review and continuous active learning, its distribution model, and what the company will be announcing at ILTACON 2017.

Reinventing Professionals
ILTA Feature: Definitively Deploying Text Analytics

Reinventing Professionals

Play Episode Listen Later Aug 7, 2017 6:33


I spoke with Rob Wescott, the chief revenue officer for ayfie, a text analytics company that adds structure to unstructured information. We discussed the genesis of ayfie, how the company’s technology differs from others in this sector, ways that it enables technology assisted review and continuous active learning, its distribution model, and what the company will be announcing at ILTACON 2017.

Reinventing Professionals
ILTA Feature: Definitively Deploying Text Analytics

Reinventing Professionals

Play Episode Listen Later Aug 7, 2017 6:33


I spoke with Rob Wescott, the chief revenue officer for ayfie, a text analytics company that adds structure to unstructured information. We discussed the genesis of ayfie, how the company’s technology differs from others in this sector, ways that it enables technology assisted review and continuous active learning, its distribution model, and what the company will be announcing at ILTACON 2017.

Data Podcast
Hendrik Feddersen (@h_feddersen): HR Analytics & it's application in Data-Science World

Data Podcast

Play Episode Listen Later Jun 9, 2017 11:58


Hendrik Feddersen is leader of the HRIS at the European Medicines Agency (which is the European equivalent of the FDA) in London. Six years ago, he led the full SAP HCM project from conception to completion and was the main Change Manager. His current tasks are to introduce further process improvements, problem solving, data cleaning, reporting, preparing predictions and training of colleagues. For more than three years he has been connecting internationally with like-minded HR professionals interested in HR Analytics, attending conferences, studying Data Sciences and collecting and writing articles on HR Analytics. His special interests are Text Analytics, Social Network Analysis and open source software like R. Interviewer: Rajib Bahar Music: www.freesfx.co.uk

Linear Digressions
Feature Processing for Text Analytics

Linear Digressions

Play Episode Listen Later Apr 23, 2017 17:28


It seems like every day there's more and more machine learning problems that involve learning on text data, but text itself makes for fairly lousy inputs to machine learning algorithms.  That's why there are text vectorization algorithms, which re-format text data so it's ready for using for machine learning.  In this episode, we'll go over some of the most common and useful ways to preprocess text data for machine learning.

HANA.fm
HFM010: Text Analytics mit SAP HANA

HANA.fm

Play Episode Listen Later Oct 28, 2016 29:47


In dieser Ausgabe von HANA.fm dreht es sich um Text Analytics mit SAP HANA. Alsbald man sich vom eigenen Unternehmen entfernt, trifft man in der digitalisierten Welt häufig unstrukturierte Daten. Daher gehe ich in dieser Episode darauf ein, wie man denn mit solchen unstrukturierten Daten auf der SAP HANA umgeht, um diese anschließend sinnvoll zu verwenden. Mehr Informationen auf: https://hanafm.de/hfm010-text-analytics-mit-sap-hana/

4imprint Podcasts
Text Analytics [PODCAST]

4imprint Podcasts

Play Episode Listen Later Feb 23, 2016 35:50


This podcast explores text analytics and how businesses can use it to better understand its customers.

The Recruiting Animal
Talent Browser Search Tool with Janet Dwyer, John Harney and Andrew Gadomski

The Recruiting Animal

Play Episode Listen Later Feb 3, 2016 91:00


@TalentBrowser - @JanetDwyer2 - @AndrewGadomski - John Harney Talent Browser is cloud-based Text Analytics and Job Matching software that automatically inventories, quantifies and benchmarks people's skills and experience for use in recruitment, learning and development, mobility, succession planning and other corporate initiatives.

European Speechwriter Network's Podcast
"Who Needs Copywriters?" Chris West: Who Likes Cunning Linguists?

European Speechwriter Network's Podcast

Play Episode Listen Later Jun 3, 2013 37:20


A conference organised by A Thousand Monkeys & UK Speechwriters' Guild. Hosted by Bournemouth University Media School, Thursday 18th & Friday 19th April 2013. Chris West’s award-winning copywriting career spans Saatchi’s to Mother, and he is one of the people claiming to have written the line “You never actually own a Patek Phillipe. You merely look after it for the next generation.” His company, Verbal Identity, creates language which creates value for his clients and uses Text Analytics for Marketing to discover unknown themes in consumer conversations. Chris also writes for the Sunday Times and wrote the Best Film at Barcelona Film Festival.

European Speechwriters
"Who Needs Copywriters?" Chris West: Who Likes Cunning Linguists?

European Speechwriters

Play Episode Listen Later Jun 3, 2013 37:20


A conference organised by A Thousand Monkeys & UK Speechwriters' Guild. Hosted by Bournemouth University Media School, Thursday 18th & Friday 19th April 2013. Chris West's award-winning copywriting career spans Saatchi's to Mother, and he is one of the people claiming to have written the line “You never actually own a Patek Phillipe. You merely look after it for the next generation.” His company, Verbal Identity, creates language which creates value for his clients and uses Text Analytics for Marketing to discover unknown themes in consumer conversations. Chris also writes for the Sunday Times and wrote the Best Film at Barcelona Film Festival.

DiscoverText
Creating a New Project in DiscoverText

DiscoverText

Play Episode Listen Later Nov 28, 2012 1:09


DiscoverText
Creating a New Project in DiscoverText

DiscoverText

Play Episode Listen Later Nov 28, 2012 1:09


DiscoverText
Uploading PST Files from Microsoft Outlook

DiscoverText

Play Episode Listen Later Jan 26, 2012 4:40


DiscoverText
Uploading PST Files from Microsoft Outlook

DiscoverText

Play Episode Listen Later Jan 25, 2012 4:40


CERIAS Security Seminar Podcast
Victor Raskin & Julia Taylor, Ontological Semantic Technology for Detecting Insider Threat and Social Engineering

CERIAS Security Seminar Podcast

Play Episode Listen Later Apr 28, 2010 56:18


The paper describes a computational system, an application and implementation of the mature Ontological Semantic Technology, for detecting unintentional inferences in casual unsolicited and unrestricted verbal output of individuals, potentially responsible for leaked classified information to people with unauthorized access. Uses of the system for cases of insider threat and/or social engineering are discussed. About the speaker: Victor Raskin is a Distinguished Professor of English and Linguistics at Purdue, who also is an Associate Director for Graduate Education at CERIAS and has a courtesy appointment in CS. He holds a Ph.D. (1970) from the Lomonosov Moscow State University in mathematical and computational linguistics. A co-founder of Ontological Semantics, with his former Ph.D. advisee Sergei Nirenburg, he has authored, co-authored, edited, etc. some 20 books and over 200 papers in the are of theoretical and computational semantics and their applications. He has also consulted a number of businesses implementing applications of Ontological Semantics.Julia M. Taylor is a Visiting Scholar at CERIAS and Linguistics at Purdue University and a leading designer for the Text Analytics application of the Ontological Semantics Technology at RiverGlass, Inc. She holds a Ph.D. in Computer Science and Engineering from the University of Cincinnati (2008). She has published over 30 papers on knowledge representation, fuzzy logic, and, of course, the Ontological Semantic Technology and its applications and is working on a book about computational joke detection system.

DiscoverText
Smoking Hot Data and Text Analytics

DiscoverText

Play Episode Listen Later Jan 1, 1970 4:13


DiscoverText
Smoking Hot Data and Text Analytics

DiscoverText

Play Episode Listen Later Dec 31, 1969 4:13