Podcasts about tf idf

23PODCASTS
24EPISODES
45mAVG DURATION
?INFREQUENT EPISODES
Feb 9, 2024LATEST

POPULARITY

20172018201920202021202220232024

Best podcasts about tf idf

SEO im Ohr - die SEO-News von SEO Südwest

2 episodes with tf idf

Latest podcast episodes about tf idf

The top AI news from the past week, every ThursdAI

Play Episode Listen Later Feb 9, 2024 113:51

Hihi, this is Alex, from Weights & Biases, coming to you live, from Yosemite! Well, actually I'm writing these words from a fake virtual yosemite that appears above my kitchen counter as I'm not a Vision Pro user and I will force myself to work inside this thing and tell you if it's worth it. I will also be on the lookout on anything AI related in this new spatial computing paradigm, like THIS for example! But back to rfeality for a second, we had quite the show today! We had the awesome time to have Junyang Justin Lin, a dev lead in Alibaba, join us and talk about Qwen 1.5 and QwenVL and then we had a deep dive into quite a few Acronyms I've been seeing on my timeline lately, namely DSPy, ColBERT and (the funniest one) RAGatouille and we had a chat with Connor from Weaviate and Benjamin the author of RAGatouille about what it all means! Really really cool show today, hope you don't only read the newsletter but listen on Spotify, Apple or right here on Substack. TL;DR of all topics covered: * Open Source LLMs * Alibaba releases a BUNCH of new QWEN 1.5 models including a tiny .5B one (X announcement)* Abacus fine-tunes Smaug, top of HF leaderboard based Qwen 72B (X)* LMsys adds more open source models, sponsored by Together (X)* Jina Embeddings fine tune for code* Big CO LLMs + APIs* Google rebranding Bard to Gemini and launching Gemini Ultra (Gemini)* OpenAI adds image metadata (Announcement)* OpenAI keys are now restricted per key (Announcement)* Vision & Video* Bria - RMBG 1.4 - Open Source BG removal that runs in your browser (X, DEMO)* Voice & Audio* Meta voice, a new apache2 licensed TTS - (Announcement)* AI Art & Diffusion & 3D* Microsoft added DALL-E editing with "designer" (X thread)* Stability AI releases update to SVD - video 1.1 launches with a webUI, much nicer videos* Deep Dive with Benjamin Clavie and Connor Shorten show notes:* Benjamin's announcement of RAGatouille (X)* Connor chat with Omar Khattab (author of DSPy and ColBERT) - Weaviate Podcast* Very helpful intro to ColBert + RAGatouille - NotionOpen Source LLMs Alibaba releases Qwen 1.5 - ranges from .5 to 72B (DEMO)With 6 sizes, including 2 new novel ones, from as little as .5B parameter models to an interesting 4B, to all the way to a whopping 72B, Alibaba open sources additional QWEN checkpoints. We've had the honor to have friend of the pod Junyang Justin Lin again, and he talked to us about how these sizes were selected, that even thought this model beats Mistral Medium on some benchmarks, it remains to be seen how well this performs on human evaluations, and shared a bunch of details about open sourcing this.The models were released with all the latest and greatest quantizations, significantly improved context length (32K) and support for both Ollama and Lm Studio (which I helped make happen and am very happy for the way ThursdAI community is growing and connecting!) We also had a chat about QwenVL Plus and QwebVL Max, their API only examples for the best open source vision enabled models and had the awesome Piotr Skalski from Roborflow on stage to chat with Junyang about those models! To me a success of ThursdAI, is when the authors of things we talk about are coming to the show, and this is Junyang second appearance, which he joined at midnight at the start of the chinese new year, so greately appreciated and def. give him a listen! Abacus Smaug climbs to top of the hugging face leaderboard Junyang also mentioned that Smaug is now at the top of the leaderboards, coming from Abacus, this is a finetune of the previous Qwen-72B, not even this new one. First model to achieve an average score of 80, this is an impressive appearance from Abacus, though they haven't released any new data, they said they are planning to! They also said that they are planning to finetune Miqu, which we covered last time, the leak from Mistral that was acknowledged by Arthur Mensch the CEO of Mistral.The techniques that Abacus used to finetune Smaug will be released an upcoming paper! Big CO LLMs + APIsWelcome Gemini Ultra (bye bye Bard) Bard is no longer, get ready to meet Gemini. it's really funny because we keep getting cofusing naming from huge companies like Google and Microsoft. Just a week ago, Bard with Gemini Pro shot up to the LMSYS charts, after regular gemini pro API were not as close. and now we are suppose to forget that Bard even existed?

ceo spotify english google china apple ai space star wars young deep thinking writing russian elon musk microsoft arts mit iphone open mars hands emotional bs pc valley android chatgpt tinder discord vr deep dive reddit cloud mac stanford large billion ios beef ipads sr regular moscow feliz samsung bay folks nlp powered whispers slack transformers blade assistant siri ir bart instructors climb substack rank fusion recognition gemini api bits openai trained nvidia alto bard palm bing performer ui python 4k ml alibaba lama github virgo jags gpt lava apis 1b petr wwdc javascript hermes stephen colbert tim cook gt burst apache tl yosemite biases 2b ids macs nemo colbert quants google home canary spatial google cloud burr 5b copilot vector dali sql seamless restricted llm jin google assistant piotr bg ocr vespa beagle quin gpus 4b 8k m3 smog kano deepmind dsp tato satya nadella shorten embedding fine tuning query wits lm lilacs bixby smaug json cpus vb sundar pichai hf vl lex fridman spi cli bai tts vectors mistral csv abacus plas asr sundar svd miku compiling commercially axolotl dpo google deepmind gpts jumbotron juna mkbhd hihi retrieve jina xai junaid suno lmc google brain stability ai bge miq l'ami miqu clang bindu matryoshka marques brownlee hackernoon gbt infer lmm gbu 32k neurips dbo sfd ssps mbd ggf web ui gemini pro technium weaviate tf idf

Kodsnack 536 - I choose computer science, with Michele Riva

Kodsnack in English

Play Episode Listen Later Aug 1, 2023 49:03

Recorded at the Øredev 2022 developer conference, Fredrik chats with Michele Riva about writing a full-text search engine, maintaining 8% of all Node modules, going to one conference per week, refactoring, the value of a good algorithm, and a lot more. Michele highly recommends writing a full-text search engine. He created Lyra - later renamed Orama, and encourages writing your own in order to demystify subjects. Since the podcast was recorded, Michele has left his then employer Nearform and founded Oramasearch to focus on the search engine full time. We also discuss working for product companies versus consulting, versus open source. It’s more about differences between companies than anything else. Open source teaches you deal with more and more different people. Writing code is never just writing code. Should we worry about taking on too many dependencies? Michele is in favour of not fearing dependencies, but ensuring you understand how things important parts for your application work. Writing books is never convenient, but it can open many doors. When it comes to learning, there are areas where a whole level of tutorials are missing - where there is only really surface-level tutorial and perhaps deep papers, but nothing in between. Michele works quite a bit on bridging such gaps through his presentations. Thank you Cloudnet for sponsoring our VPS! Comments, questions or tips? We are @kodsnack, @tobiashieta, @oferlund and @bjoreman on Twitter, have a page on Facebook and can be emailed at info@kodsnack.se if you want to write longer. We read everything we receive. If you enjoy Kodsnack we would love a review in iTunes! You can also support the podcast by buying us a coffee (or two!) through Ko-fi. Links Michele Michele’s Øredev 2023 presentations Nearform TC39 - the committee which evolves Javascript as a language Matteo Collina - worked at Nearform, works with the Node technical steering committee Lyra - the full-text search engine - has been renamed Orama Lucene Solr Elasticsearch Radix tree Prefix tree Inverted index Thoughtworks McKinsey Daniel Stenberg Curl Deno Express Fastify Turbopack Turborepo from Vercel Vercel Fast queue Refactoring Michele’s refactoring talk Real-world Next.js - Michele’s book Next.js Multitenancy Create React app Nuxt Vue Sveltekit TF-IDF - “term frequency–inverse document frequency” Cosine similarity Michele’s talk on building Lyra Explaining distributed systems like I’m five Are all programming languages in English? 4th dimension Prolog Velato - programming language using MIDI files as source code Titles For foreign people, it’s Mitch That kind of maintenance A very particular company A culture around open source software Now part of the 8% Nothing more than a radix tree One simple and common API Multiple ways of doing consultancy What you’re doing is hidden You can’t expect to change people A problem we definitely created ourselves Math or magic Writing books is never convenient Good for 90% of the use cases (When I can choose,) I choose computer science

english real writing open virtual code computers math express explaining ko computer science titles api dimension mckinsey javascript fredrik midi node curl vue vps inverted prolog deno refactoring elasticsearch thoughtworks radix solr prefix nuxt tc39 cosine lucene sveltekit nearform daniel stenberg tf idf kodsnack cloudnet

Unleashing the power of large language models

The Data Exchange with Ben Lorica

Play Episode Listen Later Aug 18, 2022 38:51

Maarten Grootendorst, is a data scientist at IKNL, and more importantly, he's the author of two open source libraries that I've come to love: BERTopic (topic modeling with transformers and c-TF-IDF) and PolyFuzz (fuzzy string matching). Both these projects bring the power of transformers and other leading edge models, and package them with simple APIs, clear documentation, and visualization tools.Download a FREE copy of our recent NLP Industry Survey Results: https://gradientflow.com/2021nlpsurvey/Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.Detailed show notes can be found on The Data Exchange web site.

language large models unleashing detailed apis large language models tf idf

EP 207 : Term Frequency * Inverse Document Frequency ( TF*IDF ) | Content SEO

SIFT Podcast

Play Episode Listen Later Jul 13, 2021 8:14

How often should a certain target keyword appear on an article ? With the TF*IDF tool you can determine the term frequency while doing a competitive analysis in Content Marketing . Agency : SavaDigital.io Contact : augustine@savadigital.io

frequency content marketing document inverse tf idf

Le Natural Language Processing c'est quoi ? - Ep. 4 - LPHS

Les Petites Histoires Du SEO

Play Episode Listen Later May 30, 2021 7:20

Qu'est-ce que le Natural Language Processing ? Et quel est son rapport avec le SEO ? Le Natural Language Processing a pour objectif de permettre aux machines de comprendre le langage humain, et est notamment utilisé dans la recherche d'information. C'est une technologie utilisée par Google pour traiter, comprendre et classer le contenu des pages web, mais également les recherches des utilisateurs. D'un point de vue historique, une théorie majeure fut développée au milieu des années 50. Il s'agit de ce que l'on appelle l'hypothèse distributionnelle. Elle pose l'idée que les mots qui se trouvent dans des contextes d'apparition proches tendent à avoir des significations similaires. Plusieurs décennies plus tard, en 1983, Gerard Salton propose le modèle vectoriel, qui consiste à représenter des documents textuels ou des listes de mots sous la forme de vecteurs, c'est-à-dire de valeurs numériques. Parallèlement, ce même Gérard Salton propose d'utiliser une méthode statistique de pondération, appelée TF-IDF, pour évaluer l'importance d'un terme, devenu une valeur numérique, dans un document. A partir des années 2010, on commence à utiliser des réseaux de neurones artificielles en NLP. En 2013, des algorithmes entraînés par des réseaux de neurones et développés par les équipes de Google ont permis de mettre au point le système Word2Vec, un algorithme de word embedding, capable d'identifier les relations entre les mots en prenant en compte le contexte dans lequel ces mots, transformés en vecteur, apparaissent. Mais depuis 2013, Google ne cesse de repousser les frontières du traitement automatique du langage naturel. On peut citer BERT, son algorithme à l'œuvre depuis 2019 pour comprendre encore plus précisément les requêtes des utilisateurs. Fin 2020, Google annonce que sa mise à jour “passage indexing” lui permet d'identifier un passage précis d'un contenu qui répond selon lui précisément à la requête de l'internaute. De cette manière, Google peut renvoyer à l'utilisateur un extrait d'un contenu en réponse à sa recherche, peu importe que le contenu d'ensemble de la page n'ait qu'un rapport lointain avec la demande de l'utilisateur. On le voit, la compréhension qu'a Google de votre contenu est précise. Les avancées en traitement automatique du langage naturel montre qu'aujourd'hui il est totalement improductif de bourrer votre contenu du mot-clé sur lequel vous souhaitez vous positionner. De la même manière, les longs textes dilués ne servent à rien. Au contraire. Google souhaite mettre en avant des textes précis, allant à l'essentiel, clairs dans l'objectif qu'ils se donnent de répondre à telle ou telle problématique, autant dans leur globalité que dans chacune des sous-thématiques abordées. Gardez toujours à l'esprit. Ce que Google veut, c'est afficher les réponses les plus pertinentes à la requête de l'utilisateur. Pour optimiser un contenu, il faut donc d'abord et avant tout être clair dans l'intention qu'on se donne de répondre à une problématique rencontrée par vos utilisateurs. Et plutôt que de bourrer votre page du même mot-clé sur lequel vous souhaitez vous positionner, demandez-vous plutôt quels sont les termes et les thèmes qui tournent autour et qui sont régulièrement abordés lorsqu'on parle du sujet sur lequel vous souhaitez prendre la parole. Structurez votre contenu en conséquence. Chacun des sujets connexes à votre sujet principal pourra faire l'objet d'une sous-partie ou d'un paragraphe spécifique. Cette manière de structurer votre contenu plaira autant aux internautes qu'au moteur de recherche. Et c'est la combinaison gagnante pour vous rapprocher des premières places dans les pages de résultats de Google. Retrouvez le podcast Qu'est-ce que le Natural Language Processing sur YouTube : https://www.youtube.com/watch?v=5eGX_aturVM

google seo nlp fin plusieurs chacun parall natural language processing gardez salton word2vec tf idf lphs

learn about TF-IDF model in Natural Language Processing

Code Logic

Play Episode Listen Later Dec 13, 2020 1:45

In this podcast episode we will talk about TF-IDF model in Natural Language Processing. TF-IDF model stands for term frequency inverse document frequency. We use TF-IDF model to give more weight to important words as compared with common words like the, a, in, there, where, etc. To learn python programming visit www.stacklearn.org. See you in the next podcast episode!

model natural language processing tf idf

LSI Keywords, TF IDF, etc. : Let's debunk Google SEO Algo Myths - SEO Conspiracy S01E38

DEBUNK SEO MYTHS AND LEARN PROPER SEO WITH LAURENT BOURRELLY & DIXON JONES

Play Episode Listen Later Nov 16, 2020 19:24

Since our industry, Search Engine Optimization and Digital Marketing, does not have any standards, anyone can pretend anything. This podcast was launched with the will to tell MY truth about Google SEO and Digital Marketing in general. When it's come to Google algorithms, there are some insane theories going around. Domain Authority is a big one we treated twice on SEO Conspiracy. LSI Keywords, TF IDF and a couple more are SEO Urban Legends born from we don't know where. Especially LSI Keywords is not a good idea to use in your SEO Strategy. Google does not use Latent Semantic Indexing or Latent Semantic Analysis. These algorithms are interesting, but it's useless and careless to play with them for Google SEO. Beyond using bad data, don't use SEO myths, if you want to be taken seriously by SEO who know what's up. As always, this is only our opinion. Please don't hesitate to share your point of view on the topics covered in this video. I never take for granted the time you spend to watch our content. --- The program of the SEO Conspiracy Podcast is the following: Monday ➡ SEO Myth-busting with my exclusive co-host, the one and only Dixon Jones; Tuesday ➡ Your fix of Alternative SEO News. I'm reviewing every important news about the Search/Digital Marketing industry from the week before; Wednesday ➡ SEO Stories. With or without a guest, I will take the time to dissects Search Engine Optimization and Digital Marketing topics; Thursday ➡ this day is reserved to talk about Semantic SEO and my strategy called the Topical Mesh. In a series of 52 videos, I lay out the complete plan. This is the most advanced free SEO tutorial in the World; Friday ➡ Q&A. I have tons of questions in stock, asked by my students and clients. To start off the series, I will dig into this pool of SEO and Digital Marketing related questions. To continue the series, please contact me (contact info in the about page on the Youtube channel) or via social medias (links below). Ask me any questions. My answer will be 100% BS Free Guaranteed or your money back (just kidding, I'm giving out everything for free); Week-ends ➡ and/or sometimes during the week, live sessions will take place. Among other ideas, I will be performing live SEO audits. I want to help you achieve better results; I don't want to hold back anything. I've always been known to lay it all out like it is. There is way too much BS talk in the SEO industry. Let's cut throughout the noise to have a real conversation. Thank you very much for watching Laurent Bourrelly https://www.seoconspiracy.com/ ----------- Laurent's Stuff : https://www.topicalmesh.com/ https://www.frenchtouchseo.com/ https://rank4win.com/ https://twitter.com/laurentbourelly ----------- Dixon's Stuff: https://dixonjones.com/ https://majestic.com/ https://inlinks.net/ ----------- SEO Conspiracy Social Media : - Facebook: https://www.facebook.com/seoconspiracy/ - Instagram: https://www.instagram.com/seoconspiracy/ - Twitter: https://twitter.com/seoconspiracy #SEO #Google #DigitalMarketing

world google marketing digital conspiracies myths bs seo digital marketing algo dixon laurent search engine optimization debunk seo strategies google seo domain authority dixon jones tf idf

Animal Olympics, Whatsapp, and Models for Healthcare

Journal Club

Play Episode Listen Later Jun 3, 2020 41:38 Very Popular

This week we have a guest joining us, Francisco J. Azuaje G! He brings us the paper "How to Develop Machine Learning Models for Healthcare." Lan discusses "Animal AI Olympics," a reinforcement learning competition inspired by animal cognition. Kyle talks about WhatsApp and discusses the article "Why New Contact Tracing Apps Have A Critical WhatsApp-Sized Problem." Last but not least: George! He brings us his blog post about comparing TF-IDF and BERT vectorisation for speaker prediction. All works discussed can be found in the show notes.

olympic games healthcare whatsapp animal models lan tf idf

Drive effective Splunk enablement and adoption & build user cohorts using internal logs and machine learning [Splunk Enterprise, Splunk Cloud, Splunk Machine Learning Toolkit, AI/ML]

Splunk [Foundations/Platform Track] 2019 .conf Videos w/ Slides

Play Episode Listen Later Dec 23, 2019

Wouldn’t it be a great if we had Splunk’s version of Moneyball; an application where everyone comes out ahead by leveraging data to drive effective Splunk enablement and adoption? Splunk’s internal logs have a wealth of information about how Splunk is being used within your organization. Let’s take drinking the “Splunk Champagne” to the next level by applying statistics and machine learning to Splunk’s internal logs! This session will cover segmenting users based on their search profiles - number of searches run, average response times, and recency of searches executed, among other criteria. We’ll use techniques such as clustering to classify users from novice to experts, and use TF-IDF and text analytics techniques to understand commands used in search strings. Enriching this data with completed and planned Splunk Education courses, lunch & learn sessions, and other training activities will enable your users to achieve the Splunk Ninja status they’re looking for! Speaker(s) Anand Ladda, Staff Solutions Engineer, Splunk Slides PDF link - https://conf.splunk.com/files/2019/slides/FN1373.pdf?podcast=1577146201 Product: Splunk Enterprise, Splunk Cloud, Splunk Machine Learning Toolkit, AI/ML Track: Foundations/Platform Level: Intermediate

speaker drive cloud adoption internal enterprise user moneyball slides cohorts logs ai ml enablement splunk enriching tf idf level intermediate product splunk enterprise splunk machine learning toolkit track foundations platform

Drive effective Splunk enablement and adoption & build user cohorts using internal logs and machine learning [Splunk Enterprise, Splunk Cloud, Splunk Machine Learning Toolkit, AI/ML]

Splunk [Enterprise Cloud and Splunk Cloud Services] 2019 .conf Videos w/ Slides

Play Episode Listen Later Dec 23, 2019

Wouldn’t it be a great if we had Splunk’s version of Moneyball; an application where everyone comes out ahead by leveraging data to drive effective Splunk enablement and adoption? Splunk’s internal logs have a wealth of information about how Splunk is being used within your organization. Let’s take drinking the “Splunk Champagne” to the next level by applying statistics and machine learning to Splunk’s internal logs! This session will cover segmenting users based on their search profiles - number of searches run, average response times, and recency of searches executed, among other criteria. We’ll use techniques such as clustering to classify users from novice to experts, and use TF-IDF and text analytics techniques to understand commands used in search strings. Enriching this data with completed and planned Splunk Education courses, lunch & learn sessions, and other training activities will enable your users to achieve the Splunk Ninja status they’re looking for! Speaker(s) Anand Ladda, Staff Solutions Engineer, Splunk Slides PDF link - https://conf.splunk.com/files/2019/slides/FN1373.pdf?podcast=1577146252 Product: Splunk Enterprise, Splunk Cloud, Splunk Machine Learning Toolkit, AI/ML Track: Foundations/Platform Level: Intermediate

speaker data drive conference videos streaming cloud adoption internal enterprise user moneyball slides cohorts logs ai ml enablement splunk enriching tf idf level intermediate product splunk enterprise splunk machine learning toolkit track foundations platform

Drive effective Splunk enablement and adoption & build user cohorts using internal logs and machine learning [Splunk Enterprise, Splunk Cloud, Splunk Machine Learning Toolkit, AI/ML]

Splunk [All Products] 2019 .conf Videos w/ Slides

Play Episode Listen Later Dec 23, 2019

Wouldn’t it be a great if we had Splunk’s version of Moneyball; an application where everyone comes out ahead by leveraging data to drive effective Splunk enablement and adoption? Splunk’s internal logs have a wealth of information about how Splunk is being used within your organization. Let’s take drinking the “Splunk Champagne” to the next level by applying statistics and machine learning to Splunk’s internal logs! This session will cover segmenting users based on their search profiles - number of searches run, average response times, and recency of searches executed, among other criteria. We’ll use techniques such as clustering to classify users from novice to experts, and use TF-IDF and text analytics techniques to understand commands used in search strings. Enriching this data with completed and planned Splunk Education courses, lunch & learn sessions, and other training activities will enable your users to achieve the Splunk Ninja status they’re looking for! Speaker(s) Anand Ladda, Staff Solutions Engineer, Splunk Slides PDF link - https://conf.splunk.com/files/2019/slides/FN1373.pdf?podcast=1577146224 Product: Splunk Enterprise, Splunk Cloud, Splunk Machine Learning Toolkit, AI/ML Track: Foundations/Platform Level: Intermediate

Drive effective Splunk enablement and adoption & build user cohorts using internal logs and machine learning [Splunk Enterprise, Splunk Cloud, Splunk Machine Learning Toolkit, AI/ML]

Splunk [AI/ML, Splunk Machine Learning Toolkit] 2019 .conf Videos w/ Slides

Play Episode Listen Later Dec 23, 2019

Wouldn’t it be a great if we had Splunk’s version of Moneyball; an application where everyone comes out ahead by leveraging data to drive effective Splunk enablement and adoption? Splunk’s internal logs have a wealth of information about how Splunk is being used within your organization. Let’s take drinking the “Splunk Champagne” to the next level by applying statistics and machine learning to Splunk’s internal logs! This session will cover segmenting users based on their search profiles - number of searches run, average response times, and recency of searches executed, among other criteria. We’ll use techniques such as clustering to classify users from novice to experts, and use TF-IDF and text analytics techniques to understand commands used in search strings. Enriching this data with completed and planned Splunk Education courses, lunch & learn sessions, and other training activities will enable your users to achieve the Splunk Ninja status they’re looking for! Speaker(s) Anand Ladda, Staff Solutions Engineer, Splunk Slides PDF link - https://conf.splunk.com/files/2019/slides/FN1373.pdf?podcast=1577146257 Product: Splunk Enterprise, Splunk Cloud, Splunk Machine Learning Toolkit, AI/ML Track: Foundations/Platform Level: Intermediate

Drive effective Splunk enablement and adoption & build user cohorts using internal logs and machine learning [Splunk Enterprise, Splunk Cloud, Splunk Machine Learning Toolkit, AI/ML]

Splunk [Enterprise] 2019 .conf Videos w/ Slides

Play Episode Listen Later Dec 23, 2019

Wouldn’t it be a great if we had Splunk’s version of Moneyball; an application where everyone comes out ahead by leveraging data to drive effective Splunk enablement and adoption? Splunk’s internal logs have a wealth of information about how Splunk is being used within your organization. Let’s take drinking the “Splunk Champagne” to the next level by applying statistics and machine learning to Splunk’s internal logs! This session will cover segmenting users based on their search profiles - number of searches run, average response times, and recency of searches executed, among other criteria. We’ll use techniques such as clustering to classify users from novice to experts, and use TF-IDF and text analytics techniques to understand commands used in search strings. Enriching this data with completed and planned Splunk Education courses, lunch & learn sessions, and other training activities will enable your users to achieve the Splunk Ninja status they’re looking for! Speaker(s) Anand Ladda, Staff Solutions Engineer, Splunk Slides PDF link - https://conf.splunk.com/files/2019/slides/FN1373.pdf?podcast=1577146229 Product: Splunk Enterprise, Splunk Cloud, Splunk Machine Learning Toolkit, AI/ML Track: Foundations/Platform Level: Intermediate

Learning machine learning for SEO w/ Britney Muller

Tech Bound Conversations

Play Episode Listen Later Nov 20, 2019 53:24

Birtney Muller, Senior SEO Scientist at Moz, talks about Machine Learning, learning in general, elephants, and crazy chess stories! 0:00 Intro 0:54 2 Chess stories and why it’s not like Poker 9:00 What chess and Machine Learning have in common 11:28 How Britney fell in love with machine learning 16:06 Applying machine learning to SEO 18:47 How bad is TF-IDF really? 21:09 Google’s NLP API 25:00 How we can save elephants with machine learning 31:43 How machine learning will take Google to the next level 43:44 What is Moz planning to do with ML? 45:30 The Snapchat thief 50:55 The importance of learning * https://twitter.com/BritneyMuller * moz.com * chess.com * www.bitly.com/help-the-elephants * Book: Annie Duke - Thinking in Bets * AlphaGo documentary: https://www.netflix.com/title/80190844 * Andrew NGs course: https://www.coursera.org/courses?query=andrew%20ng

google snapchat machine learning chess ml muller moz learning machines tf idf

129 - What the heck is TF-IDF?

DigitalCast

Play Episode Listen Later Jul 15, 2019 1:35

If you follow search engine news at all, you've likely heard of TF-IDF. But what is it? See full show notes at: https://digitalcast.org/podcast129

heck tf idf

TF*IDF: ein sinnvolles SEO-Tool? SEO im Ohr - Folge 49

SEO im Ohr - die SEO-News von SEO Südwest

Play Episode Listen Later Jul 12, 2019 19:09

TF*IDF galt lange Zeit als ein wichtiges Werkzeug zur Optimierung von Inhalten auf Webseiten. Doch bringen die Analyse und die Ausrichtung von Texten an der Termfrequenz und der Termverteilung wirklich etwas? Dieser Frage gehen wir in der aktuellen Ausgabe von 'SEO im Ohr' nach. Außerdem gibt es wie immer die wichtigsten SEO-News der Woche.

seo ausgabe analyse online marketing ohr werkzeug inhalten texten ausrichtung dieser frage optimierung webseiten seo tools sinnvolles seo news tf idf

Improve the customer journey with Intent Recognition and Conversational Analytics

Bol.com - Techlab

Play Episode Listen Later May 23, 2019 33:34

In this episode, we talk with Daniel and Emiel, software engineer and product owner in the customer support domain. In this domain, the focus is to help our customers in the best way possible. But what if we can prevent the customer to feel the need to contact bol.com in the first place, they asked themselves.What this episode coversThey realized this can be possible using the analyses of the various customer interactions we have via the Chatbot "Billie", live chat, phone and email.For these analyses, they introduced techniques from the Data Science and Machine Learning domain. Natural Language Processing is needed as well as Comprehender Techniques. As a team, they investigated models available in the open source community. In the podcast, we talk about how we adapted them for this purpose followed by the training of these models.We talk about the four steps to get from idea to the usable information for the product specialists. The product specialist can use the information provided to enhance the product details and descriptions to a level where a minimum of questions from customers is needed.One of the first deliverables was the introduction of so-called unhappy products report. Products which cause relatively much customer interactions. It presents these products but even more important, possible causes.GuestsDaniël Heres; software engineerEmiel Ubink; product ownerNotesDesign sprints as part of the way of working to increase the speed and shorten the feedback loop:design sprintDetermination of the important words in the text is done by the use of TF/IDF which stands for term frequency-inverse document frequency. It's part of the Natural Language Processing and to determine the Smart Word Clouds. TF/IDFBigQuery is used to store the data to be analysed later on.Predictive Models are being used to improve the shop which should result in a better customer journey with a lower number of customer questions.Unsupervised Learning is discussed as the way the models are being trained and verifiedClustering Topic Modelling; finding out the latent thems (topics)Update 4th June 2021:We received a mail from one of our listeners providing us with a link to more up to date info on unsupervised learning

products analytics machine learning recognition intent chatbots data science customer journey conversational natural language processing emiel unsupervised learning tf idf

#15 Praktisches Machine Learning mit Python

Techtiefen

Play Episode Listen Later Apr 29, 2019 114:57

In Folge 15 sind Jochen und Dominik vom Python Podcast zu Gast um uns dem maschinellen Lernen zu widmen. Gemeinsam besprechen wir ganz konkret die notwendigen Schritte, um Nachrichtentexte automatisch Tags zuzuordnen. Einerseits diskutieren wir über das richtige Tooling im Python-Umfeld, etwa Jupyter Notebooks, PyData Tools wie numpy und pandas sowie unsere bevorzugten Plotting Bibliotheken. Nach einer ersten Datenanalyse besprechen wir den Umgang mit Null-Werten und wie man mit TF-IDF oder Word-Embeddings den Text vektorisiert. Wir diskutieren verschiedene Algorithmen aus der Scikit-Learn Bibliothek und erklären Pipelines und Hyper-Parameter Tuning. Abschließend überprüfen wir die Güte unserer Modelle anhand eines Klassifikations-Reports und streifen Themen Skalierung, Deep Learning und vieles mehr. [Reuters Dataset](https://martin-thoma.com/nlp-reuters/) [Pandas Profiling](https://github.com/pandas-profiling/pandas-profiling) [Pathlib](https://docs.python.org/3/library/pathlib.html) [Modin](https://github.com/modin-project/modin) [Pandarallel](https://github.com/nalepae/pandarallel) [Dask](https://dask.org/) [Sklearn Pipelines](https://www.kaggle.com/baghern/a-deep-dive-into-sklearn-pipelines) [Management von Machine Learning Modellen](https://www.inovex.de/blog/machine-learning-model-management/) [kaggle](https://www.kaggle.com/)

management umgang gemeinsam machine learning lernen schritte python deep learning modelle pipelines abschlie jochen einerseits algorithmen das k tooling datenanalyse jupyter notebooks praktisches modin tf idf

SEO im Ohr - Folge 38 - die wichtigsten SEO-News der Woche im Podcast

SEO im Ohr - die SEO-News von SEO Südwest

Play Episode Listen Later Apr 27, 2019 13:35

Google erwägt offenbar die Einführung von Bezahl-Features für My Business, weitere Probleme mit der Indexierung bei Google betreffen sowohl die von Google gewählten Canonical-URLs als auch den Report zur Indexabdeckung. Außerdem in dieser Ausgabe: Google nimmt Stellung zur steigenden Anzahl von 'nofollow'-Links sowie zu TF*IDF als Rankingfaktor, warum Ketten von 301- und 302-Redirects problematisch sein können und welche Unterschiede Google zwischen dem Aktualisierungsdatum in der Sitemap und dem Datum direkt auf einer Webseite macht.

google seo probleme einf online marketing im podcast anzahl stellung wichtigsten redirects ketten my business sitemaps seo news rankingfaktor tf idf indexierung

On page SEO, Backlink building & data backed SEO with Kyle Roof - SML #16

ShoutMeLoud Digital Marketing Podcast- Be Your Own Boss

Play Episode Listen Later Apr 22, 2019 42:07

Why should you listen to this episode? Here is what you will learn: #1 Ranking factor for Google search What's the 'Best' Word Count for Google? Heading tags for SEO Using LSI and TF-IDF for advanced On page SEO. Creating a content brief to rank higher - Process Ranking signals that matter Keyword research tools recommended by Kyle The Site SEO audit tool recommended by Kyle Best ways to go about Backlink Building (Easy wins and tough one) Creating a backlink building strategy for service/product based websites Should you be using the main domain or other domain emails for outreach emails? How to keep yourself updated with the latest SEO techniques Light-hearted talk about Kyle's journey Fun fact: Kyle was a lawyer before he started his career in the SEO industry. Kyles unofficial punch line: "Go Data or Go Home" About Kyle Roof: Co-founder HVSEO and POP Kyle Roof is the co-founder of Page Optimizer pro and an international SEO speaker

data seo ranking go home backed backlinks kyles kyle roof tf idf

71: Sharing SEO Insight, Tips, and Tricks with Marcus Tandler

Marketing Speak

Play Episode Listen Later Mar 1, 2017 62:41

Today's episode features Marcus Tandler who cofounded and runs the enterprise SEO software company OnPage.org, which has the goal of helping people create better websites that rank better in search engines. Marcus, a native of Munich, Germany, is a former super affiliate who was at one point in Commission Junction’s top 5 earners in all of Europe. also runs a super-exclusive conference (or think tank, to use his terms) called SEOktoberfest in Munich every year. In our chat, Marcus takes the time to share much many of his work experiences with us. Find Out More About Marcus Here: MarcusTandler on LinkedIn@mediadonis on TwitterMarcus Tandler on Facebook In This Episode: [01:06] - Why did Marcus decide to switch from affiliate marketing to running a software service company? [05:18] - Stephan steps in for a moment to share an origin story of his own. [06:55] - Marcus shares his thoughts on what Stephan did, which he believes was a smart strategy. He and Stephan then go on to discuss their companies and work experiences. [11:27] - Stephan brings up the idea of surrounding yourself with people who are smarter than you. Marcus agrees and gives an example of having done this. [14:17] - We learn about how private SEOktoberfest is, and how it’s structured in terms of experts and attendees. [16:52] - Marcus and Stephan engage in a role-playing exercise by giving each other cool information, the way they might at a mastermind or think tank. [18:02] - Stephan’s first contribution involves YouTube searching on Google Trends. His second is about Christoph Cemper’s research on 302s being better at passing the SEO benefit over time (versus 301s). [20:36] - Stephan shares a story about Greg Boser and Todd Friesen. [25:01] - We hear about Marcus’ and Stephan’s thoughts on the pill-pushing game. [27:22] - Certain black hat techniques stopped working in 2007 or 2008, Marcus explains. [32:49] - Stephan and Marcus talk about the featured snippet, or instant answer, on Google. [34:23] - We hear more about SEMrush, and a tool it offers related to featured snippets. [35:37] - Marcus takes his turn for sharing ideas. He talks about TF-IDF analysis, a major topic in German SEO circles. [38:41] - Marcus offers a reverse example of what he’s been talking about. [45:11] - We hear an example about the kind of thought process Marcus has been describing, from Stephan this time. He talks about Homesteading.com and the fact that one of their articles outranks the home page on Google. [48-31] - What does Marcus think of latent semantic indexing (LSI)? [52:42] - Are there any free tools that give some actionable insight into TF-IDF? Marcus reveals that a limited version of this tool is available in the free version of OnPage. [54:31] - Stephan and Marcus touch on the problem of using the disallow directive instead of no-indexing pages. [55:22] - Marcus talks about some common SEO screw-ups that OnPage can find. [60:14] - Marcus’ company doesn’t offer any consulting services. Here, he explains why. Links and Resources: MarcusTandler on LinkedIn@mediadonis on TwitterMarcus Tandler on FacebookOnPage.orgScreaming FrogAlta VistaBlack hat SEOFireballWar Room MastermindChristoph Cemper11 More Things You Didn’t Know About Links and Redirects by Christoph CemperGreg BoserTodd FriesenSEMrushTF-IDF Homesteading.comMarcus Tober on Marketing Speak

google europe germany sharing seo munich tips and tricks homesteading redirects google trends semrush lsi onpage commission junction marketing speak tf idf marcus tandler seoktoberfest

TF*IDF, Content und SEO – SEOHouse 75

SEOHouse: SEO Podcast zur strategischen Suchmaschinenoptimierung – termfrequenz

Play Episode Listen Later Oct 20, 2016 142:07

Wir haben mit Thomas Mindnich einen echten Helden der Frühzeit des SEOs im SEOHouse. Weshalb es nahe liegt, sich mit ihm über Textanalysen, Information Retrieval, NLP, Suchmaschinen, TF*IDF und eine Menge anderes Zeugs zu unterhalten.... Der Beitrag TF*IDF, Content und SEO – SEOHouse 75 erschien zuerst auf termfrequenz.

seo nlp menge helden suchmaschinen zeugs information retrieval tf idf

Understanding TF*IDF: One of Google’s Earliest Ranking Factors

Search Engine Nerds

Play Episode Listen Later Jul 8, 2016 31:47

In this Marketing Nerds episode, Brent Csutoras sits down with Marcus Tandler of OnPage.org to talk about TF*IDF, one of Google's earliest ranking factor.The post Understanding TF*IDF: One of Google’s Earliest Ranking Factors appeared first on Search Engine Journal.

google earliest search engine journal ranking factors onpage tf idf marcus tandler brent csutoras

Algorithm Marketplace with Diego Oppenheimer of Algorithmia

Hackers – Software Engineering Daily

Play Episode Listen Later Jun 19, 2016 60:19

Algorithmia is marketplace for algorithms. A software engineer who writes an algorithm for image processing or spam detection or TF-IDF can turn that algorithm into a RESTful API to be consumed by other developers. Different algorithms can be composed together to build even higher level applications. Diego Oppenheimer is the CEO of Algorithmia, and he The post Algorithm Marketplace with Diego Oppenheimer of Algorithmia appeared first on Software Engineering Daily.

ceo marketplace algorithms oppenheimer software engineering daily restful api tf idf

Podcasts about tf idf

Best podcasts about tf idf

SEO im Ohr - die SEO-News von SEO Südwest

Latest news about tf idf

Latest podcast episodes about tf idf

Kodsnack 536 - I choose computer science, with Michele Riva

Unleashing the power of large language models

EP 207 : Term Frequency * Inverse Document Frequency ( TF*IDF ) | Content SEO

Le Natural Language Processing c'est quoi ? - Ep. 4 - LPHS

learn about TF-IDF model in Natural Language Processing

LSI Keywords, TF IDF, etc. : Let's debunk Google SEO Algo Myths - SEO Conspiracy S01E38

Animal Olympics, Whatsapp, and Models for Healthcare

Drive effective Splunk enablement and adoption & build user cohorts using internal logs and machine learning [Splunk Enterprise, Splunk Cloud, Splunk Machine Learning Toolkit, AI/ML]

Drive effective Splunk enablement and adoption & build user cohorts using internal logs and machine learning [Splunk Enterprise, Splunk Cloud, Splunk Machine Learning Toolkit, AI/ML]

Drive effective Splunk enablement and adoption & build user cohorts using internal logs and machine learning [Splunk Enterprise, Splunk Cloud, Splunk Machine Learning Toolkit, AI/ML]

Drive effective Splunk enablement and adoption & build user cohorts using internal logs and machine learning [Splunk Enterprise, Splunk Cloud, Splunk Machine Learning Toolkit, AI/ML]

Drive effective Splunk enablement and adoption & build user cohorts using internal logs and machine learning [Splunk Enterprise, Splunk Cloud, Splunk Machine Learning Toolkit, AI/ML]

Learning machine learning for SEO w/ Britney Muller

129 - What the heck is TF-IDF?

TF*IDF: ein sinnvolles SEO-Tool? SEO im Ohr - Folge 49

Improve the customer journey with Intent Recognition and Conversational Analytics

#15 Praktisches Machine Learning mit Python

SEO im Ohr - Folge 38 - die wichtigsten SEO-News der Woche im Podcast

On page SEO, Backlink building & data backed SEO with Kyle Roof - SML #16

71: Sharing SEO Insight, Tips, and Tricks with Marcus Tandler

TF*IDF, Content und SEO – SEOHouse 75

Understanding TF*IDF: One of Google’s Earliest Ranking Factors

Algorithm Marketplace with Diego Oppenheimer of Algorithmia