Podcasts about lucene

Play Episode Listen Later Dec 18, 2023 99:41

Dans cet épisode, Katia, Arnaud et Emmanuel discutent les nouvelles de cette fin 2023. Le gatherer dans les stream Java, les exceptions, JavaScript dans la JVM, recherche vectorielle, coût du cloud, Gemini, Llama et autres animaux fantastiques et pleins d'outils sympathiques pour fêter la fin de l'année. Enregistré le 15 décembre 2023 Téléchargement de l'épisode LesCastCodeurs-Episode-304.mp3 News Aide Les Cast Codeurs et remplis un petit formulaire pour nous guider l'année prochaine https://lescastcodeurs.com/sondage Langages Avec JEP 461, arrivée dans en preview dans Java 22 de la notion de “gatherer” pour les streams https://groovy.apache.org/blog/groovy-gatherers dans cet article de Paul King, de l'équipe Groovy, il montre et contraste ce que l'on pouvait faire en Groovy depuis des années, comme des sliding windows, par exemple explique l'approche des gatherers avec ses opérations intermédiaires gatherer sont des operations intermediaires custom qui prennent un etat et le prochain element pour decided quoi faire, et meme changer le stream d'elements suivants (en publier) (via la fonction integrate certains peuvent permettre de combiner les resultats intermediaires (pour paralleliser) Examples : fenetres de taille fixe, fenettres glissantes Joe Duffy, qui est CEO de Pulumi, mais qui avait travaillé chez Microsoft sur le project Midori (un futur OS repensé) parle du design des exceptions, des erreurs, des codes de retour https://joeduffyblog.com/2016/02/07/the-error-model/ Il compare les codes d'erreurs, les exceptions, checked et non-checked il separe les bugs des erreurs attendues (bugs doivent arreter le process) il raconte l'histoire des unchecked exception et leurs problemes et des checked exceptopns et poourquoi les developeurs java les detestent (selon lui) long article maisn interessant dans ses retours mais lon je ne suis pas allé au bout :smile: Après la disparition de Nashorn dans le JDK, on peut se tourner vers le projet Javet https://www.caoccao.com/Javet/index.html Javet permet d'intégrer JavaScript avec le moteur V8 Mais aussi carrément Node.js c'est super comme capacité car on a les deux mielleurs moteurs, par contre le support hors x86 est plus limité (genre arm sous windows c'est non) Librairies Une partie de l'équipe Spring se fait lourder après le rachat effectif de Broadcom https://x.com/odrotbohm/status/1729231722498425092?s=20 peu d'info en vrai à part ce tweet mais l'acquisition Broadcome n'a pas l'air de se faire dans le monde des bisounours Marc Wrobel annonce la sortie de JBanking 4.2.0 https://www.marcwrobel.fr/sortie-de-jbanking-4-2-0 support de Java 21 possibilité de générer aléatoirement des BIC amélioration de la génération d'IBAN jbanking est une bibliotheque pour manipuler des structures typiques des banques comme les IBAN les BIC, les monnaies, les SEPA etc. Hibernate Search 7 est sorti https://in.relation.to/2023/12/05/hibernate-search-7-0-0-Final/ Support ElasticSearch 8.10-11 et openSearch 2.10-11 Rebasé sur Lucerne 9.8 support sur Amazon OpenSearch Serverless (experimental) attention sous ensemble de fonctionnalités sur Serverless, c'est un API first search cluster vendu a la lambda En lien aussi sur la version 7.1 alpha1 Hibernate ORM 6.4 est sorti https://in.relation.to/2023/11/23/orm-640-final/ support pour SoftDelete (colonne marquant la suppression) support pour les operations vectorielles (support postgreSQL initialement) les fonctions vectorielles sont particulièrement utilisées par l'IA/ML événement spécifiques JFR Intégration de citrus et Quarkus pour les tests d'intégrations de pleins de protocoles et formats de message https://quarkus.io/blog/testing-quarkus-with-citrus/ permet de tester les entrees / sorties attendues de systèmes de messages (HTTP, Kafka, serveur mail etc) top pour tester les application Event Driven pas de rapport mais Quarkus 3.7 ciblera Java 17 (~8% des gens utilisaient Java 11 dans les builds qui ont activé les notifications) Hibernate Search 7.1 (dev 7.1.0.Alpha1) avec dernière version de Lucene (9.8), Infinispan rajoute le support pour la recherche vectorielle. https://hibernate.org/search/releases/7.1/ https://infinispan.org/blog/2023/12/13/infinispan-vector-search Hibernate Search permet maintenant la recherche vectorielle La dernière version est intégrée en Infinispan 15 (dev) qui sortira La recherche vectoriolle et stockage de vecteurs, permettent convertir Infinispan en Embedding Store (langchain) Cloud Comment choisir sa region cloud https://blog.scottlogic.com/2023/11/23/conscientious-cloud-pick-your-cloud-region-deliberately.html pas si simple le coût la securité légale de vos données la consommation carbone de la région choisie (la France est top, la Pologne moins) la latence vs où sont vos clients les services supportés Web Vers une standardisation des Webhooks ? https://www.standardwebhooks.com/ Des gens de Zapier, Twilio, Ngrok, Kong, Supabase et autres, se rejoignent pour essayer de standardiser l'approche des Webhooks La spec est open source (Apache) sur Github https://github.com/standard-webhooks/standard-webhooks/blob/main/spec/standard-webhooks.md Les objectifs sont la sécurité, la reliabilité, l'interopérabilité, la simplicité et la compatibilité (ascendante / descendante) sans la spec, chaque webhook est different dans son comportement et donc les clients doivent s'adapter dans la sematique et les erreurs etc la (meta-) structure de la payload, la taille, la securisation via signature (e.g. hmac), les erreurs (via erreurs HTTP), etc Data et Intelligence Artificielle Google annonce Gemini, son nouveau Large Language Model https://blog.google/technology/ai/google-gemini-ai/#sundar-note modèle multimodal qui peut prendre du texte, en entrée, mais aussi des images, du son, des vidéos d'après les benchmarks, il est largement aussi bon que GPT4 plusieurs tailles de modèles disponible : Nano pour être intégré aux mobiles, Pro qui va être utilisé dans la majeure partie des cas, et Ultra pour les besoins de réflexion les plus avancés Android va rajouter aussi des librairies AICore pour utiliser Gemini Nano dans les téléphones Pixel https://android-developers.googleblog.com/2023/12/a-new-foundation-for-ai-on-android.html Gemini Pro va être disponible dans Bard (en anglais et dans 170 pays, mais l'Europe va devoir attendre un petit peu pour que ce soit dispo) Gemini Ultra devrait aussi rejoindre Bard, dans une version étendue https://blog.google/products/bard/google-bard-try-gemini-ai/ Gemini va être intégré progressivement dans plein de produits Google DeepMind parlant de Gemini https://deepmind.google/technologies/gemini/#introduction Un rapport de 60 pages sur Gemini https://storage.googleapis.com/deepmind-media/gemini/gemini_1_report.pdf Gemini a permis aussi de pouvoir développer une nouvelle version du modèle AlphaCode qui excelle dans les compétitions de coding https://storage.googleapis.com/deepmind-media/AlphaCode2/AlphaCode2_Tech_Report.pdf Liste de petites vidéos sur YouTube avec des interviews et démonstrations des capacités de Gemini https://www.youtube.com/playlist?list=PL590L5WQmH8cSyqzo1PwQVUrZYgLcGZcG malheureusement certaines des annonces sont un peu fausse ce qui a amené un discrédit (non du) sur Gemini par exemple la video “aspirationelle” était vendue comme du réel mais ce n'est pas le cas. et ultra n'est pas disponible encore ausso la comparaison de ChatGPT sur la page (initialement au moins) comparait des choux et des carottes, meme si le papier de recherche était correct Avec la sortie de Gemini, Guillaume a écrit sur comment appeler Gemini en Java https://glaforge.dev/posts/2023/12/13/get-started-with-gemini-in-java/ Gemini est multimodèle, donc on peut passer aussi bien du texte que des images, ou même de la vidéo Il y a un SDK en Java pour interagir avec l'API de Gemini Facebook, Purple Llama https://ai.meta.com/blog/purple-llama-open-trust-safety-generative-ai/ Opensource https://ai.meta.com/llama/ dans l'optique des modeles GenAI ouverts, Facebook fournit des outils pour faire des IA responsables (mais pas coupables :wink: ) notament des benchmarks pour evaluler la sureté et un classifier de sureté, par exemple pour ne pas generer du code malicieux (ou le rendre plus dur) llama purple sera un projet parapluie D'ailleurs Meta IBM, Red Hat et pleins d'autres ont annoncé l'AI Alliance pour une AI ouverte et collaborative entre académique et industriels. Sont notammenrt absent Google, OpenAI (pas ouvert) et Microsoft Juste une annouce pour l'instant mais on va voir ce que ces acteurs de l'AI Alliance feront de concret il y a aussi un guide d'utilisateur l'usage IA responsable (pas lu) Apple aussi se met aux librairies de Machine Learning https://ml-explore.github.io/mlx/build/html/index.html MLX est une librairie Python qui s'inspire fortement de NumPy, PyTorch, Jax et ArrayFire Surtout, c'est développé spécifiquement pour les Macs, pour tirer au maximum parti des processeurs Apple Silicon Dans un des repos Github, on trouve également des exemples qui font tourner nativement sur macOS les modèles de Llama, de Mistral et d'auters https://github.com/ml-explore/mlx-examples non seulement les Apple Silicon amis aussi la memoire unifiee CPU/GPU qui est une des raisons clés de la rapidité des macs Faire tourner Java dans un notebook Jupyter https://www.javaadvent.com/2023/12/jupyter-notebooks-and-java.html Max Andersen explore l'utilisation de Java dans les notebooks Jupyter, au lieu du classique Python il y a des kernels java selon vos besoins mais il faut les installer dans la distro jupyter qu'on utilise et c'est la que jbang installable via pip vient a la rescousse il installe automatiquement ces kernels en quelques lignes Outillage Sfeir liste des jeux orientés développeurs https://www.sfeir.dev/tendances/notre-selection-de-jeux-de-programmation/ parfait pour Noël mais c'est pour ceux qui veulent continuer a challenger leur cerveau après le boulot jeu de logique, jeu de puzzle avec le code comme forme, jeu autour du machine learning, jeu de programmation assembleur Les calendriers de l'Avent sont populaires pour les développeurs ! En particulier avec Advent of Code https://adventofcode.com/ Mais il y a aussi l'Advent of Java https://www.javaadvent.com/ Ou un calendrier pour apprendre les bases de SVG https://svg-tutorial.com/ Le calendrier HTML “hell” https://www.htmhell.dev/adventcalendar/ qui parle d'accessibilité, de web components, de balises meta, de toutes les choses qu'on peut très bien faire en HTML/CSS sans avoir besoin de JavaScript Pour les développeurs TypeScript, il y a aussi un calendrier de l'Avent pour vous ! https://typehero.dev/aot-2023 Un super thread de Clara Dealberto sur le thème de la “dataviz” (data visualization) https://twitter.com/claradealberto/status/1729447130228457514 Beaucoup d'outil librement accessibles sont mentionnés pour faire toutes sortes de visualisations (ex. treemap, dendros, sankey…) mais aussi pour la cartographie Quelques ressources de site qui conseillent sur l'utilisation du bon type de visualisation en fonction du problème et des données que l'on a notemment celui du financial time qui tiens dans une page de PDF Bref c'est cool mais c'est long a lire Une petite liste d'outils sympas - jc pour convertir la sortie de commandes unix en JSON https://github.com/kellyjonbrazil/jc - AltTab pour macOS pour avoir le même comportement de basculement de fenêtre que sous Windows https://alt-tab-macos.netlify.app/ - gron pour rendre le JSON grep-able, en transformant chaque valeur en ligne ressemblant à du JSONPath https://github.com/tomnomnom/gron - Marker, en Python, pour transformer des PDF en beau Markdown https://github.com/VikParuchuri/marker - n8n un outil de workflow open source https://n8n.io/ gron en fait montre des lignes avec des assignments genre jsonpath = value et tu peux ungroner apres pour revenir a du json Marker utilise du machine learning mais il halklucine moins que nougat (nous voilà rassuré) Docker acquiert Testcontainers https://techcrunch.com/2023/12/11/docker-acquires-atomicjar-a-testing-startup-that-raised-25m-in-january/ Annonce par AtomicJar https://www.atomicjar.com/2023/12/atomicjar-is-now-part-of-docker/ Annonce par Docker https://www.docker.com/blog/docker-whale-comes-atomicjar-maker-of-testcontainers/ Architecture Comment implémenter la reconnaissance de chanson, comme Shazam https://www.cameronmacleod.com/blog/how-does-shazam-work il faut d'abord passer en mode fréquence avec des transformées de Fourrier pour obtenir des spectrogrammes puis créer une sorte d'empreinte qui rassemble des pics de fréquences notables à divers endroits de la chanson d'associer ces pics pour retrouver un enchainement de tels pics de fréquence dans le temps l'auteur a partagé son implémentation sur Github https://github.com/notexactlyawe/abracadabra/blob/e0eb59a944d7c9999ff8a4bc53f5cfdeb07b39aa/abracadabra/recognise.py#L80 Il y avait également une très bonne présentation sur ce thème par Moustapha Agack à DevFest Toulouse https://www.youtube.com/watch?v=2i4nstFJRXU les pics associés sont des hash qui peut etre comparés et le plus de hash veut dire que les chansons sont plus similaires Méthodologies Un mémo de chez ThoughtWorks à propos du coding assisté par IA https://martinfowler.com/articles/exploring-gen-ai.html#memo-08 Avec toute une liste de questions à se poser dans l'utilisation d'un outil tel que Copilot Il faut bien réaliser que malheureusement, une IA n'a pas raison à 100% dans ses réponses, et même plutôt que la moitié du temps, donc il faut bien mettre à jour ses attentes par rapport à cela, car ce n'est pas magique La conclusion est intéressante aussi, en suggérant que grosso modo dans 40 à 60% des situations, tu peux arriver à 40 à 80% de la solution. Est-ce que c'est à partir de ce niveau là qu'on peut vraiment gagner du temps et faire confiance à l'IA ? Ne perdez pas trop de temps non plus à essayer de convaincre l'IA de faire ce que vous voulez qu'elle fasse. Si vous n'y arrivez pas, c'est sans doute parce que l'IA n'y arrivera même pas elle même ! Donc au-delà de 10 minutes, allez lire la doc, chercher sur Google, etc. notamment, faire genrer les tests par l'IA dans al foulée augmente les risques surtout si on n'est pas capable de bien relire le code si on introduit un choix de pattern genre flexbox en CSS, si c'est sur une question de sécuriter, vérifier (ceinture et bretelle) est-ce le framework de la semaine dernière? L'info ne sera pas dans le LLM (sans RAG) Quelles capacités sont nécessaires pour déployer un projet AI/ML https://blog.scottlogic.com/2023/11/22/capabilities-to-deploy-ai-in-your-organisation.html C'est le MLOps et il y a quelques modèles end to end Google, IBM mais vu la diversité des organisations, c'est difficile a embrasser ces versions completes ML Ops est une métier, data science est un metier, donc intégrer ces competences sachez gérer votre catalogue de données Construire un process pour tester vos modèles et continuellement La notion de culture de la recherche et sa gestion (comme un portefeuille financier, accepter d'arrêter des experience etc) la culture de la recherche est peu présente en engineering qui est de construire des choses qui foncitonnent c'est un monde pre LLM Vous connaissez les 10 dark patterns de l'UX ? Pour vous inciter à cliquer ici ou là, pour vous faire rester sur le site, et plus encore https://dodonut.com/blog/10-dark-patterns-in-ux-design/ Parmi les dark patterns couverts Confirmshaming Fake Urgency and the Fear of Missing Out Nagging Sneaking Disguised Ads Intentional Misdirection The Roach Motel Pattern Preselection Friend Spam Negative Option Billing or Forced Continuity L'article conclut avec quelques pistes sur comment éviter ces dark patterns en regardant les bons patterns de la concurrence, en testant les interactions UX, et en applicant beaucoup de bon sens ! les dark patterns ne sont pas des accidents, ils s'appuient sur la psychologie et sont mis en place specifiquement Comment choisir de belles couleurs pour la visualisation de données ? https://blog.datawrapper.de/beautifulcolors/ Plutôt que de penser en RGB, il vaut mieux se positionner dans le mode Hue Saturation Brightness Plein d'exemples montrant comment améliorer certains choix de couleurs Mieux vaut éviter des couleurs trop pures ou des couleurs trop brillantes et saturées Avoir un bon contraste Penser aussi aux daltoniens ! j'ai personnellement eu toujours du mal avec saturationm vs brightness faire que les cloueirs en noir et blanc soient separees evant de le remettre (en changeant la brightness de chaque couleur) ca aide les daltoniens eviter les couleurs aux 4 coins amis plutot des couleurs complementaires (proches) rouge orange et jaune (non saturé) et variations de bleu sont pas mal les couleurs saturées sont aggressives et stressent les gens Pourquoi vous devriez devenir Engineering Manager? https://charity.wtf/2023/12/15/why-should-you-or-anyone-become-an-engineering-manager/ L'article parle de l'évolution de la perception de l'engineering management qui n'est plus désormais le choix de carrière par défaut pour les ingénieurs ambitieux. Il met en évidence les défis auxquels les engineering managers sont confrontés, y compris les attentes croissantes en matière d'empathie, de soutien et de compétences techniques, ainsi que l'impact de la pandémie de COVID-19 sur l'attrait des postes de management. L'importance des bons engineering mnanagers est soulignée, car ils sont considérés comme des multiplicateurs de force pour les équipes, contribuant de manière significative à la productivité, à la qualité et au succès global dans les environnements organisationnels complexes. L'article fournit des raisons pour lesquelles quelqu'un pourrait envisager de devenir Engineering Manager, y compris acquérir une meilleure compréhension de la façon dont les entreprises fonctionnent, contribuer au mentorat et influencer les changements positifs dans la dynamique des équipes et les pratiques de l'industrie. Une perspective est présentée, suggérant que devenir Engineering manager peut conduire à la croissance personnelle et à l'amélioration des compétences de vie, telles que l'autorégulation, la conscience de soi, la compréhension des autres, l'établissement de limites, la sensibilité à la dynamique du pouvoir et la maîtrise des conversations difficiles. L'article encourage à considérer la gestion comme une occasion de développer et de porter ces compétences pour la vie. Sécurité LogoFAIL une faille du bootloader de beaucoup de machines https://arstechnica.com/security/2023/12/just-about-every-windows-and-linux-device-vulnerable-to-new-logofail-firmware-attack/ en gros en changeant les eimages qu'on voit au boot permet d'executer du code arbitraire au tout debuit de la securisation du UEFI (le boot le plus utilisé) donc c'est game over parce que ca demarre avant l'OS c'est pas une exploitation a distance, il faut etre sur la machine avec des droits assez elevés deja mais ca peut etre la fin de la chaine d'attaque et comme d'hab un interpreteur d'image est la cause de ces vulnerabilités Conférences L'IA au secours de conférences tech: rajoute des profile tech femme comme speaker au programme pour passer le test diversité online via des profiles fake. https://twitter.com/GergelyOrosz/status/1728177708608450705 https://www.theregister.com/2023/11/28/devternity_conference_fake_speakers/ https://www.developpez.com/actu/351260/La-conference-DevTernity-sur-la-technologie-s-e[…]s-avoir-cree-de-fausses-oratrices-generees-automatiquement/ j'avais lu le tweet du createur de cette conf qui expliquait que c'etait des comptes de tests et que pris dans le rush ils avaient oublié de les enlever mais en fait les comptes de tests ont des profils “Actifs” sur le reseaux sociaux apparemment donc c'était savamment orchestré Au final beaucoup de speakers et des sponsors se desengagent La liste des conférences provenant de Developers Conferences Agenda/List par Aurélie Vache et contributeurs : 31 janvier 2024-3 février 2024 : SnowCamp - Grenoble (France) 1 février 2024 : AgiLeMans - Le Mans (France) 6 février 2024 : DevFest Paris - Paris (France) 8-9 février 2024 : Touraine Tech - Tours (France) 15-16 février 2024 : Scala.IO - Nantes (France) 6-7 mars 2024 : FlowCon 2024 - Paris (France) 14-15 mars 2024 : pgDayParis - Paris (France) 19 mars 2024 : AppDeveloperCon - Paris (France) 19 mars 2024 : ArgoCon - Paris (France) 19 mars 2024 : BackstageCon - Paris (France) 19 mars 2024 : Cilium + eBPF Day - Paris (France) 19 mars 2024 : Cloud Native AI Day Europe - Paris (France) 19 mars 2024 : Cloud Native Wasm Day Europe - Paris (France) 19 mars 2024 : Data on Kubernetes Day - Paris (France) 19 mars 2024 : Istio Day Europe - Paris (France) 19 mars 2024 : Kubeflow Summit Europe - Paris (France) 19 mars 2024 : Kubernetes on Edge Day Europe - Paris (France) 19 mars 2024 : Multi-Tenancy Con - Paris (France) 19 mars 2024 : Observabiity Day Europe - Paris (France) 19 mars 2024 : OpenTofu Day Europe - Paris (France) 19 mars 2024 : Platform Engineering Day - Paris (France) 19 mars 2024 : ThanosCon Europe - Paris (France) 19-21 mars 2024 : IT & Cybersecurity Meetings - Paris (France) 19-22 mars 2024 : KubeCon + CloudNativeCon Europe 2024 - Paris (France) 26-28 mars 2024 : Forum INCYBER Europe - Lille (France) 28-29 mars 2024 : SymfonyLive Paris 2024 - Paris (France) 4-6 avril 2024 : Toulouse Hacking Convention - Toulouse (France) 17-19 avril 2024 : Devoxx France - Paris (France) 18-20 avril 2024 : Devoxx Greece - Athens (Greece) 25-26 avril 2024 : MiXiT - Lyon (France) 25-26 avril 2024 : Android Makers - Paris (France) 8-10 mai 2024 : Devoxx UK - London (UK) 16-17 mai 2024 : Newcrafts Paris - Paris (France) 24 mai 2024 : AFUP Day Nancy - Nancy (France) 24 mai 2024 : AFUP Day Poitiers - Poitiers (France) 24 mai 2024 : AFUP Day Lille - Lille (France) 24 mai 2024 : AFUP Day Lyon - Lyon (France) 2 juin 2024 : PolyCloud - Montpellier (France) 6-7 juin 2024 : DevFest Lille - Lille (France) 6-7 juin 2024 : Alpes Craft - Grenoble (France) 27-28 juin 2024 : Agi Lille - Lille (France) 4-5 juillet 2024 : Sunny Tech - Montpellier (France) 19-20 septembre 2024 : API Platform Conference - Lille (France) & Online 7-11 octobre 2024 : Devoxx Belgium - Antwerp (Belgium) 10-11 octobre 2024 : Volcamp - Clermont-Ferrand (France) 10-11 octobre 2024 : Forum PHP - Marne-la-Vallée (France) 17-18 octobre 2024 : DevFest Nantes - Nantes (France) Nous contacter Pour réagir à cet épisode, venez discuter sur le groupe Google https://groups.google.com/group/lescastcodeurs Contactez-nous via twitter https://twitter.com/lescastcodeurs Faire un crowdcast ou une crowdquestion Soutenez Les Cast Codeurs sur Patreon https://www.patreon.com/LesCastCodeurs Tous les épisodes et toutes les infos sur https://lescastcodeurs.com/

covid-19 ceo fear ai google apple france spring data microsoft chatgpt code advent os android engineering dans windows ibm punk ia kong faire machine learning openai gemini shazam ux liste api conf sont bard open source python pixel java avoir quelques github guillaume llama mieux beaucoup aur javascript macos parmi html nano apache kafka llm donc arnaud macs plut genai css groovy vall marker construire zapier katia red hat docker annonce penser node kubernetes scala ai ml large language models twilio sdks pologne enregistr rgb broadcom bic json serverless mistral iban engineering manager paris france typescript midori svg google deepmind sepa markdown lucerne apple silicon postgresql thoughtworks paul king jvm vache pytorch actifs event driven uefi jupyter joe duffy nashorn webhooks html css jdk numpy gemini pro supabase cpu gpu ml ops lucene alphacode testcontainers gemini ultra alpha1 atomicjar logofail

Kodsnack 536 - I choose computer science, with Michele Riva

Kodsnack in English

Play Episode Listen Later Aug 1, 2023 49:03

Recorded at the Øredev 2022 developer conference, Fredrik chats with Michele Riva about writing a full-text search engine, maintaining 8% of all Node modules, going to one conference per week, refactoring, the value of a good algorithm, and a lot more. Michele highly recommends writing a full-text search engine. He created Lyra - later renamed Orama, and encourages writing your own in order to demystify subjects. Since the podcast was recorded, Michele has left his then employer Nearform and founded Oramasearch to focus on the search engine full time. We also discuss working for product companies versus consulting, versus open source. It’s more about differences between companies than anything else. Open source teaches you deal with more and more different people. Writing code is never just writing code. Should we worry about taking on too many dependencies? Michele is in favour of not fearing dependencies, but ensuring you understand how things important parts for your application work. Writing books is never convenient, but it can open many doors. When it comes to learning, there are areas where a whole level of tutorials are missing - where there is only really surface-level tutorial and perhaps deep papers, but nothing in between. Michele works quite a bit on bridging such gaps through his presentations. Thank you Cloudnet for sponsoring our VPS! Comments, questions or tips? We are @kodsnack, @tobiashieta, @oferlund and @bjoreman on Twitter, have a page on Facebook and can be emailed at info@kodsnack.se if you want to write longer. We read everything we receive. If you enjoy Kodsnack we would love a review in iTunes! You can also support the podcast by buying us a coffee (or two!) through Ko-fi. Links Michele Michele’s Øredev 2023 presentations Nearform TC39 - the committee which evolves Javascript as a language Matteo Collina - worked at Nearform, works with the Node technical steering committee Lyra - the full-text search engine - has been renamed Orama Lucene Solr Elasticsearch Radix tree Prefix tree Inverted index Thoughtworks McKinsey Daniel Stenberg Curl Deno Express Fastify Turbopack Turborepo from Vercel Vercel Fast queue Refactoring Michele’s refactoring talk Real-world Next.js - Michele’s book Next.js Multitenancy Create React app Nuxt Vue Sveltekit TF-IDF - “term frequency–inverse document frequency” Cosine similarity Michele’s talk on building Lyra Explaining distributed systems like I’m five Are all programming languages in English? 4th dimension Prolog Velato - programming language using MIDI files as source code Titles For foreign people, it’s Mitch That kind of maintenance A very particular company A culture around open source software Now part of the 8% Nothing more than a radix tree One simple and common API Multiple ways of doing consultancy What you’re doing is hidden You can’t expect to change people A problem we definitely created ourselves Math or magic Writing books is never convenient Good for 90% of the use cases (When I can choose,) I choose computer science

LCC 298 - De l'IA à toutes les sauces

The Find Your STRONG Podcast

Play Episode Listen Later Jul 24, 2023 103:52

Dans cet épisode estival Guillaume, Emmanuel et Arnaud parcourent les nouvelles du début d'été. Du Java, du Rust, du Go du coté des langages, du Micronaut, du Quarkus pour les frameworks, mais aussi du WebGPU, de l'agilité, du DDD, des sondages, de nombreux outils et surtout de l'intelligence artificielle à toutes les sauces (dans les bases de données, dans les voitures…). Enregistré le 21 juillet 2023 Téléchargement de l'épisode LesCastCodeurs-Episode-298.mp3 News Langages La release candidate de Go 1.21 supporte WASM et WASI nativement https://go.dev/blog/go1.21rc StringBuilder ou contatenation de String https://reneschwietzke.de/java/the-stringbuilder-advise-is-dead-or-isnt-it.html StringBuilder était la recommendation ca cela créait moins d'objects notamment. Mais la JVM a évolué et le compilateur ou JIT remplace cela par du code efficace Quelques petites exceptions le code froid (e.g. startup time) qui est encore interprété peut beneficier de StringBuilder autre cas, la concatenation dans des boucles où le JIT ne pourrait peut etre pas optimiser le StringBuilder “fluid” est plus efficace (inliné?) ces regles ne changement pas si des objects sont stringifié pour etre concaténés GPT 4 pas une revolution https://thealgorithmicbridge.substack.com/p/gpt-4s-secret-has-been-revealed rumeur ca beaucou de secret pas u modele a 1 trillion de parametres maus 8 a 220 Milliards combinés intelligeament les chercheurs attendaient un breakthrough amis c'est une envolution et pas particulierement nouveau methode deja implem,entee par des cherchers chez google (maintenant chez ooenai ils ont retarde la competition avec ces rumeurs de breakthrough amis 8 LLaMA peut peut etre rivaliser avec GPT4 Le blog Open Source de Google propose un article sur 5 mythes ou non sur l'apprentissage et l'utilisation de Rust https://opensource.googleblog.com/2023/06/rust-fact-vs-fiction-5-insights-from-googles-rust-journey-2022.html Il faut plus de 6 mois pour apprendre Rust : plutôt faux; quelques semaines à 3-4 mois max Le compilateur Rust est pas aussi rapide qu'on le souhaiterait — vrai ! Le code unsafe et l'interop sont les plus gros challanges — faux, c'est plutôt les macros, l'owernship/borrowing, et la programmation asynchrone Rust fournit des messages d'erreur de compilation géniaux — vrai Le code Rust est de haute qualité — vrai InfoQ sort un nouveau guide sur le Pattern Matching pour le switch de Java https://www.infoq.com/articles/pattern-matching-for-switch/ Le pattern matching supporte tous les types de référence L'article parle du cas de la valeur null L'utilisation des patterns “guarded” avec le mot clé when L'importance de l'ordre des cases Le pattern matching peut être utilisé aussi avec le default des switchs Le scope des variables du pattern Un seul pattern par case label Un seul case match-all dans un bloc switch L'exhaustivité de la couverture des types L'utilisation des generics La gestion d'erreur avec MatchException Librairies Sortie de Micronaut 4 https://micronaut.io/2023/07/14/micronaut-framework-4-0-0-released/ Langage minimal : Java 17, Groovy 4 et Kotlin 1.8 Support de la dernière version de GraalVM Utilisation des GraalVM Reachability Metadata Repository pour faciliter l'utilisation de Native Image Gradle 8 Nouveau Expression Language, à la compilation, pas possible au runtime (pour des raisons de sécurité et de support de pré-compilation) Support des Virtual Threads Nouvelle couche HTTP, éliminant les stack frames réactives quand on n'utilise pas le mode réactif Support expérimental de IO Uring et HTTP/3 Des filtres basés sur les annotations Le HTTP Client utilise maintenant le Java HTTP Client Génération de client et de serveur en Micronaut à partir de fichier OpenAPI L'utilisation YAML n'utilise plus la dépendance SnakeYAML (qui avait des problèmes de sécurité) Transition vers Jackarta terminé Et plein d'autres mises à jour de modules Couverture par InfoQ https://www.infoq.com/news/2023/07/micronaut-brings-virtual-thread/ Quarkus 3.2 et LTS https://quarkus.io/blog/quarkus-3-2-0-final-released/ https://quarkus.io/blog/quarkus-3-1-0-final-released/ https://quarkus.io/blog/lts-releases/ Infrastructure Red Hat partage les sources de sa distribution au travers de son Customer Portal, et impacte la communauté qui se base dessus https://almalinux.org/blog/impact-of-rhel-changes/ RedHat a annoncé un autre changement massif qui affecte tous les rebuilds et forks de Red Hat Enterprise Linux. À l'avenir, Red Hat publiera uniquement le code source pour les RHEL RPMs derrière leur portail client. Comme tous les clones de RHEL dépendent des sources publiées, cela perturbe encore une fois l'ensemble de l'écosystème Red Hat. Une analyse du choix de red hat sur la distribution du code source de rhel https://dissociatedpress.net/2023/06/24/red-hat-and-the-clone-wars/ Une reponse de red hat aux feux démarrés par l'annonce de la non distribution des sources de RHEL en public https://www.redhat.com/en/blog/red-hats-commitment-open-source-response-gitcentosorg-changes et un lien vers une de ces feux d'une personne proheminente dans la communauté Ansible https://www.jeffgeerling.com/blog/2023/im-done-red-hat-enterprise-linux Oracle demande a garder un Linux ouvert et gratuit https://www.oracle.com/news/announcement/blog/keep-linux-open-and-free-2023-07-10/ Suite à l'annonce d'IBM/RedHat, Oracle demande à garder Linux ouvert et gratuit IBM ne veut pas publier le code de RHEL car elle doit payer ses ingénieurs Alors que RedHat a pu maintenir son modèle économique durante des années L'article revient sur CentOS qu'IBM “a tué” en 2020 Oracle continue ses éfforts de rendre Linux ouvert et libre Oracle Linux continuera à être compatible avec RHEL jusqu'à la version 9.2, après ça sera compliqué de maintenir une comptabilité Oracle embauche des dev Linux Oracle demande à IBM de récupérer le downstream d'Oracle et de le distribuer SUSE forke RHEL https://www.suse.com/news/SUSE-Preserves-Choice-in-Enterprise-Linux/ SUSE est la société derrière Rancher, NeuVector, et SUSE Linux Enterprise (SLE) Annonce un fork de RHEL $10M d'investissement dans le projet sur les prochaines années Compatibilité assurée de RHEL et CentOS Web Google revent sont service de nom de domaine a Squarespace https://www.reddit.com/r/webdev/comments/14agag3/squarespace_acquires_google_domains/ et c'était pas gratuit donc on n'est pas censé etre le produit :wink: Squarespace est une entreprise américaine spécialisée dans la création de site internet Squarespace est un revendeur de Google Workspace depuis longtemps La vente devrait se finaliser en Q3 2023 Petite introduction à WebGPU en français https://blog.octo.com/connaissez-vous-webgpu/ Data Avec la mode des Large Language Models, on parle de plus en plus de bases de données vectorielles, pour stocker des “embeddings” (des vecteurs de nombre flottant représentant sémantiquement du texte, ou même des images). Un article explique que les Vecteurs sont le nouveau JSON dans les bases relationnelles comme PostgreSQL https://jkatz05.com/post/postgres/vectors-json-postgresql/ L'article parle en particulier de l'extension pgVector qui est une extension pour PostgreSQL pour rajouter le support des vectors comme type de colonne https://github.com/pgvector/pgvector Google Cloud annonce justement l'intégration de cette extension vectorielle à CloudSQL pour PostgreSQL et à AlloyDB pour PostgreSQL https://cloud.google.com/blog/products/databases/announcing-vector-support-in-postgresql-services-to-power-ai-enabled-applications Il y a également une vidéo, un notebook Colab, et une article plus détaillé techniquement utilisant LangChain https://cloud.google.com/blog/products/databases/using-pgvector-llms-and-langchain-with-google-cloud-databases Mais on voit aussi également Elastic améliorer Lucene pour utiliser le support des instructions SIMD pour accélérer les calculs vectoriels (produit scalaire, distance euclidienne, similarité cosinus) https://www.elastic.co/fr/blog/accelerating-vector-search-simd-instructions Outillage Le sondage de StackOverflow 2023 https://survey.stackoverflow.co/2023/ L'enquête a été réalisée auprès de 90 000 développeurs dans 185 pays. Les développeurs sont plus nombreux (+2%) que l'an dernier à travailler sur site (16% sur site, 41% remote, 42% hybrid) Les développeurs sont également de plus en plus nombreux à utiliser des outils d'intelligence artificielle, avec 70 % d'entre eux déclarant les utiliser (44%) ou prévoyant de les utiliser (25) dans leur travail. Les langages de programmation les plus populaires sont toujours JavaScript, Python et HTML/CSS. Les frameworks web les plus populaires sont Node, React, JQuery. Les bases de données les plus populaires sont PostgreSQL, MySQL, et SQLite. Les systèmes d'exploitation les plus populaires sont Windows puis macOS et Linux. Les IDE les plus populaires sont Visual Studio Code, Visual Studio et IDEA IntelliJ. Les différents types de déplacement dans Vim https://www.barbarianmeetscoding.com/boost-your-coding-fu-with-vscode-and-vim/moving-blazingly-fast-with-the-core-vim-motions/ JetBrains se mets aussi à la mode des assistants IA dans l'IDE https://blog.jetbrains.com/idea/2023/06/ai-assistant-in-jetbrains-ides/ une intégration avec OpenAI mais aussi de plus petits LLMs spécifiques à JetBrains un chat intégré pour discuter avec l'assistant, puis la possibilité d'intégrer les snippets de code là où se trouve le curseur possibilité de sélectionner du code et de demander à l'assistant d'expliquer ce que ce bout de code fait, mais aussi de suggérer un refactoring, ou de régler les problèmes potentiels on peut demander à générer la JavaDoc d'une méthode, d'une classe, etc, ou à suggérer un nom de méthode (en fonction de son contenu) génération de message de commit il faut avoir un compte JetBrains AI pour y avoir accès Des commandes macOS plus ou moins connues https://saurabhs.org/advanced-macos-commands caffeinate — pour garder le mac éveillé pbcopy / pbpaste — pour interagir avec le clipboard networkQuality — pour mesurer la rapidité de l'accès à internet sips — pour manipuler / redimensionner des images textutil — pour covertir des fichers word, texte, HTML screencapture — pour faire un screenshot say — pour donner une voix à vos commandes Le sondage de la communauté ArgoCD https://blog.argoproj.io/cncf-argo-cd-rollouts-2023-user-survey-results-514aa21c21df Un client d'API open-source et cross-platform pour GraphQL, REST, WebSockets, Server-sent events et gRPC https://github.com/Kong/insomnia Architecture Moderniser l'architecture avec la decouverte via le domain driven discovery https://www.infoq.com/articles/architecture-modernization-domain-driven-discovery/?utm_source=twitter&utm_medium=link&utm_campaign=calendar Un article très détaillé pour moderniser son architecture en utilisant une approche Domain-Driven Discovery qui se fait en 5 étapes: Encadrer le problème – Clarifier le problème que vous résolvez, les personnes touchées, les résultats souhaités et les contraintes de solution. Analyser l'état actuel – Explorer les processus opérationnels et l'architecture des systèmes existants afin d'établir une base de référence pour l'amélioration. Explorer l'état futur – Concevoir une architecture modernisée fondée sur des contextes délimités, établir des priorités stratégiques, évaluer les options et créer des solutions pour l'état futur. Créer une feuille de route – Créer un plan pour moderniser l'architecture au fil du temps en fonction des flux de travail ou des résultats souhaités. Récemment, Sfeir a lancé son blog de développement sur https://www.sfeir.dev/ plein d'articles techniques sur de nombreux thèmes : front, back, cloud, data, AI/ML, mobile aussi des tendances, des success stories par exemple dans les derniers articles : on parle d'Alan Turing, du Local Storage en Javascript, des la préparation de certifications React, l'impact de la cybersécurité sur le cloud Demis Hassabis annonce travailler sur une IA nommée Gemini qui dépassera ChatGPT https://www.wired.com/story/google-deepmind-demis-hassabis-chatgpt/ Demis Hassabis CEO de Google DeepMind créateur de AlphaGOet AlphaFold Travaille sur une IA nommé Gemini qui dépasserait ChatGPT de OpenAI Similair à GPT-4 mais avec des techniques issues de AlphaGO Encore en developpement, va prendre encore plusieurs mois Un remplaçant a Bard? Méthodologies Approcher l'agilité par les traumatismes (de developement) passés des individus https://www.infoq.com/articles/trauma-informed-agile/?utm_campaign=infoq_content&utm_source=twitter&utm_medium=feed&utm_term=culture-methods Nous subissons tous un traumatisme du développement qui rend difficile la collaboration avec d'autres - une partie cruciale du travail dans le développement de logiciels agiles. Diriger d'une manière tenant compte des traumatismes n'est pas pratiquer la psychothérapie non sollicitée, et ne justifie pas les comportements destructeurs sans les aborder. Être plus sensible aux traumatismes dans votre leadership peut aider tout le monde à agir de façon plus mature et plus disponible sur le plan cognitif, surtout dans des situations émotionnellement difficiles. Dans les milieux de travail tenant compte des traumatismes, les gens accordent plus d'attention à leur état physique et émotionnel. Ils s'appuient aussi davantage sur le pouvoir de l'intention, fixent des objectifs d'une manière moins manipulatrice et sont capables d'être empathiques sans s'approprier les problèmes des autres. Loi, société et organisation Mercedes va rajouter de l'intelligence artificielle dans ses voitures https://azure.microsoft.com/en-us/blog/mercedes-benz-enhances-drivers-experience-with-azure-openai-service/ Programme béta test de 3 mois pour le moment Assistance vocale “Hey Mercedes” Permet de discuter avec la voiture pour trouver son chemin, concocter une recette, ou avoir tout simplement des discussions Ils travaillent sur des plugin pour reserver un resto, acheter des tickets de cinéma Free software vs Open Source dans le contexte de l'intelligence artificielle par Sacha Labourey https://medium.com/@sachalabourey/ai-free-software-is-essential-to-save-humanity-86b08c3d4777 on parle beaucoup d'AI et d'open source mais il manque la dimension de controle des utilisateurs finaux Stallman a crée la FSF par peur de la notion d'humain augmenté par des logiciels qui sont controllés par d'autres (implants dans le cerveau etc) d'ou la GPL et sa viralité qui propage la capacité a voir et modifier le conde que l'on fait tourner dans le debat AI, ce n'est pas seulement open source (casser oligopolie) mais aissu le free software qui est en jeu La folie du Cyber Resilience Act (CRA) europeen https://news.apache.org/foundation/entry/save-open-source-the-impending-tragedy-of-the-cyber-resilience-act Au sein de l'UE, la loi sur la cyber-résilience (CRA) fait maintenant son chemin à travers les processus législatifs (et doit faire l'objet d'un vote clé le 19 juillet 2023). Cette loi s'appliquera à un large éventail de logiciels (et de matériel avec logiciel intégré) dans l'UE. L'intention de ce règlement est bonne (et sans doute attendue depuis longtemps) : rendre le logiciel beaucoup plus sûr. Le CRA a une approche binaire: oui/non et considère tout le monde de la même manière Le CRA réglementerait les projets à source ouverte à moins qu'ils n'aient « un modèle de développement entièrement décentralisé ». Mais les modèles OSS sont de complexes mélanges de pur OSS et éditeurs de logiciels les entreprises commerciales et les projets open source devront être beaucoup plus prudents quant à ce que les participants peuvent travailler sur le code, quel financement ils prennent, et quels correctifs ils peuvent accepter. Certaines des obligations sont pratiquement impossibles à respecter, par exemple l'obligation de « livrer un produit sans vulnérabilités exploitables connues ». Le CRA exige la divulgation de vulnérabilités graves non corrigées et exploitées à l'ENISA (une institution de l'UE) dans un délai mesuré en heures, avant qu'elles ne soient corrigées. (complètement opposé aux bonnes pratiques de sécu) Une fois de plus une bonne idée à l'origine mais très mal implémentée qui risque de faire beaucoup de dommages Octave Klaba, avec Miro, son frère, et la Caisse des Dépôts, finalisent la création de Synfonium qui va maintenant racheter 100% de Qwant et 100% fe Shadow. Synfonium est détenue à 75% par Jezby Venture & Deep Code et à 25% par la CDC. https://twitter.com/i/web/status/1673555414938427392 L'un de rôles de Synfonium est de créer la masse critique des utilisateurs et des clients B2C & B2B qui vont pouvoir utiliser tous ces services gratuits et payants Vous y retrouverez le moteur de recherche, les services gratuits, la suite collaborative, le social login, mais aussi les services de nos partenaires tech. Le but est de créer une plateforme dans le Cloud SaaS EU qui respectent nos valeurs et nos lois européennes Yann LeCun : «L'intelligence artificielle va amplifier l'intelligence humaine» https://www.europe1.fr/emissions/linterview-politique-dimitri-pavlenko/yann-lecun-li[…]gence-artificielle-va-amplifier-lintelligence-humaine-4189120 Conférences La liste des conférences provenant de Developers Conferences Agenda/List par Aurélie Vache et contributeurs : 2-3 septembre 2023 : SRE France SummerCamp - Chambéry (France) 6 septembre 2023 : Cloud Alpes - Lyon (France) 8 septembre 2023 : JUG Summer Camp - La Rochelle (France) 14 septembre 2023 : Cloud Sud - Remote / Toulouse (France) 18 septembre 2023 : Agile Tour Montpellier - Montpellier (France) 19-20 septembre 2023 : Agile en Seine - Paris (France) 19 septembre 2023 : Salon de la Data Nantes - Nantes (France) & Online 21-22 septembre 2023 : API Platform Conference - Lille (France) & Online 22 septembre 2023 : Agile Tour Sophia Antipolis - Valbonne (France) 25-26 septembre 2023 : BIG DATA & AI PARIS 2023 - Paris (France) 28-30 septembre 2023 : Paris Web - Paris (France) 2-6 octobre 2023 : Devoxx Belgium - Antwerp (Belgium) 6 octobre 2023 : DevFest Perros-Guirec - Perros-Guirec (France) 10 octobre 2023 : ParisTestConf - Paris (France) 11-13 octobre 2023 : Devoxx Morocco - Agadir (Morocco) 12 octobre 2023 : Cloud Nord - Lille (France) 12-13 octobre 2023 : Volcamp 2023 - Clermont-Ferrand (France) 12-13 octobre 2023 : Forum PHP 2023 - Marne-la-Vallée (France) 19-20 octobre 2023 : DevFest Nantes - Nantes (France) 19-20 octobre 2023 : Agile Tour Rennes - Rennes (France) 26 octobre 2023 : Codeurs en Seine - Rouen (France) 25-27 octobre 2023 : ScalaIO - Paris (France) 26-27 octobre 2023 : Agile Tour Bordeaux - Bordeaux (France) 26-29 octobre 2023 : SoCraTes-FR - Orange (France) 10 novembre 2023 : BDX I/O - Bordeaux (France) 15 novembre 2023 : DevFest Strasbourg - Strasbourg (France) 16 novembre 2023 : DevFest Toulouse - Toulouse (France) 23 novembre 2023 : DevOps D-Day #8 - Marseille (France) 30 novembre 2023 : PrestaShop Developer Conference - Paris (France) 30 novembre 2023 : WHO run the Tech - Rennes (France) 6-7 décembre 2023 : Open Source Experience - Paris (France) 7 décembre 2023 : Agile Tour Aix-Marseille - Gardanne (France) 8 décembre 2023 : DevFest Dijon - Dijon (France) 7-8 décembre 2023 : TechRocks Summit - Paris (France) Nous contacter Pour réagir à cet épisode, venez discuter sur le groupe Google https://groups.google.com/group/lescastcodeurs Contactez-nous via twitter https://twitter.com/lescastcodeurs Faire un crowdcast ou une crowdquestion Soutenez Les Cast Codeurs sur Patreon https://www.patreon.com/LesCastCodeurs Tous les épisodes et toutes les infos sur https://lescastcodeurs.com/

ai google france transition shadow chatgpt dans windows cdc ibm oracle ia react comme kong alors faire suite salon openai gemini explorers ils rust programme conf agile bard open source python gpt toutes linux server java quelques assistance guillaume llama string petite aur javascript macos html arnaud miro squarespace groovy google cloud certaines vall ranchers oss red hat node loi ai ml large language models sauces elastic cra enregistr stack overflow langage json milliards paris france mysql google workspace vim colab graphql caisse google deepmind ddd jquery analyser visual studio suse postgresql jit kotlin visual studio code gpl ansible jvm lts vache wasm jetbrains diriger sqlite couverture micronauts concevoir yaml fsf centos demis hassabis grpc wasi websockets stallman clarifier langchain rhel html css pattern matching qwant infoq red hat enterprise linux b2c b2b approcher lucene simd webgpu argocd oracle linux codeurs jackarta local storage javadoc

106 - 5 Surefire Ways To Upgrade Your Health and Life with Nadine Shaban

Play Episode Listen Later Mar 9, 2023 53:57

This week Jenny is joined by STRONG Girl O.G and the Brainiac of the Team Nadine Shaban-Teriaky who tells us how she started in athletics becoming immersed in fitness programming and education, and how her academic background has led her to becoming a dynamic and accomplished fitness professional; all while becoming a nurse! Nadine brings us 5 science-backed surefire ways to upgrade your health, often and with ease. Get Your Perfect Sports 20% Discount here by using coupon code JVBSave $100 off Your MAXPRO Fitness hereApply for the STRONG Formula Certification Program WORK WITH A TEAM STRONG GIRLS COACHSTRONG Fitness Magazine Subscription Use discount code STRONGGIRL If you enjoyed this episode, make sure and give us a five star rating and leave us a review on iTunes, Podcast Addict, Podchaser and Castbox. Resources:STRONG Fitness MagazineSTRONG Fitness Magazine on IGTeam Strong GirlsCoach JVB Follow Jenny on social media:InstagramFacebookYouTube

discounts protein podchaser podcast addict protein powders precision nutrition brainiac surefire ways protein shakes berardi shaban charles poliquin leucine john berardi poliquin protein supplements upgrade your health lucene

Ep 39. 和 Alex 聊聊向量数据库与职业规划

捕蛇者说

Play Episode Listen Later Jan 23, 2023 78:45

如果喜欢我们的节目，欢迎通过爱发电打赏支持：https://afdian.net/@pythonhunter 嘉宾 Alex 主播小白 laike9m 时间轴 00:00:30 开场 00:00:59 嘉宾自我介绍 00:02:55 [第一部分]向量数据库 milvus 简介 00:07:35 向量数据库的「向量」代表什么以及其应用场景 00:14:16 原始数据到向量数据的转化 00:15:42 不同方式产生的向量数据格式是否相同、能否混用 00:19:04 milvus 存储向量数据的方式以及应用场景描述 00:25:59 怎样利用 milvus 进行向量数据查找 00:27:46 向量数据库存在的必要性 00:33:56 milvus 商业化道路讨论 00:41:57 [第二部分]嘉宾加入当前公司的契机是什么 00:52:54 在三类不同阶段的公司工作分别的工作体验是怎样的 00:53:41 小白：未融资的初创公司 00:57:39 Alex：已经融资进入正轨的创业公司 01:04:02 laike9m：Google 01:08:30 Alex：one more thing 01:12:13 好物分享 01:17:04 结语相关链接 00:00:47 zilliz 00:00:53 milvus 00:04:26 milvus start history in github 00:06:04 Facebook(Meta) faiss 00:06:21 Elasticsearch 00:06:24 Lucene 00:06:47 Google scann 00:06:50 Microsoft DiskANN 00:09:11 Embedding 维基百科 | 国内网络中没有找到权威解释这里引用一篇博文 embedding 的原理及实践 00:10:03 以图搜图 | 百度搜图 | Google 的话在 goole.com 的搜索栏旁边有一个相机的图标点击就能使用 00:14:50 Hugging Face 00:14:58 towhee 00:36:22 Databricks 00:45:47 B站：李自然说 01:04:23 A career ending mistake 01:12:37 AnimeGANv2 01:14:48 nintendo switch sports 01:16:35 古明地觉-博客园 | 里面有公众号的图片 | 古明地觉-知乎

google embedding databricks elasticsearch lucene

Victor Adossi on Yak Shaving

Software Sessions

Play Episode Listen Later Jan 2, 2023 110:47

Victor is a software consultant in Tokyo who describes himself as a yak shaver. He writes on his blog at vadosware and curates Awesome F/OSS, a mailing list of open source products. He's also a contributor to the Open Core Ventures blog. Before our conversation Victor wrote a structured summary of how he works on projects. I recommend checking that out in addition to the episode. Topics covered: Most people should use Dokku or CapRover But he uses Kubernetes anyways Hosting a Database in Kubernetes Learning technology You don't really know a thing until something goes wrong History of Frontend Development Context from lower layers of the stack and historical projects Good project pages have comparisons to other products Choosing technologies Language choice affects maintainability Knowing an ecosystem Victor's preferred stack Technology bake offs Posting findings means you get free corrections Why people use medium instead of personal sites Victor VADOSWARE - Blog How Victor works on Projects - Companion post for this episode Awesome FOSS - Curated list of OSS projects NimbusWS - Hosted OSS built on top of budget cloud providers Unvalidated Ideas - Startup ideas for side project inspiration PodcastSaver - Podcast index that allows you to choose Postgres or MeiliSearch and compare performance and results of each Victor's preferred stack Docker - Containers Kubernetes - Container provisioning (Though at the beginning of the episode he suggests Dokku for single server or CapRover for multiple) TypeScript - JavaScript with syntax for types. Victor's default choice. Rust - Language he uses if doing embedded work, performance is critical, or more correctness is desired Haskell - Language he uses if correctness and type system is the most important for the project Postgresql - General purpose database that's good enough for most use cases including full text search. KeyDB - Redis compatible database for caching. Acquired by Snap and then made open source. Victor uses it over Redis because it is multi threaded and supports flash storage without a Redis Enterprise license. Pulumi - Provision infrastructure with the languages you're already using instead of a specialized one or YAML Svelte and SvelteKit - Preferred frontend stack. Previously used Nuxt. Search engines Postgres Full Text Search vs the rest Optimizing Postgres Text Search with Trigrams OpenSearch - Amazon's fork of Elasticsearch typesense meilisearch sonic Quickwit JavaScript build tools Babel SWC Webpack esbuild parcel Vite Turbopack JavaScript frameworks React Vue Svelte Ember Frameworks built on top of frameworks Next - React Nuxt - Vue SvelteKit - Svelte Astro - Multiple Historical JavaScript tools and frameworks Underscore jQuery MooTools Backbone AngularJS Knockout Aurelia GWT Bower - Frontend package manager Grunt - Task runner Gulp - Task runner Related Links Dokku - Open source single-host alternative to Heroku Cloud Native Buildpacks - Buildpacks created by Heroku and Pivotal and used by Dokku CapRover - An open source PaaS-like abstraction built on top of Docker Swarm Kelsey Hightower's tweet about being cautious about running databases on Kubernetes Settling the Myth of Transparent HugePages for Databases Kubernetes Container Storage Interface (CSI) Kubernetes Local Persistent Volumes Longhorn - Distributed block storage for Kubernetes Postgres docs Postgres TOAST Everything I've seen on optimizing Postgres on ZFS Kubernetes Workload Resources Kubernetes Network Plugins Kubernetes Ingress Traefik Kubernetes the Hard Way (Setting up a cluster in a way that optimizes for learning) How does TLS work Let's Encrypt Cert manager for Kubernetes Choose Boring Technology A Linux user's guide to Logical Volume Management Docker networking overview Kubernetes Scheduler Tauri - Build desktop applications with web technology and Rust ripgrep - CLI tool to recursively search directory for a regex pattern (Meant to be a rust replacement for grep) angle-grinder / ag - CLI tool to parse and process log files written in rust Object.observe ECMAScript Proposal to be Withdrawn Ruby on Rails - Ruby web framework Django - Python web framework Laravel - PHP web framework Adonis - JavaScript NestJS - JavaScript What is a NullPointerException, and how do I fix it? Mastodon Clap - CLI argument parser for Rust AWS CDK - Provision AWS infrastructure using programming languages Terraform - Provision infrastructure with terraform language URL canonicalization of duplicate pages and the use of the canonical tag - Used by dev.to to send google traffic to the original blogpost instead of dev.to Transcript You can help edit this transcript on GitHub. [00:00:00] Jeremy: This episode, I talk to Victor Adossi who describes himself as a yak shaver. Someone who likes trying a whole bunch of different technologies, seeing the different options. We talk about what he uses, the evolution of front end development, and his various projects. Talking to just different people it's always good to get where they're coming from because something that works for Google at their scale is going to be different than what you're doing with one of your smaller projects. [00:00:31] Victor: Yeah, the context. Of course in direct conflict with that statement, I definitely use Google technology despite not needing to at all right? Like, you know, 99% of people who are doing like people like to call it indiehacking or building small products could probably get by with just Dokku. If you know Dokku or like CapRover. Are two projects that'll be like, Oh, you can just push your code here, we'll build it up like a little mini Heroku PaaS thing and just go on one big server, right? Like 99% of the people could just use that. But of course I'm not doing that. So I'm a bit of a hypocrite in that sense. I know what I should be doing, but I'm not doing that. I am writing a Kubernetes cluster with like five nodes for no reason. Uh, yeah, I dunno, people don't normally count the controllers. [00:01:24] Jeremy: Dokku and CapRover, I think those are where it's supposed to create a heroku like experience I think it's based off of the heroku buildpacks right? At least Dokku is? [00:01:36] Victor: Yeah Buildpacks has actually been spun out into like a community thing so like pivotal and heroku, it's like buildpacks.io, they're trying to build a wider standard around it so that more people can get involved. And buildpacks are actually obviously fantastic as a technology and as a a process piece. There's not much else like them and you know, that's obvious from like Heroku's success and everything. I know Dokku uses that. I don't know that Caprover does, but I haven't, I haven't really run Caprover that much. They, they probably do. Like at this point if you're going to support building from code, it seems silly to try and build your own buildpacks. Cause that's what you will do, eventually. So you might as well use what's there. Anyway, this is like just getting to like my personal opinions at this point, but like, if you think containers are a bad idea in 2022, You're wrong, you should, you should stop. Like you should, you should stop. Think about it. I mean, obviously there's not, um, I got a really great question at an interview once, which is, where are containers a bad idea? That's probably one of the best like recent interview questions I've ever gotten cause I was like, Oh yeah, I mean, like, you can't, it can't be perfect everywhere, right? Nothing's perfect everywhere. So it's like, where is it? Uh, and of course the answer was networking, right? (unintelligible) So if you need absolute performance, but like for just about everything else. Containers are kind of it at this point. Like, time has born it out, I think. So yeah, I always just like bias at taking containers at this point. So I'm probably more of a CapRover person than a Dokku person, even though I have not used, I don't use CapRover. [00:03:09] Jeremy: Well, like something that I've heard with containers, and maybe it's changed recently, but, but something that was kind of holdout was when people would host a database sometimes they would oh we just don't wanna put this in a container and I wonder if like that matches with your thinking or if things have changed. [00:03:27] Victor: I am not a database administrator right like I read postgres docs and I read the, uh, the Postgres documentation, and I think I know a bit about postgres but I don't commit right like so and I also haven't, like, oh, managed X terabytes on one server that you are making sure never goes down kind of deal. But the stickiness for me, at least from when I've run, So I've done a lot of tests with like ZFS and Postgres and like, um, and also like just trying to figure out, and I run Postgres in Kubernetes of course, like on my cluster and a lot of the stuff I found around is, is like fiddly kernel things like sort of base kernel settings that you need to have set. Like, you know, stuff like should you be using transparent huge pages, like stuff like that. But once you have that settled. Containers are just processes with name spacing and resource control, right? Like, that's it. there are some other ins and outs, but for the most part, if you're fine running a process, so people ran processes, right? And they were just completely like unprotected. Then people made users for the processes and they limited the users and ran the processes, right? Then the next step is now you can run a process and then do the limiting the name spaces in cgroups dynamically. Like there, there's, there's sort of not a humongous difference, unless you're hitting something very specific. Uh, but yeah, databases have been a point of contention, but I think, Kelsey Hightower had that tweet yeah. That was like, um, don't run databases in Kubernetes. And I think he called it back. [00:04:56] Victor: I don't know, but I, I know that was uh, was one of those things that people were really unsure about at first, but then after people sort of like felt it out, they were like, Oh, it's actually fine. Yeah. [00:05:06] Jeremy: Yeah I vaguely remember one of the concerns having to do with persistent storage. Like there were challenges with Kubernetes and needing to keep that storage around and I don't know if that's changed yeah or if that's still a concern. [00:05:18] Victor: Uh, I'd say that definitely has changed. Uh, and it was, it was a concern, depending on where you were. Mostly people who are running AKS or EKS or you know, all those other managed Kubernetes, they're just using EBS or like whatever storage provider is like offering for storage. Most of those people don't actually have that much of a problem with, storage in general. Now, high performance storage is obviously different, right? So like, so you'll, you're gonna have to start doing manual, like local volume management and stuff like that. it was a problem, because obviously CSI (Kubernetes Container Storage Interface) didn't exist for some period of time, and like there was, it was hard to know what to do for if you were just running a Kubernetes cluster. I think a lot of people were just using local, first of all, local didn't even exist for a bit. Um, they were just using host path, right? And just like, Oh, it's on the disk somewhere. Where do we, we have to go get it right? Or we have to like, sort of manage that. So that was something most people weren't ready for, especially if you were just, if you weren't like sort of a, a, a traditional sysadmin and used to doing that stuff. And then of course local volumes came out, but I think they still had to be, um, pre-provisioned. So that's sysadmin stuff that most people, you know, maybe aren't, aren't necessarily ready for. Uh, and then most of the general solutions were slow. So like, I used Longhorn (https://longhorn.io) for a long time and Longhorn, Longhorn's great. And super easy to set up, but it can be slower and you can have some, like, delays in mount time. it wasn't ideal for, for most people. So yeah, I, overall it's true. Databases, Databases in Kubernetes were kind of fraught with peril for a while, but it wasn't for the reason that, it wasn't for the fundamental reason that Kubernetes was just wrong or like, it wasn't the reason most people think of, which is just like, Oh, you're gonna break your database. It's more like, running a database is hard and Kubernetes hasn't solved all the hard problems. Like, cuz that's what Kubernetes does. It basically solves a lot of problems in a very generic way. Right. So it just hadn't solved all those problems yet at this point. I think it's got decent answers on a lot of them. So I, I mean, I don't know. I I do it. Don't, don't take what I'm saying to your, you know, PM meeting or your standup meeting, uh, anyone who's listening. But it's more like if you could solve the problems with databases in the sense before. You could probably solve 'em on Kubernetes now with a good understanding of Kubernetes. Cause at the end of the day, it's all the same stuff. Just Kubernetes makes it a little easier to, uh, do it dynamically. [00:07:50] Jeremy: It sounds like you could do it before, but some of the, I guess the tools or the ways of doing persistent storage were not quite there yet, or they were difficult to use. And so that was why people at the start were like, Okay, maybe it's not a good idea, but, now maybe there's some established practices for how you should run a database in Kubernetes. And I, I suppose the other aspect too is that, like you were saying, Kubernetes is its own thing. You gotta learn Kubernetes and all its intricacies. And then running a database is also its own challenge. So if you stack the two of them together and, and the path was not really clear then maybe at the start it wasn't the best idea. Um, uh, if somebody was going to try it out now, was there like a specific resource you looked at or a specific path to where like okay this is is how I'm going to do it. [00:08:55] Victor: I'll just say what I normally recommend to everybody. Cause it depends on which path you wanna go right? If you wanna go down like running a database path first and figure that out, fill out that skill tree. Like go read the Postgres docs. Well, first of all, use Postgres. That's the first tip there. But like, read those documents. And obviously you don't have to understand everything. You won't understand everything. But knowing the big pieces and sort of letting your brain see the mention of like a whole bunch of things, like what is toast? Oh, you can do compression on columns. Like, you can do some, some things concurrently. Um, you know, what ALTER TABLE looks like. You get all that stuff kind of in your head. Um, and then I personally really believe in sort of learning by building and just like iterating. you won't get it right the first time. It's just like, it's not gonna happen. You're get, you can, you can get better the first time, right? By being really prepared and like, and leave yourself lots of outs, but you kind of have to like, get it out there. Do do your best to make sure that you can't fail, uh, catastrophically, right? So this is like, goes back to that decision to like use ZFS as the bottom of this I'm just like, All right, well, I, I'm not a file systems expert, but if I. I could delegate some of that, you know, some of that, I can get some of that knowledge from someone else. Um, and I can make it easier for me to not fail catastrophically. For the database side, actually read documentation on Postgres or the whatever database you're going to use, make sure you at least understand that. Then start running it like locally or whatever. Again, Docker use, use Docker locally. It's, it's, it's fine. and then, you know, sort of graduate to running sort of more progressively, more complicated versions. what I would say for the Kubernetes side is actually similar. the Kubernetes docs are really good. they're very large. but they're good. So you can actually go through and know all the, like, workload, workload resources, know, like what a config map is, what a secret is, right? Like what etcd is doing in this whole situation. you know, what a kublet is versus an API server, right? Like the, the general stuff, like if you go through all that, you should have like a whole bunch of ideas at least floating around in your head. And then once you try and start setting up a server, they will all start to pop up again, right? And they'll all start to like, you, like, Oh, okay, I need a CNI (Container Networking) plugin because something needs to make the services available, right? Or something needs to power the ingress, right? Like, if I wanna be able to get traffic, I need an ingress object. But what listens, what does that, what makes that ingress object do anything? Oh, it's an ingress controller. nginx, you know, almost everyone's heard of nginx, so they're like, okay. Um, nginx, has an ingress control. Actually there's, there used to be two, I assume there's still two, but there's like one that's maintained by Kubernetes, one that's maintained by nginx, the company or whatever. I use traefik, it's fantastic. but yeah, so I think those things kind of fall out and that is almost always my first way to explain it and to start building. And tinkering iteratively. So like, read the documentation, get a good first grasp of it, and then start building yourself because you'll, you'll get way more questions that way. Like, you'll ask way more questions, you won't be able to make progress. Uh, and then of course you can, you know, hop into slacks or like start looking around and, and searching on the internet. oh, one of the things that really helped me out early learning Kubernetes was, Kelsey Hightower's, um, learn Kubernetes the hard way. I'm also a big believer in doing things the hard way, at least knowing what you're choosing to not know, right? distributing file system, Deltas, right? Or like changes to a file system over the network is not a new problem. Other people have solved it. There's a lot of complexity there. but if you at least know the sort of surface level of what the thing does and what it's supposed to do and how it's supposed to do it, you can make a decision on, Oh, how deep am I going to go? Right? To prevent yourself from like, making a mistake or going too deep in the rabbit hole. If you have an idea of the sort of ecosystem and especially like, Oh, here, like the basics of how I can use this thing, that's generally very good. And doing things the hard way is a great way to get a, a feel for that, right? Cause if you take some chunk and like, you know, the first level of doing things the hard way, uh, or, you know, Kelsey Hightower's guide is like, get a machine, right? Like, so, like, if you somehow were like, Oh, I wanna run a Kubernetes cluster. but, you know, I don't want use necessarily EKS and you wanna learn it the hard way. You have to go get a machine, right? If you, if you're not familiar, if you run on Heroku the whole time, like you didn't manage your own machines, you gotta go like, figure out EC2, right? Or, I personally use, hetzner I love hetzner, so you have to go figure out hetzner, digital ocean, whatever. Right. And then the next thing's like, you know, the guide's changed a lot, and I haven't, I haven't looked at it in like, in years, actually a while since I, since I've sort of been, I guess living it, but it's, it's like generate certificates, right? So if you've never dealt with SSL and like, sort of like, or I should say TLS uh, and generating certificates and how that whole dance works, right? Which is fascinating because it's like, oh, right, nothing's secure on the internet, except that we distribute root certificates on computers that are deployed in every OS, right? Like, that's a sort of fundamental understanding you may not go deep enough to realize, but if you are fascinated by it, trying to do it manually would lead you down that path. You'd be like, Oh, what, like what is this thing? What is a CSR? Like, why, who is signing my request? Right? And it's like, why do we trust those people? Right? And it's like, you know, that kind of thing comes out and I feel like you can only get there from trying to do it, you know, answering the questions you can. Right. And again, it takes some judgment to know when you should not go down a rabbit hole. uh, and then iterating. of course there are people who are excellent at explaining. you can find some resources that are shortcuts. But, uh, I think particularly my bread and butter has been just to try and do it the hard way. Avoid pitfalls or like rabbit holes when you can. But know that the rabbit hole is there, and then keep going. And sometimes if something's just too hard, you're not gonna get it the first time. Like maybe you'll have to wait like another three months, you'll try again and you'll know more sort of ambiently about everything else. You get a little further that time. that's how I feel about that. Anyway. [00:15:06] Jeremy: That makes sense to me. I think sometimes when people take on a project, they try to learn too many things at the same time. I, I think the example of Kubernetes and Postgres is pretty good example, where if you're not familiar with how do I install Postgres on bare metal or a vm, trying to make sense of that while you're trying to into is probably gonna be pretty difficult. So, so splitting them up and learning them individually, that makes a lot of sense to me. And the whole deciding how deep you wanna go. That's interesting too, because I think that's very specific to the person right because sometimes you wanna go a little deeper because otherwise you don't understand how the two things connect together. But other times it's just like with the example with certificates, some people they may go like, I just put in let's encrypt it gives me my cert I don't care right then, and then, and some people they wanna know like okay how does the whole certificate infrastructure work which I think is interesting, depending on who you are, maybe you go ahh maybe it doesn't really matter right. [00:16:23] Victor: Yeah, and, you know, shout out to Let's Encrypt . It's, it's amazing, right? think Singlehandedly the most, most of the deployment of HTTPS that happens these days, right? so many so many of like internet providers and uh, sort of service providers will use it right? Under the covers. Like, Hey, we've got you free SSL through Let's Encrypt, right? Like, kind of like under the, under the covers. which is awesome. And they, and they do it. So if you're listening to this, donate to them. I've done it. So now that, now the pressure is on whoever's listening, but yeah, and, and I, I wanna say I am that person as well, right? Like, I use, Cert Manager on my cluster, right? So I'm just like, I don't wanna think about it, but I, you know, but I, I feel like I thought about it one time. I have a decent grasp. If something changes, then I guess I have to dive back in. I think it, you've heard the, um, innovation tokens idea, right? I can't remember the site. It's like, um, do, like do boring tech or something.com (https://boringtechnology.club/) . Like it shows up on sort of hacker news from time to time, essentially. But it's like, you know, you have a certain amount of tokens and sort of, uh, we'll call them tokens, but tolerance for complexity or tolerance for new, new ideas or new ways of doing things, new processes. Uh, and you spend those as you build any project, right? you can be devastatingly effective by just sticking to the stack, you know, and not introducing anything new, even if it's bad, right? and there's nothing wrong with LAMP stack, I don't wanna annoy anybody, but like if you, if you're running LAMP or if you run on a hostgator, right? Like, if you run on so, you know, some, some service that's really old but really works for you isn't, you know, too terribly insecure or like, has the features you need, don't learn Kubernetes then, right? Especially if you wanna go fast. cuz you, you're spending tokens, right? You're spending, essentially brain power, right? On learning whatever other thing. So, but yeah, like going back to that, databases versus databases on Kubernetes thing, you should probably know one of those before you, like, if you're gonna do that, do that thing. You either know Kubernetes and you like, at least feel comfortable, you know, knowing Kubernetes extremely difficult obviously, but you feel comfortable and you feel like you can debug. Little bit of a tangent, but maybe that's even a better, sort of watermark if you know how to debug a thing. If, if it's gone wrong, maybe one or five or 10 or 20 times and you've gotten out. Not without documentation, of course, cuz well, if you did, you're superhuman. But, um, but you've been able to sort of feel your way out, right? Like, Oh, this has gone wrong and you have enough of a model of the system in your head to be like, these are the three places that maybe have something wrong with them. Uh, and then like, oh, and then of course it's just like, you know, a mad dash to kind of like, find, find the thing that's wrong. You should have confidence about probably one of those things before you try and do both when it's like, you know, complex things like databases and distributed systems management, uh, and orchestration. [00:19:18] Jeremy: That's, that's so true in, in terms of you are comfortable enough being able to debug a problem because it's, I think when you are learning about something, a lot of times you start with some kind of guide or some kind of tutorial and you follow the steps. And if it all works, then great. Right? But I think it's such a large leap from that to something went wrong and I have to figure it out. Right. Whether it's something's not right in my Dockerfile or my postgres instance uh, the queries are timing out. so many things that could go wrong, that is the moment where you're forced to figure out, okay, what do I really know about this not thing? [00:20:10] Victor: Exactly. Yeah. Like the, the rubber's hitting the road it's uh you know the car's about to crash or has already crashed like if I open the bonnet, do I know what's happening right or am I just looking at (unintelligible). And that's, it's, I feel sort a little sorry or sad for, for devs that start today because there's so much. Complexity that's been built up. And a lot of it has a point, but you need to kind of have seen the before to understand the point, right? So I like, I like to use front end as an example, right? Like the front end ecosystem is crazy, and it has been crazy for a very long time, but the steps are actually usually logical, right? Like, so like you start with, you know, HTML, CSS and JavaScript, just plain, right? And like, and you can actually go in lots of directions. Like HTML has its own thing. CSS has its own sort of evolution sort of thing. But if we look at JavaScript, you're like, you're just writing JavaScript on every page, right? And like, just like putting in script tags and putting in whatever, and it's, you get spaghetti, you get spaghetti, you start like writing, copying the same function on multiple pages, right? You just, it, it's not good. So then people, people make jquery, right? And now, now you've got like a, a bundled set of like good, good defaults that you can, you can go for, right? And then like, you know, libraries like underscore come out for like, sort of like not dom related stuff that you do want, you do want everywhere. and then people go from there and they go to like backbone or whatever. it's because Jquery sort of also becomes spaghetti at some point and it becomes hard to manage and people are like, Okay, we need to sort of like encapsulate this stuff somehow, right? And like the new tools or whatever is around at the same timeframe. And you, you, you like backbone views for example. and you have people who are kind of like, ah, but that's not really good. It's getting kind of slow. Uh, and then you have, MVC stuff comes out, right? Like Angular comes out and it's like, okay, we're, we're gonna do this thing called dirty checking, and it's gonna be, it's gonna be faster and it's gonna be like, it's gonna be less sort of spaghetti and it's like a little bit more structured. And now you have sort of like the rails paradigm, but on the front end, and it takes people to get a while to get adjusted to that, but then that gets too heavy, right? And then dirty checking is realized to be a mistake. And then, you get stuff like MVVM, right? So you get knockout, like knockout js and you got like Durandal, and like some, some other like sort of front end technologies that come up to address that problem. Uh, and then after that, like, you know, it just keeps going, right? Like, and if you come in at the very end, you're just like, What is happening? Right? Like if it, if it, if someone doesn't sort of boil down the complexity and reduce it a little bit, you, you're just like, why, why do we do this like this? Right? and sometimes there's no good reason. Sometimes the complexity is just like, is unnecessary, but having the steps helps you explain it, uh, or helps you understand how you got there. and, and so I feel like that is something younger people or, or newer devs don't necessarily get a chance to see. Cause it just, it would take, it would take very long right? And if you're like a new dev, let's say you jumped into like a coding bootcamp. I mean, I've got opinions on coding boot camps, but you know, it's just like, let's say you jumped into one and you, you came out, you, you made it. It's just, there's too much to know. sure, you could probably do like HTML in one month. Well, okay, let's say like two weeks or whatever, right? If you were, if you're literally brand new, two weeks of like concerted effort almost, you know, class level, you know, work days right on, on html, you're probably decently comfortable with it. Very comfortable. CSS, a little harder because this is where things get hard. Cause if you, if you give two weeks for, for HTML, CSS is harder than HTML kind of, right? Because the interactions are way more varied. Right? Like, and, and maybe it's one of those things where you just, like, you, you get somewhat comfortable and then just like know that in the future you're gonna see something you don't understand and have to figure it out. Uh, but then JavaScript, like, how many months do you give JavaScript? Because if you go through that first like, sort of progression that I, I I, I, I mentioned everyone would have a perfect sort of, not perfect but good understanding of the pieces, right? Like, why did we start transpiling at all? Right? Like, uh, or why did you know, why did we adopt libraries? Like why did Bower exist? No one talks about Bower anymore, obviously, but like, Bower was like a way to distribute front end only packages, right? Um, what is it? Um, Uh, yes, there's grunt. There's like the whole build system thing, right? Once, once we decide we're gonna, we're gonna do stuff to files before we, before we push. So there's grunt, there's, uh, gulp, which is like grunt, but like, Oh, we're gonna do it all in memory. We're gonna pipe, we're gonna use this pipes thing to make sure everything goes fast. then there's like, of course that leads like the insanity that's webpack. And then there's like parcel, which did better. There's vite there's like, there's all this, there's this progression, but how many months would it take to know that progression? It, it's too long. So they end up just like, Hey, you're gonna learn react. Which is the right thing because it's like, that's what people hire for, right? But then you're gonna be in react and be like, What's webpack, right? And it's like, but you can't go down. You can't, you don't have the time. You, you can't sort of approach that problem from the other direction where you, which would give you better understanding cause you just don't have the time. I think it's hard for newer devs to overcome this. Um, but I think there are some, there's some hope on the horizon cuz some things are simpler, right? Like some projects do reduce complexity, like, by watching another project sort of innovate so like react. Wasn't the first component, first framework, right? Like technically, I, I think, I think you, you might have to give that to like, to maybe backbone because like they had views and like marionette also went with that. Like maybe, I don't know, someone, someone I'm sure will get in like, send me an angry email, uh, cuz I forgot you Moo tools or like, you know, Ember Ember. They've also, they've also been around, I used to be a huge Ember fan, still, still kind of am, but I don't use it. but if you have these, if you have these tools, right? Like people aren't gonna know how to use them and Vue was able to realize that React had some inefficiencies, right? So React innovates the sort of component. So Reintroduces the component based model component first, uh, front end development model. Vue sees that and it's like, wait a second, if we just export this like data object, and of course that's not the only innovation of Vue, but if we just export this data object, you don't have to do this fine grained tracking yourself anymore, right? You don't have to tell React or tell your the system which things change when other things change, right? Like you, you don't have to set up this watching and stuff, right? Um, and that's one of the reasons, like Vue is just, I, I, I remember picking up Vue and being like, Oh, I'm done. I'm done with React now. Because it just doesn't make sense to use React because they Vue essentially either, you know, you could just say they learned from them or they, they realize a better way to do things that is simpler and it's much easier to write. Uh, and you know, functionally similar, right? Um, similar enough that it's just like, oh they boil down some of that complexity and we're a step forward and, you know, in other ways, I think. Uh, so that's, that's awesome. Every once in a while you get like a compression in the complexity and then it starts to ramp up again and you get maybe another compression. So like joining the projects that do a compression. Or like starting to adopting those is really, can be really awesome. So there's, there's like, there's some hope, right? Cause sometimes there is a compression in that complexity and you you might be lucky enough to, to use that instead of, the thing that's really complex after years of building on it. [00:27:53] Jeremy: I think you're talking about newer developers having a tough time making sense of the current frameworks but the example you gave of somebody starting from HTML and JavaScript going to jquery backbone through the whole chain, that that's just by nature of you've put in a lot of time right you've done a lot of work working with each of these technologies you see the progression as if someone is starting new just by nature of you being new you won't have been able to spend that time [00:28:28] Victor: Do you think it could work? again, the, the, the time aspect is like really hard to get like how can you just avoid spending time um to to learn things that's like a general problem I think that problem is called education in the general sense. But like, does it make sense for a, let's say a bootcamp or, or any, you know, school right? To attempt to guide people through the previous solutions that didn't work, right? Like in math, you don't start with calculus, right? It just wouldn't, it doesn't make sense, right? But we try and start with calculus in software, right? We're just like, okay, here's the complexity. You've got all of it. Don't worry. Just look at this little bit. If, you know, if the compiler ever spits out a weird error uh oh, like, you're, you're, you're in for trouble cuz you, you just didn't get the. get the basics. And I think that's maybe some of what is missing. And the thing is, it is like the constraints are hard, right? No one has infinite time, right? Or like, you know, even like, just tons of time to devote to learning, learning just front end, right? That's not even all of computing, That's not even the algorithm stuff that some companies love to throw at you, right? Uh, or the computer sciencey stuff. I wonder if it makes more sense to spend some time taking people through the progression, right? Because discovering that we should do things via components, let's say, or, or at least encapsulate our functionality to components and compose that way, is something we, we not everyone knew, right? Or, you know, we didn't know wild widely. And so it feels like it might make sense to touch on that sort of realization and sort of guide the student through, you know, maybe it's like make five projects in a week and you just get progressively more complex. But then again, that's also hard cause effort, right? It's just like, it's a hard problem. But, but I think right now, uh, people who come in at the end and sort of like see a bunch of complexity and just don't know why it's there, right? Like, if you've like, sort of like, this is, this applies also very, this applies to general, but it applies very well to the Kubernetes problem as well. Like if you've never managed nginx on more than one machine, or if you've never tried to set up a, like a, to format your file system on the machine you just rented because it just, you know, comes with nothing, right? Or like, maybe, maybe some stuff was installed, but, you know, if you had to like install LVM (Logical Volume Manager) yourself, if you've never done any of that, Kubernetes would be harder to understand. It's just like, it's gonna be hard to understand. overlay networks are hard for everyone to understand, uh, except for network people who like really know networking stuff. I think it would be better. But unfortunately, it takes a lot of time for people to take a sort of more iterative approach to, to learning. I try and write blog posts in this way sometimes, but it's really hard. And so like, I'll often have like an idea, like, so I call these, or I think of these as like onion, onion style posts, right? Where you either build up an onion sort of from the inside and kind of like go out and like add more and more layers or whatever. Or you can, you can go from the outside and sort of take off like layers. Like, oh, uh, Kubernetes has a scheduler. Why do they need a scheduler? Like, and like, you know, kind of like, go, go down. but I think that might be one of the best ways to learn, but it just takes time. Or geniuses and geniuses who are good at two things, right? Good at the actual technology and good at teaching. Cuz teaching is a skill and it's very hard. and, you know, shout out to teachers cuz that's, it's, it's very difficult, extremely frustrating. it's hard to find determinism in, in like methods and solutions. And there's research of course, but it's like, yeah, that's, that's a lot harder than the computer being like, Nope, that doesn't work. Right? Like, if you can't, if you can't, like if you, if the function call doesn't work, it doesn't work. Right. If the person learned suboptimally, you won't know Right. Until like 10 years down the road when, when they can't answer some question or like, you know, when they, they don't understand. It's a missing fundamental piece anyway. [00:32:24] Jeremy: I think with the example of front end, maybe you don't have time to walk through the whole history of every single library and framework that came but I think at the very least, if you show someone, or you teach someone how to work with css, and you have them, like you were talking about components before you have them build a site where there's a lot of stuff that gets reused, right? Maybe you have five pages and they all have the same nav bar. [00:33:02] Victor: Yeah, you kind of like make them do it. [00:33:04] Jeremy: Yeah. You make 'em do it and they make all the HTML files, they copy and paste it, and probably your students are thinking like, ah, this, this kind of sucks [00:33:16] Victor: Yeah [00:33:18] Jeremy: And yeah, so then you, you come to that realization, and then after you've done that, then you can bring in, okay, this is why we have components. And similarly you brought up, manual dom manipulation with jQuery and things like that. I, I'm sure you could come up with an example of you don't even necessarily need to use jQuery. I think people can probably skip that step and just use the the, the API that comes with the browser. But you can have them go in like, Oh, you gotta find this element by the id and you gotta change this based on this, and let them experience the. I don't know if I would call it pain, but let them experience like how it was. Right. And, and give them a complex enough task where they feel like something is wrong right. Or, or like, there, should be something better. And then you can go to you could go straight to vue or react. I'm not sure if we need to go like, Here's backbone, here's knockout. [00:34:22] Victor: Yeah. That's like historical. Interesting. [00:34:27] Jeremy: I, I think that would be an interesting college course or something that. Like, I remember when, I went through school, one of the classes was programming languages. So we would learn things like, Fortran and stuff like that. And I, I think for a more frontend centered or modern equivalent you could go through, Hey, here's the history of frontend development here's what we used to do and here's how we got to where we are today. I think that could be actually a pretty interesting class yeah [00:35:10] Victor: I'm a bit interested to know you learned fortran in your PL class. I, think when I went, I was like, lisp and then some, some other, like, higher classes taught haskell but, um, but I wasn't ready for haskell, not many people but fortran is interesting, I kinda wanna hear about that. [00:35:25] Jeremy: I think it was more in terms of just getting you exposed to historically this is how things were. Right. And it wasn't so much of like, You can take strategies you used in Fortran into programming as a whole. I think it was just more of like a, a survey of like, Hey, here's, you know, here's Fortran and like you were saying, here's Lisp and all, all these different languages nd like at least you, you get to see them and go like, yeah, this is kind of a pain. [00:35:54] Victor: Yeah [00:35:55] Jeremy: And like, I understand why people don't choose to use this anymore but I couldn't take away like a broad like, Oh, I, I really wish we had this feature from, I think we were, I think we were using Fortran 77 or something like that. I think there's Fortran 77, a Fortran 90, and then there's, um, I think, [00:36:16] Victor: Like old fortran, deprecated [00:36:18] Jeremy: Yeah, yeah, yeah. So, so I think, I think, uh, I actually don't know if they're, they're continuing to, um, you know, add new things or maintain it or it's just static. But, it's, it's more, uh, interesting in terms of, like we were talking front end where it's, as somebody who's learning frontend development who is new and you get to see how, backbone worked or how Knockout worked how grunt and gulp worked. It, it's like the kind of thing where it's like, Oh, okay, like, this is interesting, but let us not use this again. Right? [00:36:53] Victor: Yeah. Yeah. Right. But I also don't need this, and I will never again [00:36:58] Jeremy: yeah, yeah. It's, um, but you do definitely see the, the parallels, right? Like you were saying where you had your, your Bower and now you have NPM and you had Grunt and Gulp and now you have many choices [00:37:14] Victor: Yeah. [00:37:15] Jeremy: yeah. I, I think having he history context, you know, it's interesting and it can be helpful, but if somebody was. Came to me and said hey I want to learn how to build websites. I get into front end development. I would not be like, Okay, first you gotta start moo tools or GWT. I don't think I would do that but it I think at a academic level or just in terms of seeing how things became the way they are sure, for sure it's interesting. [00:37:59] Victor: Yeah. And I, I, think another thing I don't remember who asked or why, why I had to think of this lately. um but it was, knowing the differentiators between other technologies is also extremely helpful right? So, What's the difference between ES build and SWC, right? Again, we're, we're, we're leaning heavy front end, but you know, just like these, uh, sorry for context, of course, it's not everyone a front end developer, but these are two different, uh, build tools, right? For, for JavaScript, right? Essentially you can think of 'em as transpilers, but they, I think, you know, I think they also bundle like, uh, generally I'm not exactly sure if, if ESbuild will bundle as well. Um, but it's like one is written in go, the other one's written in Rust, right? And sort of there's, um, there's, in addition, there's vite which is like vite does bundle and vite does a lot of things. Like, like there's a lot of innovation in vite that has to have to do with like, making local development as fast as possible and also getting like, you're sort of making sure as many things as possible are strippable, right? Or, or, or tree shakeable. Sorry, is is is the better, is the better term. Um, but yeah, knowing, knowing the, um, the differences between projects is often enough to sort of make it less confusing for me. Um, as far as like, Oh, which one of these things should I use? You know, outside of just going with what people are recommending. Cause generally there is some people with wisdom sometimes lead the crowd sometimes, right? So, so sometimes it's okay to be, you know, a crowd member as long as you're listening to the, to, to someone worth listening to. Um, and, and so yeah, I, I think that's another thing that is like the mark of a good project or, or it's not exclusive, right? It's not, the condition's not necessarily sufficient, but it's like a good projects have the why use this versus x right section in the Readme, right? They're like, Hey, we know you could use Y but here's why you should use us instead. Or we know you could use X, but here's what we do better than X. That might, you might care about, right? That's, um, a, a really strong indicator of a project. That's good cuz that means the person who's writing the project is like, they've done this, the survey. And like, this is kind of like, um, how good research happens, right? It's like most of research is reading what's happening, right? To knowing, knowing the boundary you're about to push, right? Or try and sort of like push one, make one step forward in, um, so that's something that I think the, the rigor isn't in necessarily software development everywhere, right? Which is good and bad. but someone who's sort of done that sort of rigor or, and like, and, and has, and or I should say, has been rigorous about knowing the boundary, and then they can explain that to you. They can be like, Oh, here's where the boundary was. These people were doing this, these people were doing this, these people were doing this, but I wanna do this. So you just learned now whether it's right for you and sort of the other points in the space, which is awesome. Yeah. Going to your point, I feel like that's, that's also important, it's probably not a good idea to try and get everyone to go through historical artifacts, but if just a, a quick explainer and sort of, uh, note on the differentiation, Could help for sure. Yeah. I feel like we've skewed too much frontend. No, no more frontend discussion this point. [00:41:20] Jeremy: It's just like, I, I think there's so many more choices where the, the mental thought that has to go into, Okay, what do I use next I feel is bigger on frontend. I guess it depends on the project you're working on but if you're going to work on anything front end if you haven't done it before or you don't have a lot of experience there's so many build tools so many frameworks, so many libraries that yeah, but we [00:41:51] Victor: Iterate yeah, in every direction, like the, it's good and bad, but frontend just goes in every direction at the same time Like, there's so many people who are so enthusiastic and so committed and and it's so approachable that like everyone just goes in every direction at the same time and like a lot of people make progress and then unfortunately you have try and pick which, which branch makes sense. [00:42:20] Jeremy: We've been kind of talking about, some of your experiences with a few things and I wonder if you could explain the the context you're thinking of in terms of the types of projects you typically work on like what are they what's the scale of them that sort of thing. [00:42:32] Victor: So I guess I've, I've gone through a lot of phases, right? In sort of what I use in in my tooling and what I thought was cool. I wrote enterprise java like everybody else. Like, like it really doesn't talk about it, but like, it's like almost at some point it was like, you're either a rail shop or a Java shop, for so many people. And I wrote enterprise Java for a, a long time, and I was lucky enough to have friends who were really into, other kinds of computing and other kinds of programming. a lot of my projects were wrapped around, were, were ideas that I was expressing via some new technology, let's say. Right? So, I wrote a lot of haskell for, for, for a while, right? But what did I end up building with that was actually a job board that honestly didn't go very far because I was spending much more time sort of doing, haskell things, right? And so I learned a lot about sort of what I think is like the pinnacle of sort of like type development in, in the non-research world, right? Like, like right on the edge of research and actual usability. But a lot of my ideas, sort of getting back to the, the ideas question are just things I want to build for myself. Um, or things I think could be commercially viable or like do, like, be, be well used, uh, and, and sort of, and profitable things, things that I think should be built. Or like if, if I see some, some projects as like, Oh, I wish they were doing this in this way, Right? Like, I, I often consider like, Oh, I want, I think I could build something that would be separate and maybe do like, inspired from other projects, I should say, Right? Um, and sort of making me understand a sort of a different, a different ecosystem. but a lot of times I have to say like, the stuff I build is mostly to scratch an itch I have. Um, and or something I think would be profitable or utilizing technology that I've seen that I don't think anyone's done in the same way. Right? So like learning Kubernetes for example, or like investing the time to learn Kubernetes opened up an entire world of sort of like infrastructure ideas, right? Because like the leverage you get is so high, right? So you're just like, Oh, I could run an aws, right? Like now that I, now that I know this cuz it's like, it's actually not bad, it's kind of usable. Like, couldn't I do that? Right? That kind of thing. Right? Or um, I feel like a lot of the times I'll learn a technology and it'll, it'll make me feel like certain things are possible that they, that weren't before. Uh, like Rust is another one of those, right? Like, cuz like Rust will go from like embedded all the way to WASM, which is like a crazy vertical stack. Right? It's, that's a lot, That's a wide range of computing that you can, you can touch, right? And, and there's, it's, it's hard to learn, right? The, the, the, the, uh, the, the ramp to learning it is quite steep, but, it opens up a lot of things you can write, right? It, it opens up a lot of areas you can go into, right? Like, if you ever had an idea for like a desktop app, right? You could actually write it in Rust. There's like, there's, there's ways, there's like is and there's like, um, Tauri is one of my personal favorites, which uses web technology, but it's either I'm inspired by some technology and I'm just like, Oh, what can I use this on? And like, what would this really be good at doing? or it's, you know, it's one of those other things, like either I think it's gonna be, Oh, this would be cool to build and it would be profitable. Uh, or like, I'm scratching my own itch. Yeah. I think, I think those are basically the three sources. [00:46:10] Jeremy: It's, it's interesting about Rust where it seems so trendy, I guess, in lots of people wanna do something with rust, but then in a lot of they also are not sure does it make sense to write in rust? Um, I, I think the, the embedded stuff, of course, that makes a lot of sense. And, uh, you, you've seen a sort of surge in command line apps, stuff ripgrep and ag, stuff like that, and places like that. It's, I think the benefits are pretty clear in terms of you've got the performance and you have the strong typing and whatnot and I think where there's sort of the inbetween section that's kind of unclear to me at least would I build a web application in rust I'm not sure that sort of thing [00:47:12] Victor: Yeah. I would, I characterize it as kind of like, it's a tool toolkit, so it really depends on the problem. And think we have many tools that there's no, almost never a real reason to pick one in particular right? Like there's, Cause it seems like just most of, a lot of the work, like, unless you're, you're really doing something interesting, right? Like, uh, something that like, oh, I need to, I need to, like, I'm gonna run, you know, billions and billions of processes. Like, yeah, maybe you want erlang at that point, right? Like, maybe, maybe you should, that should be, you know, your, your thing. Um, but computers are so fast these days, and most languages have, have sort of borrowed, not borrowed, but like adopted features from others that there's, it's really hard to find a, a specific use case, for one particular tool. Uh, so I often just categorize it by what I want out of the project, right? Or like, either my goals or project goals, right? Depending on, and, or like business goals, if you're, you know, doing this for a business, right? Um, so like, uh, I, I basically, if I want to go fast and I want to like, you know, reduce time to market, I use type script, right? Oh, and also I'm a, I'm a, like a type zealot. I, I'd say so. Like, I don't believe in not having types, right? Like, it's just like there's, I think it's crazy that you would like have a function but not know what the inputs could be. And they could actually be anything, right? , you're just like, and then you have to kind of just keep that in your head. I think that's silly. Now that we have good, we, we have, uh, ways to avoid the, uh, ceremony, right? You've got like hindley Milner type systems, like you have a way to avoid the, you can, you know, predict what types of things will be, and you can, you don't have to write everything everywhere. So like, it's not that. But anyway, so if I wanna go fast, the, the point is that going back to that early, like the JS ecosystem goes everywhere at the same time. Typescript is excellent because the ecosystem goes everywhere at the same time. And so you've got really good ecosystem support for just about everything you could do. Um, uh, you could write TypeScript that's very loose on the types and go even faster, but in general it's not very hard. There's not too much ceremony and just like, you know, putting some stuff that shows you what you're using and like, you know, the objects you're working with. and then generally if I wanna like, get it really right, I I'll like reach for haskell, right? Cause it's just like the sort of contortions, and again, this takes time, this not fast, but, right. the contortions you can do in the type system will make it really hard to write incorrect code or code that doesn't, that isn't logical with itself. Of course interfacing with the outside world. Like if you do a web request, it's gonna fail sometimes, right? Like the network might be down, right? So you have to, you basically pull that, you sort of wrap that uncertainty in your system to whatever degree you're okay with. And then, but I know it'll be correct, right? But and correctness is just not important. Most of like, Oh, I should , that's a bad quote. Uh, it's not that correct is not important. It's like if you need to get to market, you do not necessarily need every single piece of your code to be correct, Right? If someone calls some, some function with like, negative one and it's not an important, it's not tied to money or it's like, you know, whatever, then maybe it's fine. They just see an error and then like you get an error in your back and you're like, Oh, I better fix that. Right? Um, and then generally if I want to be correct and fast, I choose rust these days. Right? Um, these days. and going back to your point, a lot of times that means that I'm going to write in Typescript for a lot of projects. So that's what I'll do for a lot of projects is cuz I'll just be like, ah, do I need like absolute correctness or like some really, you know, fancy sort of type stuff. No. So I don't pick haskell. Right. And it's like, do I need to be like mega fast? No, probably not. Cuz like, cuz so I don't necessarily don't necessarily need rust. Um, maybe it's interesting to me in terms of like a long, long term thing, right? Like if I, if I'm think, oh, but I want x like for example, tight, tight, uh, integration with WASM, for example, if I'm just like, oh, I could see myself like, but that's more of like, you know, for a fun thing that I'm doing, right? Like, it's just like, it's, it's, you don't need it. You don't, that's premature, like, you know, that's a premature optimization thing. But if I'm just like, ah, I really want the ability to like maybe consider refactoring some of this out into like a WebAssembly thing later, then I'm like, Okay, maybe, maybe I'll, I'll pick Rust. Or like, if I, if I like, I do want, you know, really, really fast, then I'll like, then I'll go Rust. But most of the time it's just like, I want a good ecosystem so I don't have to build stuff myself most of the time. Uh, and you know, type script is good enough. So my stack ends up being a lot of the time just in type script, right? Yeah. [00:52:05] Jeremy: Yeah, I think you've encapsulated the reason why there's so many packages on NPM and why there's so much usage of JavaScript and TypeScript in general is that it, it, it fits the, it's good enough. Right? And in terms of, in terms of speed, like you said, most of the time you don't need of rust. Um, and so typescript I think is a lot more approachable a lot of people have to use it because they do front end work anyways. And so that kinda just becomes the I don't know if I should say the default but I would say it's probably the most common in terms of when somebody's building a backend today certainly there's other languages but JavaScript and TypeScript is everywhere. [00:52:57] Victor: Yeah. Uh, I, I, I, another thing is like, I mean, I'm, of ignored the, like, unreasonable effectiveness of like rails Cause there's just a, there's tons of just like rails warriors out there, and that's great. They're they're fantastic. I'm not a, I'm not personally a huge fan of rails but that's, uh, that's to my own detriment, right? In, in some, in some ways. But like, Rails and Django sort of just like, people who, like, I'm gonna learn this framework it's gonna be excellent. It most, they have a, they have carved out a great ecosystem for themselves. Um, or like, you know, even php right? PHP and like Laravel, or whatever. Uh, and so I'm ignoring those, like, those pockets of productivity, right? Those pockets of like intense productivity that people like, have all their needs met in that same way. Um, but as far as like general, general sort of ecosystem size and speed for me, um, like what you said, like applies to me. Like if I, if I'm just like, especially if I'm just like, Oh, I just wanna build a backend, Like, I wanna build something that's like super small and just does like, you know, maybe a few, a couple, you know, endpoints or whatever and just, I just wanna throw it out there. Right? Uh, I, I will pick, yeah. Typescript. It just like, it makes sense to me. I also think note is a better. VM or platform to build on than any of the others as well. So like, like I, by any of the others, I mean, Python, Perl, Ruby, right? Like sort of in the same class of, of tool. So I I am kind of convinced that, um, Node is better, than those as far as core abilities, right? Like threading Right. Versus the just multi-processing and like, you know, other, other, other solutions and like, stuff like that. So, if you want a boring stack, if I don't wanna use any tokens, right? Any innovation tokens I reach for TypeScript. [00:54:46] Jeremy: I think it's good that you brought up. Rails and, and Django because, uh, personally I've done, I've done work with Rails, and you're right in that Rails has so many built in, and the ways to do them are so well established that your ability to be productive and build something really fast hard to compete with, at least in my experience with available in the Node ecosystem. Um, on the other hand, like I, I also see what you mean by the runtimes. Like with Node, you're, you're built on top of V8 and there's so many resources being poured into it to making it fast and making it run pretty much everywhere. I think you probably don't do too much work with managed services, but if you go to a managed service to run your code, like a platform as a service, they're gonna support Node. Will they support your other preferred language? Maybe, maybe not, You know that they will, they'll be able to run node apps so but yeah I don't know if it will ever happen or maybe I'm just not familiar with it, but feel like there isn't a real rails of javascript. [00:56:14] Victor: Yeah, you're, totally right. There are, there are. It's, it's weird. It's actually weird that there, like Uh, but, but, I kind of agree with you. There's projects that are trying it recently. There's like Adonis, um, there is, there are backends that also do, like, will do basic templating, like Nest, NestJS is like really excellent. It's like one of the best sort of backend, projects out there. I I, I but like back in the day, there were projects like Sails, which was like very much trying to do exactly what Rails did, but it just didn't seem to take off and reach that critical mass possibly because of the size of the ecosystem, right? Like, how many alternatives to Rails are there? Not many, right? And, and now, anyway, maybe let's say the rest of 'em sort of like died out over the years, but there's also like, um, hapi HAPI, uh, which is like also, you know, similarly, it was like angling themselves to be that, but they just never, they never found the traction they needed. I think, um, or at least to be as wide, widely known as Rails is for, for, for the, for the Ruby ecosystem, um, but also for people to kind of know the magic, cause. Like I feel like you're productive in Rails only when you imbibe the magic, right? You, you, know all the magic context and you know the incantations and they're comforting to you, right? Like you've, you've, you have the, you have the sort of like, uh, convention. You're like, if you're living and breathing the convention, everything's amazing, right? Like, like you can't beat that. You're just like, you're in the zone but you need people to get in that zone. And I don't think node has, people are just too, they're too frazzled. They're going like, there's too much options. They can't, it's hard to commit, right? Like, imagine if you'd committed to backbone. Like you got, you can't, It's, it's over. Oh, it's not over. I mean, I don't, no, I don't wanna, you know, disparage the backbone project. I don't use it, but, you know, maybe they're still doing stuff and you know, I'm sure people are still working on it, but you can't, you, it's hard to commit and sort of really imbibe that sort of convention or, or, or sort of like, make yourself sort of breathe that product when there's like 10 products that are kind of similar and could be useful as well. Yeah, I think that's, that's that's kind of big. It's weird that there isn't a rails, for NodeJS, but, but people are working on it obviously. Like I mentioned Adonis, there's, there's more. I'm leaving a bunch of them out, but that's part of the problem. [00:58:52] Jeremy: On, on one hand, it's really cool that people are trying so many different things because hopefully maybe they can find something that like other people wouldn't have thought of if they all stick same framework. but on the other hand, it's ... how much time have we spent jumping between all these different frameworks when what we could have if we had a rails. [00:59:23] Victor: Yeah the, the sort of wasted time is, is crazy to think about it uh, I do think about that from time to time. And you know, and personally I waste a lot of my own time. Like, just, just rec

amazon google running myth tokyo os speed medium seo pl windows workers ebay sonic snap hackers saas personally depending nest meant react wordpress assurance platforms complexity rust ux api languages python astro aws java databases github lamp assuming mastodon maus vm pivotal javascript html rails clap knockout acquired versus csr css longhorns ssd containers php django docker node js kubernetes cuz sails v8 foss pelican vue ssl haskell yc elastic gitlab bower bake off milner moo grunt paas tls gulp v1 terraform typescript cli mvc jquery npm nodejs redis readme encrypt webassembly lisp heroku hacker news ansible postgres wasm ebs laravel eks aks cdk sqlite nginx ec2 fortran deltas swc hapi kelsey hightower activitypub zfs nuxt gwt lvm dsls mvvm redis labs lucene dockerfile sidekiq durandal yak shaving nestjs jeremy it jeremy you jeremy so victor it victor there

Grant Ingersoll - Fractional CTO, Leading Search Consultant - Engineering Better Search

Vector Podcast

Play Episode Listen Later Jun 9, 2022 72:42

Vector Podcast LiveTopics:00:00 Kick-off introducing co:rise study platform03:03 Grant's background04:58 Principle of 3 C's in the life of a CTO: Code, Conferences and Customers07:16 Principle of 3 C's in the Search Engine development: Content, Collaboration and Context11:51 Balance between manual tuning in pursuit to learn and Machine Learning15:42 How to nurture intuition in building search engine algorithms18:51 How to change the approach of organizations to true experimentation23:17 Where should one start in approaching the data (like click logs) for developing a search engine29:36 How to measure the success of your search engine 33:50 The role of manual query rating to improve search result relevancy36:56 What are the available datasets, tools and algorithms, that allow us to build a search engine?41:56 Vector search and its role in broad search engine development and how the profession is shaping up49:01 The magical question of WHY: what motivates Grant to stay in the space52:09 Announcement from Grant: course discount code DGSEARCH1054:55 Questions from the audienceShow notes:- Grant's interview at Berlin Buzzwords 2016: https://www.youtube.com/watch?v=Y13gZM5EGdc- “BM25 is so Yesterday: Modern Techniques for Better Search”: https://www.youtube.com/watch?v=CRZfc9lj7Po- “Taming text” - book co-authored by Grant: https://www.manning.com/books/taming-text- Search Fundamentals course - https://corise.com/course/search-fundamentals- Search with ML course - https://corise.com/course/search-with-machine-learning- Click Models for Web Search: https://github.com/markovi/PyClick- Trustworthy Online Controlled Experiments: A Practical Guide to A/B Testing, book by Ron Kohavi et al: https://www.amazon.com/Trustworthy-Online-Controlled-Experiments-Practical-ebook/dp/B0845Y3DJV- Quepid, open source tool and free service for query rating and relevancy tuning: https://quepid.com/- Grant's talk in 2013 where he discussed the need of a vector field in Lucene and Solr: https://www.youtube.com/watch?v=dCCqauwMWFE- CLIP model for multimodal search: https://openai.com/blog/clip/- Demo of multimodal search with CLIP: https://blog.muves.io/multilingual-and-multimodal-vector-search-with-hardware-acceleration-2091a825de78- Learning to Boost: https://www.youtube.com/watch?v=af1dyamySCs- Dmitry's Medium List on Vector Search: https://medium.com/@dmitry-kan/list/vector-search-e9b564d14274

#19 Datenbank-Deepdive (oder das Ende einer Ära): von Redis bis ClickHouse

Engineering Kiosk

Play Episode Listen Later May 17, 2022 64:02

Der zweite Datenbank-Deepdive im Engineering Kiosk.Indirekt knüpfen wir an Episode 8 mit dem Thema Datenbanken. Diesmal fangen wir aber ganz vorne an: Mit hierarchischen Datenbanken über Objektorientierte Datenbanken, anschließend zu SQL bis hin zur NoSQL und Spaltenorientierten Datenbank-Ära. Dabei klären wir Fragen was zum Beispiel der Unterschied zwischen Datenbanken und Dateien ist, ob OOP-Datenbank immer noch ein Hype ist, was Indexe sind und wie diese funktionieren, warum die Migration weg von Oracle schwierig sein kann, ob Lucene eine Datenbank ist und noch viel viel mehr.Bonus: Was Kürbiskerne mit Datenbanken zu tun haben und warum MySQL ein besseres Adressbuch mit SQL Interface ist.Feedback an stehtisch@engineeringkiosk.dev oder via Twitter an https://twitter.com/EngKioskLinksIBM Mainframes: https://www.ibm.com/de-de/it-infrastructure/zClickHouse: https://github.com/ClickHouse/ClickHouse / https://clickhouse.com/Oracle Cloud Free Tier: https://www.oracle.com/de/cloud/free/Apache Lucene: https://lucene.apache.org/Apache Solr: https://solr.apache.org/ElasticSearch: https://github.com/elastic/elasticsearchListe der Datenbankmanagementsysteme: https://de.wikipedia.org/wiki/Liste_der_DatenbankmanagementsystemeIBM Go Fork für Mainframes: https://github.com/linux-on-ibm-z/goDB4O: https://de.wikipedia.org/wiki/Db4oMichael Stonebraker / The End of an Architectural Era (It's Time for a Complete Rewrite): http://nms.csail.mit.edu/~stavros/pubs/hstore.pdfPercona: https://www.percona.com/2ndquadrant: https://www.2ndquadrant.com/OSS Names: https://github.com/EngineeringKiosk/OSS-NamesRedis: https://github.com/redis/redisRedisLabs: https://redis.com/antirez: http://antirez.com/RocksDB: http://rocksdb.org/ElasticSearch: https://github.com/elastic/elasticsearchLevelDB: https://github.com/google/leveldbMyRocks: http://myrocks.io/Sprungmarken(00:00:00) Intro(00:00:55) Mathematik-Professoren und Kürbiskern-Brötchen(00:02:27) Warum Datenbanken ein Herzensthema von Wolfgang ist(00:04:08) Was ist denn eine Datenbank und wann verwendet man eine Datenbank?(00:06:34) Sind klassische Dateien auch eine Datenbank?(00:07:25) Hierarchische Datenbanksysteme: IBM IMS(00:09:30) IBM Mainframes, Go, Docker und horizontale Skalierung(00:11:30) Was wäre ein Use-Case von hierarchische und Objekt-Orientierte Datenbanken?(00:16:15) Hast du bereits eine Objekt-Orientierte Datenbanken bereits in einem Projekt eingesetzt?(00:16:52) Trennung von Daten und Applikationslogik und SQL als Basis-Datenbanken-Wissen(00:18:55) Was ist der Unterschied von SQL-Datenbanken und Dateien(00:19:32) Datenbank Index/Indize: Daten-Duplikation, Lese- und Schreibzugriffe(00:23:54) Ist eine Excel-Datei eine Datenbank?(00:24:58) Unterschied von Files und Datenbanken: Nutzung von mehreren Benutzern(00:28:03) Recovery, persistentes und konsistentes Speichern bei Files und Datenbanken(00:31:01) Relationale Datenbanken sind die eigentlich klassischen Datenbanken(00:34:31) Proprietäre Datenbanken: Oracle Migration nach PostgreSQL(00:37:06) Oracle Cloud und das Free-Tier(00:38:29) MySQL wurde von Oracle übernommen und MariaDB als Alternative(00:39:48) Logik in der Datenbank, Oracle-Migrationen und Application-Server(00:41:10) Gibt es ein Killer-Argument für proprietäre Datenbanken?(00:43:57) Woher kommt der Name MySQL und MariaDB kommt?(00:45:19) Ist ElasticSearch eine Datenbank nach der klassischen Definition?(00:46:38) Ist Redis und andere Key-Value-Stores eine Datenbank?(00:48:42) NoSQL ist für Kinder, Feature-Ritis, Einfache Datenbanken und LevelDB / RocksDB und MyRocks(00:53:19) Was sind Spalten-Datenbanken und wann sollten diese angewendet werden? Analytische Datenbanken und Clickhouse von Yandex(00:58:15) Was für Fragen sind relevant um die richtige Datenbank für mich zu finden?(01:01:43) Feedback zum Thema Datenbanken und OutroHostsWolfgang Gassler (https://twitter.com/schafele)Andy Grunwald (https://twitter.com/andygrunwald)Engineering Kiosk Podcast: Anfragen an stehtisch@engineeringkiosk.dev oder via Twitter an https://twitter.com/EngKiosk

Keeping the Chaos Searchable with Thomas Hazel

Screaming in the Cloud

Play Episode Listen Later Nov 30, 2021 44:43

About ThomasThomas Hazel is Founder, CTO, and Chief Scientist of ChaosSearch. He is a serial entrepreneur at the forefront of communication, virtualization, and database technology and the inventor of ChaosSearch's patented IP. Thomas has also patented several other technologies in the areas of distributed algorithms, virtualization and database science. He holds a Bachelor of Science in Computer Science from University of New Hampshire, Hall of Fame Alumni Inductee, and founded both student & professional chapters of the Association for Computing Machinery (ACM).Links:ChaosSearch: https://www.chaossearch.io TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored in part by my friends at ThinkstCanary. Most companies find out way too late that they've been breached. ThinksCanary changes this and I love how they do it. Deploy canaries and canary tokens in minutes and then forget about them. What's great is the attackers tip their hand by touching them, giving you one alert, when it matters. I use it myself and I only remember this when I get the weekly update with a “we're still here, so you're aware” from them. It's glorious! There is zero admin overhead to this, there are effectively no false positives unless I do something foolish. Canaries are deployed and loved on all seven continents. You can check out what people are saying at canary.love. And, their Kub config canary token is new and completely free as well. You can do an awful lot without paying them a dime, which is one of the things I love about them. It is useful stuff and not an, “ohh, I wish I had money.” It is speculator! Take a look; that's canary.love because it's genuinely rare to find a security product that people talk about in terms of love. It really is a unique thing to see. Canary.love. Thank you to ThinkstCanary for their support of my ridiculous, ridiculous non-sense. Corey: This episode is sponsored in part by our friends at Vultr. Spelled V-U-L-T-R because they're all about helping save money, including on things like, you know, vowels. So, what they do is they are a cloud provider that provides surprisingly high performance cloud compute at a price that—while sure they claim its better than AWS pricing—and when they say that they mean it is less money. Sure, I don't dispute that but what I find interesting is that it's predictable. They tell you in advance on a monthly basis what it's going to going to cost. They have a bunch of advanced networking features. They have nineteen global locations and scale things elastically. Not to be confused with openly, because apparently elastic and open can mean the same thing sometimes. They have had over a million users. Deployments take less that sixty seconds across twelve pre-selected operating systems. Or, if you're one of those nutters like me, you can bring your own ISO and install basically any operating system you want. Starting with pricing as low as $2.50 a month for Vultr cloud compute they have plans for developers and businesses of all sizes, except maybe Amazon, who stubbornly insists on having something to scale all on their own. Try Vultr today for free by visiting: vultr.com/screaming, and you'll receive a $100 in credit. Thats v-u-l-t-r.com slash screaming.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. This promoted episode is brought to us by our friends at ChaosSearch.We've been working with them for a long time; they've sponsored a bunch of our nonsense, and it turns out that we've been talking about them to our clients since long before they were a sponsor because it actually does what it says on the tin. Here to talk to us about that in a few minutes is Thomas Hazel, ChaosSearch's CTO and founder. First, Thomas, nice to talk to you again, and as always, thanks for humoring me.Thomas: [laugh]. Hi, Corey. Always great to talk to you. And I enjoy these conversations that sometimes go up and down, left and right, but I look forward to all the fun we're going to have.Corey: So, my understanding of ChaosSearch is probably a few years old because it turns out, I don't spend a whole lot of time meticulously studying your company's roadmap in the same way that you presumably do. When last we checked in with what the service did-slash-does, you are effectively solving the problem of data movement and querying that data. The idea behind data warehouses is generally something that's shoved onto us by cloud providers where, “Hey, this data is going to be valuable to you someday.” Data science teams are big proponents of this because when you're storing that much data, their salaries look relatively reasonable by comparison. And the ChaosSearch vision was, instead of copying all this data out of an object store and storing it on expensive disks, and replicating it, et cetera, what if we queried it in place in a somewhat intelligent manner?So, you take the data and you store it, in this case, in S3 or equivalent, and then just query it there, rather than having to move it around all over the place, which of course, then incurs data transfer fees, you're storing it multiple times, and it's never in quite the format that you want it. That was the breakthrough revelation, you were Elasticsearch—now OpenSearch—API compatible, which was great. And that was, sort of, a state of the art a year or two ago. Is that generally correct?Thomas: No, you nailed our mission statement. No, you're exactly right. You know, the value of cloud object stores, S3, the elasticity, the durability, all these wonderful things, the problem was you couldn't get any value out of it, and you had to move it out to these siloed solutions, as you indicated. So, you know, our mission was exactly that, transformed customers' cloud storage into an analytical database, a multi-model analytical database, where our first use case was search and log analytics, replacing the ELK stack and also replacing the data pipeline, the schema management, et cetera. We automate the entire step, raw data to insights.Corey: It's funny we're having this conversation today. Earlier, today, I was trying to get rid of a relatively paltry 200 gigs or so of small files on an EFS volume—you know, Amazon's version of NFS; it's like an NFS volume except you're paying Amazon for the privilege—great. And it turns out that it's a whole bunch of operations across a network on a whole bunch of tiny files, so I had to spin up other instances that were not getting backed by spot terminations, and just firing up a whole bunch of threads. So, now the load average on that box is approaching 300, but it's plowing through, getting rid of that data finally.And I'm looking at this saying this is a quarter of a terabyte. Data warehouses are in the petabyte range. Oh, I begin to see aspects of the problem. Even searching that kind of data using traditional tooling starts to break down, which is sort of the revelation that Google had 20-some-odd years ago, and other folks have since solved for, but this is the first time I've had significant data that wasn't just easily searched with a grep. For those of you in the Unix world who understand what that means, condolences. We're having a support group meeting at the bar.Thomas: Yeah. And you know, I always thought, what if you could make cloud object storage like S3 high performance and really transform it into a database? And so that warehouse capability, that's great. We like that. However to manage it, to scale it, to configure it, to get the data into that, was the problem.That was the promise of a data lake, right? This simple in, and then this arbitrary schema on read generic out. The problem next came, it became swampy, it was really hard, and that promise was not delivered. And so what we're trying to do is get all the benefits of the data lake: simple in, so many services naturally stream to cloud storage. Shoot, I would say every one of our customers are putting their data in cloud storage because their data pipeline to their warehousing solution or Elasticsearch may go down and they're worried they'll lose the data.So, what we say is what if you just said activate that data lake and get that ELK use case, get that BI use case without that data movement, as you indicated, without that ETL-ing, without that data pipeline that you're worried is going to fall over. So, that vision has been Chaos. Now, we haven't talked in, you know, a few years, but this idea that we're growing beyond what we are just going after logs, we're going into new use cases, new opportunities, and I'm looking forward to discussing with you.Corey: It's a great answer that—though I have to call out that I am right there with you as far as inappropriately using things as databases. I know that someone is going to come back and say, “Oh, S3 is a database. You're dancing around it. Isn't that what Athena is?” Which is named, of course, after the Greek Goddess of spending money on AWS? And that is a fair question, but to my understanding, there's a schema story behind that does not apply to what you're doing.Thomas: Yeah, and that is so crucial is that we like the relational access. The time-cost complexity to get it into that, as you mentioned, scaled access, I mean, it could take weeks, months to test it, to configure it, to provision it, and imagine if you got it wrong; you got to redo it again. And so our unique service removes all that data pipeline schema management. And because of our innovation because of our service, you do all schema definition, on the fly, virtually, what we call views on your index data, that you can publish an elastic index pattern for that consumption, or a relational table for that consumption. And that's kind of leading the witness into things that we're coming out with this quarter into 2022.Corey: I have to deal with a little bit of, I guess, a shame here because yeah, I'm doing exactly what you just described. I'm using Athena to wind up querying our customers' Cost and Usage Reports, and we spend a couple hundred bucks a month on AWS Glue to wind up massaging those into the way that they expect it to be. And it's great. Ish. We hook it up to Tableau and can make those queries from it, and all right, it's great.It just, burrr goes the money printer, and we somehow get access and insight to a lot of valuable data. But even that is knowing exactly what the format is going to look like. Ish. I mean, Cost and Usage Reports from Amazon are sort of aspirational when it comes to schema sometimes, but here we are. And that's been all well and good.But now the idea of log files, even looking at the base case of sending logs from an application, great. Nginx, or Apache, or [unintelligible 00:07:24], or any of the various web servers out there all tend to use different logging formats just to describe the same exact things, start spreading that across custom in-house applications and getting signal from that is almost impossible. “Oh,” people say, “So, we'll use a structured data format.” Now, you're putting log and structuring requirements on application developers who don't care in the first place, and now you have a mess on your hands.Thomas: And it really is a mess. And that challenge is, it's so problematic. And schemas changing. You know, we have customers and one reasons why they go with us is their log data is changing; they didn't expect it. Well, in your data pipeline, and your Athena database, that breaks. That brings the system down.And so our system uniquely detects that and manages that for you and then you can pick and choose how you want to export in these views dynamically. So, you know, it's really not rocket science, but the problem is, a lot of the technology that we're using is designed for static, fixed thinking. And then to scale it is problematic and time-consuming. So, you know, Glue is a great idea, but it has a lot of sharp [pebbles 00:08:26]. Athena is a great idea but also has a lot of problems.And so that data pipeline, you know, it's not for digitally native, active, new use cases, new workloads coming up hourly, daily. You think about this long-term; so a lot of that data prep pipelining is something we address so uniquely, but really where the customer cares is the value of that data, right? And so if you're spending toils trying to get the data into a database, you're not answering the questions, whether it's for security, for performance, for your business needs. That's the problem. And you know, that agility, that time-to-value is where we're very uniquely coming in because we start where your data is raw and we automate the process all the way through.Corey: So, when I look at the things that I have stuffed into S3, they generally fall into a couple of categories. There are a bunch of logs for things I never asked for nor particularly wanted, but AWS is aggressive about that, first routing through CloudTrail so you can get charged 50-cent per gigabyte ingested. Awesome. And of course, large static assets, images I have done something to enter colloquially now known as shitposts, which is great. Other than logs, what could you possibly be storing in S3 that lends itself to, effectively, the type of analysis that you built around this?Thomas: Well, our first use case was the classic log use cases, app logs, web service logs. I mean, CloudTrail, it's famous; we had customers that gave up on elastic, and definitely gave up on relational where you can do a couple changes and your permutation of attributes for CloudTrail is going to put you to your knees. And people just say, “I give up.” Same thing with Kubernetes logs. And so it's the classic—whether it's CSV, where it's JSON, where it's log types, we auto-discover all that.We also allow you, if you want to override that and change the parsing capabilities through a UI wizard, we do discover what's in your buckets. That term data swamp, and not knowing what's in your bucket, we do a facility that will index that data, actually create a report for you for knowing what's in. Now, if you have text data, if you have log data, if you have BI data, we can bring it all together, but the real pain is at the scale. So classically, app logs, system logs, many devices sending IoT-type streams is where we really come in—Kubernetes—where they're dealing with terabytes of data per day, and managing an ELK cluster at that scale. Particularly on a Black Friday.Shoot, some of our customers like—Klarna is one of them; credit card payment—they're ramping up for Black Friday, and one of the reasons why they chose us is our ability to scale when maybe you're doing a terabyte or two a day and then it goes up to twenty, twenty-five. How do you test that scale? How do you manage that scale? And so for us, the data streams are, traditionally with our customers, the well-known log types, at least in the log use cases. And the challenge is scaling it, is getting access to it, and that's where we come in.Corey: I will say the last time you were on the show a couple of years ago, you were talking about the initial logging use case and you were speaking, in many cases aspirationally, about where things were going. What a difference a couple years is made. Instead of talking about what hypothetical customers might want, or what—might be able to do, you're just able to name-drop them off the top of your head, you have scaled to approximately ten times the number of employees you had back then. You've—Thomas: Yep. Yep.Corey: —raised, I think, a total of—what, 50 million?—since then.Thomas: Uh, 60 now. Yeah.Corey: Oh, 60? Fantastic.Thomas: Yeah, yeah.Corey: Congrats. And of course, how do you do it? By sponsoring Last Week in AWS, as everyone should. I'm taking clear credit for that every time someone announces around, that's the game. But no, there is validity to it because telling fun stories and sponsoring exciting things like this only carry you so far. At some point, customers have to say, yeah, this is solving a pain that I have; I'm willing to pay you money to solve it.And you've clearly gotten to a point where you are addressing the needs of those customers at a pretty fascinating clip. It's bittersweet from my perspective because it seems like the majority of your customers have not come from my nonsense anymore. They're finding you through word of mouth, they're finding through more traditional—read as boring—ad campaigns, et cetera, et cetera. But you've built a brand that extends beyond just me. I'm no longer viewed as the de facto ombudsperson for any issue someone might have with ChaosSearch on Twitters. It's kind of, “Aww, the company grew up. What happened there?”Thomas: No, [laugh] listen, this you were great. We reached out to you to tell our story, and I got to be honest. A lot of people came by, said, “I heard something on Corey Quinn's podcasts,” or et cetera. And it came a long way now. Now, we have, you know, companies like Equifax, multi-cloud—Amazon and Google.They love the data lake philosophy, the centralized, where use cases are now available within days, not weeks and months. Whether it's logs and BI. Correlating across all those data streams, it's huge. We mentioned Klarna, [APM Performance 00:13:19], and, you know, we have Armor for SIEM, and Blackboard for [Observers 00:13:24].So, it's funny—yeah, it's funny, when I first was talking to you, I was like, “What if? What if we had this customer, that customer?” And we were building the capabilities, but now that we have it, now that we have customers, yeah, I guess, maybe we've grown up a little bit. But hey, listen to you're always near and dear to our heart because we remember, you know, when you stop[ed by our booth at re:Invent several times. And we're coming to re:Invent this year, and I believe you are as well.Corey: Oh, yeah. But people listening to this, it's if they're listening the day it's released, this will be during re:Invent. So, by all means, come by the ChaosSearch booth, and see what they have to say. For once they have people who aren't me who are going to be telling stories about these things. And it's fun. Like, I joke, it's nothing but positive here.It's interesting from where I sit seeing the parallels here. For example, we have both had—how we say—adult supervision come in. You have a CEO, Ed, who came over from IBM Storage. I have Mike Julian, whose first love language is of course spreadsheets. And it's great, on some level, realizing that, wow, this company has eclipsed my ability to manage these things myself and put my hands-on everything. And eventually, you have to start letting go. It's a weird growth stage, and it's a heck of a transition. But—Thomas: No, I love it. You know, I mean, I think when we were talking, we were maybe 15 employees. Now, we're pushing 100. We brought on Ed Walsh, who's an amazing CEO. It's funny, I told him about this idea, I invented this technology roughly eight years ago, and he's like, “I love it. Let's do it.” And I wasn't ready to do it.So, you know, five, six years ago, I started the company always knowing that, you know, I'd give him a call once we got the plane up in the air. And it's been great to have him here because the next level up, right, of execution and growth and business development and sales and marketing. So, you're exactly right. I mean, we were a young pup several years ago, when we were talking to you and, you know, we're a little bit older, a little bit wiser. But no, it's great to have Ed here. And just the leadership in general; we've grown immensely.Corey: Now, we are recording this in advance of re:Invent, so there's always the question of, “Wow, are we going to look really silly based upon what is being announced when this airs?” Because it's very hard to predict some things that AWS does. And let's be clear, I always stay away from predictions, just because first, I have a bit of a knack for being right. But also, when I'm right, people will think, “Oh, Corey must have known about that and is leaking,” whereas if I get it wrong, I just look like a fool. There's no win for me if I start doing the predictive dance on stuff like that.But I have to level with you, I have been somewhat surprised that, at least as of this recording, AWS has not moved more in your direction because storing data in S3 is kind of their whole thing, and querying that data through something that isn't Athena has been a bit of a reach for them that they're slowly starting to wrap their heads around. But their UltraWarm nonsense—which is just, okay, great naming there—what is the point of continually having a model where oh, yeah, we're going to just age it out, the stuff that isn't actively being used into S3, rather than coming up with a way to query it there. Because you've done exactly that, and please don't take this as anything other than a statement of fact, they have better access to what S3 is doing than you do. You're forced to deal with this thing entirely from a public API standpoint, which is fine. They can theoretically change the behavior of aspects of S3 to unlock these use cases if they chose to do so. And they haven't. Why is it that you're the only folks that are doing this?Thomas: No, it's a great question, and I'll give them props for continuing to push the data lake [unintelligible 00:17:09] to the cloud providers' S3 because it was really where I saw the world. Lakes, I believe in. I love them. They love them. However, they promote the move the data out to get access, and it seems so counterintuitive on why wouldn't you leave it in and put these services, make them more intelligent? So, it's funny, I've trademark ‘Smart Object Storage,' I actually trademarked—I think you [laugh] were a part of this—‘UltraHot,' right? Because why would you want UltraWarm when you can have UltraHot?And the reason, I feel, is that if you're using Parquet for Athena [unintelligible 00:17:40] store, or Lucene for Elasticsearch, these two index technologies were not designed for cloud storage, for real-time streaming off of cloud storage. So, the trick is, you have to build UltraWarm, get it off of what they consider cold S3 into a more warmer memory or SSD type access. What we did, what the invention I created was, that first read is hot. That first read is fast.Snowflake is a good example. They give you a ten terabyte demo example, and if you have a big instance and you do that first query, maybe several orders or groups, it could take an hour to warm up. The second query is fast. Well, what if the first query is in seconds as well? And that's where we really spent the last five, six years building out the tech and the vision behind this because I like to say you go to a doctor and say, “Hey, Doc, every single time I move my arm, it hurts.” And the doctor says, “Well, don't move your arm.”It's things like that, to your point, it's like, why wouldn't they? I would argue, one, you have to believe it's possible—we're proving that it is—and two, you have to have the technology to do it. Not just the index, but the architecture. So, I believe they will go this direction. You know, little birdies always say that all these companies understand this need.Shoot, Snowflake is trying to be lake-y; Databricks is trying to really bring this warehouse lake concept. But you still do all the pipelining; you still have to do all the data management the way that you don't want to do. It's not a lake. And so my argument is that it's innovation on why. Now, they have money; they have time, but, you know, we have a big head start.Corey: I remembered last year at re:Invent they released a, shall we say, significant change to S3 that it enabled read after write consistency, which is awesome, for again, those of us in the business of misusing things as databases. But for some folks, the majority of folks I would say, it was a, “I don't know what that means and therefore I don't care.” And that's fine. I have no issue with that. There are other folks, some of my customers for example, who are suddenly, “Wait a minute. This means I can sunset this entire janky sidecar metadata system that is designed to make sure that we are consistent in our use of S3 because it now does it automatically under the hood?” And that's awesome. Does that change mean anything for ChaosSearch?Thomas: It doesn't because of our architecture. We're append-only, write-once scenario, so a lot of update-in-place viewpoints. My viewpoint is that if you're seeing S3 as the database and you need that type of consistency, it make sense of why you'd want it, but because of our distributive fabric, our stateless architecture, our append-only nature, it really doesn't affect us.Now, I talked to the S3 team, I said, “Please if you're coming up with this feature, it better not be slower.” I want S3 to be fast, right? And they said, “No, no. It won't affect performance.” I'm like, “Okay. Let's keep that up.”And so to us, any type of S3 capability, we'll take advantage of it if benefits us, whether it's consistency as you indicated, performance, functionality. But we really keep the constructs of S3 access to really limited features: list, put, get. [roll-on 00:20:49] policies to give us read-only access to your data, and a location to write our indices into your account, and then are distributed fabric, our service, acts as those indices and query them or searches them to resolve whatever analytics you need. So, we made it pretty simple, and that is allowed us to make it high performance.Corey: I'll take it a step further because you want to talk about changes since the last time we spoke, it used to be that this was on top of S3, you can store your data anywhere you want, as long as it's S3 in the customer's account. Now, you're also supporting one-click integration with Google Cloud's object storage, which, great. That does mean though, that you're not dependent upon provider-specific implementations of things like a consistency model for how you've built things. It really does use the lowest common denominator—to my understanding—of object stores. Is that something that you're seeing broad adoption of, or is this one of those areas where, well, you have one customer on a different provider, but almost everything lives on the primary? I'm curious what you're seeing for adoption models across multiple providers?Thomas: It's a great question. We built an architecture purposely to be cloud-agnostic. I mean, we use compute in a containerized way, we use object storage in a very simple construct—put, get, list—and we went over to Google because that made sense, right? We have customers on both sides. I would say Amazon is the gorilla, but Google's trying to get there and growing.We had a big customer, Equifax, that's on both Amazon and Google, but we offer the same service. To be frank, it looks like the exact same product. And it should, right? Whether it's Amazon Cloud, or Google Cloud, multi-select and I want to choose either one and get the other one. I would say that different business types are using each one, but our bulk of the business isn't Amazon, but we just this summer released our SaaS offerings, so it's growing.And you know, it's funny, you never know where it comes from. So, we have one customer—actually DigitalRiver—as one of our customers on Amazon for logs, but we're growing in working together to do a BI on GCP or on Google. And so it's kind of funny; they have two departments on two different clouds with two different use cases. And so do they want unification? I'm not sure, but they definitely have their BI on Google and their operations in Amazon. It's interesting.Corey: You know its important to me that people learn how to use the cloud effectively. Thats why I'm so glad that Cloud Academy is sponsoring my ridiculous non-sense. They're a great way to build in demand tech skills the way that, well personally, I learn best which I learn by doing not by reading. They have live cloud labs that you can run in real environments that aren't going to blow up your own bill—I can't stress how important that is. Visit cloudacademy.com/corey. Thats C-O-R-E-Y, don't drop the “E.” Use Corey as a promo-code as well. You're going to get a bunch of discounts on it with a lifetime deal—the price will not go up. It is limited time, they assured me this is not one of those things that is going to wind up being a rug pull scenario, oh no no. Talk to them, tell me what you think. Visit: cloudacademy.com/corey, C-O-R-E-Y and tell them that I sent you!Corey: I know that I'm going to get letters for this. So, let me just call it out right now. Because I've been a big advocate of pick a provider—I care not which one—and go all-in on it. And I'm sitting here congratulating you on extending to another provider, and people are going to say, “Ah, you're being inconsistent.”No. I'm suggesting that you as a provider have to meet your customers where they are because if someone is sitting in GCP and your entire approach is, “Step one, migrate those four petabytes of data right on over here to AWS,” they're going to call you that jackhole that you would be by making that suggestion and go immediately for option B, which is literally anything that is not ChaosSearch, just based upon that core misunderstanding of their business constraints. That is the way to think about these things. For a vendor position that you are in as an ISV—Independent Software Vendor for those not up on the lingo of this ridiculous industry—you have to meet customers where they are. And it's the right move.Thomas: Well, you just said it. Imagine moving terabytes and petabytes of data.Corey: It sounds terrific if I'm a salesperson for one of these companies working on commission, but for the rest of us, it sounds awful.Thomas: We really are a data fabric across clouds, within clouds. We're going to go where the data is and we're going to provide access to where that data lives. Our whole philosophy is the no-movement movement, right? Don't move your data. Leave it where it is and provide access at scale.And so you may have services in Google that naturally stream to GCS; let's do it there. Imagine moving that amount of data over to Amazon to analyze it, and vice versa. 2020, we're going to be in Azure. They're a totally different type of business, users, and personas, but you're getting asked, “Can you support Azure?” And the answer is, “Yes,” and, “We will in 2022.”So, to us, if you have cloud storage, if you have compute, and it's a big enough business opportunity in the market, we're there. We're going there. When we first started, we were talking to MinIO—remember that open-source, object storage platform?—We've run on our laptops, we run—this [unintelligible 00:25:04] Dr. Seuss thing—“We run over here; we run over there; we run everywhere.”But the honest truth is, you're going to go with the big cloud providers where the business opportunity is, and offer the same solution because the same solution is valued everywhere: simple in; value out; cost-effective; long retention; flexibility. That sounds so basic, but you mentioned this all the time with our Rube Goldberg, Amazon diagrams we see time and time again. It's like, if you looked at that and you were from an alien planet, you'd be like, “These people don't know what they're doing. Why is it so complicated?” And the simple answer is, I don't know why people think it's complicated.To your point about Amazon, why won't they do it? I don't know, but if they did, things would be different. And being honest, I think people are catching on. We do talk to Amazon and others. They see the need, but they also have to build it; they have to invent technology to address it. And using Parquet and Lucene are not the answer.Corey: Yeah, it's too much of a demand on the producers of that data rather than the consumer. And yeah, I would love to be able to go upstream to application developers and demand they do things in certain ways. It turns out as a consultant, you have zero authority to do that. As a DevOps team member, you have limited ability to influence it, but it turns out that being the ‘department of no' quickly turns into being the ‘department of unemployment insurance' because no one wants to work with you. And collaboration—contrary to what people wish to believe—is a key part of working in a modern workplace.Thomas: Absolutely. And it's funny, the demands of IT are getting harder; the actual getting the employees to build out the solutions are getting harder. And so a lot of that time is in the pipeline, is the prep, is the schema, the sharding, and et cetera, et cetera, et cetera. My viewpoint is that should be automated away. More and more databases are being autotune, right?This whole knobs and this and that, to me, Glue is a means to an end. I mean, let's get rid of it. Why can't Athena know what to do? Why can't object storage be Athena and vice versa? I mean, to me, it seems like all this moving through all these services, the classic Amazon viewpoint, even their diagrams of having this centralized repository of S3, move it all out to your services, get results, put it back in, then take it back out again, move it around, it just doesn't make much sense. And so to us, I love S3, love the service. I think it's brilliant—Amazon's first service, right?—but from there get a little smarter. That's where ChaosSearch comes in.Corey: I would argue that S3 is in fact, a modern miracle. And one of those companies saying, “Oh, we have an object store; it's S3 compatible.” It's like, “Yeah. We have S3 at home.” Look at S3 at home, and it's just basically a series of failing Raspberry Pis.But you have this whole ecosystem of things that have built up and sprung up around S3. It is wildly understated just how scalable and massive it is. There was an academic paper recently that won an award on how they use automated reasoning to validate what is going on in the S3 environment, and they talked about hundreds of petabytes in some cases. And folks are saying, ah, S3 is hundreds of petabytes. Yeah, I have clients storing hundreds of petabytes.There are larger companies out there. Steve Schmidt, Amazon's CISO, was recently at a Splunk keynote where he mentioned that in security info alone, AWS itself generates 500 petabytes a day that then gets reduced down to a bunch of stuff, and some of it gets loaded into Splunk. I think. I couldn't really hear the second half of that sentence because of the sound of all of the Splunk salespeople in that room becoming excited so quickly you could hear it.Thomas: [laugh]. I love it. If I could be so bold, those S3 team, they're gods. They are amazing. They created such an amazing service, and when I started playing with S3 now, I guess, 2006 or 7, I mean, we were using for a repository, URL access to get images, I was doing a virtualization [unintelligible 00:29:05] at the time—Corey: Oh, the first time I played with it, “This seems ridiculous and kind of dumb. Why would anyone use this?” Yeah, yeah. It turns out I'm really bad at predicting the future. Another reason I don't do the prediction thing.Thomas: Yeah. And when I started this company officially, five, six years ago, I was thinking about S3 and I was thinking about HDFS not being a good answer. And I said, “I think S3 will actually achieve the goals and performance we need.” It's a distributed file system. You can run parallel puts and parallel gets. And the performance that I was seeing when the data was a certain way, certain size, “Wait, you can get high performance.”And you know, when I first turned on the engine, now four or five years ago, I was like, “Wow. This is going to work. We're off to the races.” And now obviously, we're more than just an idea when we first talked to you. We're a service.We deliver benefits to our customers both in logs. And shoot, this quarter alone we're coming out with new features not just in the logs, which I'll talk about second, but in a direct SQL access. But you know, one thing that you hear time and time again, we talked about it—JSON, CloudTrail, and Kubernetes; this is a real nightmare, and so one thing that we've come out with this quarter is the ability to virtually flatten. Now, you heard time and time again, where, “Okay. I'm going to pick and choose my data because my database can't handle whether it's elastic, or say, relational.” And all of a sudden, “Shoot, I don't have that. I got to reindex that.”And so what we've done is we've created a index technology that we're always planning to come out with that indexes the JSON raw blob, but in the data refinery have, post-index you can select how to unflatten it. Why is that important? Because all that tooling, whether it's elastic or SQL, is now available. You don't have to change anything. Why is Snowflake and BigQuery has these proprietary JSON APIs that none of these tools know how to use to get access to the data?Or you pick and choose. And so when you have a CloudTrail, and you need to know what's going on, if you picked wrong, you're in trouble. So, this new feature we're calling ‘Virtual Flattening'—or I don't know what we're—we have to work with the marketing team on it. And we're also bringing—this is where I get kind of excited where the elastic world, the ELK world, we're bringing correlations into Elasticsearch. And like, how do you do that? They don't have the APIs?Well, our data refinery, again, has the ability to correlate index patterns into one view. A view is an index pattern, so all those same constructs that you had in Kibana, or Grafana, or Elastic API still work. And so, no more denormalizing, no more trying to hodgepodge query over here, query over there. You're actually going to have correlations in Elastic, natively. And we're excited about that.And one more push on the future, Q4 into 2022; we have been given early access to S3 SQL access. And, you know, as I mentioned, correlations in Elastic, but we're going full in on publishing our [TPCH 00:31:56] report, we're excited about publishing those numbers, as well as not just giving early access, but going GA in the first of the year, next year.Corey: I look forward to it. This is also, I guess, it's impossible to have a conversation with you, even now, where you're not still forward-looking about what comes next. Which is natural; that is how we get excited about the things that we're building. But so much less of what you're doing now in our conversations have focused around what's coming, as opposed to the neat stuff you're already doing. I had to double-check when we were talking just now about oh, yeah, is that Google cloud object store support still something that is roadmapped, or is that out in the real world?No, it's very much here in the real world, available today. You can use it. Go click the button, have fun. It's neat to see at least some evidence that not all roadmaps are wishes and pixie dust. The things that you were talking to me about years ago are established parts of ChaosSearch now. It hasn't been just, sort of, frozen in amber for years, or months, or these giant periods of time. Because, again, there's—yeah, don't sell me vaporware; I know how this works. The things you have promised have come to fruition. It's nice to see that.Thomas: No, I appreciate it. We talked a little while ago, now a few years ago, and it was a bit of aspirational, right? We had a lot to do, we had more to do. But now when we have big customers using our product, solving their problems, whether it's security, performance, operation, again—at scale, right? The real pain is, sure you have a small ELK cluster or small Athena use case, but when you're dealing with terabytes to petabytes, trillions of rows, right—billions—when you were dealing trillions, billions are now small. Millions don't even exist, right?And you're graduating from computer science in college and you say the word, “Trillion,” they're like, “Nah. No one does that.” And like you were saying, people do petabytes and exabytes. That's the world we're living in, and that's something that we really went hard at because these are challenging data problems and this is where we feel we uniquely sit. And again, we don't have to break the bank while doing it.Corey: Oh, yeah. Or at least as of this recording, there's a meme going around, again, from an old internal Google Video, of, “I just want to serve five terabytes of traffic,” and it's an internal Google discussion of, “I don't know how to count that low.” And, yeah.Thomas: [laugh].Corey: But there's also value in being able to address things at much larger volume. I would love to see better responsiveness options around things like Deep Archive because the idea of being able to query that—even if you can wait a day or two—becomes really interesting just from the perspective of, at that point, current cost for one petabyte of data in Glacier Deep Archive is 1000 bucks a month. That is ‘why would I ever delete data again?' Pricing.Thomas: Yeah. You said it. And what's interesting about our technology is unlike, let's say Lucene, when you index it, it could be 3, 4, or 5x the raw size, our representation is smaller than gzip. So, it is a full representation, so why don't you store it efficiently long-term in S3? Oh, by the way, with the Glacier; we support Glacier too.And so, I mean, it's amazing the cost of data with cloud storage is dramatic, and if you can make it hot and activated, that's the real promise of a data lake. And, you know, it's funny, we use our own service to run our SaaS—we log our own data, we monitor, we alert, have dashboards—and I can't tell you how cheap our service is to ourselves, right? Because it's so cost-effective for long-tail, not just, oh, a few weeks; we store a whole year's worth of our operational data so we can go back in time to debug something or figure something out. And a lot of that's savings. Actually, huge savings is cloud storage with a distributed elastic compute fabric that is serverless. These are things that seem so obvious now, but if you have SSDs, and you're moving things around, you know, a team of IT professionals trying to manage it, it's not cheap.Corey: Oh, yeah, that's the story. It's like, “Step one, start paying for using things in cloud.” “Okay, great. When do I stop paying?” “That's the neat part. You don't.” And it continues to grow and build.And again, this is the thing I learned running a business that focuses on this, the people working on this, in almost every case, are more expensive than the infrastructure they're working on. And that's fine. I'd rather pay people than technologies. And it does help reaffirm, on some level, that—people don't like this reminder—but you have to generate more value than you cost. So, when you're sitting there spending all your time trying to avoid saving money on, “Oh, I've listened to ChaosSearch talk about what they do a few times. I can probably build my own and roll it at home.”It's, I've seen the kind of work that you folks have put into this—again, you have something like 100 employees now; it is not just you building this—my belief has always been that if you can buy something that gets you 90, 95% of where you are, great. Buy it, and then yell at whoever selling it to you for the rest of it, and that'll get you a lot further than, “We're going to do this ourselves from first principles.” Which is great for a weekend project for just something that you have a passion for, but in production mistakes show. I've always been a big proponent of buying wherever you can. It's cheaper, which sounds weird, but it's true.Thomas: And we do the same thing. We have single-sign-on support; we didn't build that ourselves, we use a service now. Auth0 is one of our providers now that owns that [crosstalk 00:37:12]—Corey: Oh, you didn't roll your own authentication layer? Why ever not? Next, you're going to tell me that you didn't roll your own payment gateway when you wound up charging people on your website to sign up?Thomas: You got it. And so, I mean, do what you do well. Focus on what you do well. If you're repeating what everyone seems to do over and over again, time, costs, complexity, and… service, it makes sense. You know, I'm not trying to build storage; I'm using storage. I'm using a great, wonderful service, cloud object storage.Use whats works, whats works well, and do what you do well. And what we do well is make cloud object storage analytical and fast. So, call us up and we'll take away that 2 a.m. call you have when your cluster falls down, or you have a new workload that you are going to go to the—I don't know, the beach house, and now the weekend shot, right? Spin it up, stream it in. We'll take over.Corey: Yeah. So, if you're listening to this and you happen to be at re:Invent, which is sort of an open question: why would you be at re:Invent while listening to a podcast? And then I remember how long the shuttle lines are likely to be, and yeah. So, if you're at re:Invent, make it on down to the show floor, visit the ChaosSearch booth, tell them I sent you, watch for the wince, that's always worth doing. Thomas, if people have better decision-making capability than the two of us do, where can they find you if they're not in Las Vegas this week?Thomas: So, you find us online chaossearch.io. We have so much material, videos, use cases, testimonials. You can reach out to us, get a free trial. We have a self-service experience where connect to your S3 bucket and you're up and running within five minutes.So, definitely chaossearch.io. Reach out if you want a hand-held, white-glove experience POV. If you have those type of needs, we can do that with you as well. But we booth on re:Invent and I don't know the booth number, but I'm sure either we've assigned it or we'll find it out.Corey: Don't worry. This year, it is a low enough attendance rate that I'm projecting that you will not be as hard to find in recent years. For example, there's only one expo hall this year. What a concept. If only it hadn't taken a deadly pandemic to get us here.Thomas: Yeah. But you know, we'll have the ability to demonstrate Chaos at the booth, and really, within a few minutes, you'll say, “Wow. How come I never heard of doing it this way?” Because it just makes so much sense on why you do it this way versus the merry-go-round of data movement, and transformation, and schema management, let alone all the sharding that I know is a nightmare, more often than not.Corey: And we'll, of course, put links to that in the [show notes 00:39:40]. Thomas, thank you so much for taking the time to speak with me today. As always, it's appreciated.Thomas: Corey, thank you. Let's do this again.Corey: We absolutely will. Thomas Hazel, CTO and Founder of ChaosSearch. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast episode, please leave a five-star review on your podcast platform of choice, whereas if you've hated this episode, please leave a five-star review on your podcast platform of choice along with an angry comment because I have dared to besmirch the honor of your homebrewed object store, running on top of some trusty and reliable Raspberries Pie.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

ceo university amazon founders google starting science las vegas talk chaos data reach focus cost bachelor black friday ga cloud millions shoot new hampshire ip pricing fantastic saas cto doc computer science iot nah armor bi api pov last week screaming trillion ui seuss aws snowflakes iso apis lakes devops glue azure apache invent elk chief scientist ssd canary google cloud raspberry pi glacier s3 equifax sql deploy ciso kubernetes ish klarna tableau canaries elastic splunk json unix observers siem deployments gcp aww databricks rube goldberg csv etl gcs blackboard kub elasticsearch parquet nfs auth0 nginx steve schmidt grafana efs bigquery searchable correlating greek goddess corey quinn minio kibana google video amazon cloud lucene hdfs duckbill group aws glue computing machinery acm mike julian thomasso chief cloud economist thomas you last week in aws json apis humblepod

43 выпуск 09 сезона. Prettier Ruby 2.0.0, React Router v6, Async Ruby, Caffeinate, Gammo, Cytoscape.js, Vizzu и прочее

RWpod - подкаст про мир Ruby и Web технологии

Play Episode Listen Later Nov 7, 2021 46:05

Добрый день уважаемые слушатели. Представляем новый выпуск подкаста RWpod. В этом выпуске: Ruby Rails 7 adds database-specific setup and reset tasks for multi DB configurations Async Ruby GitHub Issue-style File Uploader Using Stimulus and Active Storage Ruby Structs Rack Middlewares in Ruby on Rails Prettier Ruby 2.0.0 Caffeinate - a drip email engine for managing, creating, and sending scheduled email sequences from your Ruby on Rails application Gammo - A pure-Ruby HTML5 parser Web The New React Docs, In Progress and Now In Beta React Router v6 Photoshop's journey to the web Get started with Medusa Part 1: the open-source alternative to Shopify Record, replay and measure user flows Cytoscape.js - graph theory (network) library for visualisation and analysis Vizzu - Library for animated data visualizations and data stories Liqe - lightweight and performant Lucene-like parser and search engine RWpod Cafe 27 (04.12.2021) Сбор и голосование за темы новостей

photoshop db ruby on rails async prettier in progress caffeinate lucene react router ruby rails cytoscape

Modernize or Die® - CFML News for October 27th, 2021 - Episode 123

Modernize or Die ® Podcast - CFML News Edition

Play Episode Listen Later Oct 27, 2021 65:44

2021-10-27 Weekly News - Episode 123Watch the video version on YouTube at https://www.youtube.com/watch?v=dLQhiLcHpH0 Hosts: Brad Wood - Senior Developer for Ortus SolutionsGavin Pickin - Senior Developer for Ortus SolutionsThanks to our Sponsor - Ortus SolutionsThe makers of ColdBox, CommandBox, ForgeBox, TestBox and almost every other Box out there. A few ways to say thanks back to Ortus Solutions: Like and subscribe to our videos on YouTube. Sign up for a free or paid account on CFCasts, which is releasing new content every week Buy Ortus's new Book - 102 ColdBox HMVC Quick Tips and Tricks on GumRoad (http://gum.co/coldbox-tips) Patreon SupportWe have 37 patreons providing 93% of the funding for our Modernize or Die Podcasts via our Patreon site: https://www.patreon.com/ortussolutions. Now offering Annual Memberships, pay for the year and save 10% - great for businesses.News and EventsPreside Version 10.16.0 is outSee our release and upgrade notes/video:Video: https://t.co/OZo8qRURWe Release Notes: https://t.co/bSt8vA9OT3 Documentation: https://t.co/k3P3rHff6k Online CF Meetup - Using LaunchDarkly for feature flag management in CF applications, w/ Brad WoodThursday, October 28, 2021 at 9:00 AM to 10:00 AM PDTFeature flags are a system of enabling certain functionality in your app based on test groups, cross-cutting segments of users, and your internal release processes. Feature flags can be updated on the fly at any time by any user and don't require deploying new code to your servers. LaunchDarkly is a system that helps you manage your feature flags and how they respond to the users of your site. It offers detailed tracking of each user, each flag, and a robust set of rules for determining which users see which features. In this session, we'll see an overview of how to use the new LaunchDarkly SDK which can be used in ColdFusion applications. Demos will include both ColdBox apps and non-ColdBox legacy apps.https://www.meetup.com/coldfusionmeetup/events/281577538/ Adobe 1 Day Workshop - Adobe ColdFusion Workshop with Damien BruyndonckxWed, November 10, 202109:00 - 17:00 CEST EUROPEANJoin the Adobe ColdFusion Workshop to learn how you and your agency can leverage ColdFusion to create amazing web content. This one-day training will cover all facets of Adobe ColdFusion that developers need to build applications that can run across multiple cloud providers or on-premise.https://coldfusion-workshop.meetus.adobeevents.com/ ICYMI - Into the Box 2021 - Videos are now availableVideos are now available on CFCasts!https://cfcasts.com/series/into-the-box-2021Free for subscribers; Free for ITB 2021 attendees; available as a one-time purchase for $199.If you bought a ticket to Into the Box 2021 and have not received a coupon for access to the videos on CFCasts, please contact us from the CFCasts support page. https://cfcasts.com/supportICYMI - Ortus Webinar for October - Gavin Pickin - Building Quick APIs - the extended versionIn this session we will use ColdBox's built in REST BaseHandler, and with CBSecurity and Quick ORM we will set up a secure API using fluent query language - and you'll see how quick Quick development can be!https://www.ortussolutions.com/events/webinarsRecording will be posted to CFCasts soonHacktoberfest 2021Support open source throughout October!Hacktoberfest encourages participation in the open source community, which grows bigger every year. Complete the 2021 challenge and earn a limited edition T-shirt.GIVING TO OPEN SOURCEOpen-source projects keep the internet humming—but they can't do it without resources. Donate and support their awesome work.TREES NOT TEESRather than receive t-shirts as swag, you can choose to have a tree planted in your name and help make Hacktoberfest 2021 more carbon neutral.To win a reward, you must sign up on the Hacktoberfest site and make four pull requests on any repositories classified with the 'hacktoberfest 'topic on GitHub or GitLab by October 31. If an Ortus Solutions repo that you want to contribute to is not marked with the `hacktoberfest` topic, please let us know so we can fix it.https://hacktoberfest.digitalocean.com/ CFCasts Content Updateshttps://www.cfcasts.com Just ReleasedUp and Running with Quick Testing with Quick Step 11 Exercise Coming this week Up and Running with Quick Building Quick APIs Send your suggestions at https://cfcasts.com/supportConferences and TrainingMicrosoft IgniteNovember 2–4, 2021 Opportunity awaits, with dedicated content spotlighting Microsoft Business Applications and Microsoft Security.https://myignite.microsoft.com/homeDeploy by Digital OceanTHE VIRTUAL CONFERENCE FOR GLOBAL DEVELOPMENT TEAMSNovember 16-17, 2021 https://deploy.digitalocean.com/homeAWS re:InventNOV. 29 – DEC. 3, 2021 | LAS VEGAS, NVCELEBRATING 10 YEARS OF RE:INVENTVirtual: FreeIn Person: $1799https://reinvent.awsevents.com/ Postgres BuildOnline - FreeNov 30-Dev 1 2021https://www.postgresbuild.com/ ITB Latam 2021December 2-3, 2021Into the Box LATAM is back and better than ever! Our virtual conference will include speakers from El Salvador and all over the world, who'll present on the latest web and mobile technologies in Latin America.Registration is completely free so don't miss out!https://latam.intothebox.org/ Adobe ColdFusion Summit 2021December 7th and 8th - VirtualSpeakers are finalized and some Speakers and some session descriptions are now on the siteRegister for Free - https://cfsummit.vconfex.com/site/adobe-cold-fusion-summit-2021/1290Blog - https://coldfusion.adobe.com/2021/09/adobe-coldfusion-summit-2021-registrations-open/ Tweet from Mark Takata OK! I can finally let you all know that for the @Adobe @coldfusion #CFSummit2021 keynote we will be featuring @ashleymcnamara! Her talk will focus on the history & future of DevRel how we got here & where we're going.cfsummit.vconfex.com to register!#CFML #DevRel #conferencehttps://twitter.com/MarkTakata/status/1449063259072438277 https://twitter.com/MarkTakata jConf.devNow a free virtual eventDecember 9th starting at 8:30 am CDT/2:30 pm UTC.https://2021.jconf.dev/?mc_cid=b62adc151d&mc_eid=8293d6fdb0 More conferencesNeed more conferences, this site has a huge list of conferences for almost any language/community.https://confs.tech/Blogs, Tweets and Videos of the WeekBlog - Ben Nadel - Reading Environment (ENV) Variables From The Server Scope In Lucee CFML 5.3.7.47This is a pro-tip that I originally picked up from Julian Halliwell a few years ago. However, I sometimes talk to people who don't realize that this is possible. So, I wanted to try and amplify Julian's post. In Lucee CFML, you can read environment (ENV) variables directly out of the server scope. They are just automatically there - no dipping into the Java layer or dealing with the java.lang.System class. Lucee CFML brings these values to the surface for easy consumption.https://www.bennadel.com/blog/4140-reading-environment-env-variables-from-the-server-scope-in-lucee-cfml-5-3-7-47.htm Blog - Ben Nadel - Making SQL Queries More Flexible With LIKE In MySQL 5.7.32 And Lucee CFML 5.3.7.47While you might stand-up something like Elasticsearch, Lucene, or Solr in order to provide robust and flexible text-based searches in your ColdFusion application, your relational database is more than capable of performing (surprisingly fast) pattern matching on TEXT and VARCHAR fields using the LIKE operator. This is especially true if the SQL query in question is already being limited based on an indexed value. At InVision, I often use the LIKE operator to allow for light-weight text-based searches. And, as of late, I've been massaging the inputs in order to make the matches even more flexible, allowing for some slightly fuzzy matching in Lucee CFML 5.3.7.47.https://www.bennadel.com/blog/4137-making-sql-queries-more-flexible-with-like-in-mysql-5-7-32-and-lucee-cfml-5-3-7-47.htm Blog - Ben Nadel - Creating A Group-Based Incrementing Value In MySQL 5.7.32 And Lucee CFML 5.3.7.47In the past few weeks, I've been learning a lot about how I can leverage SERIALIZABLE transactions in MySQL, the scope of said transactions, and some hidden gotchas around locking empty rows. As a means to lock (no pun intended) some of that information in my head-meat, I thought it would be a fun code kata to create a Jira-inspired ticketing system in Lucee CFML 5.3.7.47 that uses an application-defined, group-based incrementing value in MySQL 5.7.32.https://www.bennadel.com/blog/4135-creating-a-group-based-incrementing-value-in-mysql-5-7-32-and-lucee-cfml-5-3-7-47.htm Blog - Ben Nadel - Creating A Group-Based Incrementing Value Using LAST_INSERT_ID() In MySQL 5.7.32 And Lucee CFML 5.3.7.47Yesterday, I took inspiration from Jira's ticketing system and explored the idea of creating a group-based incrementing value in MySQL. In my approach, I used a SERIALIZABLE transaction to safely "update and read" a shared sequence value across parallel threads. In response to that post, my InVision co-worker - Michael Dropps - suggested that I look at using LAST_INSERT_ID(expr) to achieve the same outcome with less transaction isolation. I had never seen the LAST_INSERT_ID() function used with an expression argument before. So, I wanted to revisit yesterday's post using this technique.https://www.bennadel.com/blog/4136-creating-a-group-based-incrementing-value-using-last-insert-id-in-mysql-5-7-32-and-lucee-cfml-5-3-7-47.htm Blog / Documentation - Zac Spitszer - Building and testing Lucee extensions documentationI have written up a detailed guide on how to Build and Test Lucee Extensions, using Lucee Script Runner and Apache Ant.It's a little bit complicated to setup, but I have developed a toolchain, which once set up, makes the entire process really dead simple.https://dev.lucee.org/t/building-and-testing-lucee-extensions-documentation/9053 Tweet - Mark Takata - Adobe - The CF Summit 2021 Keynote announcementOK! I can finally let you all know that for the @Adobe @coldfusion #CFSummit2021 keynote we will be featuring @ashleymcnamara! Her talk will focus on the history & future of DevRel how we got here & where we're going.cfsummit.vconfex.com to register!#CFML #DevRel #conferencehttps://twitter.com/MarkTakata/status/1449063259072438277 https://twitter.com/MarkTakata Tweet - Ben Nadel - Monolith DeploysIt's 10:50 AM.I work in a monolithic #Lucee #CFML codebase.And, I just started my 3rd deployment of the day.It's amazing how much work you can get done when you stop worrying about what other people think of your technology choices.

covid-19 ai las vegas running news opportunities video benefits speaker system videos tricks latin america oracle profile feature el salvador countries adobe tweets api blogs java github demos listing dev mclean colon cf spreadsheets west palm beach sql env pune tldr gitlab utc cdt jira mysql gumroad modernize invision itb elasticsearch jvm devrel lts cold fusion aws lambda solr launchdarkly hacktoberfest die podcasts itemname lucene brad wood oracle java ortus microsoft business applications adobe coldfusion contentbox ben nadel coldbox

Ep. 79 Atlas Search with Marcus Eagan

The MongoDB Podcast

Play Episode Listen Later Sep 27, 2021 40:33

Atlas Search makes it easy to add fast, full-text, relevant search capability into your applications. On this episode, we talk with Marcus Eagan, Senior Product Manager for Atlas Search. Marcus gives us a great overview of the search space and discusses Lucene, Solr, and how Atlas Search simplifies the process of adding search to your applications.

search senior product manager eagan solr lucene

36. Challenge Accepted. with Thomas Payet, co-founder & COO at MeiliSearch

Catch the Tornado Podcast

Play Episode Listen Later May 13, 2021 39:44

ElasticSearch is the default search engine for anyone who wants to build a search box. Lucene is the algorithm behind ElasticSearch, and other popular search engines but, although this technology is great, it's been around for many years. MeiliSearch, an Open Source solution co-founded by Thomas Payet, is challenging the market monopoly in terms of search and is currently providing a popular and user-friendly search engine building experience for developers. In today's Challenge Accepted episode, Piotr speaks to Thomas about starting at the very bottom and developing your own OS solution, challenging huge market players, growing a faithful following, and plans of monetization. Tune in!

co founders os open source piotr challenge accepted elasticsearch payet co founder coo lucene

LCC 249 - Édition tu perds tes amis

podcasting mentorship millionaires thought leaders lucene spiritual thought leader

Play Episode Listen Later Feb 15, 2021 79:53

Emmanuel Antonio et Guillaume discutent de Java 16, de GraalVM, de micronaut, de Quarkus, de licence Elastic, de BinTray qui s’en va et d’attaque de chaine de fournisseurs. Et merci à José Paumard et Benoit Sautel pour leur crowdcast. Enregistré le 12 février 2021 Téléchargement de l’épisode LesCastCodeurs-Episode–249.mp3 News Langages Optimiser le MD5 dans la JVM dans la tête d’une optimisation du JDK optimisation proposée amène des surcharges de contentions (thread local) donc exploration de l’alternative difficulté des codes intrinseques (c’est à dire quand un pattern est détecté, le code est hardcodé par platforme. Donc tout changement du code qui sort du pattern veut dire pas mal de taf) Conversion hexadecimal en Java 17 Crowdcast de José sur Java 16 et article de Loic sur le sujet Java 16 Socket channels (Unix domain) Court circuit de la stack tcp, pas de file descriptor de mémoire Api vectorielle avec optimisation par plateforme Foreign linker api pour panama Et le support appel natif Support alpine (musl) et aarch64 pour Windows Record et pattern matching instanceof deviennent standard Illegal access passe en deny par défaut. Ça pue ;) Java sur Truffle dans GraalVM le GC reste sur la JVM hote qui peut etre hotspot ou SubstrateVM Dans le cas de SubstrateVM, ça veut dire que Java peut etre interprété dans ce mode ahead of time compiled (donc in JIT est embarqué). Pour faire tourner certains morceaux de Java “dynamique” ça peut valoir le coup Sinon c’est la vision de GraalVM de la VM universelle donc supporter Java “comme les autres langages” fait partie du puzzle Mais bon c’est dur de comprendre leur strategie Crowdcast JavaScript GraalVM de Benoit Sautel L’API Polyglot Appeler du Javascript depuis la JVM Migrer depuis Nashorn Démonstration et benchmark GraalJS avec Maven JEP 243 Java-Level JVM Compiler Interface Interview d’un responsable de GraalVM sur Nashorn vs GraalVM JBang - comment écrire des scripts en Java pourquoi les gens écrivent des scripts dans d’autre langages que Java un seul fichier, pas de structure complexe y compris dans les dependances un demarrage juste en lançant le ficher crée un environnement pour l’IDE Element worklet, rendre JavaScript preemptif Proposition de creation d’élément de code JavaScript qui peut tourner hors du thread principal by design. JS peux rendre la main mais c’est non preemptif (yield, promesses etc) et uniquement à un endroit précis Donc création de Element Worklet (un comme un runnable en Java) qui tourne dans un thread séparé, avec un message channel pour communiquer avec le reste Travaille sur un shadow dom par contre rien n’est détaillé sur le scheduler et la priorisation Librairies Driver JDBC Oracle sur Maven Central! Drivers support for Virtual Threads Extension reactives GraalVM native image (mais encore des trucs a amelioerer (allow incomplete classpath) Micronaut 2.3 support de JMS résolution de la Locale améliorations au système d’introspection bannière personnalisable Fondation pour Grails Idée des fondations était venu ensemble avec Micronaut Mais voulait apprendre de l’un avant de lancer l’autre Embrasse semver Le technical commutee va décider de la roadmap de ce que j’ai compris Intégration initiale de micronaut dans Grails 4 Plan: TX mongo dans GORM. Groovy server pages plus modulaire, native web socket, meilleure intégration Kafka Plan grails 5: Groovy 3, SB 2.4, gradle 6 et Java 15 Quarkus 1.11 RESTEasy Reactive Annotation scanning, metamodel generation au build, base sur vert.x route Dev UI les frameworks amènent des tâches de dev (config, list des bean CDI, database schema migration etc) Massive performance without headaches Infrastructure Les rebondissements d’Elastic vs AWS et du changement de licence Clarification d’Elastic “si vous vendez Elasticsearch directement en tant que service, vous serez impacté” entre les annonces et la licence, il y a une difference est-ce que tout competiteur sérieux à Elastic amènera un changement de licence? est-ce que Lucene est le prochain sur la liste? reflechissent à une license qui ouvre le code apres 3 à 5 ans BSL (Business Solftware License qui se transforme en ASL apres quelques années, et qui a une clause restrictive avant) La distribution Elastic d’Elasticsearch avait déjà un mix de ASL et de logiciels sous license proprietaire mais “source ouverte” AWS forks Elasticsearch Montrent les contribs ~10 sur Elasticsearch et annonce 250 contributions sur Lucene Les clients Elasticsearch resteront ASL 2.0 mais pas le client Java haut niveau qui a des dependence’s sur les classes serveur. Un nouveau client devrait arriver. Retour de l’ex CTO de Chef et sa position “pro” AWS et contre Elastic contre point de la position des gens du Message a caractere informatif 4 valeurs de l’OSS: The freedom to run the program as you wish, for any purpose The freedom to study how the program works, and change it so it does your computing as you wish The freedom to redistribute copies so you can help others The freedom to distribute copies of your modified versions to others at its heart, Open Source and Free Software are about the freedom to make the system work the way you wish au dessus est la communaute et le benefce de distribution qui fait un plus group morceau de clients potentiels Shay B - By putting the core of Elasticsearch into the open, we can presume he wanted the business value benefits of Open Source — collaboration in the commons, low friction acquisition for users, and hopefully the growth of an ecosystem around it. He got it tight open core - direct, and often critical, features are only available under a proprietary license co-mingle the source code for these features in the primary Elasticsearch repository Elastic NV creates a world where it is very, very difficult to collaborate only on the open source pieces. to whom does Elasticsearch belong? The community, or Elastic NV? Elasticsearch […] exists primarily to fuel the commercial ambitions of Elastic NV I, as a contributor, want to change the course of Elasticsearch in ways that benefit me (and perhaps others), but does so at the expense of Elastic NV, will I get that opportunity? The answer is most likely no — you will not. That truth is ultimately corrosive to sustainable communities. This is the deepest, most fundamental truth about Open Source and Free Software in action. That you, as a user, have rights. That those rights are not contingent on the ability of someone else to capture value. Companies who decide to build their business on Open Source cores need to get much more aggressive about their trademark policies. It should be clear and unambiguous that your trademark cannot be used for another product without your permission. If I may go further, I would make it clear that nobody but your company can create a distribution with your trademark on it at all, without your permission. Docker donne Docker Distribution à la CNCF code déjà ouvert et utilisé par certains mais avait forké c’est le coeur de DockerHub et est une container registry objectif extensibilité pour les usages particuliers des uns et des autres (systeme de stockage etc) Web Angular CLI 11.1 Support TypeScript 4.1 nouveau plugin webpack pour le compilateur Ivy (pas d’effet visible attendu) scelection des CSS critiques pour un chargement initial et inlining => opt-in pour l’instant EcmaScript 5 polyfill a été enrichi Outillage JFrog annouce que BinTray c’est fini aussi jcenter, gocenter, chartcenter etc fin des push 31 mars et fermeture de l’API REST et l’interface le 1er mai l’url jcenter continue encore un an si les projets utilisaient la synchro sur central, les pachkages seront là sinon il va falloir copier et les scripts font devoir evoluer questions sur la scalabiluté de MAven Central Brian Fox de Sonatype nous dit que tout va bien se passer Le blog officiel de Sonatype. Attaque de suply chain par squattage de nom privés chercher le nom de dépendances privées d’organisations publier une version “supérieure” sous le meme nom dans un repo public profit ! Déployer sur Maven Central avec une action GitHub Le Java action workflow fait plus que preparer Java avec clef GPG et tout JHipster Quarkus 1.0.0 contribué par Daniel Petisme et Anthony Viard JHipster Quarkus est un “blueprint” JHipster qui permet de surcharger la mécanique de génération pour obtenir un backend qui s’appuye sur Quarkus plutôt que Spring. Cela permet de généré rapidement une application fullstack (front + back). contenu Twitch d’antony Homebrew 3.0.0 est sorti support officiel de Apple M1 avec des bottles native. Pas tous les binaires installable ne supportent M1 Sécurité Dépassement de pile dans sudo introduit en juillet 2011 Loi, société et organisation Jeff Bezos ne sera plus CEO d’Amazon (juste président du directoire) Sacha Labourey aussi quitte le poste de CEO de CloudBees pour devenir Chief Strategy Officer passer de 100 a 250 M IPO Le blog de Sacha Nous contacter Soutenez Les Cast Codeurs sur Patreon https://www.patreon.com/LesCastCodeurs Faire un crowdcast ou une crowdquestion Contactez-nous via twitter https://twitter.com/lescastcodeurs sur le groupe Google https://groups.google.com/group/lescastcodeurs ou sur le site web https://lescastcodeurs.com/

Les Cast Codeurs n°249 du 15/02/21 - LCC 249 - Édition tu perds tes amis (80min)

BadGeek

Play Episode Listen Later Feb 15, 2021 80:31

Emmanuel Antonio et Guillaume discutent de Java 16, de GraalVM, de micronaut, de Quarkus, de licence Elastic, de BinTray qui s'en va et d'attaque de chaine de fournisseurs. Et merci à José Paumard et Benoit Sautel pour leur crowdcast. Enregistré le 12 février 2021 Téléchargement de l'épisode [LesCastCodeurs-Episode-249.mp3](https://traffic.libsyn.com/lescastcodeurs/LesCastCodeurs-Episode-249.mp3) ## News ### Langages [Optimiser le MD5 dans la JVM](https://cl4es.github.io/2021/01/04/Investigating-MD5-Overheads.html) * dans la tête d’une optimisation du JDK * optimisation proposée amène des surcharges de contentions (thread local) * donc exploration de l’alternative * difficulté des codes intrinseques (c’est à dire quand un pattern est détecté, le code est hardcodé par platforme. * Donc tout changement du code qui sort du pattern veut dire pas mal de taf) [Conversion hexadecimal en Java 17](http://marxsoftware.blogspot.com/2020/12/jdk17-hex-formatting-parsing.html) Crowdcast de José sur Java 16 et article de [Loic sur le sujet Java 16](https://www.loicmathieu.fr/wordpress/informatique/java-16-quoi-de-neuf/) * Socket channels (Unix domain) Court circuit de la stack tcp, pas de file descriptor de mémoire * Api vectorielle avec optimisation par plateforme * Foreign linker api pour panama * Et le support appel natif * Support alpine (musl) et aarch64 pour Windows * Record et pattern matching instanceof deviennent standard * Illegal access passe en deny par défaut. Ça pue ;) [Java sur Truffle dans GraalVM](https://medium.com/graalvm/java-on-truffle-going-fully-metacircular-215531e3f840) * le GC reste sur la JVM hote qui peut etre hotspot ou SubstrateVM * Dans le cas de SubstrateVM, ça veut dire que Java peut etre interprété dans ce mode ahead of time compiled (donc in JIT est embarqué). Pour faire tourner certains morceaux de Java “dynamique” ça peut valoir le coup * Sinon c’est la vision de GraalVM de la VM universelle donc supporter Java “comme les autres langages” fait partie du puzzle * Mais bon c’est dur de comprendre leur strategie Crowdcast JavaScript GraalVM de Benoit Sautel * [L’API Polyglot](https://www.graalvm.org/reference-manual/polyglot-programming/) * [Appeler du Javascript depuis la JVM](https://www.graalvm.org/reference-manual/js/JavaInteroperability/) * [Migrer depuis Nashorn](https://www.graalvm.org/reference-manual/js/NashornMigrationGuide/) * [Démonstration et benchmark GraalJS avec Maven](https://github.com/graalvm/graal-js-jdk11-maven-demo) * [JEP 243 Java-Level JVM Compiler Interface](https://openjdk.java.net/jeps/243) * [Interview d’un responsable de GraalVM sur Nashorn vs GraalVM](https://jaxenter.com/graalvm-nashorn-interview-147115.html) [JBang - comment écrire des scripts en Java](https://emmanuelbernard.com/blog/2021/01/18/jbang/) * pourquoi les gens écrivent des scripts dans d'autre langages que Java * un seul fichier, pas de structure complexe y compris dans les dependances * un demarrage juste en lançant le ficher * crée un environnement pour l'IDE [Element worklet, rendre JavaScript preemptif](https://jasonformat.com/element-worklet/) * Proposition de creation d’élément de code JavaScript qui peut tourner hors du thread principal by design. * JS peux rendre la main mais c’est non preemptif (yield, promesses etc) et uniquement à un endroit précis * Donc création de Element Worklet (un comme un runnable en Java) qui tourne dans un thread séparé, avec un message channel pour communiquer avec le reste * Travaille sur un shadow dom * par contre rien n’est détaillé sur le scheduler et la priorisation ### Librairies [Driver JDBC Oracle sur Maven Central!](https://blogs.oracle.com/developers/new-year-goodies-oracle-jdbc-21100-on-maven-central) * Drivers support for Virtual Threads * Extension reactives * GraalVM native image (mais encore des trucs a amelioerer (allow incomplete classpath) [Micronaut 2.3](https://micronaut.io/blog/2021-01-22-2-dot-3-release.html) * support de JMS * résolution de la Locale * améliorations au système d’introspection * bannière personnalisable [Fondation pour Grails](https://www.infoq.com/news/2021/01/oci-grails-foundation/) * Idée des fondations était venu ensemble avec Micronaut * Mais voulait apprendre de l’un avant de lancer l’autre * Embrasse semver * Le technical commutee va décider de la roadmap de ce que j’ai compris * Intégration initiale de micronaut dans Grails 4 * Plan: TX mongo dans GORM. Groovy server pages plus modulaire, native web socket, meilleure intégration Kafka * Plan grails 5: Groovy 3, SB 2.4, gradle 6 et Java 15 [Quarkus 1.11](https://quarkus.io/blog/quarkus-1-11-0-final-released/) * RESTEasy Reactive * Annotation scanning, metamodel generation au build, base sur vert.x route * Dev UI * les frameworks amènent des tâches de dev (config, list des bean CDI, database schema migration etc) * [Massive performance without headaches](https://quarkus.io/blog/resteasy-reactive-faq/) ### Infrastructure Les rebondissements d'Elastic vs AWS et du changement de licence * [Clarification d'Elastic](https://www.elastic.co/blog/license-change-clarification) * "si vous vendez Elasticsearch directement en tant que service, vous serez impacté" * entre les annonces et la licence, il y a une difference * est-ce que tout competiteur sérieux à Elastic amènera un changement de licence? * est-ce que Lucene est le prochain sur la liste? * reflechissent à une license qui ouvre le code apres 3 à 5 ans BSL (Business Solftware License qui se transforme en ASL apres quelques années, et qui a une clause restrictive avant) * La distribution Elastic d'Elasticsearch avait déjà un mix de ASL et de logiciels sous license proprietaire mais "source ouverte" * AWS [forks Elasticsearch](https://aws.amazon.com/fr/blogs/opensource/stepping-up-for-a-truly-open-source-elasticsearch/) * Montrent les contribs ~10 sur Elasticsearch et annonce 250 contributions sur Lucene * Les [clients Elasticsearch resteront ASL 2.0](https://twitter.com/jponge/status/1353721544997040131?s=21) * mais pas le client Java haut niveau qui a des dependence’s sur les classes serveur. Un nouveau client devrait arriver. * [Retour de l'ex CTO de Chef et sa position "pro" AWS et contre Elastic](https://medium.com/sustainable-free-and-open-source-communities/free-software-is-the-only-winner-in-elastic-nv-vs-aws-9416f2a0a7f5) * contre point de la position des gens du Message a caractere informatif * 4 valeurs de l'OSS: * The freedom to run the program as you wish, for any purpose * The freedom to study how the program works, and change it so it does your computing as you wish * The freedom to redistribute copies so you can help others * The freedom to distribute copies of your modified versions to others * at its heart, Open Source and Free Software are about the freedom to make the system work the way you wish * au dessus est la communaute et le benefce de distribution qui fait un plus group morceau de clients potentiels * Shay B - By putting the core of Elasticsearch into the open, we can presume he wanted the business value benefits of Open Source — collaboration in the commons, low friction acquisition for users, and hopefully the growth of an ecosystem around it. He got it * tight open core - direct, and often critical, features are only available under a proprietary license * co-mingle the source code for these features in the primary Elasticsearch repository * Elastic NV creates a world where it is very, very difficult to collaborate only on the open source pieces. * to whom does Elasticsearch belong? The community, or Elastic NV? * Elasticsearch [...] exists primarily to fuel the commercial ambitions of Elastic NV * I, as a contributor, want to change the course of Elasticsearch in ways that benefit me (and perhaps others), but does so at the expense of Elastic NV, will I get that opportunity? The answer is most likely no — you will not. * That truth is ultimately corrosive to sustainable communities. * This is the deepest, most fundamental truth about Open Source and Free Software in action. That you, as a user, have rights. That those rights are not contingent on the ability of someone else to capture value. * Companies who decide to build their business on Open Source cores need to get much more aggressive about their trademark policies. It should be clear and unambiguous that your trademark cannot be used for another product without your permission. If I may go further, I would make it clear that nobody but your company can create a distribution with your trademark on it at all, without your permission. [Docker donne Docker Distribution à la CNCF](https://www.docker.com/blog/donating-docker-distribution-to-the-cncf/) * code déjà ouvert et utilisé par certains mais avait forké * c'est le coeur de DockerHub et est une container registry * objectif extensibilité pour les usages particuliers des uns et des autres (systeme de stockage etc) ### Web [Angular CLI 11.1](https://blog.ninja-squad.com/2021/01/21/angular-cli-11.1/) * Support TypeScript 4.1 * nouveau plugin webpack pour le compilateur Ivy (pas d'effet visible attendu) * scelection des CSS critiques pour un chargement initial et inlining => opt-in pour l'instant * EcmaScript 5 polyfill a été enrichi ### Outillage [JFrog annouce que BinTray c'est fini](https://jfrog.com/blog/into-the-sunset-bintray-jcenter-gocenter-and-chartcenter/) * aussi jcenter, gocenter, chartcenter etc * fin des push 31 mars et fermeture de l'API REST et l'interface le 1er mai * l'url jcenter continue encore un an * si les projets utilisaient la synchro sur central, les pachkages seront là * sinon il va falloir copier * et les scripts font devoir evoluer * questions sur la scalabiluté de MAven Central * [Brian Fox de Sonatype](https://twitter.com/Brian_Fox/status/1357414525377642496) nous dit que tout va bien se passer * [Le blog officiel de Sonatype](https://blog.sonatype.com/dear-bintray-and-jcenter-users-heres-what-you-need-to-know-about-the-central-repository). [Attaque de suply chain par squattage de nom privés](https://medium.com/@alex.birsan/dependency-confusion-4a5d60fec610) * chercher le nom de dépendances privées d'organisations * publier une version "supérieure" sous le meme nom dans un repo public * profit ! [Déployer sur Maven Central avec une action GitHub](https://bjansen.github.io/java/2021/02/03/deploying-to-maven-central-using-github-actions.html) * Le Java action workflow fait plus que preparer Java * avec clef GPG et tout [JHipster Quarkus 1.0.0](https://github.com/jhipster/generator-jhipster-quarkus) contribué par Daniel Petisme et Anthony Viard * JHipster Quarkus est un "blueprint" JHipster qui permet de surcharger la mécanique de génération pour obtenir un backend qui s'appuye sur Quarkus plutôt que Spring. Cela permet de généré rapidement une application fullstack (front + back). * contenu [Twitch d'antony](https://www.twitch.tv/avdev4j) [Homebrew 3.0.0 est sorti](https://brew.sh/2021/02/05/homebrew-3.0.0/) * support officiel de Apple M1 avec des bottles native. Pas tous les binaires installable ne supportent M1 ### Sécurité [Dépassement de pile dans sudo](https://blog.qualys.com/vulnerabilities-research/2021/01/26/cve-2021-3156-heap-based-buffer-overflow-in-sudo-baron-samedit) * introduit en juillet 2011 ### Loi, société et organisation [Jeff Bezos ne sera plus CEO d'Amazon (juste président du directoire)](https://www.journaldugeek.com/2021/02/03/amazon-jeff-bezos-quitte-son-poste-de-directeur-general/) [Sacha Labourey aussi quitte le poste de CEO de CloudBees pour devenir Chief Strategy Officer](https://finance.yahoo.com/news/cloudbees-names-stephen-dewitt-ceo-140000459.html) * passer de 100 a 250 M * IPO * [Le blog de Sacha](https://www.cloudbees.com/blog/sacha-labourey-the-next-phase-cloudbees) ## Nous contacter Soutenez Les Cast Codeurs sur Patreon [Faire un crowdcast ou une crowdquestion](https://lescastcodeurs.com/crowdcasting/) Contactez-nous via twitter sur le groupe Google ou sur le site web

Episode 54: Information Retrieval Research, Data Science For Space Missions, and Open-Source Software with Chris Mattmann

Datacast

Play Episode Listen Later Feb 4, 2021 82:43

Timestamps(2:55) Chris went over his experience studying Computer Science at the University of Southern California for undergraduate in the late 90s.(5:26) Chris recalled working as a Software Engineer at NASA Jet Propulsion Lab in his sophomore year at USC.(9:54) Chris continued his education at USC with an M.S. and then a Ph.D. in Computer Science. Under the guidance of Dr. Nenad Medvidović, his Ph.D. thesis is called “Software Connectors For Highly-Distributed And Voluminous Data-Intensive Systems.” He proposed DISCO, a software architecture-based systematic framework for selecting software connectors based on eight key dimensions of data distribution.(16:28) Towards the end of his Ph.D., Chris started getting involved with the Apache Software Foundation. More specifically, he developed the original proposal and plan for Apache Tika (a content detection and analysis toolkit) in collaboration with Jérôme Charron to extract data in the Panama Papers, exposing how wealthy individuals exploited offshore tax regimes.(24:58) Chris discussed his process of writing “Tika In Action,” which he co-authored with Jukka Zitting in 2011.(27:01) Since 2007, Chris has been a professor in the Department of Computer Science at USC Viterbi School of Engineering. He went over the principles covered in his course titled “Software Architectures.”(29:49) Chris touched on the core concepts and practical exercises that students could gain from his course “Information Retrieval and Web Search Engines.”(32:10) Chris continued with his advanced course called “Content Detection and Analysis for Big Data” in recent years (check out this USC article).(36:31) Chris also served as the Director of the USC’s Information Retrieval and Data Science group, whose mission is to research and develop new methodology and open source software to analyze, ingest, process, and manage Big Data and turn it into information.(41:07) Chris unpacked the evolution of his career at NASA JPL: Member of Technical Staff -> Senior Software Architect -> Principal Data Scientist -> Deputy Chief Technology and Innovation Officer -> Division Manager for the AI, Analytics, and Innovation team.(44:32) Chris dove deep into MEMEX — a JPL’s project that aims to develop software that advances online search capabilities to the deep web, the dark web, and nontraditional content.(48:03) Chris briefly touched on XDATA — a JPL’s research effort to develop new computational techniques and open-source software tools to process and analyze big data.(52:23) Chris described his work on the Object-Oriented Data Technology platform, an open-source data management system originally developed by NASA JPL and then donated to the Apache Software Foundation.(55:22) Chris shared the scientific challenges and engineering requirements associated with developing the next generation of reusable science data processing systems for NASA’s Orbiting Carbon Observatory space mission and the Soil Moisture Active Passive earth science mission.(01:01:05) Chris talked about his work on NASA’s Machine Learning-based Analytics for Autonomous Rover Systems — which consists of two novel capabilities for future Mars rovers (Drive-By Science and Energy-Optimal Autonomous Navigation).(01:04:24) Chris quantified the Apache Software Foundation's impact on the software industry in the past decade and discussed trends in open-source software development.(01:07:15) Chris unpacked his 2013 Nature article called “A vision for data science” — in which he argued that four advancements are necessary to get the best out of big data: algorithm integration, development and stewardship, diverse data formats, and people power.(01:11:54) Chris revealed the challenges of writing the second edition of “Machine Learning with TensorFlow,” a technical book with Manning that teaches the foundational concepts of machine learning and the TensorFlow library's usage to build powerful models rapidly.(01:15:04) Chris mentioned the differences between working in academia and industry.(01:16:20) Chris described the tech and data community in the greater Los Angeles area.(01:18:30) Closing segment.His Contact InfoWikipediaNASA PageGoogle ScholarUSC PageTwitterLinkedInGitHubHis Recommended ResourcesDoug Cutting (Founder of Lucene and Hadoop)Hilary Mason (Ex Data Scientist at bit.ly and Cloudera)Jukka Zitting (Staff Software Engineer at Google)"The One Minute Manager" (by Ken Blanchard and Spencer Johnson)

director university ai los angeles space nature innovation mars nasa southern california engineering missions disco usc analytics analysis machine learning big data computer science manning open source data science search engines software engineers jet propulsion laboratory panama papers ken blanchard open source software tensorflow spencer johnson charron nasa jpl research data apache software foundation information retrieval lucene nasa jet propulsion lab usc viterbi school memex orbiting carbon observatory apache software

Episode 54: Information Retrieval Research, Data Science For Space Missions, and Open-Source Software with Chris Mattmann

DataCast

Play Episode Listen Later Feb 4, 2021 82:43

390: Eating the License Cake

LINUX Unplugged

Play Episode Listen Later Jan 26, 2021 44:00

Successful open-source projects all seem to struggle with one major gorilla. Who it is, and what their options are now. Special Guests: Drew DeVore and Jonathan Corbet.

google search eating security dust cake rust chrome open source unplugged aws elastic nintendo 64 chromium wayland elasticsearch free software home assistant solr i3 kibana lucene jupiter broadcasting hacs lwn apache license linux podcast server side public license

Millionaire Mentorship (Ep - 1707) Athena Lucene, Spiritual Thought Leader, Pt 2

Bshani Radio

Play Episode Listen Later Nov 23, 2020 59:20

Millionaire Mentorship (Ep - 1707) Athena Lucene, Spiritual Thought Leader, Pt 2

Millionaire Mentorship (EP 1707) Athena Lucene Spiritual Thought Leader

Bshani Radio

Play Episode Listen Later Nov 16, 2020 39:56

Millionaire Mentorship (EP 1707) Athena Lucene Spiritual Thought Leader

podcasting mentorship millionaires thought leaders lucene spiritual thought leader bshaniradioapp

Apache Solr and Lucene – Atri Sharma

FeatherCast

Play Episode Listen Later Sep 1, 2020 9:58

Apache Solr and Lucene are sister projects, in the search technology space. In this interview, Atri Sharma talks about the plan to make Solr an independent top level project …

sharma atri solr lucene apache solr

Apache Solr and Lucene – Atri Sharma

FeatherCast

Play Episode Listen Later Sep 1, 2020 9:58

Apache Solr and Lucene are sister projects, in the search technology space. In this interview, Atri Sharma talks about the plan to make Solr an independent top level project …

sharma atri solr lucene apache solr

Atri Sharma - Database Engineer and Open Source Committer #84

Develomentor

Play Episode Listen Later Aug 20, 2020 34:10

Welcome to another episode of Develomentor. Today's guest is Atri Sharma!As a database engineer and open source committer, Atri Sharma has built a career in tech deeply focused some of today’s most popular databases. Some of these include PostgreSQL, Lucene, Elasticsearch, Presto and HAWQ. After getting his bachelor’s from Jaypee Institute of Information Technology, Atri Sharma has held roles at Amazon, EnterpriseDB, Teradata, Barclays, Pivotal, Microsoft and Intuit. Along the way, he’s become a prolific contributor to several open source projects like Lucene, PostgreSQL and a handful of other Apache Software Foundation Projects. He’s also the author of the upcoming book “Practical Lucene 8”. Stay tuned as we take a deep dive with Atri Sharma into the world of databases and open source!-Grant IngersollIf you are enjoying our content please leave us a rating and review or consider supporting usQuotes“Google Summer of Code has been around for a while now. It is designed in a way to introduce college students and academics to the world of open source development.”“One thing that open source really gives you is the people aspect. You build a network of smart and very accomplished people who can actually mentor and help you grow on the right path.”“When I actually decided to commit myself to open source I always thought it would be in my spare time. I never actually anticipated having a job where I would be expected to contribute features back to open source projects.”—Atri SharmaKey MilestonesWhat is Google Summer of CodeHow has Atri continually found jobs in open sourceAtri has spent his career working on some of the hardest challenges in the database world like adding features, making them faster. What’s it like working deep inside some of the world’s most heavily used data stores?What are key skills necessary to work as a database engineer?Atri has worked for many of the world’s largest tech companies. Why did he choose to work in big tech?Atri’s book ‘Practical Lucene 8’Additional ResourcesOrder Atri’s book ‘Practical Apache Lucene 8’ – https://www.amazon.com/Practical-Apache-Lucene-Capabilities-Application/dp/1484263448Mike McCandless is one of Atri’s mentors; read his blog here – http://blog.mikemccandless.com/Connect with Atri SharmaLinkedInGitHubFollow DevelomentorTwitter: @develomentorConnect with Grant IngersollLinkedInTwitter

amazon microsoft code engineers open source databases sharma information technology pivotal intuit barclays presto postgresql elasticsearch teradata atri google summer lucene committer enterprisedb

Optimize your cycle, women in fitness with Kimberly Baba - EPS 6

Living Wild With Em

Play Episode Listen Later Aug 12, 2020 93:06

In this episode, I am joined with my amazing friend Kimberly Baba who is a fabulous trainer, women's health specialist, and Ironman athlete! Share with someone you love! -2:30 what is your journey and how did you get into women's health? •Focus has always been on male sports/performance •Degree in Exercise Physiology & Nutrition, USA Triathlon Coach •Stacy Simms - Specialist in Women's Health •Courses: Women are not small men, Book- Roar •Losing your period: low energy availability -12:15 Where does education about women's health go wrong? •#1 Research in Colleges - Men Performing and participating the studies women are not designed to be thin -#2 :18:58:- Male scope of society * It must be horrible to be a women / have a period -21:35 The four phases of the menstrual cycle Phase 1& 2- Follicular : different for everyone * Phase 1- starts with day 1 of bleeding * Hormones are lowest: most similar to male physiology * Using glucose for fuel- during exercise you don't need extra glucose for help * FSH, LH, Estrogen, Progesterone are low * More likely to get sick- white blood count low * Phase 2- ovulation * Progesterone low & estrogen rises, FSH rises & spike in LH * Estrogen in anabolic - builds muscle * Try and have a big workout- PR * Fuel is glucose - don't need added glucose * More likely for injury - joint are lax * Ovulation prediction kit If estrogen sensitive you feel bad on ovulation 31:46 Phase 3& 4 Luteal - always 14 days unless medical issue * Phase 3- estrogen & progesterone rise * Using fats for fuel * High hormone phase * Be better at endurance : using fatty acid oxidation as fuel so you can go longer * If you went to do high intensity, you will need more carbohydrates * 33:40- Women absorb 26% of fructose - any fruit during exercise = fructose * At end of phase 3, hormones are at the highest they will be * Phase 4 * 8-9 days before bleeding * Hormones decrease, causes inflammation * Things to help with PMS- for inflammation: 150-200mg magnesium & 45mg zinc & fish oil * BCCAA's- Lucene -46:32- Most optimal way of carb intake •Women use more carbs during the morning, front load •Recommends not going under 150 grams •Women need more protein - need Lucene, 30 grams post workout -53:42- Summary of the 4 Phases -56:11- Luna Endurance •Babas company •The 4 phases of the moon •New moon & Full Moon - being connected to the Earth -1:01:01- Birth Control •the pill has synthetic hormones : progestin •Causes a small spike in progestin everyday •If you have to be on the pill, take pill at night & workout in the morning •IUD- thins the lining & makes hostile environment, will still have cycle -1:06:00- best form of birth control •Copper IUD- no hormones at all, lasts 10 years •other IUDs- localized hormones •for oral- low progestin -1:10:33- How to get off the pill? •optimize each phase - see above •if you don't see results in 3 months, try to look and see if you have any problems •the pill decreases athletic performance by 10% -1:15:20- How do we fix society around male & female pressures? •If you start at 13 or older; more irregular...Before 13, more regular * girls are starting younger due to hormones being added to things •Puberty does not start with bleeding Resources: Stacy Simms- https://www.drstacysims.com ROAR book- https://www.amazon.com/ROAR-Fitness-Physiology-Optimum-Performance/dp/1623366860 No period, now what? Book https://www.noperiodnowwhat.com/#:~:text=Now%20What%3F%20is%20the%20most%20comprehensive%20guide%20to,on%20hypothalamic%20amenorrhea%20recovery%20through%20an%20online%20forum. App- https://www.fitrwoman.com Baba's website- https://www.lunaendurance.com

Lucidworks with Radu Miclaus

Google Cloud Platform Podcast

Play Episode Listen Later Jul 29, 2020 39:20

Mark Mirchandani is joined again by Priyanka Vergadia this week for an ML-filled interview with Radu Miclaus of Lucidworks. Lucidworks, a company specializing in information retrieval, strives to make data searching easier for developers and users. Building off Solr, Lucidworks created Fusion, an environment more conducive to easy AI-enhanced query capabilities, better scalability, and more. With Fusion, developers can take advantage of the highly advanced relevance tuning tools such as query rewrites, which analyze user behavior and automatically rewrite queries based on that information. On the tech side, Fusion was built with a combination of Java, Kubernetes to increase scalability, Solr management tools, and logging and reporting tools. The engineers at Lucidworks have created Fusion-specific system-enhancing pieces as well, including a machine learning service that allows data scientists to train their models elsewhere and plug them in for a completely customized experience. The team also created Smart Answers, which is a Q-And-A system built on a search engine that can connect to chatbots, virtual assistants, and others. Radu goes into detail explaining the Smart Answers system and how the layers of the project work together. We also learn about the customization capabilities and integration of Smart Answers. Radu wraps up the show with interesting use-case stories and how Fusion is working in the real world. In the future, Lucidworks will be available right in the GCP marketplace! Radu Miclaus Radu has over 12 years of experience in the data science space with applications in general machine learning architecture, search, customer analytics, risk and financial analysis. At Lucidworks, Radu focuses on low-code AI for search developers, pluggable machine learning for data scientists, and cloud managed services that offload the burden of operating search applications. Cool things of the week Week 2 sessions on productivity and collaboration site Online shopping gets more personal with Recommendations AI blog Using new traffic control features in External HTTP(S) load balancer blog Optimizing your costs on Compute Engine video Google Cloud Talks by DevRel site Giving you better cost analytics capabilities—and a simpler invoice blog GCP Podcast Episode 217: Cost Optimization with Justin Lerma and Pathik Sharma podcast Interview Lucidworks site Solr site Lucene site Fusion site Try Fusion site Smart Answers site Spark site Kubernetes site GKE site Dialogflow site Webinar: Smart Answers for Employee and Customer Support After COVID-19 site Deconstructing Chatbots video GCP Podcast Episode 227: Pandium with Cristina Flaschen and Kelly Sarabyn podcast GCP Podcast Episode 188: Conversation AI with Priyanka Vergadia podcast GCP Podcast Episode 195: Conversational AI Best Practices with Cathy Pearl and Jessica Dene Earley-Cha podcast Tip of the week We’re talking to Dale Markowitz about Prototyping Machine Learning projects. You can also hear more from Dale in GCP Podcast Episode 214: AI in Healthcare with Dale Markowitz and GCP Podcast Episode 194: ML with Dale Markowitz. What’s something cool you’re working on? Priyanka has been working on GCP Comics and Sketchnote.

Awesome Cassandra with Rahul Singh (Part 1 of 2) | Ep. 143 Distributed Data Show

Distributed Data Show

Play Episode Listen Later Apr 7, 2020 16:55

Rahul Singh of Anant talks with Cedrick Lunven about his background building a consultancy around Cassandra, why he loves Lucene, and how he got involved in maintaining Awesome Cassandra, an aggregation of the best Cassandra resources. See omnystudio.com/listener for privacy information.

distributed anant data show lucene rahul singh

Episode 179: Dr. Lucene Wisniewski: Borderline Personality Disorder

ED Matters

Play Episode Listen Later Mar 2, 2020 28:11

Today, Kathy welcomes Lucene Wisniewski, PhD, FAED, and they have a conversation on treatment considerations when borderline personality disorder co-occurs with an eating disorder. Lucene provides an immense amount of information in the episode, so please take notes and re-listen if you have to.

phd personality disorders borderline personality disorder wisniewski lucene

Simon Willnauer - Cofounder of Elastic & Berlin Buzzwords (#35)

Develomentor

Play Episode Listen Later Feb 26, 2020 40:34 Transcription Available

Welcome to another episode of Develomentor. Today's guest is Simon Willnauer. Simon is one of the main developers of Apache Lucene and a Lucene committer. He is also a PMC member and Apache Software Foundation member. Simon has led Elasticsearch's core development for over five years and co-founded the Berlin Buzzwords conference on search, store, and scalability.Click Here –> For more information about tech careersEpisode Summary"The big common denominator between concurrency problems and other big problems is that they usually are not solved in front of the computer"—Simon WillnauerIn this episode we’ll cover:The founding stories of Elasticsearch and Berlin Buzzwords!What is the future of search? Why is Java limited in its capabilities?Why the most challenging computer problems are best solved AWAY from the computer. Simon has had breakthroughs while running!Key Milestones[1:43] – Simon wrote his first line of code when he was 22. He was working at a computer security company when he got introduced to Lucene. Open source captured Simon's attention, and a few years later, he co-founded Elasticsearch[7:23] – Simon was a mechanic before going to university. He got into coding because it was creative and had endless possibilities. [9:43] – Simon mentions his early career mentors. Now, as a mentor himself, Simon points out that mentorship is bidirectional: he always learns from his mentees as well.[12:39] – Solving challenging problems involves making a mental model and solving them away from the computer. Simon solves many problems while exercising![15:15]- The founding stories of Elasticsearch and Berlin Buzzwords![18:59] – What goes into running Berlin Buzzwords? Hint: running a conference is stressful![24:22] – What were the difficulties in founding Elasticsearch? [28:09] - How to deal with founding a company and dealing with handing responsibility to others. At first. it's challenging, but success is when the company can run without you at the center of it. [31:08] - What is the future of search? Why is Java limited in its capabilities? It's great that we're increasing computing power, but we also must change the way we think about search[37:29] - The importance of Lucene and how many companies and products are inspired by it. You can find more resources and a full transcript in the show notesTo learn more about our podcast go to https://develomentor.com/To listen to previous episodes go to https://develomentor.com/blog/Follow Simon WillnauerTwitter: @s1m0nwLinkedIn: linkedin.com/in/simonwillnauer/Github: github.com/s1monwFollow Develomentor:Twitter: @develomentorFollow Grant IngersollTwitter: @gsingersLinkedIn: linkedin.com/in/grantingersoll

open berlin java github buzzwords elastic pmc elasticsearch apache software foundation lucene

Suchmaschinen

Python Podcast

Play Episode Listen Later Feb 24, 2020 96:13

Heute ging es um Volltextsuchmaschinen. Wir sprechen darüber, was die so grundsätzlich tun und wie man sie von Python aus verwenden kann, oder auch selbst eine implementieren könnte. Weitere Themen waren die Relevanz von Suchergebnissen, SEO und alles Mögliche drumherum. Zudem haben wir unsere Androhung aus früheren Episoden wahr gemacht und sprechen ein wenig über das pathlib Modul aus der Standardbibliothek. Shownotes Unsere E-Mail für Fragen, Anregungen & Kommentare: hallo@python-podcast.de News aus der Szene Bald startet der Kartenverkauf für die europython 2020 Python 3.8.2 Modul aus der Standardbibliothek Pathlib Metathema WDR 5 Das philosophische Radio Volltextsuchmaschinen Lucene - inzwischen die Standardbibliothek für Volltextsuche Solr - Ein auf Lucene aufsetzender Suchserver Elasticsearch - Ein ebenfalls auf Lucene aufsetzender Suchserver xapian Sphinx whoosh Volltextsuche in Python FTS5 Volltextsucherweiterung für sqlite Postgresql Volltextfeature MariaDB Volltextfeature zombodb Variable byte encoding TREC Conference series BM25 / Okapi PageRank RediSearch Volltexterweiterung für redis Learning to rank NDCG, MAP, ERR Django Postgres full text search Picks The Algorithms python read json directly in python: Armin Ronacher's tweet Python Entwicklungsumgebung Windows Tutorial: pyenv installation mit powershell

learning news radio search conference seo algorithms ir evaluation bald zudem map python tutorials kommentare anregungen szene relevanz variable sphinx wdr err weitere themen discounted suchmaschinen fulltext modul postgresql elasticsearch postgres pagerank solr mariadb trec suchergebnissen androhung lucene kartenverkauf metathema armin ronacher

Ep. 1 Origins

Develomentor

Play Episode Listen Later Sep 10, 2019 13:14

Welcome to the first episode of our brand new podcast Develomentor. We are a career based podcast about technology and we interview people working all sorts of jobs in tech, not just software engineers! Each interview will focus on helping you learn more about the role, what’s required to be successful in it and how that role is changing. More importantly, we hope you’ll see that there are a lot of different paths to be successful in your own career. If you stick with us, you’ll meet IT product managers with social work degrees and heads of sales with engineering degrees. You’ll meet engineers who “got all the degrees” and others who did a six week coding bootcamp and are running the show. You’ll meet academics and practitioners across a wide range of industries. You might even meet a few artists. Most importantly, we hope you’ll meet your calling. For full episode show notes click here

technology search origins creative directors computer science product managers syracuse university natural language processing chief technical officer technology marketing lucene lucidworks

PP05 - Datenbanken

Python Podcast

Play Episode Listen Later Feb 24, 2019 190:46

Wir haben uns diesmal zum Thema Datenbanken und Python zusammen gesetzt. Datenbanken sind ein weites Feld und daher ist diese Sendung auch ein bisschen länger geworden. Shownotes Datenbanken Postgres MySQL MariaDB MongoDB CouchDB Dgraph Neo4j Redis InfluxDB TimescaleDB Lucene Solr Elastichsearch Python ORM Django SQLAlchemy Pony peewee "Big Data" Ibis Arrow pyspark Papers A Relational Model of Data for Large Shared Data Banks C-Store: A Column-oriented DBMS Picks Sqlite Datasette Async binary driver for postgres Pickle Quellen Data serialization formats Taking a tour of postgres Everything is miscellaneous Method Chaining Implementing faceted search with Django and PostgreSQL Data Warehousing for Cavemen

Jake Mannix on Recommendation Systems, Physics, and the Best Answer to Our Questions Yet

Data Driven

Play Episode Listen Later Sep 5, 2018 47:01

In this episode, Frank and Andy talk to Jake Mannix, an accomplished data engineer who has had an amazing career. Links Sponsor: Audible.com (http://thedatadrivenbook.com) – Get a free audio book when you sign up for a free trial! Notable Quotes Frank heartily recommends Head Strong (https://www.audible.com/pd/Head-Strong-Audiobook/B01N7TZPXR) ([00:50]) It takes a lot of time to fly from Farmville to India… ([2:40]) Audible (http://thedatadrivenbook.com) is a sponsor! ([04:00]) Jake is a LucidWorks (https://lucidworks.com/) ([06:15]) Jake describes a cool Systems Development Cycle… ([10:00]) … and compares and contrasts with Customer Service and Product Engineering. ([12:00]) Sometimes this cycle works “too good!” ([13:10]) Learn more about Lucene (http://lucene.apache.org/core/) . ([15:00]) We now search by typing questions. Not terms. ([15:40]) More about Query-Intent Classification (https://en.wikipedia.org/wiki/User_intent) … ([17:10]) “How Do I Do Brain Surgery (https://www.google.com/search?q=how+do+I+do+brain+surgery&oq=how+do+I+do+brain+surgery) ” – Google Search ([18:25]) Smart search is based on closed loop analysis of “who clicks on what.” ([19:30]) One example of public search datasets: Google Trends (https://trends.google.com/trends/?geo=US) . ([21:30]) What is a garage-door opener? ([22:00]) “We expect a lot from search these days.” ([26:10]) Searching Amazon for garage door (https://www.amazon.com/s/ref=nb_sb_noss_2?url=search-alias%3Daps&field-keywords=garage+door) . ([26:40]) “Data is their foundation.” ([28:00]) “I’m a data engineer.” ([31:00]) “Math is fun but it’s 5% of the work.” What is most of the work? Data cleansing. ([31:20]) On tensors (https://www.grc.nasa.gov/www/k-12/Numbers/Math/documents/Tensors_TM2002211716.pdf) … ([31:40]) Jake likes to run long distances over mountains! ([33:10]) Jake thinks self-driving cars will be cool. ([37:30]) Jake shares his encounter with the Moonies, which wins the Something Different About Yourself segment (to date). ([39:35]) Jake and his daughter like to listen to audio books (http://thedatadrivenbook.com) . ([43:20]) Learn more about Jake on LinkedIn (https://www.linkedin.com/in/jakemannix/) ! ([44:00]) Jak helped build LinkedIn (https://www.linkedin.com) ‘s search engine! ([44:44]) Check out the LucidWorks blog (https://lucidworks.com/blog/) . ([45:00])

data smart math recommendations audible physics customer service user jak google search google trends farmville headstrong mannix moonies 3daps product engineering lucene lucidworks

The Cloudcast #347 - The Critical Skills for AI and ML

The Cloudcast

Play Episode Listen Later May 17, 2018 22:52

Aaron and Brian talk with Chao Han, (VP & Head of R&D at @Lucidworks) about critical Data Science skills for AI and ML, how Data Scientists engage with Business Leaders, the Lucidworks AI platform, and emerging research in the AI/ML space. Show Links: Lucidworks Homepage Apache Lucene Apache Solr Kaggle A Cloud Guru - Big Data Certification Python Programming for Beginners R Programming for Beginners [PODCAST] @PodCTL - Containers | Kubernetes | OpenShift - RSS Feed, iTunes, Google Play, Stitcher, TuneIn and all your favorite podcast players [A CLOUD GURU] Get The Cloudcast Alexa Skill [A CLOUD GURU] A Cloud Guru Membership - Start your free trial. Unlimited access to the best cloud training and new series to keep you up-to-date on all things AWS. [A CLOUD GURU] FREE access to AWS Certification Exam Prep Guide - At A Cloud Guru, the #1 question received from students is "I want to pass the AWS cert exam, so where do I start?" This course is your answer. [FREE] eBook from O'Reilly Show Notes Topic 1 - Welcome to the show. Let’s talk about your background, as it’s very rich both in the breadth of technology as well as the industries where you’ve applied your skills. Topic 2 - Let’s start with the basics of what companies do when they hire or engage data scientists. At what point do they typically shift their thinking (or problem solving) from needing Statistical Analysis to understanding when and where AI or ML are a better fit? Topic 3 - Lucidworks specializes in Enterprise Search, and is one of the primary sponsors of both Apache Lucene and Solr. How do those technologies fit into the Enterprise today, and who are newer AI/ML frameworks being built in conjunction with them? Topic 4 - As you’re working with Data Scientists, where do you see their interests evolving, or what types of problems interest them today? How do you connect and communicate this to business challenges and business leaders? Topic 5 - Can you share some of the research that you’ve been recently focused on that begin to start making it easier for non Data Scientists to start answering questions in a faster or easier manner? Feedback? Email: show at thecloudcast dot net Twitter: @thecloudcastnet and @ServerlessCast

DBT for Eating Disorder Recovery

The Eating Disorder Recovery Podcast

Play Episode Listen Later Mar 21, 2017 41:39

In this podcast, Tabitha talks to Dr Lucene Wisniewski on how DBT can be implemented in treatment for eating disorders such as Bulimia Nervosa, and Binge Eating Disorder. Lucene Wisniewski, PhD, FAED is a clinician, trainer and researcher whose interests center around using empirically founded treatments to inform clinical practice. Lucene, an Adjunct Assistant Professor of Psychological Sciences at Case Western Reserve University, has taught more than 150 workshops, lectures and presentations on Cognitive Behavioral and Dialectical Behavior Therapies internationally and has over 40 publications in peer reviewed journals and invited book chapters. From 2006-2014, Lucene was Clinical Director and co-founder of the Cleveland Center for Eating Disorders, a comprehensive eating disorder treatment program offering evidenced based care. She served as the Chief Clinical Integrity Officer of the Emily Program, a multi-state eating disorder program from 2014-2017. Lucene has been elected fellow, has served on the board of directors, and as the co-chair of the Borderline Personality Disorder special interest group of the Academy for Eating Disorders (AED). In 2013 the AED awarded Lucene the Outstanding Clinician Award to acknowledge her contribution to the field of eating disorder treatment. She is currently the owner and founder of Lucene Wisniewski, PhD, LLC and DBTOHIO. www.lucenewisniewski.com We want your feedback! Please take a second to fill out this survey with feedback so we can make these podcasts even better: https://www.surveymonkey.co.uk/r/BSQ7BBM Cheers!

phd academy llc eating disorders clinical director case western reserve university dbt borderline personality disorder psychological science aed eating disorder recovery binge eating disorder adjunct assistant professor cognitive behavioral bulimia nervosa lucene emily program

Use of new data technologies now pervades our institutions, both private and government. But this data-driven revolution is far from complete. We can still influence where it takes us. I will discuss some of the current challenges we face, both technical and social, and how we might address them. Doug Cutting (@cutting) is the founder of numerous successful open-source projects, including Lucene, Nutch, Avro, and Hadoop. Doug joined Cloudera in 2009 from Yahoo!, where he was a key member of the team that built and deployed a production Hadoop storage and analysis cluster for mission-critical business analytics. Doug holds a bachelor’s degree from Stanford University and sits on the board of the Apache Software Foundation.

challenges data cutting stanford university yahoo ecosystem chief architect hadoop cloudera avro apache software foundation lucene

LCC 137 - si tu chiffres quand je déchiffres

search microsoft java ricci gian dato visual studio elasticsearch lucene

Play Episode Listen Later Dec 1, 2015 97:03

Les Cast Codeurs discutent sur les news et sur le fond pendant cet épisode. Pour n’en citer que quelques uns, on parle de Devoxx, du modus operandi des fondations Apache et Eclipse, de couverture de code, de développement web hybride, d’outillage, de sécurité et de pages de statut. Enregistré le 26 novembre 2015 Téléchargement de l’épisode LesCastCodeurs-Episode–137.mp3 News Devoxx Discussion sur les Devoxx Langages Java the missing features sur InfoQ par Ben Evans Ceylon 1.2 Javascript pour développeurs Java Groovy accepté comme TLP Apache Groovy doubling downloads Les côtés pervers de la code coverage Infra, Middleware et Cloud GORM 5 avec support pour Hibernate ORM 5 Lucene the good parts Vert.x @ Eclipse Red Hat et Microsoft, quoi?! Fedora 23 Docker compose + swarm vs Kubernetes La mémoire ECC ou pas Raspberry Pi Zero Web et mobile CodeLabs Android L’appli native de BaseCamp au fil du temps Version 2.0 d’Android Studio Données Bolt le protocole binaire de Neo4j Google TensorFlow: j’ai rien compris plus rapide que l’éclair MongoDB 3.2, avec left outer join Outillage VisualStudio Code est open sourcé Plus de mémoire pour IntelliJ fait la différence Maven impose JDK 7 (depuis la 3.3.x en fait :-) ). Pour info: Statistiques des versions de java utilisées pour deployer Jenkins Maven central sur Google Storage Npm pour Eclipse Red Hat rachète Ansible Sécurité La CNIL épingle la mauvaise sécurité Les extensions Chrome qui débloquent (la pub) Encryption dans Azure La vulnérabilité de commons logging et les produits JBoss et WildFly Débat Une page de statut pour vos services Rubrique débutant Stack overflow Outil de l’épisode Xip.io Conférences Codeurs en Seine - Rouen - 26 novembre 2015 Snowcamp - Neige - 21–22 janvier Breizhcamp 23–26 mars Devoxx France 20/22 avril Mix-IT 21 et 22 avril Nous contacter Contactez-nous via twitter https://twitter.com/lescastcodeurs sur le groupe Google https://groups.google.com/group/lescastcodeurs ou sur le site web https://lescastcodeurs.com/ Flattr-ez nous (dons) sur https://lescastcodeurs.com/ En savoir plus sur le sponsoring? sponsors@lescastcodeurs.com

Elasticsearch - Gian Maria Ricci

dotNETpodcast

Play Episode Listen Later Jul 13, 2015 70:55

L'argomento di questa puntata, Elastic Search, è un argomento "sui generis" rispetto agli altri argomenti trattati.Si tratta infatti di un motore di ricerca open source sviluppato in Java, e basato su un altro progetto open source, Lucene (di cui esiste anche un porting in .NET, Lucene.NET appunto).Dato che però si tratta di un prodotto ormai ai massimi livelli di diffusione nelle medie e grandi aziende, e potrà capitare a chi sviluppa con le tecnologie Microsoft di doverlo utilizzare, abbiamo pensato di chiedere ad un esperto come Gian Maria Ricci di illustrarci alcune delle peculiarità di questo server, e di come possiamo utilizzarlo, per esempio, da Visual Studio.

Episode 28 – Jul 2015

Play Episode Listen Later Jul 12, 2015

Kito and Daniel discuss new releases from PrimeFaces, OpenWebBeans, DeltaSpike, Spring Boot, Polymer, AngularJS, WebAssembly, Play, Lucene, new JSF extensions, and more.

polymer webassembly kito angularjs spring boot jsf lucene primefaces

Episode 28 - Jul 2015

Play Episode Listen Later Jul 12, 2015 65:18

Kito and Daniel discuss new releases from PrimeFaces, OpenWebBeans, DeltaSpike, Spring Boot, Polymer, AngularJS, WebAssembly, Play, Lucene, new JSF extensions, and more. They also discuss Microsoft’s open-source strategy and Visual Studio Code. Keep up with the alphabet soup of product names. Check out our technical glossary! UI Tier OmniFaces 2.1 released! New F12 Developer Tools for the New Microsoft Edge Liferay Faces Project News - May 2015 AngularJS + CDI = AngularBeans Web framework: Introducing Juzu version 1.0 and its brand new website Apache Tobago 2.0.8 Release Polymer 1.0 PrimeFaces introduces Rio theme and layout New Releases for PrimeFaces Layouts PrimeUI 2.0 Released Recent Ripple of JSF Extensions ButterFaces Material Prime Generjee JSR 378: Portlet 3.0 Bridge for JavaServerTM Faces 2.2 Specification WebAssembly: A Universal Binary and Text Format for the Web Play 2.4.0 “Damiya” released, adds new DI support and test APIs Persistence Tier [ANNOUNCE] Apache Lucene 5.2.0 released Services (Middleware & Microservices) Tier Oracle Developer Cloud Service 15.2.2 Released Spring for Apache Hadoop 2.2 GA released [ANN] End of life for Apache Tomcat 6.0.x [ANNOUNCEMENT] HttpComponents Client 4.5 GA Released [ANNOUNCE] Apache OpenWebBeans 1.6.0 [ANNOUNCE] Release of Apache DeltaSpike 1.4.0 Apache Allura 1.3.0 released [ANNOUNCE] Apache Flume 1.6.0 released [ANNOUNCE] Apache Calcite 1.3.0 (incubating) released [ANNOUNCE] Commons Email version 1.4 released Misc CRaSH Spring Boot 1.2.4 released Spring Social 1.1.2 Released JBoss Fuse 6.2 is out! Infinispan 7.2.3.Final Discussion JavaEE or Spring? Neither! We Call Out For a Fresh Competitor! Visual Studio Code, and Microsoft OSS will it affect Java Enterprise ? https://code.visualstudio.com/Download MS fork of NodeJS Events Javazone, Oslo - September 9-10, 2015 No Fluff Just Stuff Austin July 10 - 12, 2015 ÜberConf, Denver July 21 - 24, 2015 Raleigh August 21 - 22, 2015 SpringOne 2GX, Washington DC September 7-14, 2015 Atlanta September 18 - 20, 2015

microsoft bridge ga rio new releases polymer webassembly visual studio code kito angularjs spring boot jsf apache tomcat lucene apache hadoop primefaces

June 20, 2015 Tech Talk Radio Show

Tech Talk Radio Podcast

Play Episode Listen Later Jun 20, 2015 58:55

OPM security breach (identity theft protection), HD calling on Verizon (setting up simultaneous voice and data for iPhone6), laptop backup options (Carbonite and others, Dropbox and others), 3D printer options (from budget to semi-pro), Profiles in IT (Doug Cutting, creator of Hadoop, Lucene, Nutch open-source projects), Stupid Idea of the Week (text walking lanes), ballot machines are easy hacking targets (Wi-Fi links with default passwords, many other vulnerabilities), Internet of Things (needs standards now, if-then-then-that platform can provide linkage), and optimizing iPhone storages (delete space hogging apps, stream don't store music, store full resolution pics in iCloud, delete old texts). This show originally aired on Saturday, June 20, 2015, at 9:00 AM EST on WFED (1500 AM).

science internet technology space iphone 3d software wifi radio show verizon internet of things dropbox information technology profiles talk radio icloud opm network security iphone6 hadoop carbonite talk+radio+show lucene stratford university tech talk radio wfed

June 20, 2015 Tech Talk Radio Show

Tech Talk Radio Podcast

Play Episode Listen Later Jun 20, 2015 58:55

Episode 23 – Dec 2014

Play Episode Listen Later Dec 18, 2014

Kito, Ian, and Daniel cover new releases of AngularJS, PrimeFaces, MyFaces, Bootstrap, Hadoop, Spring Roo, Tomcat, Arquillian, Spring Framework, Spring Integration, Akka, Solr, Lucene, and more.

bootstrap tomcat hadoop kito akka angularjs solr spring framework lucene primefaces

Episode 23 - Dec 2014

Software Engineering Radio - The Podcast for Professional Software Developers

Play Episode Listen Later Dec 18, 2014 65:02

Kito, Ian, and Daniel cover new releases of AngularJS, PrimeFaces, MyFaces, Bootstrap, Hadoop, Spring Roo, Tomcat, Arquillian, Spring Framework, Spring Integration, Akka, Solr, Lucene, and more. They also discuss the forking of Node.js, microservices vs app servers, and a recent blog post about why you shouldn’t use JSF. UI Tier RichFaces 4.5.1.Final released MyFaces Core 2.2.6 released PrimeFaces Elite triple released AngularJS 1.3.0 Aria support Bootstrap 3.3.1 released Persistence Tier Apache Hadoop 2.5.0 is released Spring Roo 1.3.0 introduces JDK 8 support Spring for Apache Hadoop 2.0.3 Released Testing Arquillian OSGi 1.1.1.Final released Arquillian Cube Extension 1.0.0.Alpha1 Misc Spring Integration Java DSL 1.0 GA Released Spring Security OAuth 2.0.4.RELEASE Available Now Spring Framework 4.1.2 & 4.0.8 & 3.2.12 released Akka 2.3.7 Maintenance Release Apache Tomcat 6.0.43 released Release of Apache DeltaSpike 1.1.0 Apache Camel 2.13.3 Released Apache Solr 4.10.2 released Apache Lucene 4.10.2 released Node.js forked Discussion Is Middleware done for in favor of Microservices? Why You Should Avoid JSF Ian’s slides on integration JSF with front-end frameworks Events No Fluff Just Stuff Jfokus - Stockholm, Sweden Feb 2-4th, 2015 DevNexus - Atlanta, GA, USA (call for papers is open; Kito and Daniel will both be speaking) Mar 10-12th, 2015

united states spring ga node bootstrap tomcat hadoop kito akka angularjs solr jdk jsf spring framework lucene apache hadoop primefaces

Fréquence Valtech - épisode 10 : Classification de textes avec Apache Lucene/SOLR et LIBSVM

Fréquence Valtech

Play Episode Listen Later Jan 22, 2014 52:59

Majirus Fansi est l'invité de ce 10ème épisode du podcast Fréquence Valtech, sur le thème de la classification de textes avec Apache Lucene/SOLR et LibSVM. Le problème est : "Comment attribuer automatiquement une ou plusieurs catégories à un texte donné" ? La façon la plus simple mais non automatisée est de demander à un expert du domaine de lire le texte et de décider à quelles catégories il appartient. Dans cet épisode, Majirus nous explique quelle démarche unique il a utilisé pour répondre à ce problème.

dans big data apache classification textes svm solr valtech lucene

Episode 82

Unsupported Operation

Play Episode Listen Later May 12, 2013

Unsupported Operation Episode 82 Misc JavaBlogs.com closed by Atlassian.Latest Dart VM beats Java 64 bit VM in Delta Blue BenchmarkJava version number scheme officially updated to make managing all these security fixes easierNewly minted just in time for more unmaintainable build messes: Gradle 1.6? Progress being made on Android build system, facebook builds their own BuckIntelliJ IDEA 12.1.3 - moar upgrades!YourKit Java Profiler 12.0.5Typesafe release Typesafe Activator- part of the Typesafe PlatformJaCoCoverage: java 7 code coverage for NetbeansBefore Google I/O, Square announce Seven Days of Open Source and release (thus far) - OkHttp, Dagger, MimeCraft, ProtoParser, JavaWriter, Roboelectric 2.0, Intellij Plugins - a lot focused around Android.Java gets a REPL - scary, and awesome, but very, very scary. Sonar GroovyCassaforte - Client client API for Apache Cassandra 1.2+ Apache Apache Camel 2.11 - new Camel CMIS module for integration with CMS systems, new camel-couchdb, camel-elasticsearch modules.Three major defects found in Tomcat (groan, yes again) - Chunked transfer encoding extension size is not limited, Session fixation with FORM authenticator, Request mix-up if AsyncListener method throws RuntimeException. Tomcat 6.0.37 released. 7. something too.Apache Curator 2.0.0-incubator - A ZooKeeper Keeper - utils making ZooKeeper easier to use.Apache Gora 0.3 - in memory and persistence for Big Data for using HadoopApache HttpComponents HttpCore 4.3-beta2Apache Buildr 1.4.12Apache Giraph 1.0 - first release out of incubation - Apache Giraph is an scalable and distributed iterative graph processing system that is inspired by BSP (bulk synchronous parallel) and Google's Pregel. Giraph distinguishes itself from those projects by being open-source, running on Hadoop infrastructure, and going beyond the Pregel model with features such as master computation, sharded aggregators, out-of-core support, no single point of failure design, and more.Lucene and Solar 4.3Commons Codec 1.8Apache Marmotta 3.0 - incubating - Apache Marmotta is an extensible and modular implementation of a Linked Data Platform and contains a server component as well as many useful libraries for accessing Linked Data resources. Apache Marmotta is a continuation of the Linked Media Framework (LMF) originally developed by Salzburg Research, but considerably improved with respect to modularity, reliability, performance and also licensing. Since the last LMF release was the 2.x series, Apache Marmotta starts with the version number 3.0.0-incubating.Open JPA 2.2.2 Other news Greg’s favourite NoSQL database gets incremental backup

Episode 187: Grant Ingersoll on the Solr Search Engine

Play Episode Listen Later Jul 18, 2012 51:58

Recording Venue: Lucene Revolution 2012 (Boston) Guest: Grant Ingersoll Grant Ingersoll, a committer on the Apache Solr and Lucene, talks with Robert about the problems of full-text search and why applications are taking control of their own search, and then continues with a dive into the architecture of the Solr search engine. The architecture portion of the […]

development testing software engineering architecture patterns enterprise programming languages search engines embedded scripting soa mda ingersoll concurrency solr lucene apache solr

Episode 72

Unsupported Operation

Play Episode Listen Later Apr 23, 2012

Unsupported Operation 72 Google/Oracle trial. APIs are apparently copyright. Huh? So Harmony was infringing copyright all along? The Judge has made the call that HE will decide - I don’t think he has yet done so…?Google acquires Unisys Parents, including Java API patents Generation of Java language application programming interface for an object-oriented data storeJDBC? EJB ‘reposistories’?, JCR content repositories?…Common gateway which allows JAVA applets to make program calls to OLTP applications executing on an enterprise server reference to co-pending applicationsHenri Gomez’s JDK build project now has JDK8 builds for Llambda and Jigsaw.Gerrit 2.3 is out - new draft reviews - nice.FEST-Reflect 1.3FEST-Assert 2.0m2Flyway DB Migrations 1.6.1Damn Handy URI TemplatesSonar 3 is out, new developer cockpit - looks nice but EXPENSIVE - per developer stats Jetbrains Kotlin M1 - Maven repository Clojure Chas Emerick releases Friend - an auth lib for ClojureClojure 1.4 released - not yet mentioned on the website it seems tho.New clojars’s application/site release Scala Typesafe Stack 2.0.1 releasedScala IDE for Eclipse M1 released Groovy Grails is now 2.0.3 after Windows related issued found. Apache Compress Antlib 1.2 releasedCommons Compress 1.4Open Web Beans 1.1.4Tomcat 7.0.27IvyDE 2.2.0 beta 1Camel 2.9.2CXF 2.6.0HTTP Server 2.4.2Rave 0.10.1 (mashup engine)OFBiz 10.04.02Commons IO 2.3BVal 0.4 (implements Bean Validation 1.0 spec)Axion 1.2.13 - xml model something or other that was part of Axis 2Lucene and Solr 3.6 (does this mean a new version of Elastic Search soon?)MyFaces Core 2.1.7 / 1.2.12 / 1.1.10Accumulo 1.4.0 (key/value store big table based on hadoop, zookeeper and thrift)Empire DB 3.0 - alternative to JPA - http://empire-db.apache.org/empiredb/hibernate.htm Other Meteor decides to change their license to MITLean is officially cooler than Agile

google friend spring mit judge generation windows expensive agile rave java apis axis apache meteors jigsaw maven groovy scala lean startups sonar gerrit pn tomcat elasticsearch jvm pall jetbrains grails guice clojure solr axion jdk jpa oltp jcr lucene javafx google oracle rspn projectid cxf sect2 hitoff chas emerick

Ruby NoName Podcast S04E05

Ruby NoName podcast

Play Episode Listen Later Mar 22, 2012 32:11

Новости Вышли Rails 3.0.12, 3.1.4, 3.2.2 Товарищ Константин, О времени и о себе. Кстати, упоминаемая в интервью книжица “Sinatra: Up and Running” — тоже очень ничего. Можно рекомендовать как академическое пособие для желающих разобраться, как правильно готовить на Руби web-(и прочее)-middleware и все такое. Деплой как в Heroku 4 марта вышло обновление на Github, связанное с массовыми уязвимостями на этом сайте 6 марта вышел Vagrant версии 1.0 7 марта вышел Bundler 1.1 Lightrail - легкий rails-стэк для json приложений Ruby 2.0 Enumerable::Lazy Except.io - сервис, аналогичный airbrake.io Обсуждение Системы полнотекстового поиска Sphinx - система полнотекстового поиска от Андрея Аксенова Full Text Search в Postgresql - система полнотекстового поиска, встроенная в Postgresql Elasticsearch Solr - сервер полнотекстового поиска от Apache Foundation Lucene - движок полнотекствого поиска от Apache Foundation Срывая покровы с Ивана Самсонова Профиль Ивана на Моем Круге Профиль Ивана на LinkedIn Wheely - компания, где сейчас работает Иван РГГУ - а здесь Иван сейчас учится

running github noname rails sphinx vagrant postgresql heroku elasticsearch light rail solr wheely bundler lucene apache foundation

Episode 64

Unsupported Operation

Play Episode Listen Later Feb 12, 2012

Unsupported Operation 64Java / MiscJavaFX 2.1 gets MPEG4 playbackScala artifacts now in centralGithub's mashup of Jenkins called JankyThe state of IcedTea and IcedTeaWeb video from FOSDEMSpring Data JPA 1.1.0 RC1 and 1.0.3 GA Releasedhttp://bit.ly/xkOR9CPrimeFaces 3.0 - a year long development, its tagline is Ajax, Mobile and IE9 components. IE9 components????Scandal: ICEFaces is just a rip off of PrimeFacesSpring Roo 1.2.1 available, patch release which brings support for the new PrimeFaces and latest GAEQuery Time Joining makes it into Lucene 3.6 (but a different impl from 4.0 which is 3x faster)GoogleGoogle App Engine "Community Support" moved to Stack OverflowFails in its attempt to keep email out of court on AndroidHardware x-overSheeva Plug, the box from Globalscale that the FreedomBox is based on also has a JVM+OSGI kit on an SD card.Speaking of OSGi, Distributed OSGi RI 1.3 is out, based on Apache CXFApacheRichard moved to Maven 3.0.4 and is having no problemsApache Jackrabbit 2.4.0, 2.2.11 released http://jackrabbit.apache.org - lots of new features, fixes and improvements(not Java, but) Apache libcloud gone 0.8.0 http://libcloud.apache.org/Apache MyFaces CVE-2011-4367Apache MyFaces information disclosure vulnerability affects MyFaces 2.0.1 - 2.0.11, 2.1.0 - 2.1.5MyFaces JavaServer Faces (JSF) allows relative paths in thejavax.faces.resource 'ln' parameter or writing the url so the resourcename include '..' sequences . An attacker could use the securityvulnerability to view files that they should not be able to.http://://faces/javax.faces.resource/../WEB-INF/web.xmlMyFaces Core 2.0.12 and 2.1.6 releasedApache Directory Studio 2.0M2Apache Directory DS 2.0.0-M5Apache LDAP API 1.0.0-M10HttpClient 4.1.3 GAApache Hive 0.8.1 - distributed data warehouse on top of HadoopCommons Configuration 1.8Commons Validator 1.4Lucy 0.3 (incubating)Apache Lucy is full-text search engine library written in C and targeted at dynamic languages

google speaking spring ga android mobile fails plug jenkins hardware sd java github ajax apache maven groovy scala stack overflow cve m5 hadoop jvm janky guice clojure fosdem m10 sheeva rc1 lucene javafx ie9 apache hive freedombox osgi primefaces

078 TMTC Chris Mattmann – OODT/NASA

nasa pbs usc project management hive assume search engines software engineers jet propulsion laboratory torque tika hadoop nasa jpl idl solr apache software foundation ganglia lucene

Play Episode Listen Later Dec 30, 2011 57:38

Chris Mattmann is a Software Engineer at NASA's JPL. He's the VP of OODT in the Apache Software Foundation and an adjunct professor at USC. OODT is a framework for managing data from multiple sources and adding them to other data sources for different purposes (like a database and a search engine.) It manages hundreds of thousands of job in a day and terabytes or petabytes of data from various sources. Mentioned in this episode: Apache OODT Nutch Hadoop Apache Software Foundation NASA NASA JPL ftp sftp Solr Lucene Hive File Catalog vs Search Engine Tika Goodle Project Management was the hard part Assume that failure happens and recover quickly Ganglia Torque PBS struts IDL CHLA (Childrens Hospital of LA) VPICU OODT Contact page (info on mailing lists, etc.)

OpenUpon Podcast #10 - Lucene Indexing (1 of 2)

OpenUpon

Play Episode Listen Later Nov 21, 2011

Phil and Andy are back! They startup the podcast after about 1 year of hibernation! These tow discuss the Lucene Indexer, Solr, and Lucene’s .NET Port. They touch on how to start using Lucene,... This is a content summary only. Visit our site for full content, links, images, and more.

technology news development software engineering practices patterns programming learned ts indexing solr lucene

073 TMTC Grant Ingersoll – Lucene & Solr

Play Episode Listen Later Nov 3, 2011 59:28

Lucene is a terrific tool for powering searches. Solr adds a layer of functionality on top of it that makes things even more easy to use. In this interview, Grant and I discuss the ins and outs of using Lucene to power searches on your websites.

ingersoll solr lucene

Interview with Josh Berkus – Part 2

Play Episode Listen Later Aug 1, 2011 58:39

In this episode we discussed: MongoDB Standardization of NoSQL databases Portability between non-relational databases CouchDB PostgreSQL AGPL license PostgreSQL license (like the BSD license) MySQL is GPLv2 Drizzle has rewritten their MySQL driver so it’s not GPL Oracle’s behavior toward products they own that compete InnoDB MySQL engine Microsoft SQL – The price hike and bug report that drove Josh to PostgreSQL Customer expectations vs Intended functionality GreenPlum Alexa Implementing the minimum feature set and getting feedback. Transactional DDL – All operations are transactional except create database. Database Migrations – PostgreSQL can do migrations with no downtime. Memcached Redis Solr ElasticSearch Foreign Data Wrappers – a driver for external data sources that can then be managed through PostgreSQL Lucene Hadoop HBase Cassandra Project Voldemort HyperTable Riak Amazon Cap Theorem Papers VoltDB

oracle implementing intended mongodb standardization mysql drizzle nosql redis postgresql elasticsearch gpl bsd portability solr couchdb memcached lucene riak berkus greenplum gplv2

TMTC 66 Josh Berkus (PostgreSQL Core Team)