POPULARITY
Categories
My conversation with Andrea starts at about 22 minutes in to today's show after headlines and clips Subscribe and Watch Interviews LIVE : On YOUTUBE.com/StandUpWithPete ON SubstackStandUpWithPete Stand Up is a daily podcast. I book,host,edit, post and promote new episodes with brilliant guests every day. This show is Ad free and fully supported by listeners like you! Please subscribe now for as little as 5$ and gain access to a community of over 750 awesome, curious, kind, funny, brilliant, generous soul On YOUTUBE.com/StandUpWithPete ON SubstackStandUpWithPete Andrea Jones-Rooy, Ph.D., is a data and social scientist, science educator, standup comedian, and circus performer. They are a professor and the Director of Undergraduate Studies at the NYU Center for Data Science, where they teach the flagship undergraduate course, Data Science for Everyone, as well as advanced courses on Natural Language Processing. Andrea is also a research consultant and keynote speaker for global Fortune 500 and tech companies of all sizes on how to thoughtfully integrate data science into achieving their goals, especially in the people analytics space. When they aren't doing those things, they perform standup, trapeze, and fire all over the world. Andrea hosts the podcast Majoring in Everything and is working on a book about why focusing on just one thing is overrated. Get in touch after the interview… • @jonesrooy on Twitter, Instagram, and TikTok www.jonesrooy.com jonesrooy@gmail.com Listen rate and review on Apple Podcasts Listen rate and review on Spotify Pete On Instagram Pete on Blue Sky Pete on Threads Pete on Tik Tok Pete on Twitter Pete Personal FB page Stand Up with Pete FB page Gift a Subscription https://www.patreon.com/PeteDominick/gift Send Pete $ Directly on Venmo All things Jon Carroll Buy Ava's Art Subscribe to Piano Tuner Paul Paul Wesley on Substack Listen to Barry and Abigail Hummel Podcast Listen to Matty C Podcast and Substack Follow and Support Pete Coe Hire DJ Monzyk to build your website or help you with Marketing
Topics covered in this episode: CVE-2026-48710: A Maintainer's Perspective daily-stars-explorer Markdown to pdf with pandoc and typst postman2pytest Extras Joke Watch on YouTube About the show Brian #1: CVE-2026-48710: A Maintainer's Perspective Marcelo Trylesinski suggested by Lee Luocks Short version: users of Starlette: upgrade to Starlette 1.0.1 security professionals: we can't treat open source projects like corporations This top link is a Starlette security advisory with the title Missing Host header validation poisons request.url.path, bypassing path-based security checks The CVE apparently caused some negative press targeting starlette. However, “the vulnerability came from the application pattern and the deployment, never from something Starlette intended.” A quote from an OSTIF article: “This bug is a classic “responsibility gap” where if this maintainer didn't patch, thousands of exposed projects would have to individually secure their projects. In doing this work, they've voluntarily taken on the responsibility to protect the ecosystem from long-term systemic harm. As with all open source projects, they owed us nothing and could have left this to be everyone else's problem and took the extraordinary steps of helping the ecosystem.” Both X40 D-Sec and Ars Technica expected immediate fixes and responses from Starlette. That's not good. We can do better. Michael #2: daily-stars-explorer Explore the full history of any GitHub repository.
Talk Python To Me - Python conversations for passionate developers
You wake up, brew the coffee, open GitHub, and there it is. Another pull request on your open source project. Thirteen thousand lines added. No issue filed first. No discussion. Just "here, please review this for me." Over the past year, GitHub activity has spiked roughly twelve times in a few short months, and a huge chunk of that signal is landing on the same small group of maintainers who were already stretched thin. The curl bug bounty got buried under AI-generated noise. Jazzband, the home of Django classics like pip-tools and the Django debug toolbar, hit what its maintainer called an "apocalypse" and started sunsetting. Even CPython just shipped fresh guidelines on AI-assisted contributions this week. So what does all of this actually look like from the receiving end of the pull request? On this episode, Paolo Melchiorre joins us to tell that story from inside the maintainer's chair. Paolo is a director of the Django Software Foundation, an organizer of PyCon Italy, a Django Girls coach, and he has spent the past year carefully collecting examples of how AI is reshaping open source contributions. The good, the bad, and the extra fingers. We dig into his PyCon US talk on AI-assisted contributions and maintainer load, why AI is best understood as an amplifier rather than a new kind of contributor, the wildly different policies across 86 open source foundations, whether projects banning AI today are reacting to last year's models. Episode sponsors AgentField AI Talk Python Courses Links from the show Guest Paolo Melchiorre: github.com DSF: www.djangoproject.com djangonaut-space: djangonaut.space PyCon Italia: 2026.pycon.it uDjango: github.com My PyCon US 2026 post: www.paulox.net AI-Assisted Contributions and Maintainer Load: www.paulox.net Senior Engineer Tries Vibe Coding: www.youtube.com Code Rabbit AI PR Reviews: www.coderabbit.ai GitHub Usage Graphs: github.blog Update on CPython's AI Policies: fosstodon.org High-Quality Chaos from Curl: daniel.haxx.se The Generative AI Policy Landscape in Open Source: redmonk.com Watch this episode on YouTube: youtube.com Episode #550 deep-dive: talkpython.fm/550 Episode transcripts: talkpython.fm Theme Song: Developer Rap
Modern propaganda isn't random noise. It's a repeatable, engineered algorithm that starts with ideology, weaponizes identity, and manufactures conflict. Once you see the pattern, you can't unsee it. What happens with AI? Buy me a coffee https://ko-fi.com/datascience Discord Channel: https://discord.gg/4UNKGf3 ✨ Connect with us! Personal newsletter: https://defragzone.substack.com
Will Parrish is the Co-Founder and Chief Customer Officer of Lula, a Kansas City-based proptech platform built to streamline property maintenance for property managers and their residents. Will co-founded Lula alongside CEO Bo Lais with a mission to make property maintenance smarter — pivoting the business during the pandemic to focus on property managers in the single-family rental space, a move that fueled rapid growth. Lula recently closed a $28 million Series A round and is expanding from 42 markets to 60, with heavy investment in AI and automation. Before co-founding Lula, Will spent nearly two decades in enterprise sales and business development, including a long tenure at Thomson Reuters. (00:53) - How Lula Started(02:34) - Trading Corporate for Startup Life(03:29) - Is Maintenance Archaic(05:49) - Where Work Orders Fail(07:30) - Scaling 100K Work Orders(12:28) - Building Vendor Trust & Quality(13:19) - Expanding Markets(16:16) - Flat Rate Pricing Playbook(19:15) - Ideal Rental Customers(21:54) - Integrations(25:47) - AI In Maintenance(30:21) - Future of Lula(32:14) - ROI for Property Owners & Operators(35:49) - Hardware play ahead?(39:12) - Collaboration Superpower: MacGyver
Help us become the #1 Data Podcast by leaving a rating & review! We are 67 reviews away! Five years ago I made the scariest decision of my life. Here's the full story.
Gareth McGlynn speaks with Nathan Schafer, Estimating Manager at Cornerstone General Contractors, at Advancing Preconstruction 2026 in Phoenix. A self-performing GC based in Alaska, Cornerstone gives Nathan a hands-on perspective on preconstruction that is grounded in real field conditions.Key Topics Covered:Nathan & Cornerstone: Working across healthcare, hospitality, and federal military projects, with a focus on vertical commercial construction.New Estimators & Value Engineering: The shortage of talent entering the field, their understanding of value engineering, and professional development through ASPE and AACE.Lean Construction: Its growing impact on the estimating process and the role of AI takeoff tools as a lean principle in action.Data Science in Preconstruction: How Cornerstone is incorporating data science into its workflow, including labor productivity tracking and predicting quantity growth risk as a function of design maturity.Value Management in Practice: Cornerstone's formal pilot on a $90M project: 13 propositions totaling $9M, with $7M accepted.You can connect with Nathan via his LinkedIn: https://www.linkedin.com/in/nathan-schafer-cpe-b75991178/Or reach him through his blog: https://www.preconomics.com/blog
Wir sprechen im Interview mit Prof. Dr. Henrik Leopold, Professor für Data Science und Business Intelligence an der Kühne Logistics University, über datengetriebene Verwaltung und kommentieren die Herausforderungen für Frauen in der Kommunalpolitik. Außerdem recherchieren wir die Bedeutung des Beschaffungsamts des Bundesinnenministeriums für die Sicherheit in Deutschland.
Talk Python To Me - Python conversations for passionate developers
Your documentation has two audiences now - humans reading the rendered HTML, and AI agents trying to make sense of your library. Rich Iannone and Michael Chow from Posit are back on Talk Python with a brand new Python documentation tool called Great Docs that takes both seriously. Rich is the creator of Great Tables, and before that the R package GT, the man has a serious eye for design, and he's pointed that energy at the Python docs ecosystem. We'll talk about how Great Docs spins up a polished site in three commands, why every page ships as Markdown for your favorite LLM, how it leans on Quarto for executable code blocks and tabbed install sections, and where it lands against Sphinx, MkDocs, and Zensical. Plus, you'll meet Tablin. Here we go. Episode sponsors Sentry Error Monitoring, Code talkpython26 Temporal Talk Python Courses Links from the show Guests Michael Chow: github.com Rich lannone: github.com Python Web Security with OWASP Top 10 and Agentic AI Course: talkpython.fm Great Docs: posit-dev.github.io/great-docs Great Tables: posit-dev.github.io GT Episode: talkpython.fm Sphinx: www.sphinx-doc.org mkdocs: www.mkdocs.org Zensical: zensical.org Hugo: gohugo.io Ghost: ghost.org Rs pkgdown: pkgdown.r-lib.org Quarto: quarto.org quickstart: posit-dev.github.io llms.txt file: llmstxt.org llms.txt: talkpython.fm mcp: talkpython.fm cli: talkpython.fm Watch this episode on YouTube: youtube.com Episode #549 deep-dive: talkpython.fm/549 Episode transcripts: talkpython.fm Theme Song: Developer Rap
Topics covered in this episode: Dumb Ways for an Open Source Project to Die How to create a pylock.toml lockfile https://github.com/facebook/Lifeguard Choosing a Python Logging Library in 2026 Extras Joke Watch on YouTube About the show Sponsored by us! Support our work through: Our courses at Talk Python Training The Complete pytest Course Patreon Supporters Connect with the hosts Michael: @mkennedy@fosstodon.org / @mkennedy.codes (bsky) Brian: @brianokken@fosstodon.org / @brianokken.bsky.social Show: @pythonbytes@fosstodon.org / @pythonbytes.fm (bsky) Join us on YouTube at pythonbytes.fm/live to be part of the audience. Usually Monday at 11am PT. Older video versions available there too. Finally, if you want an artisanal, hand-crafted digest of every week of the show notes in email form? Add your name and email to our friends of the show list, we'll never share it. Michael #1: Dumb Ways for an Open Source Project to Die Core categories The maintainer left The maintainer is still there Sabotage and capture The release pipeline broke Force majeure The world moved on The project split - Examples Bulma PRs still from 2023, issues and PRs with no maintainer response for years, last release 1.5 years ago diskcache Similar, got hired by OpenAI, crickets after that Brian #2: How to create a pylock.toml lockfile Tim Hopper Tim walks through using uv, pip and pdm to create pylock.toml files. Recommendation: use uv export --format pylock.toml -o pylock.toml He also has How to install from a pylock.toml lockfile with pip but the short version is: use -r because tools treat it like a requirements file Michael #3: https://github.com/facebook/Lifeguard Lifeguard is a static analyzer to detect Lazy Imports incompatibilities and ease the adoption overhead for Lazy Imports in Python. I'm more excited about lazy imports after my Cutting Python Web App Memory Over 31% experience Some Python patterns depend on imports executing immediately. For example: Module-level side effects — a module that registers a handler or modifies global state at import time will behave differently if that import is deferred. The registry pattern — a module that registers itself (e.g., adding to a global dict) when imported will silently fail to register under Lazy Imports. sys.modules manipulation — code that reads or writes sys.modules assumes prior imports have already executed. Metaclasses and __init_subclass__ — class creation side effects may depend on imports being resolved. Project Stage: Beta Lifeguard is in active development. We are aiming to be ready for general use by the Python 3.15 final release. Brian #4: Choosing a Python Logging Library in 2026 Ayooluwa Isaiah " which libraries matter, how they compare, where they overlap with the standard module, and when each one makes sense.” The slant with this article is the need to log json output, which seems reasonable as things like API entry and exit point logging will include json. Covered libraries standard library logging with a hat tip to python-json-logger Same site has a guide to setting up python-json-logger structlog Loguru Logbook picologging Some benchmarks with structlog, stdlib+json, and Loguru, with structlog coming out faster I liked the Loguru example I'm going to have to try @logger.catch and logger.exception() for easily logging exceptions and serialize=True to enable JSON output. Extras Brian: When Women Stopped Coding - Planet Money segment , spotted on BlueSky from Savannah Ostrowski Lean TDD is now leaner Still working on audio version, but some great changes in 0.7.1 version Ch 6, TDD Interpretations, move ATDD and some of BDD to chapter Ch 7, Change name to TDD with Teams: BDD and ATDD Ch 9, Lean TDD, streamline steps and chapter Ch 10, Change name to Lean TDD with Teams: Lean ATDD Ch 11, Lean TDD with AI, Add short discussion about guardrails and security Michael: New course: Python Web Security: OWASP Top 10 with Agentic AI All courses now with Spanish subtitles, see announcement Joke: Stop texting me
From years in the SEO trenches, today's guest knows what it takes to run successful strategies. Adrian Dahlin is the Founder & CEO of Search to Sale, an SEO analytics SaaS company providing automatic content intelligence for B2B, SaaS and marketing agencies.Adrian Dahlin is the Founder & CEO of Search to Sale, an SEO analytics SaaS company providing automatic content intelligence for B2B SaaS and marketing agencies. He began his entrepreneurial journey in 2020 after leaving corporate marketing to launch a startup consultancy, later evolving it into Search to Sale in 2023. Previously, Adrian worked in data science and marketing analytics after earning a Master's in Applied Data Science from NYU, and earlier in his career founded and led sustainability-focused ventures. CONTACT DETAILS:Email: gerardo@searchtosale.io Business: Search to SaleWebsite: https://www.searchtosale.io/ Social Media:LinkedIN: https://www.linkedin.com/in/adriandahlin/ LinkedIN Company: https://www.linkedin.com/company/search-to-sale-seo-revenue-generation-software/ Remember to SUBSCRIBE so you don't miss "Information That You Can Use." Share Just Minding My Business with your family, friends, and colleagues. Engage with us by leaving a review or comment. https://g.page/r/CVKSq-IsFaY9EBM/review Your support keeps this podcast going and growing.Visit Just Minding My Business Media™ LLC at https://jmmbmediallc.com/ to learn how we can help you get more visibility on your products and services.
A large-scale analysis of Grokipedia, the world's first AI-written encyclopedia, has found that while many Grokipedia articles closely resemble their Wikipedia counterparts, a substantial subset diverged markedly in style, sourcing, and political leaning. Conducted by researchers at Trinity College Dublin and Technological University Dublin, the study compared nearly 18,000 of the most-edited English-language Wikipedia pages with articles on the same topic on the new Grokipedia platform. The study is the largest academic analysis of Grokipedia since it was launched by Elon Musk last October with a promise that the AI-written encyclopedia systematically "fixes" left-leaning biases alleged to exist in the widely used online encyclopedia Wikipedia. Wikipedia's content is written and maintained by volunteer editors, while Grokipedia is an AI-generated encyclopedia using the xAI's Grok large language model. What did the study find? Using computational text analysis and machine learning methods, the team analysed articles on the same topic across Wikipedia and Grokipedia. Selection of topics was based on Wikipedia's most-edited English-language pages. The team compared differences in writing style, structure, and the political orientation of external sources referenced in the paired articles. The researchers found a profound split – while many Grokipedia articles closely mirror Wikipedia, a substantial proportion (66%) of the 18,000 analysed are more extensively rewritten – they are longer, more complex, and rely on fewer references. As a whole, articles on Grokipedia show similar political leaning to those on Wikipedia, drawing on left-leaning news sources. However, when it comes to the politically and culturally sensitive topics of religion, history, literature and art, Grokipedia shows a consistent shift toward referencing more right-leaning news sources compared to Wikipedia. The study analysed Wikipedia's most-edited English-language pages, a selection that likely overrepresents high-profile and contentious topics. That said the study, according to the authors, provides useful evidence of emerging differences between AI-generated and human-edited encyclopedic knowledge systems. Details of the research, conducted at the joint Centre for Sociology of Humans and Machines (SOHAM) in Trinity and TU Dublin, have been published in the peer-reviewed journal Proceedings of the National Academy of Sciences (PNAS). What is the impact of this research? Lead author of the study, Saeedeh Mohammadi, PhD candidate at SOHAM and Research Ireland's Centre for Research Training in Foundations of Data Science said: "Online encyclopedias are central to public knowledge. They are also being used to train future generations of large language models. Our findings raise important questions about how public knowledge is produce, reproduced, verified, and governed. "Unlike Wikipedia, where biases are visible and contested through human editing, AI-generated systems operate largely opaquely. This means shifts in perspective or sourcing may occur without clear accountability or editorial oversight. Simply put AI generation does not remove bias – it changes how and where bias enters the system, often making it less visible." Professor Taha Yasseri Director of SOHAM and Principal Investigator of the study said: "Rather than systematically 'correcting' Wikipedia's alleged biases, as claimed when first launched, our findings suggest that AI-generated encyclopedias such as Grokipedia selectively reshape existing knowledge. This creates a patchwork system in which some content is copied, while other content is reinterpreted in ways that are less transparent and harder to scrutinise." "There is a dire need for transparency, oversight, and regulation in this space. Our information landscape is changing rapidly. We have already seen how the lack of editorial responsibility on social media platforms has enabled the generation and circulation of misinformation and ...
Manikanta Sirumalla, a graduate student in UMBC's Data Science program, shares the story behind Rep Track Pro — an AI-powered fitness app designed to bring workouts, nutrition, recovery, and progress tracking into one place.In this conversation, he discusses how his own fitness journey inspired the idea, the challenges of building and launching an app as a solo founder, and how coursework in UMBC's Data Science program helped shape the technology behind it. From machine learning models to real-world user feedback, this is a story about innovation, persistence, and building something meaningful from the ground up.Learn more about UMBC's Data Science graduate program:https://professionalprograms.umbc.edu/data-scienceLearn more about RepTrack Pro App: https://reptrackpro.org/
In this episode, host Josh interviews entrepreneur Rolando Rosas about his journey from office technology to Amazon selling and founding Circuit Com. Rolando shares his advanced PPC strategy, using a year's worth of sales data and heat maps to optimize Amazon ad scheduling for better ROAS. He offers practical tips for sellers: enhance product images, respond to customer questions with videos, and use data tools like Seller Labs Data Hub to identify peak buying times. Rolando encourages starting small with data-driven ad adjustments to boost efficiency and sales.Chapters:Introduction to Rolando Rosas and His Journey (00:00:00)Josh introduces Rolando, his entrepreneurial background, and the founding of Global Tech Worldwide and Circuit Com.Podcast Sound Effects and Stream Deck Tips (00:01:15)Rolando shares his experience setting up podcast sound effects and encourages using a stream deck.Introduction to Innovative Amazon PPC Strategy (00:01:38)Josh prompts Rolando to share his unique PPC strategy, setting the stage for the main discussion.Data-Driven Ad Scheduling and Heat Maps (00:02:13)Rolando explains using 12 months of order data and Seller Labs Data Hub to create heat maps for ad scheduling.Key Insights from Data: Golden Hours and Days (00:02:59)Discovery of optimal times and days for ads, including patterns like low Friday evening and weekend sales.Challenging Weekend Ad Spend Myths (00:04:12)Rolando debunks the idea that weekends are best for ads, showing most sales occur Monday–Friday.Impact on ROAS and Sales Performance (00:06:03)Discussion of improved ROAS and sales by focusing ad spend on high-performing days and times.Layering Day Parting and Low Bid Strategies (00:07:02)Exploring advanced ad scheduling, including low bid strategies during off-peak hours.Manual vs. Automated Campaign Management (00:08:31)Rolando discusses the manual nature of their current process and the use of portfolio grouping for easier management.Leveraging Seller Labs Data Hub for Insights (00:09:36)How to use Seller Labs Data Hub for actionable business insights, even for non-data experts.The Importance of Data Science and AI for Sellers (00:10:53)Emphasizing the future role of data analytics and AI in Amazon selling success.Three Actionable Takeaways for Amazon Sellers (00:11:56)Josh summarizes three key takeaways: main image optimization, customer Q&A engagement, and data-driven ad scheduling.Encouragement to Start Small and Test Strategies (00:15:20)Advice to implement changes gradually, testing on a few campaigns or SKUs before scaling.Closing Remarks and Appreciation (00:16:18)Josh and Rolando wrap up the episode, express mutual appreciation, and end the conversation.Links and Mentions:Tools and Websites"Global Teck Worldwide": "00:00:00""Seller Labs Data Hub": "00:02:59""Google Sheets": "00:10:08"Strategies and Concepts"Day Parting": "00:02:13""Heat Map": "00:02:59"Actionable Takeaways"Adjust Main Images": "00:11:56""Respond to Customer Questions": "00:12:07"Transcript:Josh 00:00:00 Today I'm super excited to introduce you all to Rolando Rosas. Rolando never could have predicted that a college computer, a printer, and an old school wall phone in his kitchen would lead him down the path of entrepreneurship. But that's exactly how it happened. In 2002, he founded Global Tech Worldwide with the goal of making it easy for businesses to use the right office technologies for better and frictionless customer interactions that help businesses elevate their customer interactions and turn them into rich, meaningful discussions. Fast forward to today, and after spending ten years selling on Amazon, he is on his third startup circuit. Com because he was frustrated with the lack of transparency and outdated methods of buying broadband, wireless and fiber internet for small and medium sized businesses. So with that introduction, welcome to the show, Rolando.Rolando 00:00:53 Woo! Woo woo woo woo. Woo woo. Let me try. Let me try.Josh 00:00:56 Hey, there you go. Hey.Rolando 00:00:57 There we go.Josh 00:00:58 You got the audio work?Rolando 00:00:59 I got it, I got it I got him to work.Josh 00:01:02 Rolando has his own podcast and we recorded an episode last week I was on, I was in the reverse side. I was the guest there. And that I told you, Rolando, I love the sound effects that you have going on in your podcast.Rolando 00:01:15 You know what? I'm here. You know what? Go get a stream deck, go get it and call me, and I'll help you set it up. Because it took me a while. I left it in the box for quite some time before I actually started using it, because I was a little intimidated. I'm not an Avi guy or anything like that, but, you know, I was like, all right, let me add one, two, three. And I was like, ooh. And now I've got a couple of those buttons set up for it.Josh 00:01:38 I love it, I love it. All right, Rolando, there's another really wicked smart strategy that I want you to share with our audience that you shared with me prior to hitting the record button.Josh 00:01:48 And this is your amazing PPC strategy that I have never heard anybody else talk about this other than yourself. everybody's always heard of de parting, right? And that's kind of the new hot PPC term, but this isn't Dave Harding. This is something, I think, even more intelligent than what De parting is. So I've laid out the red carpet for you there, Rolando. Give us the gold nugget.Rolando 00:02:13 Yeah, right. So de parting is just simply ad scheduling. You know, run an ad on a schedule. Nothing new there. But what if. Chad. Chad, I was just talking to Chad. What if Josh. We could map or have ads show up when we have our ideal customers on Amazon? How can we do that? Can we pull it off? And can we save money while we're doing that? That's really what we wanted to find out. Turns out there is a way to do it. Not easy, not clean. But there was. So we went and pulled data from our orders for 12 months, and we used, Seller Labs product that they have or service that's called Data Hub.Rolando 00:02:59 and it pulled in all that data, right? It's our own data. So we didn't have to do all these crazy reports from Amazon. Pulled it all in. Once they pulled that in I said, wait a minute, guys. I'm not a mathematician here. This is just a spreadsheet with a bunch of numbers. Can we do something better? So then we put together something that anybody could easily use in the organization. We put together a heat map so that you can visually see the data. And, you know, dark green means good, red is bad. And guess what? We found golden hours every day of the week. Also golden months also patterns within those months. For example summertime for our products which are mostly office related products. After 4 p.m. on a Friday, we've virtually had no orders on the summer months. So if I'm a betting man, Why would I run PPC after 4 p.m. if we're not getting any orders? Another one was when? on the weekends, you hear people say this all the time.Rolando 00:04:12 And now that I have the data for our stuff, I know it's totally wrong. You got to run ads on Saturday and Sunday because people browse Saturday and Sunday and buy on Monday. The evidence does not hold that up in our case, because in our case, most of our activity, nearly 85 to 90% of the purchases c...
Is artificial intelligence creating a helpful resource for your customers, or is it building a wall between them and your sales team? We caught up with Dr. John Coles, Vice President of Data Science and Analytics at ACV, for an exclusive sneak peek at the machine learning and vehicle valuation strategies he is bringing to the VADA '26 Convention at the Marriott Virginia Beach Oceanfront . Register for VADA '26: https://vada.com/convention/ In this bonus "Convention Sneak Peek" episode, Dr. Coles explains that modern consumers demand absolute transparency . He explores how to effectively utilize machine learning in the back office, the critical necessity of multi-source information fusion, and how to stop overwhelming your staff with too many software tools . In this episode: The "Zero Surprises" Consumer — Modern buyers are fiercely protective of their time . As Dr. Coles notes regarding his own car buying experience, "The thing that I look for as a consumer when I walk in is zero surprises on a cost side" . The New Normal — With lease returns growing and margin compression remaining a stark reality, dealers must utilize data to quickly position each vehicle for the right consumer . As Dr. Coles warns, "We're never going back to an old normal" . Speed to a Human — If you introduce AI as a friction point between your dealership and the customer, you put the relationship at risk . AI should be used in the back office because, as Dr. Coles puts it, "Right now, for me, it's all about speed to a human" . Stop Software Overload — Dr. Coles breaks down the change management strategies needed to actually implement data-driven tools without burning out staff . "If you lob nine software solutions in and see what works... I'll give you a hint. None of them will work" .
In this episode, Charlie Samolczyk, Global Technology Sales Leader, is joined by guest speaker André Balleyguier, Applied AI Leader at Anthropic, and WTW's Pardeep Bassi, Global Proposition Leader for Data Science, to discuss how AI is transforming the insurance industry.
Some of the most asked questions on the channel. Here answered. Buy me a coffee https://ko-fi.com/datascience Discord Channel: https://discord.gg/4UNKGf3 ✨ Connect with us! Personal newsletter: https://defragzone.substack.com
Topics covered in this episode: Using Django Tasks in production Co-authored with Claude? PyPI packages are increasing rapidly httpx2 Extras Joke Watch on YouTube About the show Sponsored by us! Support our work through: Our courses at Talk Python Training The Complete pytest Course Patreon Supporters Connect with the hosts Michael: @mkennedy@fosstodon.org / @mkennedy.codes (bsky) Brian: @brianokken@fosstodon.org / @brianokken.bsky.social Show: @pythonbytes@fosstodon.org / @pythonbytes.fm (bsky) Join us on YouTube at pythonbytes.fm/live to be part of the audience. Usually Monday at 11am PT. Older video versions available there too. Finally, if you want an artisanal, hand-crafted digest of every week of the show notes in email form? Add your name and email to our friends of the show list, we'll never share it. Brian #1: Using Django Tasks in production Tim Schilling shares how the Djangonaut Space website has been using Django's new tasks framework and some of the info missing from the official Django docs. Tasks require a third party package, django-tasks-db to actually run the tasks. Article walks through all changes necessary to get an email process running to notify admins of new testimonials. Cool simple example. With the db backend, you can monitor progress of tasks in the admin, to see which tasks are scheduled, completed, or have errors. Some wishes for the community to implement new tutorial in the Django docs Django Debug toolbar panel for tasks test/mock backend Great title for wish list: Thinks I'd like to see, but I'm too lazy to implement myself. Michael #2: Co-authored with Claude? Via Nik T. We don't put “executed on macOS”, “edited with PyCharm”, etc. in our commits. Why Claude? Seems like a growth hack to me, that I don't really care to participate in. Some projects that have formalized their thoughts on this: The Generative AI Policy Landscape in Open Source Adjust to turn off in ~/.claude/settings.json see the docs. { "attribution": { "commit": "", "pr": "" } } Brian #3: PyPI packages are increasing rapidly Artem Golubin There's been an increase of published packages per week on PyPI A pretty big increase in the last handful of months. 30% increase since 2025, clearly due to AI Artem is building hexora, a malicious Python code detector. Cool package too, it can: Audit project dependencies to catch potential supply-chain attacks Detect malicious scripts found on platforms like Pastebin, GitHub, or open directories Analyze IoC files from past security incidents Audit new packages uploaded to PyPi. Artem is using hexora to analyze recently published pypi packages and many are obviously vibecoded and trigger false positives for abuses of eval, exec, and subprocess Side note: I don't think that's necessarily a false positive. Not malicious, but maybe a stupid-code-detector? Lots are LLM related, Lots have bots contributing code Publishing rate is crazy, dozens to hundreds of published versions in a day is a bug, not a feature Brian's proposal, PyPI should limit releases per day for any package to something a sane human would do, even if they make a mistake on a release, to maybe like 2-3, definitely under 10, in a day. And if the repo has obvious agent contributors listed, maybe lower to the limit to 1-2 a day? Honestly, “move fast and break things” doesn't apply to breaking the commons. Michael #4: httpx2 More on the httpx, httpxyz, etc changes: Pydantic people started their own fork, httpx2. Michiel says “while we think httpxyz was definitely needed, we welcome httpx2 and think it should be the ‘blessed' fork.” Kludex, who is among other things maintainer of Starlette, was considering a fork As it stands, httpx2 is lacking the performance improvements they added to httpxyz. But it will not be long before they will add those, too. Also they already made some smart decisions: they are switching from certifi to truststore they are switching to compression.zstd on Python 3.14+, enabling zstd compression by default they merged httpcore and vendored it in their repository Discussion on Hacker News Extras Brian: The Four Horsemen of the LLM Apocalypse - Anarcat Django/JetBrains 2026 developer survey is open Pyrefly 1.0 : “meaning we are confident that Pyrefly is ready for production use.” Michael: Just about ready to release Python Web Security: OWASP Top 10 with Agentic AI course. Be sure to be on the courses newsletter to get notified. Joke: Proud Parents
In this episode, Curtis and Joanie sit down with Mahmoud Harding from Data Science 4 Everyone (www.ds4e.com) to explore the growing role of data science in K-12 education.Mahmoud breaks down the key distinction between data science and data literacy — two terms that are often used interchangeably but carry very different meanings for educators and students alike. The conversation dives into why data science matters for all educators right now, regardless of subject area or grade level, and why the time to act is today. And taking action doesn't mean you need math expertise or to steer away from the standards and curriculum your students need to know!Mahmoud also shares practical, accessible ways teachers can get started with data-centered lessons in their classrooms — regardless of grade level or content area.Whether you're a curious educator or ready to dive in, this episode will leave you inspired to bring data to life for your students.Resources:● https://www.datascience4everyone.org/about (DS4E Homepage)● https://www.datascience4everyone.org/resources (DS4E Resources)● https://ds4e-org.github.io/CPN_rubric/ (DS4E Content Partner Network)● https://ds4e-org.github.io/technologytoolkit/ (DS4E Technology Tools for working with data)● https://datasciencelearning.org/ (K12 Data Science Learning Progressions) ● https://datasciencelearning.org/blog/five-basic-concepts-for-teachers-new-to-data-science (DS4E Blog: Five basic concepts for teachers new to data science)● https://hkurzweil.github.io/ds4e-teacher-pd/frontmatter.html (DS4E Data Science Starter Kit)
Talk Python To Me - Python conversations for passionate developers
What if your database worked more like Git? Every change captured as an immutable event you can replay, instead of a single mutating row that quietly forgets its own history. That's event sourcing, and Chris May is back on Talk Python, fresh off our Datastar panel, to walk us through what it actually looks like in Python. We'll cover the core patterns, the libraries to reach for, when not to use it, and why event sourcing turns out to be a surprisingly good fit for AI-assisted coding. Episode sponsors Sentry Error Monitoring, Code talkpython26 Temporal Talk Python Courses Links from the show Guest Chris May: everydaysuperpowers.dev Intro to event sourcing e-book: everydaysuperpowers.gumroad.com Domain-Driven Design: The Power of CQRS and Event Sourcing: How CQRS/ES Redefine Building Scalable System: ricofritzsche.me DDD: www.amazon.com Understanding Eventsourcing (Martin Dilger): www.amazon.com Event Sourcing Explained using Football Video: www.youtube.com Why I finally embraced event sourcing and why you should too article: everydaysuperpowers.dev valkey: valkey.io diskcache: talkpython.fm eventsourcing package: github.com eventsourcing docs: eventsourcing.readthedocs.io John Bywater: github.com Datastar: data-star.dev Microconf: microconf.com Event Modeling & Event Sourcing Podcast: podcast.eventmodeling.org Python Package Guides for AI Agents: github.com Iodine tablets AI joke: x.com KurrentDb: www.kurrent.io Watch this episode on YouTube: youtube.com Episode #548 deep-dive: talkpython.fm/548 Episode transcripts: talkpython.fm Theme Song: Developer Rap
Topics covered in this episode: httpxyz one month in Learn concurrency - a deep dive into multithreading with Python pip 26.1 - lockfiles and dependency cooldowns Python 3.15 sentinal values from PEP 661 Extras Joke Watch on YouTube About the show Sponsored by us! Support our work through: Our courses at Talk Python Training The Complete pytest Course Patreon Supporters Connect with the hosts Michael: @mkennedy@fosstodon.org / @mkennedy.codes (bsky) Brian: @brianokken@fosstodon.org / @brianokken.bsky.social Show: @pythonbytes@fosstodon.org / @pythonbytes.fm (bsky) Join us on YouTube at pythonbytes.fm/live to be part of the audience. Usually Monday at 11am PT. Older video versions available there too. Finally, if you want an artisanal, hand-crafted digest of every week of the show notes in email form? Add your name and email to our friends of the show list, we'll never share it. Michael #1: httpxyz one month in First version of httpxyz contained just the fixes to get zstd working, and the fixes to get the test suite running on python 3.14, some ‘housekeeping' changes related to the renaming End of March: a compatibility shim that allows you to use httpxyz even with third-party packages that import httpx themselves, as long as you import httpxyz first. Importing httpxyz automatically registers it under the httpx name in sys.modules , see https://httpxyz.org/httpx-compatibility/ Fixed a WHOLE bunch of performance related issues by forking httpcore Brian #2: Learn concurrency - a deep dive into multithreading with Python Nikos Vaggalis “Whenever you are trying to speed up code using multiple cores, always ask yourself: “Do these threads need to talk to each other right now?” If the answer is yes, it will be slow. The best parallel code splits a big job into completely isolated chunks, processes them separately, and merges the results at the finish line.” Good overview of thread concurrency with Python and how that's been improved dramatically with free-threaded Python Defines lots of terms you come across, including “embarrassingly parallel multithreading” There's a counter example that's nice Start with a shared resource, a counter, and multiple threads updating it Attempt to fix with threading.Lock(), which fixes it, but slows things down Good explanation of why Proper fix with concurrent.futures and separating the work of different threads so that they can be independent and their results can be combined when they're all finished. Michael #3: pip 26.1 - lockfiles and dependency cooldowns Python 3.9 is no longer supported Experimental: installing from pylock files Dependency cooldowns (see my post about this) Lifting several 2020 resolver limitations Brian #4: Python 3.15 sentinal values from PEP 661 MISSING = sentinel("MISSING") def next_value(default: int | MISSING = MISSING): ... if default is MISSING: ... Take a name str as a constructor parameter Intended to be compared with is operator, similar to None Sentinal objects can be used as a type, also similar to None and can be combined with other types with |. Unlike None, sentinal values are truthy. (Elipses ... are also truthy) This seems like a strange choice. but I guess it must have made sense to someone. It does force you to use is instead of depending on False-ness, so I guess it'll make code using sentinels more readable. Interesting that the PEP was started in 2021, and we're finally getting it this year. Extras Brian: Before GitHub - Armin Ronacher tenacity - cross-platform multi-track audio editor/recorder learned about it from Armin's article Joke: Joke option Make it myself Seems similar to what people think about software now Links httpxyz one month in httpxyz.org/httpx-compatibility Learn concurrency - a deep dive into multithreading with Python pip 26.1 - lockfiles and dependency cooldowns my post about this Python 3.15 sentinal values from PEP 661 Before GitHub tenacity Make it myself
In der 365. Episode spricht Marc mit Dr. Florian Skupin. Er leitet das Center for Legal Technology and Data Science sowie den Bucerius Legal Innovation Hub an der Bucerius Law School. Florian berichtet von seinem Weg vom BWL-Studium über die Promotion bis hin zum Wissenschaftsmanagement. Das Gespräch thematisiert die wachsende Komplexität von Gesetzen bei gleichzeitig stagnierenden Nachwuchszahlen sowie die Transformation klassischer Kanzleistrukturen hin zu interdisziplinären Teams. Zudem beleuchten Marc und Florian die Chancen und Risiken von KI in der Justiz. Wie verändert generative KI die tägliche Arbeit in Rechtsabteilungen und Kanzleien? Welche neuen Rollenbilder wie der Legal Connector entstehen durch den technologischen Wandel? Wie kann die Justiz trotz drohender Pensionierungswellen ihre Leistungsfähigkeit durch Technik sichern? Welche praktischen Möglichkeiten haben Studierende heute um sich frühzeitig digitale Zusatzqualifikationen aufzubauen? Antworten auf diese und viele weitere Fragen erhaltet Ihr in dieser Folge von IMR. Viel Spaß!
For decades, organizations have talked about paying for skills instead of jobs. The idea is simple. Reward people based on what they can do, not just the role they hold. But in practice, it has always been difficult to execute. Skills are hard to define, harder to measure, and nearly impossible to track consistently across a workforce. At the same time, the market is shifting fast. AI-related skills are in high demand, showing up in job postings across industries. But new data shows those skills don't always translate into higher pay. So organizations are facing a disconnect. They know skills matter more than ever. But they don't yet have the systems or structures to consistently pay for them. In this episode of Comp and Coffee, Ruth Thomas is joined by Sara Hillenmeyer, VP of AI and Data Science at Payscale, to explore why skills-based pay has remained out of reach and why that may finally be changing. Together they unpack how AI is reshaping demand for skills, why the market isn't consistently rewarding them yet, and what needs to happen for skills-based pay to become a reality at scale. This conversation looks at the data, the technology gap, and the structural shifts required for organizations to move from jobs-based to skills-based compensation.
Welcome back to the show! This week, I sit down with three co-authors of the Atlas of Macroscopes—Katy Borner, Elizabeth Record, and Todd Theriault from the Cyberinfrastructure for Network Science Center at Indiana University—to explore what a macroscope actually is and how it differs from a standard interactive visualization. We trace the 20-year journey of the Places and Spaces: Mapping Science exhibit, from two-dimensional wall maps to the 40 richly interactive pieces featured in this stunning 11×14-inch MIT Press book. Along the way, we talk about design strategies for making complex systems legible to general audiences, the role of AI in data visualization, and what it takes to grab and hold attention on a museum floor. Each guest shares a personal favorite from the book—ranging from Smelly Maps to an Appalachian opioid overdose tool to a skills-landscape explorer—and we close with a look at the exhibit's exciting third decade, focused on visualizing intelligences.Keywordsdata visualization, macroscope, atlas of macroscopes, interactive visualization, Katy Borner, Indiana University, Places and Spaces, complex systems, information visualization, scrollytelling, AI and data visualization, opioid epidemic mapping, data communication, science exhibit, data science podcastSubscribe to the PolicyViz Podcast wherever you get your podcasts.Become a patron of the PolicyViz Podcast (https://patreon.com/policyviz) for as little as a buck a monthFind the Atlas of Macroscopes and explore the Places and Spaces exhibit at scimaps.org. Follow Katy Borner, Elizabeth Record, and Todd Theriault through Indiana University's CNS Center.Follow me on Instagram, LinkedIn, Substack, Twitter, Website, YouTubeEmail: jon@policyviz.com
Can NPCs in videogames leverage new LLM-based tech? What are the benefits? What are the costs? Buy me a coffee https://ko-fi.com/datascience Discord Channel: https://discord.gg/4UNKGf3 ✨ Connect with us! Personal newsletter: https://defragzone.substack.com
Talk Python To Me - Python conversations for passionate developers
When OpenAI trained GPT-3, they didn't roll their own orchestration layer. They used Ray, an open source Python framework born out of the same Berkeley research lab lineage that gave us Apache Spark. And here's the twist: Ray was originally built for reinforcement learning research, then quietly faded as RL hit a wall. Until ChatGPT showed up. Suddenly reinforcement learning was back, as the post-training step that turns a raw language model into something genuinely useful. Edward Oakes and Richard Liaw, two founding engineers behind Ray and Anyscale, join me on Talk Python to tell that story. We'll trace Ray from its RISE Lab origins at UC Berkeley to powering some of the largest training runs in the world. We'll talk about what Ray actually is, a distributed execution engine for AI workloads, and how a few lines of Python become work running across hundreds of GPUs. We'll cover Ray Data for multimodal pipelines, the dashboard, the VS Code remote debugger, KubRay for Kubernetes, and where Ray fits alongside Dask, multiprocessing, and asyncio. If you've ever stared at a single-machine Python script and thought, "there has to be a better way to scale this", this one's for you Episode sponsors Sentry Error Monitoring, Code talkpython26 AgentField AI Talk Python Courses Links from the show Guests Richard Liaw: github.com Edward Oakes: github.com Ray: www.ray.io Example code (we used for walk-through): docs.ray.io Getting Started with Ray: docs.ray.io Ray Libraries: docs.ray.io kuberay: github.com Watch this episode on YouTube: youtube.com Episode #547 deep-dive: talkpython.fm/547 Episode transcripts: talkpython.fm Theme Song: Developer Rap
Send us Fan MailDan Jenkins, Ph.D., is Professor of Leadership & Organizational Studies at the University of Southern Maine. Co-author of The Role of Leadership Educators: Transforming Learning and author of over 75 peer-reviewed publications, his scholarship spans leadership pedagogy, artificial intelligence (AI), followership, critical thinking, and curriculum design. A pioneer in integrating AI into development, training, and education, he develops innovative courses preparing students for digital-age leadership challenges. Dan serves as Co-Founder of the International Leadership Association's Leadership Education Academy, Associate Editor of the Journal of Leadership Studies, and co-host of The Leadership Educator and Leaders in the Loop podcasts. An award-winning international speaker and facilitator, he engages thousands of leadership educators, scholars, students, and professionals worldwide on innovative teaching approaches and AI integration.Gaurav Khanna, Senior Manager, Data Science and Digital Journeys, Cisco Systems, has 25 years of experience in technology and entrepreneurship. During the past five years, he has led efforts to automate business workflows using machine learning and deep learning techniques. His work focuses on using large language models and generative AI to transform how users interact with sales acceleration platforms. Khanna is passionate about demystifying complex subjects and is a frequent speaker on AI/ML topics. He received a BS in physics from Yale and an MS and a PhD in materials science and engineering from Stanford.A Couple of Quotes From This Episode“About The International Leadership Association (ILA)The ILA was created in 1999 to bring together professionals interested in studying, practicing, and teaching leadership. Attend The Global Conference in Toronto, October 28-31.About Scott J. AllenWebsiteWeekly Newsletter: Practical Wisdom for LeadersMy Approach to HostingThe views of my guests do not constitute "truth." Nor do they reflect my personal views in some instances. However, they are views to consider, and I hope they help you clarify your perspective. Nothing can replace your reflection, research, and exploration of the topic. ♻️ Please share with others and follow/subscribe to the podcast!⭐️ Please leave a review on Apple, Spotify, or your platform of choice.➡️ Follow me on LinkedIn for more on leadership, communication, and tech.
What is the state of AI and videogames? Who is considering it? What are the big fails so far? This and much more is covered in this 1st episode of AI and videogames. Buy me a coffee https://ko-fi.com/datascience Discord Channel: https://discord.gg/4UNKGf3 ✨ Connect with us! Personal newsletter: https://defragzone.substack.com
Matt Ober, Managing Partner at Social Leverage, joins Jake & Gino to discuss venture capital, fintech investing, data-driven investing strategies, AI, entrepreneurship, and the future of finance. Previously Chief Data Scientist at Third Point and Head of Data Strategy at WorldQuant, Matt shares valuable insights into startup investing, identifying market opportunities, and how technology is transforming the financial world. In this episode: Venture capital & fintech trends Data science in investing Startup growth strategies AI in finance Entrepreneurship & scaling businesses Long-term investing insights Looking to grow your real estate investing business with proven systems and education? Visit Wheelbarrowprofits.com and start building long-term wealth today. timestamps 0:05 - Introduction by Jake Stenziano 0:13 - Gino responds to Jake 0:18 - Jake's comment on gratitude 0:21 - Gino talks about yesterday's conversation 0:49 - Gino acknowledges Jake's support 1:07 - Discussion about the weather 1:23 - Introduction of guest Matt Ober 1:52 - Matt Ober's introduction 2:01 - Matt shares his career journey 2:30 - Matt talks about his hedge fund experience 2:58 - Discussion on venture firm building 3:28 - Matt talks about his partners 3:37 - Matt discusses the hedge fund space 4:05 - Jake comments on the hedge fund space 4:31 - Matt talks about his current company 5:11 - Discussion on investment thesis 5:30 - Matt explains investment focus 6:29 - Matt talks about investing in people 7:06 - Discussion on adversity and entrepreneurship 7:39 - Jake asks about investing in trust funds 8:28 - Matt discusses work atmosphere 9:05 - Discussion on investment backgrounds 9:35 - Matt talks about global team experience 10:24 - Discussion on competition and relationships 11:00 - Discussion on wealth management 12:16 - Discussion on gambling and prediction markets 13:28 - Discussion on prediction markets as media 14:05 - Discussion on tax loss harvesting 15:17 - Discussion on investment strategies 16:02 - Discussion on borrowing against stock portfolios 17:10 - Discussion on interest rates and loans 18:04 - Discussion on democratizing financial tools 19:24 - Discussion on data and AI 20:55 - Discussion on company adaptation to AI 22:06 - Discussion on layoffs and efficiency 23:26 - Discussion on AI and job skills 24:09 - Discussion on investment lifecycle 25:13 - Discussion on venture scale 26:24 - Discussion on raising capital 27:45 - Discussion on investment success rates 29:10 - Discussion on investment distribution 30:18 - Discussion on timing and product success 31:14 - Discussion on founding teams 32:09 - Discussion on founder challenges 33:25 - Discussion on business similarities 34:25 - Discussion on AI and creativity 35:24 - Discussion on creativity and skills 36:27 - Discussion on AI usage 37:49 - Discussion on sales and networking 38:26 - Discussion on commercial real estate 39:16 - Discussion on loan processes 40:38 - Discussion on real estate debt space 41:06 - Discussion on mortgage processes 42:32 - Discussion on financial planning 43:00 - Discussion on 401k transfers 43:59 - Matt's bold prediction 44:42 - Closing remarks We're here to help create real estate entrepreneurs... About Jake & Gino: Jake & Gino are multifamily investors, operators, and owners who have created a vertically integrated real estate company. They control over $350M in assets under management. Connect with Jake & Gino here --> https://jakeandgino.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.
Topics covered in this episode: profiling-explorer Reverting the incremental GC in Python 3.14 and 3.15 VSCode AI Co-author defaults to on, then off django freeze Extras Joke Watch on YouTube About the show Sponsored by us! Support our work through: Our courses at Talk Python Training The Complete pytest Course Patreon Supporters Connect with the hosts Michael: @mkennedy@fosstodon.org / @mkennedy.codes (bsky) Brian: @brianokken@fosstodon.org / @brianokken.bsky.social Show: @pythonbytes@fosstodon.org / @pythonbytes.fm (bsky) Join us on YouTube at pythonbytes.fm/live to be part of the audience. Usually Monday at 11am PT. Older video versions available there too. Finally, if you want an artisanal, hand-crafted digest of every week of the show notes in email form? Add your name and email to our friends of the show list, we'll never share it. Brian #1: profiling-explorer Adam Johnson And intro post Python: introducing profiling-explorer “profiling-explorer is a tool for exploring profiling data from Python's built-in profilers, which are stored in pstats files. ” Features Dark mode Click the calls, internal ms, or cumulative ms column headers to sort by that column. Use the search box to filter by filename or function name. Hover by a filename + line number pair to reveal the copy button, which copies the location to your clipboard for faster opening. Click the callers or callees links on the right of a row (not pictured above) to see the callers or callees of that function. Michael #2: Reverting the incremental GC in Python 3.14 and 3.15 Python 3.14 shipped with a new incremental garbage collector, but production reports of severe memory pressure (Neil Schemenauer measured up to 5× peak RSS on pathological cyclic workloads) have pushed the core team and Steering Council to revert it in both 3.14 and 3.15 - returning to the 3.13-era generational GC. This is the second time the inc GC has been pulled back: it was also reverted right before 3.13.0 final, and it shipped in 3.14 without going through the PEP process. The tradeoff is real: Neil's benchmarks showed max GC pause times of 1.3ms with inc GC versus 26ms with the generational one - great for latency-sensitive apps, terrible for memory-constrained ones. Release manager Hugo van Kemenade will ship 3.14.5 early with the revert, and Gregory Smith floated the idea of a 3.14.5rc1 - the first patch-release RC since 3.9.2 back in 2021. Tim Peters spent the thread doing live forensics on Windows, running a toy deque program that should cap at 1GB and watching it balloon to 15.6GB on a 16GB machine - and discovered the gen0 collector effectively never fires under the new scheme. Tim's bigger meta-point: CPython has a chronic shortage of real-world GC benchmarks, pyperformance has "basically no interesting" cyclic workloads, and users almost never share real data - so core devs keep flying blind on changes like this. Django maintainer Adam Johnson published a blog post mid-thread documenting a real memory "leak" in Django's migration system caused by inc GC, with a manual gc.collect() workaround - the listener-facing receipt that this wasn't just theoretical. If the inc GC comes back for 3.16, it'll go through a proper PEP, and the discussion is already shifting toward keeping both collectors available via a startup flag - which Neil and Sergey Miryanov have both prototyped. Brian #3: VSCode AI Co-author defaults to on, then off VSCode merges Enabling ai co author by default - 3 week ago Ton's of “why would you do this” and related comments VSCode merges Change default for git.addAICoAuthor to off - yesterday Take-away, don't rely on default, set addAICoAuthor to off yourself Michael #4: django freeze Convert your dynamic django site to a static one with one line of code. Just run python manage.py generate_static_site :) Features Generate the static version of your Django site, optionally compressed .zip file Generate/download the static site using urls (only superuser and staff) Follow sitemap.xml urls Follow internal links founded in each page Follow redirects Report invalid/broken urls Selectively include/exclude media and static files Custom base url (very useful if the static site will run in a specific folder different by the document-root) Convert urls to relative urls (very useful if the static site will run offline or in an unknown folder different by the document-root) Prevent local directory index Extras Brian: Thinking Less, Trusting More: GenAI's Impacts on Students' Cognitive Habits Michael: Vercel breached, employee to blame Introducing the new Talk Python web player GitHub uptime (a couple of views 1, 2) Joke: Friends in tech
In this special episode of Tangent Proptech, Edward Cohen is on the red carpet at one of the most exclusive commercial real estate events of the year: the Real Estate Gala in New York City. This episode features rapid-fire conversations with founders, investors, brokers, developers, and operators across the proptech and commercial real estate ecosystem. A big focus of the evening was on AI. Namely, this question: how is AI being used in real estate right now? And possibly more front-and-center: what's hype and what's here to stay? From leasing and marketing to underwriting and financial modeling, this episode explores where artificial intelligence is already driving value in real estate, where it's falling short, and how we can close the gap.(00:00) - Welcome to the Real Estate Gala Red Carpet Interviews (02:30) - Cyrus Claffey (ButterflyMX): AI Across Product, Marketing, & Operations (06:00) - Zach Molzer (Molzer Development) & Madi Bremer (CBRE): Networking & AI in Leasing (08:30) - Gabe Einhorn (VryfID): Content, Consistency, and AI Efficiency (10:00) - Kaylan Knitowski (Franklin Street): AI Workflows and Competing with Experience (13:30) - David Auerbach (Hoya Capital): Driving Tech Adoption in Real Estate (14:45) - Adam Steiner (Rick, Steiner, Fell, and Benowitz): Document Automation & Bridging Tech and Business (16:45) - Humberto Lopes (HL Dynasty, Gotham Housing Alliance): A Human-First Real Estate Perspective (19:15) - Jovian Lopes (Gotham Housing Alliance): AI for Research vs Human Relationships (21:00) - Lauren O'Breza (Foresite CRE): AI in Brokerage & Underwriting (24:30) - Pablo Barreiro (Fortec): Simplifying Tech Adoption & the Future of Financing (26:00) - Shanti Ryle (Crexi): AI Data Enrichment & Storytelling Advantage (30:30) - Rameen Inayat (Ryan): AI for Admin & Property Tax Insights (32:00) - Collaboration Superpower: Priya Parker
Send us Fan MailJoin Brandon Seigel on the Private Practice Survival Guide Podcast as he speaks with cybersecurity expert Yves Martin about the critical importance of cybersecurity for private practices. Discover how a single click can lead to devastating data breaches and ransomware attacks, and learn the essential strategies to protect your business. Yves Martin, president and founder of MQual, shares real-world insights and actionable advice on preventing, detecting, and responding to cyber threats in the healthcare industry. This episode is a must-listen for any private practice owner looking to fortify their digital defenses and ensure compliance. What You'll Learn:The prevalent dangers of phishing and social engineering in healthcare cybersecurity.The crucial difference between IT support and dedicated cyber protection.Why staff training is your most potent defense against cyberattacks.The benefits and ease of implementing multi-factor authentication across your systems.Urgent steps to take if you suspect your practice has been compromised.Don't let your practice become another statistic. Tune in to understand the cybersecurity landscape and empower your team.#Cybersecurity #PrivatePractice #DataProtection #HealthcareIT #RansomwareYves Martin has been programming since age twelve, starting with BASIC on a TRS-80. He studied Industrial Engineering at Lehigh University and holds a Professional Certificate in Artificial Intelligence from MIT, along with a certification in Designing and Building AI Products and Services. He also holds certificates in Statistics, Data Analysis, Data Science, and Analyzing and Visualizing Data. With over twenty years of experience designing and building data systems—including business intelligence platforms—he combines technical depth with practical insight. As an author, he writes about artificial intelligence and the use of technology to automate business processes.https://www.mqual.comhttps://www.facebook.com/mqualtech/Welcome to Private Practice Survival Guide Podcast hosted by Brandon Seigel! Brandon Seigel, President of Wellness Works Management Partners, is an internationally known private practice consultant with over fifteen years of executive leadership experience. Seigel's book "The Private Practice Survival Guide" takes private practice entrepreneurs on a journey to unlocking key strategies for surviving―and thriving―in today's business environment. Now Brandon Seigel goes beyond the book and brings the same great tips, tricks, and anecdotes to improve your private practice in this companion podcast. Get In Touch With MePodcast Website: https://www.privatepracticesurvivalguide.com/LinkedIn: https://www.linkedin.com/in/brandonseigel/Instagram: https://www.instagram.com/brandonseigel/https://wellnessworksmedicalbilling.com/Private Practice Survival Guide BookThis show is proudly produced at PS Studios — learn more https://www.psstudios.co
How do you add agent skills to your data science workflow? How can a coding agent assist with data wrangling and research? This week on the show, Trevor Manz from marimo joins us to discuss marimo pair.
Join Paul Steven Conyngham, Co-founder of Core Intelligence Technologies and a veteran data scientist with 17 years of experience, for a conversation that redefined the boundaries of "citizen science." In 2026, Paul stunned the global medical and tech communities by doing the unthinkable: designing a personalized mRNA cancer vaccine for his rescue dog, Rosie, after a terminal diagnosis. In this episode, we discuss how Paul applied the rigors of machine learning and data strategy to the complex world of genomics, utilizing AI to turn a death sentence into a landmark recovery.
In 1973, a bizarre encounter allegedly unfolded on the Isle of Wight, involving two children who claimed to meet an odd, clown-like humanoid figure near Sandown, England. Speaking in odd phrases and appearing to inhabit a strange, makeshift dwelling, the being called itself "All Colors Sam," and despite the obscure origins of the tale, it would eventually gain a cult following within the annals of UFO and high-strangeness lore, remembered today as the story of Sam "The Sandown Clown." Joining us this week on The Micah Hanks Program to discuss this case from ufology's "Odd Files" is Ryan Whalen, a Brooklyn-based researcher, science reporter, and college instructor who holds an MA in History and a Master of Library and Information Science with a certificate in Data Science. Whalen, who also co-hosts the podcast "Cease to Exist", reveals what he and his colleagues recently uncovered about this bizarre 1973 urban legend. What new details have emerged about the case, and one of the alleged witnesses to these eerie events that have since become a mainstay in modern UFO folklore? Want to advertise/sponsor The Micah Hanks Program? We have partnered with the AdvertiseCast to handle our advertising/sponsorship requests. If you would like to advertise with The Micah Hanks Program, all you have to do is click the link below to get started: AdvertiseCast: Advertise with The Micah Hanks Program Show Notes Below are links to stories and other content featured in this episode: NEWS: Trump unharmed after shooting incident at White House correspondents' dinner WILCOCK UPDATE: UPDATE: (Police and Family Statement) Death Investigation Near Ridge Road Death Investigation Near Ridge Road - Boulder County SANDOWN CLOWN: The Mystery of the Sandown Clown: Britain's Answer to Bigfoot CEASE TO EXIST PODCAST: https://ceasetoexistpod.com/ RYAN WHALEN: Ryan Whalen (@mdntwvlf) / Posts / X BECOME AN X SUBSCRIBER AND GET EVEN MORE GREAT PODCASTS AND MONTHLY SPECIALS FROM MICAH HANKS. Sign up today and get access to the entire back catalog of The Micah Hanks Program, as well as "classic" episodes, weekly "additional editions" of the subscriber-only X Podcast, the monthly Enigmas specials, and much more. Like us on Facebook Follow @MicahHanks on X. Keep up with Micah and his work at micahhanks.com.
This is episode 324, recorded on April 17th, 2026, where John and Jason continue through the Microsoft Fabric March 2026 Feature Summary — the Data Science & AI rebrand with Fabric Data Agents reaching GA, AutoML going GA, multimodal support for AI functions, the Data Warehouse section covering Fabric Data Warehouse recovery, Activator support, T-SQL AI functions, ANY_VALUE aggregate, Custom SQL Pools, SQL audit logs GA, outbound access protection, and the big one — the Database Hub, Fabric's new unified control plane for databases across edge, on-prem, cloud and Fabric. For show notes please visit www.bifocal.show
AI for Better & Sustainable ProductsIn this clip, Dr. Satyajit Wattamwar, Data Science & Digital Expertise Leader at Unilever R&D, shares how AI is transforming product innovation.From identifying the right formulations for higher-quality products to discovering sustainable packaging alternatives, AI helps narrow down possibilities faster and smarter
Joyjit Roy shares his experience leading large-scale initiatives across insurance and eCommerce, including multi-year modernization programs that integrate AI into core enterprise systems. We explore what it takes to move from legacy platforms to intelligent, scalable systems that can support real-time decisioning, MLOps, and next-generation automation. Key Highlights: Enterprise AI Modernization: How to integrate AI/ML into legacy systems while maintaining execution speed, governance, and business alignment. Cloud-Native Transformation: Lessons from building scalable, event-driven architectures that improve agility, observability, and resilience. MLOps in Practice: Insights into feature engineering, inference workflows, and operationalizing AI for real-world enterprise use cases. Agentic AI in the Enterprise: Joyjit's perspective on how agentic systems can enable adaptive decisioning, contextual workflows, and more intelligent automation at scale. Leading at Scale: Strategies for aligning large, cross-functional teams across product, architecture, engineering, and business stakeholders.
Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas
The connectome is the wiring diagram of a brain, a big matrix that tells us what neurons talk to what other neurons. Understanding it is an important step to understanding how brains work, but a long way from the final answer. A big next step is understanding how neuronal circuits connect to and guide bodily behavior. Very recent work on mapping the fruit-fly connectome has brought us closer to that goal. I talk with neuroscientist Bing Brunton about the connectome, how we can study it to understand bodily motion in flies and other creatures, and where it's all taking us. Chubbies is here to keep you comfy and looking good year-round. Get 20% off with code MINDSCAPE at chubbiesshorts.com/MINDSCAPE! #chubbiespod Upgrade your denim game with Rag & Bone! Get 20% off sitewide with code MINDSCAPE at www.rag-bone.com. #ragandbonepod Support Mindscape on Patreon. Blog post with transcript: https://www.preposterousuniverse.com/podcast/2026/04/27/352-bing-brunton-on-connecting-the-connectome-to-the-body/ Bing Wen Brunton received her Ph.D. in neuroscience from Princeton University.. She is currently a Professor of Biology and the Richard & Joan Komen University Chair at the University of Washington, with affiliations at the eScience Institute for Data Science, the Paul G. Allen School of Computer Science & Engineering, and the Department of Applied Mathematics. Web site University of Washington web page Google Scholar publications YouTube channel Bluesky Artworks (Instagram)
Talk Python To Me - Python conversations for passionate developers
The cloud is convenient until it isn't. You upload your photos, sync your contacts, click through the cookie banners. Then prices go up again or you read about a family that lost their entire Google account over a medical photo sent to a doctor. At some point, the question shifts from "why would I run this myself?" to "why aren't I?" My guest this week is Alex Kretzschmar, head of DevRel at Tailscale, longtime host of the Self-Hosted podcast, and co-founder of Linuxserver.io. We cover what self-hosting really means in 2026, the apps worth running yourself like Immich and Home Assistant, why Docker Compose ties it all together, and how Tailscale lets you reach any of it from anywhere, without opening a single port. If you've been thinking about pulling your digital life back behind your own walls, this is your roadmap. Episode sponsors Temporal Talk Python Courses Links from the show Guest Alex Kretzschmar: alex.ktz.me Bitflip podcast: bitflip.show Self-Hosted podcast (Alex's previous show): selfhosted.show Perfect Media Server: perfectmediaserver.com KTZ Systems on YouTube: youtube.com/@ktzsystems Linuxserver.io (co-founded by Alex): linuxserver.io "How Tailscale Works" blog post: tailscale.com/blog/how-tailscale-works https://tailscale.com/: tailscale.com Self-hosted apps discussed Awesome Self-Hosted (GitHub list): github.com Immich (Google Photos alternative): immich.app Home Assistant: home-assistant.io Open Home Foundation: openhomefoundation.org Plausible Analytics: plausible.io Umami Analytics: umami.is Python integration for umami: pypi.org Pi-hole: pi-hole.net AdGuard Home: adguard.com NextDNS: nextdns.io Coolify: coolify.io Docker + ufw: docs.docker.com Storage, backup & filesystem OpenZFS: openzfs.org ZFS.rent (offsite ZFS replication): zfs.rent Backblaze: backblaze.com Hetzner Storage Box: hetzner.com DigitalOcean: digitalocean.com Secrets management mentioned OpenBao (open-source Vault fork): openbao.org HashiCorp Vault: hashicorp.com Bitwarden: bitwarden.com 1Password: 1password.com Hardware mentioned Proxmox VE: proxmox.com Minisforum MS01: minisforum.com Zima Board / Zima OS: zimaspace.com Other references Cory Doctorow on "enshittification" (Cory's blog where he coined the term): pluralistic.net Linus Tech Tips' WAN Show (Linus mentioned NAS-building going mainstream): linustechtips.com Watch this episode on YouTube: youtube.com Episode #546 deep-dive: talkpython.fm/546 Episode transcripts: talkpython.fm Theme Song: Developer Rap
In this Cloud Wars conversation, Bob Evans sits down with Shub Bhowmick, CEO and Founder of Tredence, alongside Yasmeen Ahmad from Google Cloud to explore how enterprises are moving from AI applications to AI agents. Their discussion focuses on what it takes to turn intelligence into action — covering data foundations, semantic layers, agentic architectures, and the operational shifts required for businesses to scale AI successfully. Turning AI Into Action The Big Themes: AI Agents Redefine Applications: Traditional AI apps assist by querying data, generating recommendations, and supporting limited workflows. AI agents, however, represent a much deeper operational shift. As Ahmad explains, agents are multi-step reasoning engines that can access multiple systems, coordinate actions, and execute entire business processes autonomously. Instead of simply helping humans decide, they can perform work across ERP systems, supply chains, and customer interactions. This changes the role of the database itself — from a storage and query engine into a reasoning engine with vector search, graph RAG, and semantic understanding. Examples like Home Depot and Danfoss show how this creates massive efficiency gains Why Questions Require Agentic Intelligence: Shub Bhowmick draws a critical distinction between “what” questions and “why” questions. A conversational BI system can answer what happened — such as how much sales dropped — but a “why” question demands deeper reasoning. Why did sales decline? Was it pricing pressure, competitor behavior, inventory constraints, or macroeconomic events? These problems require hypothesis-driven exploration. Tredence addresses this through business semantic layers, knowledge graphs, and hypothesis banks that support open-ended reasoning. Closed Systems Create Long-Term Risk: Bhowmick warns against enterprises rushing toward closed, inflexible platforms simply because they promise faster short-term value. While packaged solutions may accelerate deployment, they often restrict ownership, adaptability, and future innovation. In contrast, open architectures built with hyperscalers like Google Cloud allow customers to own the IP, customize solutions, and evolve as the market changes. The Big Quote: “Gone are the days when these migrations used to take 12 to 18 months. Nowadays, you have to complete these migrations in less than three to four months.” More from Tredence and Google Cloud: Learn about the partnership between Tredence and Google Cloud and AI agents on Gemini Enterprise. Visit Cloud Wars for more.
Ido Genosar is the CEO and Founder of Verobotics, the pioneering robotics solution for building façade maintenance, inspection, and upkeep. Prior to this, he led innovation at Aluminium Construction, Israel's biggest façade constructor. His diverse background in building exteriors, technology and global business development is at the core of his mission to solve deep-rooted inefficiencies with breakthrough innovation.(00:50) - “Miracle” Robotics(02:50) - What the Façade Robot Does(03:11) - Humanoids vs. Task-Specific Robots(04:28) - Why Robotics Demos Fail in the Real World(06:05) - VC Lens on Robotics(09:13) - Funding & Adoption Reality(12:48) - Façades as a Data Blind Spot(15:11) - Construction Signoff & Compliance(16:00) - Unions, Scaffolding & Safety(18:07) - New Data-driven Decisions(20:52) - Where's the Long-Term Value (29:33) - Humans in the Loop(31:48) - Best-Fit Buildings Today(33:03) - Robots in 5 to 10 Years(36:43) - Collaboration Superpower: Sir James Dyson, Leonardo Torres
Matt is a Principal Computational Scientist at The Jackson Laboratory. Trained as a mathematician, Matt then moved into the area of systems biology - driven by a lifelong curiosity and the opportune timing of the 2009 financial crisis. He is currently working on two main projects: studying aging biology and understanding mechanisms of cardiotoxicity for drugs. This conversation was recorded in March 2026 ~~~~~The Maine Science Podcast is a production of the Maine Discovery Museum. It is recorded at Discovery Studios, at the Maine Discovery Museum, in Bangor, ME. The Maine Science Podcast is hosted and executive produced by Kate Dickerson; edited and produced by Scott Loiselle. The Discover Maine theme was composed and performed by Nick Parker. To support our work: https://www.mainediscoverymuseum.org/donate. Find us online:Maine Discovery MuseumMaine Discovery Museum on social media: Facebook Instagram LinkedIn Bluesky YouTubeMaine Science Podcast on social media: Facebook Instagram YouTubeMaine Science Festival on social media: Facebook Instagram LinkedIn YouTube© 2026 Maine Discovery Museum
Early bird discounts for the San Francisco World's Fair, the biggest AIE gathering of the year, end today - prices will go up by ~$500 tonight so do please lock in ASAP!From near-universal AI tool adoption inside Shopify to internal systems for ML experimentation, auto-research, customer simulation, and ultra-low-latency search, Mikhail Parakhin joins us for a deep dive into what it actually looks like when a 20-year-old, $200B software company goes all-in on AI. We cover why Shopify has become much more vocal about its internal stack, what changed after the December model-quality inflection, and why the real bottleneck in AI coding is no longer generation, but review, CI/CD, and deployment stability.We also go inside Tangle, Tangent, SimGym, which are three major AI initiatives that Shopify is doing to make experimentation reproducible, optimization automatic, customer behavior simulatable, and search and catalog intelligence faster and cheaper at scale. Along the way, Mikhail explains UCP, Liquid AI, and why token budgets are directionally right but often measured badly, why AI-written code can still increase bugs in production, what makes Shopify's customer simulation defensible, and what he learned from the Sydney era at Bing.We discuss:* Mikhail's path from running a major Microsoft business unit spanning Windows, Edge, Bing, and ads to becoming CTO of Shopify* Why Shopify is talking more publicly about AI now, and why staying at the frontier has become necessary for the company* Shopify's internal AI adoption curve, the December inflection, and why CLI-style tools are rising faster than traditional IDE-based tools* Why Jensen Huang is directionally right on token budgets, but raw token count is still the wrong way to evaluate engineering output* Why the real unlock is not more agents in parallel, but better critique loops, stronger models, and spending more on review than generation* Why AI coding can still lead to more bugs in production even if models write cleaner code on average than humans* Why Shopify built its own PR review flow, and why Mikhail thinks most off-the-shelf review tools miss the point* How PR volume, test failures, and deployment rollback are becoming the real bottlenecks in the agent era* Why Git, pull requests, and CI/CD may need a new metaphor once code is written at machine speed* What Tangle is, and how Shopify uses it to make ML and data workflows reproducible, collaborative, and production-ready from the start* Why Tangle is different from Airflow, and why content-addressed caching creates network effects across teams* What Tangent is, and how Shopify is using auto-research loops to optimize search, themes, prompt compression, storage, and more* Why Tangent is becoming a democratizing tool for PMs and domain experts, not just ML engineers* Why AutoML finally feels real in the LLM era, and where auto-research still falls short today* Why Tangle, Tangent, and SimGym become much more powerful when combined into one system* What SimGym is, why simulated customers only work if you have real historical behavior, and why Shopify's data gives it a moat* How SimGym evolved from comparing A/B variants to telling merchants what to change on a single live storefront to raise conversions* Why customer simulation is so expensive, from multimodal models to browser farms to serving and distillation costs* How Shopify models merchant and buyer trajectories, runs counterfactuals, and thinks about interventions like discounts, campaigns, and notifications* Why category-level behavior is so different across commerce, and why ideas like Chinese Restaurant Processes are showing up again in practice* Shopify's new UCP and catalog work, including runtime product search, bulk lookups, and identity linking* Why Shopify is using Liquid AI, and why Mikhail sees it as the first genuinely competitive non-transformer architecture he has used in practice* Where Liquid already works inside Shopify today, from low-latency query understanding to large-scale catalog and Sidekick Pulse workloads* Whether Liquid could become frontier-scale with enough compute, and why Shopify remains pragmatic and merit-based about model choice* Who Shopify is hiring right now across ML, data science, and distributed databases* The Sydney story at Bing, why its personality was not an accident, and what Mikhail learned from deliberately shaping AI character early onMikhail Parakhin* LinkedIn: https://www.linkedin.com/in/mikhail-parakhin/* X: https://x.com/MParakhinTimestamps00:00:00 Introduction: Mikhail Parakhin, Microsoft, and Shopify00:01:16 Why Shopify Is Talking More About AI00:02:29 Internal AI Adoption at Shopify and the December Inflection00:06:54 Token Budgets, Jensen Huang, and Why Usage Metrics Can Mislead00:10:55 Why Shopify Built Its Own AI PR Review System00:12:38 AI Coding, More Bugs, and the Real Deployment Bottleneck00:14:11 Why Git, PRs, and CI/CD May Need to Change for Agents00:18:24 Tangle: Shopify's Reproducible ML and Data Workflow Engine00:21:19 Why Tangle Is Different from Airflow00:26:14 Tangent: Auto Research for Optimization and Experimentation00:30:07 How Tangent Democratizes Experimentation Beyond ML Engineers00:33:06 The Limits of Auto Research00:36:36 Why Tangle, Tangent, and SimGym Compound Together00:37:20 SimGym: Simulating Customers with Shopify's Historical Data00:42:47 The Infra Behind SimGym00:46:00 Why SimGym Gets Better with Real Customer History00:47:30 Counterfactuals, HSTU, and Modeling Merchant Trajectories00:51:55 CRPs, Clustering, and Category-Level Customer Behavior00:53:30 UCP, Shopify Catalog, and Identity Linking00:55:07 Liquid AI: Why Shopify Uses Non-Transformer Models00:59:13 Real Shopify Use Cases for Liquid01:03:00 Can Liquid Scale into a Frontier Model?01:09:49 Hiring at Shopify: ML, Data Science, and Databases01:10:43 Sydney at Bing: Personality Shaping and AI Character01:13:32 Closing ThoughtsTranscript[00:00:00] swyx: Okay. We're here in the studio, a remote studio, with Mikhail Parakhin, CTO of Shopify. Welcome.[00:00:08] Mikhail Parakhin: Thank you. Welcome.[00:00:10] swyx: I don't even know if I should introduce you as CTO of Shopify. I feel like you have many identities. Uh, you led sort of the, the Bing ML team, I guess, uh, uh, or ads team. I, I don't know, I don't know, uh, you know, it's, uh, people va-variously refer you as like CEO or, or, uh, I don't know what that, that, that said previous role at Microsoft was.[00:00:29] Mikhail Parakhin: Uh, that was... Yeah, my previous role w- at Microsoft was the-- I actually was the CEO of one of Microsoft's business units, which included, as I, you know, as we discussed, all the things that people like to laugh about, uh, including Windows and Edge and Bing and ads and everything.[00:00:47] swyx: Yeah, yeah. What a, what a, what a wild time.You've obviously, uh, done a lot since you landed at Shopify. Uh, one of the reasons I reached out was because you started promoting more sort of internal tooling, uh, primarily Tangle, but also a lot of people have seen and adopted Tobi's QMD, uh, and obviously, I think, uh, Shopify has always been sort of leading in terms of, uh, engineering.I think more-- it's just more recent that you guys have been more vocal about your sort of AI adoption. Is that, is that true?[00:01:16] Mikhail Parakhin: Well, I think AI tools in general are fairly recent development, uh, and we've-- Shopify, you know, at this stage of its development, we're developing AI in-in-house and other, uh, building tools that use AI and, you know, interfacing with the wider AI community, uh, you know, are on the sort of the, uh, runaway trajectory.So it just did by sort of natural byproduct. We, we talk about it more also. We just, uh, just even yesterday, Andrej Karpathy was famous in tweeting about, oh, are there some, uh, ways, uh, that, that you can organize your agents to store the data and then, uh, look up the data so that you don't have to research or, or lose context every- Yestime. And a little bit tongue in cheek, I tweeted that, “Hey, we've, we've done it much earlier, and we even have different approaches, Tobi and I.” Tobi, of course, is a big fan of QMD, and I'm more of a SQL, SQLite fan. But, uh, yeah, very similar things that we've already done here. The point is, yeah, we're very dynamic, you know, explosively growing company, and we have to be at the forefront of AI adoption, obviously.[00:02:29] swyx: Yeah. Yeah. Um, you, your team kindly prepared some slides actually that we were gonna bring up on to, uh, the screen. I think I can, I can screen share, and then we can kind of go through some of the shocking stats that maybe, maybe put some numbers to what exactly is going on. So here we have, uh- An internal AI tool adoption chart.What are we looking at here? What ?[00:02:54] Mikhail Parakhin: Yeah, this is very interesting statistics. Uh, this is number of daily active workers, you know, think of, uh, DAO, basically the active users of-[00:03:05] swyx: Yeah ...[00:03:05] Mikhail Parakhin: AI tool as a percentage of all the people in the company, right? And then- Yeah ... different AI tools. And, uh, you could see two things here is that one is the green is total.Uh, green is just total. So you could see that it approaches really % by now. It's hard not to do your job now without interacting deeply, at least with one tool. You could see another interesting thing is just as many people commented in December was the phase transition when suddenly models gotten good enough that, that everything took off and started growing.Uh, it, it was many people noticed that the thing is that small improvements accumulated into this big change in Sep- December roughly timeframe.[00:03:52] swyx: Yeah.[00:03:52] Mikhail Parakhin: The other thing I would claim you could see is that, uh, CLI-based tools and tools that don't require you to look at the code becoming more popular, and you could see, yeah, various versions of, uh, Cloud Code and Codex and Pi and internal development tools taking off.Uh, exactly, yeah, uh, and blue is our River, just internal agent for coding, where tools, uh, that require IDEs such as, uh, GitHub, Copilot or Cursor, they're not exactly shrinking, but they're not growing as fast. Like, uh, red, red line is, is the IDE kind of tools. So you could see that they're, they're not experiencing as, as fast of a growth.[00:04:37] swyx: As I understand it, basically, every employee has their choice, right? Of choose whatever tool you use, and then you're just kind of doing a, a daily sur-survey or something.[00:04:47] Mikhail Parakhin: Exactly. And, uh, we- Yeah ... the, the push is to get your job done, you can use any tool, and we effectively fund unlimited tokens for everybody.Uh, we, we do, we do try to control the models that, uh, people use, but from the bottom, not from top. Like we basically say, “Hey, please don't use anything less than Opus four point six.”[00:05:09] swyx: Oh .[00:05:10] Mikhail Parakhin: Some people, some people end up using GPT five point four extra high. Some people use Opus four point six. Um, uh, you know, uh, there are some, uh, there are plus and minuses in going for full one million context window versus not.But, uh, we try to discourage people from using anything less than that.[00:05:28] swyx: Yeah, yeah. Got it, got it. Uh, I mean, uh, that's, you know... The, the next chart here, it really kind of shows the expansion and the sort of December twenty twenty-five inflection, right? That, uh, people are using a lot of tokens. I think it's also really interesting that no one was kind of abusing it in twenty twenty-five.Like it was- Had comparatively, uh, to this year, there was almost no growth. I mean, it's still like, you know, probably, probably gave fifty percent.[00:05:56] Mikhail Parakhin: Yeah. This is just a different scale. It's still exponential- Yeah, yeah ...growth at just a different- ...rate of expansion. Uh, there was inflection point, and Sean, I would claim the, the super interesting part here is that you could see that the distribution becoming more and more skewed.Yes. The top percentiles grow faster. So that means- Yeah ...the people in the top ten percentile, they, their consumption grows faster than seventy-five and so forth. So, uh, the distribution skews more and more towards the highest users, which is... I don't know what it tells me. It's like it feels not ideal, to be honest.Or maybe it's okay. We'll see.[00:06:36] swyx: Why does it feel not ideal? Is, is it because of, um, quantity over quality, or what's the concern?[00:06:42] Mikhail Parakhin: Because take it to the limit. That means, you know, if, if this rate of separation continued- Ah, yes ...a year, there will be one person consuming all the tokens. So it's just, it's kinda strange.[00:06:54] swyx: Yeah, I mean, um, uh, I, I think internal like teaching and all that, uh, will, will help sort of distribute things more widely. But in, in the early days, of course, the people who are sort of more AI-pilled will obviously find more ways to use it than the people who are less AI-pilled. Maybe let's, let's call it that.I'll just, I'll just kinda quickly, uh, pause from the, the... You know, we will go back to the rest of the slides, but I just wanna, um, review, you know, there are a lot of CTOs of, of large companies like yourself where they're all considering some kind of token budget, right? Like I think it's something, something that Jensen Huang has been talking about, where like if your 200K engineer is not using 100K of tokens every year, like they're, they're underutilizing coding agents.Of course, Jensen Huang would say that, but like it seems a very quantity over quality approach and like some, some people are basically saying like, well, is this comparable to judging engineer quality by lines of code, right? Which we also know is like kind of flawed, but better than nothing. So I, I don't know if you have like a sort of management take here on, on how to view this kind of, uh, metrics.[00:08:02] Mikhail Parakhin: Well, I mean, you're, you're baiting me. I, I like... This is my favorite topic. Uh, if you let me, I'll probably talk for two hours on just this. I have a lot of things to say. Like I do think Jensen gotten a lot of bad press saying, “Oh, of course you're, you know, this, uh, the- ...the cake seller says you don't need enough cakes.”You know? Like, of course. Uh, but, uh, I actually, uh, think that's undeserved. I think he, he's actually right. Uh, I do think- He,[00:08:33] swyx: he's directionally correct.[00:08:35] Mikhail Parakhin: Yeah. Yeah. He's directionally correct for sure. Uh-[00:08:37] swyx: Who knows what the right number is? Yeah.[00:08:39] Mikhail Parakhin: The thing that I do Uh, want to say, and this is something that we learned through trial and error and very important is like two things.One is that it's not about just consuming tokens. Uh, you can consume tokens and, and in fact, the anti-pattern is running multiple agents, too many agents in parallel that don't communicate with each other. That's almost useless, uh, compared to just fewer agents and burns tokens very efficiently. Uh, setting up the right critique loop, especially with the high quality models, where one agent does something, the other one, ideally with a different model, critiques it, uh, suggests ways to improve it, the agent redoes it with this critique and, and so it takes much longer.So people don't like it because latency goes up. You know, they, they have to wait until this debate is happening. But, uh, the quality of the code is much higher. And another thing, just since you mentioned like, look, uh, uh, yeah, the overall budget is just like, uh, lines of codes. Lines of codes are exploding for everybody right now, or partially because AI is really mover balls, but partially just because AI can write a lot more code, you know, doesn't get tired.And so you have to have to have a very strong narrow waist during PR review. Otherwise, just the number of bugs will go through the roof. It's, uh, it's this unexpected consequence of the just volume trumping everything. I would claim by now good model writes code on average with fewer bugs than, than the average human.But since they write so much more of it, like more of it will make it into production. So you have to- You still[00:10:26] swyx: have[00:10:26] Mikhail Parakhin: more bugs. Yeah. Have to have a very rigorous PR reviews, also automated of course. But, uh, yeah, that to spend a lot budget there. Like this, this for me, for me, actually, the important metric is the ratio of budget spent during code generation versus, uh, spent, uh, expensive tokens like GPT, uh, five point four Pro or, uh, uh, Deep Think from Gemini, you know, checking on PR reviews.[00:10:55] swyx: Yeah, totally. Uh, I noticed in your chart you didn't have any review tools. Do you just use like, like let's say a Claude code to review tools? Or do you have another set of review tools like the Greptiles, the Code Rabbits, uh, Devin Reviews has a review tool. I don't know if you've had those specialist review tools.[00:11:13] Mikhail Parakhin: You are a little bit jumping on my store tool right now because the graphs I was only showing public tools. Uh, uh, the-- I haven't found a good PR review tool that, that does what I think should be done. And, uh, partially my, my thinking is because it's so... It just goes against both what people feel like emotionally they prefer and, uh, some of the, uh, you know, frankly Even business models that, that the companies run.At peer review tool, uh, time, you want to run the largest models. That means, I don't know, Codex or, or, uh, Cloud Code is not gonna cut it. You need to have pro-level models if you really want to, uh, stand the tide of bots from going into production. And you need us to spend a lot of time, the models taking turns, but you don't want, like, a big swarm of, uh, of, uh, agents.So in fact, you end up in a different dual-dualistic world where you generate not that many tokens. You, in fact, generate few tokens, but it takes f-a long time because these are expensive models taking turns rather than many, many agents trying to do many things in parallel. So that's, that's why I feel like I haven't found good tools, so we are using our own for peer review for now.[00:12:33] swyx: Yeah. Yeah. I mean, uh, I think a lot of companies are building their own, uh, especially to their needs, right?[00:12:38] Mikhail Parakhin: Mm-hmm.[00:12:38] swyx: Um, I, uh, you also have a chart here going back to the slides on, uh, PR merge growth, where we're now at thirty percent, uh, month on month rather than ten percent. Uh, and also the, the estimated complexity is going up.You know, this is productivity, right? ‘Cause y- presumably there's more stuff going into the code base and more, more features getting worked on. I'm curious about the backlog, right? Like the, the, the-- I actually don't mind a pro-level model taking an hour or two hours to review my PR, because I've dealt with humans who take a week to review my PR, right?And I keep pinging them on Slack, “Hey, hey, review my PR.” So, you know, I think there's some trade-off here where, like, it still doesn't make sense.[00:13:18] Mikhail Parakhin: Exactly. That, that's exactly m-my point. Uh, that on one hand, you can tolerate longer latencies at, uh, PR. On the other hand, like right now, the real problem is not in spending time waiting for PR.It's real problem is since there's so much more code than- Yeah ... uh, probability of at least some tests failing going up, and then you, like, keep de-failing, then you have to find the offending PR, evict it, retest it without that PR, and so deployment cycle becomes much longer. Uh, so it actually, in terms of the overall time to deploy, it's total time savings if you spend more time on a longer model, like thinking for an hour, because then, then you, you don't have to spend all that time during testing and rolling, you know, rolling back the deployment.[00:14:03] swyx: Yeah, totally. That's still worth it. You know, you don't look at the individual, look at the aggregate, and look at the, the, the change in the aggregate system.[00:14:11] Mikhail Parakhin: Exactly.[00:14:11] swyx: I'm kind of curious if, like, there's this PR mentality and, like, c-- the, the, the CICD paradigm will be changed eventually. Some people are like, obviously a lot of people want new GitHub, but I even wonder if, like, Git is the problem, right?Like, is that the bottleneck? Is the concept of a PR a bottleneck? Do you guys use stack diffs? I don't know if, uh, that's a, like, a merge queue stack diff type of thing.[00:14:34] Mikhail Parakhin: We, we use, we use Stacks, we u- we use Graphite. We worked with, uh, Graphite a lot. Uh, so we use Stack, uh, PRs. I think, uh, like that's clearly the overall CICD in general, and the interaction with the code repository right now is the, clearly the sort of the, the main issue and the bottleneck for us, uh, and highest top of mind.I would say we probably need a different metaphor or different whole design of how to process it in new agentic world. I haven't seen anything dramatically better yet. I, I think everybody right now is just trying to keep their head above the water ‘cause, ‘cause there, there's so many PRs and then everybody's CICD pipelines start creaking, the, the times are increasing, the number of bugs slipping by increasing, and you have to, have to clap on down.And so we are a little bit in this situation when we need to first stabilize that story and then start thinking, hey, what, what it could be a completely different and new world, which I haven't... I know some people working on it. I haven't seen something, like anything super compelling yet, but clearly the old thing were designed for humans will need to be morphed into something new.[00:15:53] swyx: One of the thing that I, I think about is kind of like the merge conflict is basically a global mutex on the whole system, right? And in, in hu- in human organizations, we do have something like that. It's the company standup. But like, other than that, it's like it's actually fitting for us to be somewhat decentralized, somewhat plugged into one stream of information source, but somewhat lossy.Like it's okay, you know, that, that not every delivery is like atomic consistency. Like we're not dealing with a database sometimes.[00:16:27] Mikhail Parakhin: This is a very good point, uh, because since humans don't write code too fast, you know that global mutex is not too bad. Once you-[00:16:36] swyx: Yes ...[00:16:37] Mikhail Parakhin: start writing code at the speed of machine, it becomes the, you know, the bottleneck.Then what do you do? Maybe, and I can't believe I'm saying this because I, I'm long-- lifelong opponent of, uh, microservices, and I always thought that was, like, a really bad idea. And now that you're saying it, like, maybe in new guys like microservices will make a comeback, you know, because then you, you can ship things independently in tiny things and, and the managing all that complexity automatically will be much easier.I don't know. Like, we'll s-- we'll have to see.[00:17:10] swyx: Yeah. I mean, I don't know what the Microsoft or, or Shopify thing is, but I, I read this paper from Google where they have a monorepo that deploys into microservices, right? And then, uh, the other concept that I think about a lot is the Chaos Monkey concept from, from Netflix.Being able to create, like, this robust system where, um, uh, you know, you, you have the service discovery, you have the, uh, the independent, independent microservices discovery and, and, uh, you know, probably going to be a fair amount of duplication. That's how an organic system sort of scales, uh, that, that you have that...I don't know how you call it. Slack? Robustness? Depend-- uh, d-duplication. I, I, I forget the-- I, I'm-- And this-- those-- these are not exactly the terms- Hmm ... I'm looking for, but I c-can't really think of the words. Okay. I was gonna go into Tangent and Tangle. Uh, so, uh, we, we sort of discussed the overall stats that, uh, Shopify has.Uh, but, you know, I, I think some, some pretty cool stuff that you guys are working on is your ML experimentation, uh, and your, your sort of auto tr-research training pipeline. Presumably you're much closer to this one because it's, it's a sort of personal hobby of yours. How, how would you explain them in, together?I thought we have a slide that, like, uh, has the s- the system diagram.[00:18:24] Mikhail Parakhin: Yeah. Tangle first and then Tangent as a-[00:18:27] swyx: Yeah ...[00:18:28] Mikhail Parakhin: as a thing on top of Tangle. And, uh, Tangle is the third generation, I claim, of, uh, systems of, uh, running any data processing, but a bit with a skew for ML experiments, but not necessarily. Any sort of data processing tasks where you need to iterate, share, and you have scale so that you want maximum efficiency.You know how, like, normally you would work, you would-- Imagine you're a data scientist or an ML practitioner, you would get Jupiter notebooks or, or maybe you would get, uh, you know, Pyth- your Python scripts, and you would manage the data, and you produce those TSV files, and you put them in some JFS or something.Then you would notice that, oh, it has this, uh, weird missing values. You go and write another script that, uh, goes and replaces them with, uh-[00:19:20] swyx: Ah ...[00:19:21] Mikhail Parakhin: dash S. And then, then you, then you run some, some, uh, “Oh, I need to filter bots.” And so you run some light GBM model that, uh, removes the bots. And then, then you like-- And then you, you kind of like get into shape, and then you start experimenting, and you run multiple experiments, and then you're like, “Oh my God,” like, “this experiment is worse.”You undo, and you cannot get to previous result. And like, “Ah, what did I do?” Like that. Again, then, then you finally like get everything working. Then you like start throwing it over the fence to production. You, you replicate it, those things don't work, and then sometimes you like don't notice that you forgot some feature naming and the, the features don't match.But then, like imagine you, you did everything, and then six months later you're like, have to repeat it because now there's more data, or you wanted to do another pass, and you're like, “What, what did I do?” Or like, or like, “This script crashes now,” or the, “the path has changed.” And then, then you're trying to, like you spend another month just doing ar- digital archeology on your own, you know, history, right?Now multiply that by many, many teams. Now imagine you got an intern that you wanna ramp up. Now you have to show that intern, “Oh, you know, look, here's the folder, there's the scripts, you know, ask your cloud agent to do, and then, uh, to, to figure it out.” And then cloud agent does something, and then you're, “Ah, yeah, right, right, it was the wrong folder.I forgot to tell you, I actually have this other thing I forgot myself.” And, and that's, that's the, like, the daily life we all, uh, all know it, uh, if, if you're a data scientist, machine practitioner, ma- machine learning practitioner or, uh, or even like any data managing, uh, person.[00:21:00] swyx: Yeah. So I, I used to do this, uh, f- uh, on the quant finance side, uh, in, in my hedge fund.So we did this before Airflow, and then, uh, obviously Airflow came along and, uh, then more recently Dagster, uh, I would say is like, in my mind, what I would use for that shape of problem, uh, where you had to materialize assets and create a pipeline.[00:21:19] Mikhail Parakhin: And that's, that's very good segue because... So Airflow is great, but Airflow is more about you, you have something and you wanna repeatedly run it in production on schedule.It's less about you as a team developing things and being able to share, and you grabbing the standard pipeline and saying, “Hey, I wanna change this tiny little component in the huge sea of data processing, and I don't wanna-- I wanna run ten experiments on this, and I wanna do hyperparameter optimization.”All that is very hard to do with Airflow. It's very easy to do with Tango. Tango is m- more about, it's everything about group of people Running experiments, it might be agents too nowadays. Uh, running experiments cheaply, collaborating, sharing results. Uh, you don't need to understand fully. You, you grab-- you clone somebody else's experiment or somebody else's pipeline, uh, run, uh, change small piece, run it, be, like, get it to production state, and then ship in one click.So then the... You don't have to port it into any other system to, to run in production. You can just run the same experiment. It's, it's fully production ready. And, and it's, uh, it has lots of... Again, as I said, it's third generation system. The original one was, I would claim there was Ether and then, uh, at least in my career, Ether was the first, first, uh, that pioneered this type of approach.And then there was, uh, Nirvana, which, uh, uh, at Yandex, which did kind of sec-second take on this. And now this one aggregates the, the learnings from all of those and, and Airflow as well to, to get to the state where you try it, it, it feels kind of magical. Uh, ‘cause now everything is based on content, uh, hashes.So even if the version changed, but if the output didn't change, nothing is being rerun. It's very efficient. If you... Multiple people start experiment that needs the same sort of data preprocessing, it's not repeated multiple times. It's automatically done only once. If you start ten experiments that all require, you know, some, some data preparation first as the first step, and you don't have to coordinate for that.Like, you don't have to know that other people are starting it. You now, it's very easy compos-, uh, composability, any language you can u- uh, you wanna use, and it's very visual. So you can see immediately, you can edit it easily, you can assemble small things with just even mouse clicks if you want to, and, uh, share, clone.And everybody knows also it's fully kind of static in the sense that we rerun it second time, it will exactly have the same results. Like, you will never have to do digital archeology. So full versioning and everything is also there.[00:24:06] swyx: Uh, so, so people can, uh... It's open source. Go to the GitHub repo and, and, uh, check it out.Uh, and it is also a really good, uh, blog post about it. I think all these is, like, really appealing. The, the, the, the thing that I think sells me the most about it is that, um, sort of development to production transition, right? Which I think, um, a lot of people haven't really solved that, uh, strictly, right?Like, we develop really, really well in, in Python notebooks, but then, you know, that's obviously not a sort of production ready process. I think that, like, any way in which that is solved, I think is, is very appealing. Then the other thing that you mentioned, which also raised my eyebrows, was content-based caching, which you mentioned is, is, um, you know, is ve-very much, uh, um, a sort of efficiency measure about, uh, you know, just like recalculation only on, on sort of content addressing Which I think makes sense.Uh, it surprised me that the savings could be this much, but maybe I just haven't worked at your scale where there's so much duplication, uh, that people just rerun because they change a single ID upstream.[00:25:10] Mikhail Parakhin: It does, yeah. But it's not only you rerun. The, the main savings are coming from the fact that you ran it, you got your job done, and you moved on.Then- Yeah ... somebody else in some department you don't know existed runs the same task, but on a newer version.[00:25:27] swyx: Yeah.[00:25:27] Mikhail Parakhin: Like right now, you can't, in, in most of the organizations, you can't even find out about it so that you can't even measure that you're spending that time twice, right? Here- Yeah ... if everybody's on Tango, that's detected automatically and detected that the output is the same.And then for that person, all it looks like is like experiment just suddenly moved, jumped forward, right? Uh, uh- Yeah ... so that's because, because the, there's network effect of multiple people helping each other.[00:25:51] swyx: Yeah. This is one of those things where it's designed to be a platform from the beginning rather than an individual developer's tool from the beginning, right?And, and everything's gonna streams down from there. That is the sort of Tango, uh, orchestrator, and it's, it manages jobs. We've seen a few versions of this, and this is obviously, uh, uh, the sort of, uh, unique approaches that you guys have, have, uh, figured out. And then there's Tangent.[00:26:14] Mikhail Parakhin: Yeah. And Tangent is basically an automatic auto research loop that can help and kind of do your work for you.Uh- ... you know, uh, effectively, effectively, Andrej Karpathy recently popularized it with auto research. Yes. Remember he said like he was, uh, speed running this, uh... Yeah, uh, you know the story. The, here we're basically bringing the same capability into Tango so that, uh, the, uh, Tangent can analyze it. It's just an agent that can run multiple experiments, figure out what can be changed, and keep on rerunning it, keep on modifying until, uh, maximizing some goal, some loss function, whatever you need to, to achieve.And in general, I would say if you're not using auto research-like approach in whatever you do, like literally whatever you do, then you're missing out. We saw at Shopify that taking like a wildfire, anything where you can put measurements can be done dramatically better. Our-[00:27:19] swyx: Mm-hmm ...[00:27:20] Mikhail Parakhin: uh, speed of, uh, templatization HTML, uh, completely new UX tem- uh, templatization of, uh, reducing latency for liquid themes.Uh, we-- Our, uh, search, uh, recently we moved from It's hard even, uh, quote from eight hundred QPS to forty-two hundred QPS with the same quality just by pure optimizations and not a research loop that kept running and changing code in our index serve on the same number of machines, just increasing the throughput.We, we managed to improve the quality of gisting and machine learning process. Uh, you know, gisting is the prompt compression technique that[00:27:59] swyx: allows for[00:28:00] Mikhail Parakhin: lower latency and, and lower and, uh, actually higher quality slightly. So like literally whatever different walks of life, and it doesn't have to be AI related.Uh, we, we had a reduction in, uh, storage because the agents would go and find data sets that clearly are derivative, uh, and then you don't need to store things twice. You know, we, we, we found somewhat embarrassingly that it was one of the largest tables was hashing random IDs into another random ID, and we literally- Oofput only one. So it was translating, yeah, two random IDs hashed[00:28:36] swyx: into[00:28:37] Mikhail Parakhin: each. So, so[00:28:37] swyx: it has access to the code as well, so it can, it can check the, like what, what the hell is it doing?[00:28:42] Mikhail Parakhin: So there, there cou- it could be run in two levels. You, uh, you know, at the superficial level, it could just use ex-existing components and, uh, reshuffle them.Uh, you know, like you can grab- Yeah ... uh, XGBoost, and you can grab some, some Py- PyTorch module, and then can grab some, you know, grab another tools and, and combine them. At a deeper level, since Tangle is all sort of CLI based underneath you, every, every component is a wrapped really CLI, uh, call and a YAML file, it can analyze code and create new components and, and, uh, keep on iterating as well.So, so you can, you can both have quick modifications of existing t- uh, pipelines with the, with components that are already there pre-baked, or you can create new components, uh, and-[00:29:29] swyx: Yeah ...[00:29:29] Mikhail Parakhin: keep iterating on those. So auto research is, again, this is probably the, the thing I was excited the most in the last two months happening, and we see it taking like, like totally like a wildfire.Just, uh, everybody, every day, every... well, every day, every minute, I would, uh, have somebody Slack message saying, “Oh, look how much better I made it.” And, uh, it's all throughout the research.[00:29:53] swyx: Is this democratized in some way in, in the sense that like is it your ML, uh, engineers and researchers doing this, or is it your regular PMs and software engineers also have the ability to auto-- to use Tangent?[00:30:07] Mikhail Parakhin: This is an awesome question. Like, Tango in general and Tangent in particular are extremely democratizing. Like they- Yeah ... they are the main tools for- ‘Cause I don't[00:30:15] swyx: need the details.[00:30:16] Mikhail Parakhin: Yeah. Exactly. Initially used by ML and AI engineers, but then literally, as you said, PMs are like the highest user right now is one of PMs on our org, uh, Sartak and he was, he was number one by, by usage of, of this ‘cause they're just, uh, energetic and knowledgeable, and now it, it unlocks a lot of capability where you don't have to co-change code manually.[00:30:39] swyx: I mean, I mean, because it kind of cuts out the ML, ML engineer from the process because the, the, the PMs have the domain knowledge and the ability to think about, uh, from first principles about, okay, what, what results do I want? And they can-- they even have the access to the data that, that needs to go in.So it's like in some ways, like this is the magic black box that we've always wanted for, for training and, and for, uh, I guess, uh, uh, hill climbing, whatever.[00:31:04] Mikhail Parakhin: It's basically cloud code for your AI development- ... uh, situation, right? Like now, now you don't have to know exactly how algorithms work. You can just, uh, bring your domain knowledge and expertise and product knowledge and iterate within Tangent until you've gotten the results that you need.[00:31:21] swyx: In my previous roles, every time that someone has pitched AutoML, you know, I've always been like, “Uh, this is not, this is not gonna work. It's, you know, it's, it's always gonna be a flop.” Somehow it's working now. I mean, presumably the answer is now we have LLMs and it's good enough, right? It's, it's an emergent property that we can do auto research, but like, it doesn't feel that satisfying that how come we didn't do this before, right?Like we just did like parameter search and like, I don't know. That's maybe that's it.[00:31:48] Mikhail Parakhin: Yeah. Bayesian optimization and hyperparameter optimization was, was the one that, or facet of AutoML that was used very actively, which incidentally also built into, uh, Tango. But, you know, I know Patrice Simard very well, and, uh, he was such a, uh, such a proponent of AutoML, and he put, like literally spent careers trying to democratize it.Without LLMs, it just turned out to be very hard. Like it, you, you would have flexibility within certain narrow domain, but it was hard to wider scale, and now with LLMs suddenly it's like magic wand, and so suddenly everybody- ... is an AutoML expert.[00:32:28] swyx: Yeah, I, I think it's multiple things, right? Like I'm, I'm just gonna bring up the, the, the chart again, right?Like LLMs can do the monitoring very well. That is the very potentially unbounded, super unstructured. It can do the analysis very well, it can do the... Uh, and basically it is much more intelligence poured into every single step. Uh, there's maybe nothing structurally changed about AutoML, but this is just m-more intelligent and more unstructured.[00:32:53] Mikhail Parakhin: Exactly.[00:32:54] swyx: Any flaws that you've run into? Like everyone is like drinking the Kool-Aid, oh my God, time savings, uh, you know, performance improvements. Like what, what, uh, issues have you have, uh, come up?[00:33:06] Mikhail Parakhin: This is really cool. It's not a solution to all the world's problems for sure. The limitations are usually the ones I-- And this is where we get into a bit of a subjective territory.Uh, I can only share what I've, I've seen so far, and I'm sure the situation, uh, is changing, and, you know, maybe after I say it, like many people will reach out and say, “Hey, what about this?” And you don't know that, and then, then we'll be probably right. But what I've seen is auto research is very good at doing kind of obvious things that you don't have bandwidth to do or you didn't notice or maybe you're not aware of like the-- some standard practices.It is not good at doing something completely out of distribution, something that, you know, you have to think for, for multiple days, uh, and, and do something like none of this. So, so it's, uh, I, uh, set an experiment once, uh, on, on my sort of, uh, hobby thing, and I let it run for, uh, ended up, uh, several weeks run, uh, you know, it's like full production kind of scale, so it, you know, slow runs and, and it ex-- it performed in the end, uh, over four hundred experiments, and only one was successful.I'm like, “Okay, that's, that's good.” But-[00:34:18] swyx: But it saved time.[00:34:19] Mikhail Parakhin: Yeah, I saved time. Like it, it was the, that thing. Yeah, if I, if I were doing four hundred experiments myself, my betting average, as I said, would have been much higher, I'm sure. But also, first of all, it would take me like three years to do four hundred experiments.And, uh, I didn't have to do them. Like the machines were just, uh, the price of electricity did that. So, and I got one improvement, uh, that in, uh, my, my-- Honestly, when I was starting that experiment, my thinking was to go and show that, “Hey, Andre, maybe you just don't know how to optimize.” And I was super smart because in, in my pro-problem, it was optimized for many years, and it was like fully improved.Uh, and I didn't expect it, you know, auto research to find anything at all. Yet it did. So instead of making fun of Andre, I ended up, uh, a big, big supporter. Yeah, that's exactly the tweet. Yes.[00:35:10] swyx: You and Toby really, really go back and forth on-online a lot, which is really funny. Uh, think of it as, as an eval for the optimalness of the code it's running on.Uh, it's almost like it reminds me of like a Kolmogorov complexity thing, but, uh, I guess it's-- there's some optimal thing that you're trying to sort of reduce down to, I guess. Um, and so, so you, you, you know, you should congratulate yourself that you had, uh, you know, uh, ninety-nine percent, uh, optimality.[00:35:36] Mikhail Parakhin: Exactly, yeah. I think Andre really deserves a lot of credit for popularizing this approach. This is, uh, this is incredibly, I think, powerful and cool and You know, the, uh, even him, him just mentioning it led to a lot of gains in a lot of places in the industry, so we should be thankful.[00:35:56] swyx: Yeah. I think he also has a just...I don't know what it is. Like, um, you know, it, it is a simple self-contained project that people can take and apply to other things, which is, is, is one thing, but also just the name. Just like somehow no one, no one managed to call their thing auto research. It's just naming things is very important. I think that that is mostly, uh, our coverage of Tango and, and, uh, Tangents.I think obviously, you know, there's a lot of, uh, ML infra at, at Shopify that people can, uh, dive into. We're about to go into SimGym, but before I do that, any, any other sort of broader comments around this whole effort? Like where is it, where is it leading to?[00:36:36] Mikhail Parakhin: As a segue to SimGym, like all those things start composing strongly.And, uh, you could see a huge unlock when you can look at each one of the tools and, and you see, oh, they're extremely useful. Uh, Tango is useful by itself. Auto Research is useful by itself. SimGym is useful by itself. If you combine all three, you create like synergetic effect. I think that's why we wanted to even, uh, cover them today is because this is something that if you go back even, you know, five years ago, would've been unthinkable.Uh, replicating that, uh, would, would be either incredibly costly or impossible, right? With probably thousands of people are required.[00:37:20] swyx: Well, we have serverless human, uh, serverless intelligence, right? Like, uh, so yes, you do have thousands of hu-- of, of intelligences, not just, not humans. And that's, that's close enough, right?Even if they're not AGI, they're, they're close enough to do the, the task that you need them to do. And, and, you know, that's, there's plenty for, for a lot of routine work, knowledge work. Okay, let's get into SimGym. Um, this is one of those things I, I was surprised to see actually it's apparently your, uh, one of your most popular launches, and I think something that, uh, I think Sim AI, I think Yunjun Park, who did the Smallville thing, there's a very small cottage industry of people trying to do like the simulate customer thing.I think a lot of people maybe don't super trust this yet because they're like, well, obviously they would just do what you prompt them to do, right? But maybe just think, uh, tell us about the sort of inspiration or origin story.[00:38:10] Mikhail Parakhin: That's exactly actually the thing I wanted to cover, because if you don't have the historical data, all you can do is prompt a-agents in a vacuum, and they will do exactly what you prompt them to do.In fact, when I first proposed it, and this is a bit of, um, my brainchild initially, if I, I can boast, even Toby said like, “But wouldn't they, they just repeat what, what you tell them?” And, uh, but I'm like, “Yes, except Shopify has decades of history of how people made changes and what there is, uh, there, what it resulted in terms of sales.”So now what we can do is we can-- we have this... It's not, it's a noisy data. There's a small, usually websites, uh, you know, like things, things are never in isolation. It's almost never AB experiment. It's always AA experiment when there's has two meanings, but basically, you know, in different time you run two different things.But if you aggregate in general, uh, like everything together, and you apply, uh, denoising and collaborative filtering like approach, you can extract a very clear signal. And then you can optimize your agents. And that's why it took so long. It took almost a year of that optimization of just us sitting and fiddling, and, and we had this internal goals of correlation of hitting-- internal goal was to hit zero point seven correlation with, uh, add to cart events, for example.Like that, that if we run real AB test experiment, that it should, it should go and, and rep-uh, replicate, uh, same sort of success that, that humans had or lack thereof. And it, it took forever, and I don't think that's easily replicatable because, uh, like who else would have that data? You have to have this historic, you know, decades, uh, worth of data.And now, now the, like the other thing you need is in-infrastructure and the scale, right? Because, uh, w- again, what we found, uh, stat sig results, you need to run a lot of simulations, a lot of agents, and, and it's-- Those are expensive things. Like you're, you're making actions in the browser because you want a real friction.You want to, to be able to get the image like of what humans will see because you wanna, uh, detect effects like, “Hey, if I make my images larger, will I have more sales or l- uh, fewer sales?” And like usually people's intuition here, by the way, is that I increase my images, I will have more because they look nicer.You know, designers all look sparse and big images. Like usually your sales tank, right? But, but, uh, you know, from HTML, all the characters look the same only the, the size tag looks different, right? So it's very hard. So you have to take visual information, you have to run this in simulated browser environment on the big farm and, and of course, you have to have, uh, like very, very expensive model, good model with multi-model model.So all this it's-- is what's taken so long and, uh, to share my personal fail a little bit there, Sean, is like, you know, we always had this bias to-- for like large company bias. You know, we always, uh, whenever you-- we do, we're like, “Hey, we'll run an experiment,” right? We make, make a change, and we will run an experiment and then, uh, see, uh, see which one's better or like, “No, this is worse,” and most of them are worse, so you discard it and keep iterating, hill climbing.And we're like, “Oh, like smaller merchants, they cannot get stat sig results. They cannot really run experiments simply because, you know, in a week there would be not enough data for them.” So we thought from this perspective. What we didn't realize is that most people don't have A and B, they just have one thing, and they need suggestions of What A and B should be.So, uh, we first build this, hey, we run simulation on two separate teams and, and, uh, say, “Hey, which one is better?” We then morphed it into, and very recently just released it, when you have just your site, your theme, we run over it and we say, “Hey, here's what predicted values of, of, uh, uh, conversions are, and here's how we think you should modify it to increase your conversions.”And then circling back to what you started with, the proof is in the pudding. Like, if we are not correlating with reality, like, people will not be using it. And, uh, thankfully, we see literally every day more users than the previous day. So, so right now, uh, right now- It's working. Yeah. I'm-- Right now my problem is how to pay for it all because the so our major thing is how to optimize the LLMs, do distillation, how to run the headless browsers, uh, and handful browsers, uh, uh, cheaper so that we can accommodate the increase in traffic.[00:42:47] swyx: Yeah. I, I understand that you, uh, you published a lot of technical detail at GTC, so I was just gonna bring it up a little bit. I think s- was this in, in con-conjunction with some kind of GTC presentation? Or something like that, right?[00:42:59] Mikhail Parakhin: Well, we, yeah, we, we did it in several place, but yeah, we had the engineering- Yeahblog, uh, as well. Yeah.[00:43:05] swyx: Yeah. So you're running, uh, GPT OSS. Uh,[00:43:08] Mikhail Parakhin: the, this is an older version. You know, now we run multimodal model. But yeah- Yeah ... GPT OSS, we still run GPT OSS as well for[00:43:15] swyx: And then you have the VMs, and you also have browser-based. I really like this one where it you said, “It violates almost every assumption that standard LLM serving is designed for.”And then you had like, basically orders of magnitude differences between everything.[00:43:29] Mikhail Parakhin: Exactly. Which is, which, uh, which was, you know, a bit of a challenge to implement, like when, like even simple things. Uh, be- since it violates all the assumptions, for example, multi-instance GPUs, like MIGs don't work as well.But we needed, uh, to get MIG to work because, ‘cause otherwise it's way too expensive. And so we had to deal with the, yeah, with, uh, lots of infrastructure and, and, uh, work with, uh, uh, Fireworks and CentML, uh, you know, to help with optimizations and browser-based, as you mentioned. Yeah, like, takes a village.[00:44:04] swyx: Okay. So there's a lot of like, I guess, experimentation in the infrastructure so far, and you've published more or less what you have here. I guess I'm, I'm less familiar with CentML. I, I don't do, uh, that much work in this, this part of the stack. But why was it the sort of preferred instance platform?[00:44:22] Mikhail Parakhin: There are really three probably top companies. There used to be, uh, uh- Three top companies, uh, at least I was aware of that did, uh, LM optimization. You know, together Fireworks and Santa ML, not necessarily in that order. Santa ML recently got acquired by NVIDIA. Uh, what they did is if you have a model and you want to optimize it to a specific prof-- uh, profile of usage, uh, they would go and do it.And, uh, we work with, with those companies, uh, this was work particularly in with Santa ML and NVIDIA to get them the best possible results out of it. And, and sometimes you, you have to retune depending on, like sometimes you want the maximum throughput, sometimes you want minimal latency, sometimes you want like the cheapest, right?And, yeah, or some combination. And so yeah, these are people who would come and help you.[00:45:14] swyx: I see. I see. Yeah, yeah. I'm familiar with these people for the LLM, you know, autoregressive stack. But the other interesting category of these optimizers is also the diffusion people, whereas like Fel and, you know, uh, Pruna recently has come up a lot as well, which I think is like really underappreciated, uh, at least by myself, because I, I thought, oh, all the workload would be LLMs, but actually there's a lot of diffusion as well.[00:45:38] Mikhail Parakhin: Exactly.[00:45:38] swyx: There's a lot here, so I, I, I... it's, it's, uh, it's, it's, it's hard to cover. But I, I do think like people underappreciate the importance of customer simulation, basically. I think this is something that I'm candidly still getting to terms with. Uh, you know, uh, you also-- your team also like prepared this, like, really nice diagram.Uh, I, I assume this is AI generated.[00:46:00] Mikhail Parakhin: Yeah, it looks-[00:46:01] swyx: Maybe it's not.[00:46:01] Mikhail Parakhin: Yeah, it looks, uh, Gemini-ish. Yeah, but, uh, uh, honestly, I, I don't know where, where the hell they generated. It looks, look, uh, looks like it's, uh, Google. But the interesting part, John, that, that, uh, we haven't covered, but I, I wanted to mention is if your store had previous customers, rather than it's a new store, you're like new merchant just launching things, it helps tremendously in just correlation and forecast.Yeah, we take your previous, uh, customer's behavior, and we create agents that replicate those specific distribution of, of customers that you get, and then we a- we apply those to your changes, and then that, that raised raw, you know, the re-- uh, just correlation with the add to cart events or to-- with conversion or whatever it, it, it may be, uh, quite dramatically.So, uh, replicating humans in general seems like an interesting, cool challenge.[00:46:58] swyx: As a shareholder, I think this is the-- like if people are Shopify shareholders, they should really deeply understand this because this is basically the moat. The, the more you use Shopify, the more it will just automatically improve, right?Like you're, you're doing the job for them.[00:47:13] Mikhail Parakhin: Yeah, that's what we started with. Like, uh- ... uh, otherwise, if you're just a startup, I wouldn't do it if, uh, you know, if it was my startup because Without the data, it, yeah, as, as you said, it's, it's exactly the case that, uh, whatever you say in prompt, that's, that's what the agents will be doing.[00:47:30] swyx: The statistician in me wants to like really satisfy the sort of, um, statistical intuition, I guess. Um, to me it's kind of, uh, the, the word that comes to mind is, um, ergodicity. Uh, so let's say a, a customer takes this path, customer takes this path, customer takes this path, right? Um, the... In my mind, the way I explain it is like, okay, here, here's the ninety-five percentile, here's the five percentile, and here's the median, right?Um, but to me, what SimGym is potentially doing is that it can, uh, modify... It can sort of model the sort of in-between sort of journeys as well, that, that maybe are dependent on the previous states. This may be like a very RL-type conclusion where like basically the summary statistics, if you only did naive AB testing, you only have the, the statistics at, at, at a certain point, and you only judge based on the sort of overall summary statistics.But here you can actually model trajectories. Does that make sense? Or-[00:48:31] Mikhail Parakhin: That makes total sense because like, well, that, that makes even more sense that maybe even you realize bec- because-[00:48:38] swyx: Okay. Please,[00:48:38] Mikhail Parakhin: please. Yes ... we do-- Yeah. The, so internally, uh, we have this system, we talked about it briefly once at NeurIPS.We have a huge HSTU-based system that models the whole companies, uh, and their possible paths. And like- Yeah ... what you are, what you are showing, like actually at any point of time, you can either model the user's behavior or you mo- can also think about, uh, the whole merchant as a company, as the entity that acts in the world.You can model that as well. And then you can do, can do counterfactuals. In your graph, like in your blue graph, uh, if you're... Imagine in the center there, uh, somewhere in the middle, you would have an intervention. I give that person a coupon, or I don't know, I send a personal thank you card, or give a discount in some- somewhere.And then you can, uh, then you can do forward rollouts from that counterfactual. So what would have happened with that intervention or without the intervention? And you can even ch- change where that intervention, uh, in time can happen, right? Like some- where, where in this journey. So we, we do this at the Shopify scale for our merchants, and then if we notice that something that they can be fixing, like there's a strong counterfactual, like we have Shopify policy, they basically get a notification like, “Hey, we think your...something is wrong with your-” I don't know, Canadian sales. Like, uh, it looks like it's misconfigured. Here's what you need to do. Or do you think like, uh, you have to set up this campaign with these parameters? And we do that at the buyer level to literally offer discounts or cashback or, or things to buyers.So this is-- I'm getting very excited. Like this is my sort of area of, uh, interest, I guess, and, and hobby. But being able to m-model something complex as human beings or companies and model counterfactuals on it, where you can have interventions in the future and optimize when to make intervention, what kind inter-- uh, what kind of intervention to make.It's such an unlock that previously was completely impossible. Like the-- it was, it was always dreamed of, but never... Like how would you even simulate it without LLMs or HTUs? I think very, very exciting times.[00:50:59] swyx: I just wanted to, uh, to maybe illustrate this. I, I'm not the best illustrator, but I, I am a conceptual statistics guy.And y-you know, you cannot just do this. Like this is a dimensionality AB test doesn't do, right? Like, uh, because it doesn't have the, the, the change over time, uh, stochastic nature, uh, and it doesn't have the sort of contextual like... Here's all the context to this point. Um, okay, cool. Um, that's SimGym.You're, you're gonna burn a lot of tokens on this thing. But you're, you're one of the, the only scale platforms in the world that can, uh, that can do this across a huge variety of workloads, right? I'm even curious on a sort of human, uh, research level of like, well, do, does retail behave d-differently from like clothing sales?D-does that behave differently from electronic sales? I, I don't know. I don't know what else you guys... The Kardashian shoppers, do they differ from like people who buy, uh, I don't know, cars and, uh, whatever.[00:51:55] Mikhail Parakhin: Well, very different, and different sensitivities and different modes of, uh, shopping and, and different levels of what's important.Now, to-totally, you can do aggregations at, uh, at a store level. You can do aggregations at a different, uh, category level. I don't know if, uh, you know, for our statisticians among us, I couldn't believe, but we-- recently we're looking at it, and we had to bring back, uh, CRPs, you know, Chinese restaurant process.It's a, like, way of aggregating and, like, naturally grow clustering. So across... Specifically to answer questions that, uh, like you were just posing on how, how if, if buyers behave different categories. And I'm like, “I haven't seen CRP since two thousand and one.” It's[00:52:37] swyx: so What? It's so- What is... No, I haven't, I haven't seen this.No. This is not in my training. Uh,[00:52:44] Mikhail Parakhin: but, but yeah, it, uh, uh, it actually, like the, the-- there was a very popular kind of theory, popular neurips HTML circles in early two thousands, uh, kind of nice. And now, now it has practical applications, uh- Yeah ... that we were resurrecting.[00:53:03] swyx: Yeah, amazing. Uh, I, I can see, I can see how this is like a, uh, a fun job for you where you get to apply all these things.Um, yeah, yeah, so super cool. Super cool. So, okay, so, so anyone who, who knows what CRPs are and has always wanted to use them at work, uh, they should, they should definitely join Shopify. Okay, so w-we have a lot and but I, I'm, I'm being mindful of the time. I, I do wanted to, to sort of cover some other things.Um, I-I'll give you a choice, UCP or Liquid?[00:53:30] Mikhail Parakhin: Liquid. I think, I think on UCP, you know, like UCP is very important for us and, and it just we are-- UCP, we have a structured, uh, discussions, and you can read about them, and we have, uh, blog posts, and we have a big release this week, in fact, like with our catalog.Oh,[00:53:46] swyx: okay.[00:53:46] Mikhail Parakhin: Uh, yeah,[00:53:46] swyx: but- Le-I mean, we, we can, we can discuss the, the, the release briefly because we'll release this after the-- after it's already announced so whatever. There's a catalog that you guys are doing?[00:53:55] Mikhail Parakhin: Yeah. So we are, we are- Okay ... we are bringing in capabilities of a whole, uh, Shopify catalog.Basically, you now you can search for products, you can do lookups by specific ID, you can do bulk lookups when you need to bring m-multiple products. You don't need to know in ad-in advance what you're trying to show or to sell or check out. Like, you can now, you can now have this decided at, at runtime, and this big area for investment for us for both non-personalized and personalized searches, trying to provide basically a win-window into whole universe of products that are being sold everywhere in the world.And Shopify is really not exactly, but almost like a super set of any-anything being sold. Now we are bringing it into UCP and, uh, and, uh, identity linking is another big thing for us, uh, so that you, you can use, uh, like Google or whatever, whatever identity you have, uh, they're minimizing friction.[00:54:56] swyx: Yeah. So[00:54:57] Mikhail Parakhin: yeah, big release for us.But Liquid AI of course we never talk about, and the problem might be more, more aligned with what we d-discussed previously on this chat.[00:55:07] swyx: Sure. The main thing that everyone understands about Liquid is that it is inspired by Worm, and I still don't know why. I'm curious on your explanation. I think you, you, uh, you can make things very approachable.And also I think like what is the potential of like the, the level of efficiency that you get out of Liquid?[00:55:23] Mikhail Parakhin: You- we all familiar with transformer architectures. And, uh, for the longest time, there was a competing architecture, it's called the state space models. So, so Sams, uh, you know, Chris, Chris Reyes, one of the pioneers and, and lots of startups, uh, trying to make those realities.They have, uh, significant benefits being main being, uh, being much faster and, uh, lower footprint and not quadratic in length, you know, sort of, uh, linear in, in, uh, in your context length. But with state space models- They never quite made it. Like they're used-- They have, uh, certain niches when they thrive, their hybrid architectures are useful, but they never quite made it.And liquid neural networks are, you can think of them as a next step, like, uh, sort of, uh, state-space model square. It's non-transformer architecture that's more complicated than sta-state space and really difficult to code if you-- if I'm being honest. But it's, um, very efficient. It's, uh, subline-- sub, uh, quadratic in, in length of your context.Uh, it's very compact way to represent things, and that's a liquid AI company. They... Their goal is to productize it, and very often you have this need, uh, when you need to have long context and small model, and you want to have low latency. Like in general, it's basically on par with transformers, and if you do hybrids with transformers, it's, it's even better.That's why we at Shopify, when we tried multiple and we constantly try multiple models, multiple companies, we found that for small, particularly with low latency applications, when you have low latency and/or if you need longer context lengths, liquid was the best. And so we still use the whole zoo and always like obviously test and use everything, uh, every open source model and, you know, it feels l
My conversation with Andrea starts at about 41 minutes in to today's show after headlines and clips Subscribe and Watch Interviews LIVE : On YOUTUBE.com/StandUpWithPete ON SubstackStandUpWithPete Stand Up is a daily podcast. I book,host,edit, post and promote new episodes with brilliant guests every day. This show is Ad free and fully supported by listeners like you! Please subscribe now for as little as 5$ and gain access to a community of over 750 awesome, curious, kind, funny, brilliant, generous soul On YOUTUBE.com/StandUpWithPete ON SubstackStandUpWithPete Andrea Jones-Rooy, Ph.D., is a data and social scientist, science educator, standup comedian, and circus performer. They are a professor and the Director of Undergraduate Studies at the NYU Center for Data Science, where they teach the flagship undergraduate course, Data Science for Everyone, as well as advanced courses on Natural Language Processing. Andrea is also a research consultant and keynote speaker for global Fortune 500 and tech companies of all sizes on how to thoughtfully integrate data science into achieving their goals, especially in the people analytics space. When they aren't doing those things, they perform standup, trapeze, and fire all over the world. Andrea hosts the podcast Majoring in Everything and is working on a book about why focusing on just one thing is overrated. Get in touch after the interview… • @jonesrooy on Twitter, Instagram, and TikTok www.jonesrooy.com jonesrooy@gmail.com Listen rate and review on Apple Podcasts Listen rate and review on Spotify Pete On Instagram Pete on Blue Sky Pete on Threads Pete on Tik Tok Pete on Twitter Pete Personal FB page Stand Up with Pete FB page Gift a Subscription https://www.patreon.com/PeteDominick/gift Send Pete $ Directly on Venmo All things Jon Carroll Buy Ava's Art Subscribe to Piano Tuner Paul Paul Wesley on Substack Listen to Barry and Abigail Hummel Podcast Listen to Matty C Podcast and Substack Follow and Support Pete Coe Hire DJ Monzyk to build your website or help you with Marketing
On this episode of 1050 Bascom, Nama sits down with Anna Haensch, Research Associate Professor at the Data Science Institute at UW-Madison, to discuss her experience at the intersection between AI, Data Science, and Public Policy, her work at the Science Desk at NPR, the U.S. Senate, and the Digital Scholarship Hub at UW (where she is Associate Director), and more!
Topics covered in this episode: Django Modern Rest Already playing with Python 3.15 Cutting Python Web App Memory Over 31% tryke - A Rust-based Ptyhon test runner with a Jest-style API Extras Joke Watch on YouTube About the show Sponsored by us! Support our work through: Our courses at Talk Python Training The Complete pytest Course Patreon Supporters Connect with the hosts Michael: @mkennedy@fosstodon.org / @mkennedy.codes (bsky) Brian: @brianokken@fosstodon.org / @brianokken.bsky.social Show: @pythonbytes@fosstodon.org / @pythonbytes.fm (bsky) Join us on YouTube at pythonbytes.fm/live to be part of the audience. Usually Monday at 11am PT. Older video versions available there too. Finally, if you want an artisanal, hand-crafted digest of every week of the show notes in email form? Add your name and email to our friends of the show list, we'll never share it. Michael #1: Django Modern Rest Modern REST framework for Django with types and async support Supports Pydantic, Attrs, and msgspec Has ai coding support with llms.txt See an example at the “showcase” section Brian #2: Already playing with Python 3.15 3.15.0a8, 2.14.4 and 3.13.13 are out Hugo von Kemenade beta comes in May, CRs in Sept, and Final planned for October But still, there's awesome stuff here already, here's what I'm looking forward to: PEP 810: Explicit lazy imports PEP 814: frozendict built-in type PEP 798: Unpacking in comprehensions with * and ** PEP 686: Python now uses UTF-8 as the default encoding Michael #3: Cutting Python Web App Memory Over 31% I cut 3.2 GB of memory usage from our Python web apps using five techniques: async workers import isolation the Raw+DC database pattern local imports for heavy libraries disk-based caching See the full article for details. Brian #4: tryke - A Rust-based Ptyhon test runner with a Jest-style API Justin Chapman Watch mode, Native async support, Fast test discovery, In-source testing, Support for doctests, Client/server mode for fast editor integrations, Pretty, per-assertion diagnostics, Filtering and marks, Changed mode (like pytest-picked), Concurrent tests, Soft assertions, JSON, JUnit, Dot, and LLM reporters Honestly haven't tried it yet, but you know, I'm kinda a fan of thinking outside the box with testing strategies so I welcome new ideas. Extras Brian: Why are't we uv yet? Interesting take on the “agents prefer pip” Problem with analysis. Many projects are libraries and don't publish uv.lock file Even with uv, it still often seen as a developer preference for non-libarries. You can sitll use uv with requirements.txt PyCon US 2026 talks schedule is up Interesting that there's an AI track now. I won't be attending, but I might have a bot watch the videos and summarize for me. :) What has technology done to us? Justin Jackson Lean TDD new cover Also, 0.6.1 is so ready for me to start f-ing reading the audio book and get on with this shipping the actual f-ing book and yes I realize I seem like I'm old because I use “f-ing” while typing. Michael: Python 3.14.4 is out Beanie 2.1 release Joke: HumanDB - Blazingly slow. Emotionally consistent.
Talk Python To Me - Python conversations for passionate developers
The OWASP Top 10 just got a fresh update, and there are some big changes: supply chain attacks, exceptional condition handling, and more. Tanya Janca is back on Talk Python to walk us through every single one of them. And we're not just talking theory, we're going to turn Claude Code loose on a real open source project and see what it finds. Let's do it. Episode sponsors Temporal Talk Python Courses Links from the show DevSec Station Podcast: www.devsecstation.com SheHacksPurple Newsletter: newsletter.shehackspurple.ca owasp.org: owasp.org owasp.org/Top10/2025: owasp.org from here: github.com Kinto: github.com A01:2025 - Broken Access Control: owasp.org A02:2025 - SecuA02 Security Misconfiguration: owasp.org ASP.NET: ASP.NET A03:2025 - Software Supply Chain Failures: owasp.org A04:2025 - Cryptographic Failures: owasp.org A05:2025 - Injection: owasp.org A06:2025 - Insecure Design: owasp.org A07:2025 - Authentication Failures: owasp.org A08:2025 - Software or Data Integrity Failures: owasp.org A09:2025 - Security Logging and Alerting Failures: owasp.org A10 Mishandling of Exceptional Conditions: owasp.org https://github.com/KeygraphHQ/shannon: github.com anthropic.com/news/mozilla-firefox-security: www.anthropic.com generalpurpose.com/the-distillation/claude-mythos-what-it-means-for-your-business: www.generalpurpose.com Python Example Concepts: blobs.talkpython.fm Watch this episode on YouTube: youtube.com Episode #545 deep-dive: talkpython.fm/545 Episode transcripts: talkpython.fm Theme Song: Developer Rap
Talk Python To Me - Python conversations for passionate developers
When you pip install a package with compiled code, the wheel you get is built for CPU features from 2009. Want newer optimizations like AVX2? Your installer has no way to ask for them. GPU support? You're on your own configuring special index URLs. The result is fat binaries, nearly gigabyte-sized wheels, and install pages that read like puzzle books. A coalition from NVIDIA, Astral, and QuanSight has been working on Wheel Next: A set of PEPs that let packages declare what hardware they need and let installers like uv pick the right build automatically. Just uv pip install torch and it works. I sit down with Jonathan Dekhtiar from NVIDIA, Ralf Gommers from Quansight and the NumPy and SciPy teams, and Charlie Marsh, founder of Astral and creator of uv, to dig into all of it. Episode sponsors Sentry Error Monitoring, Code talkpython26 Temporal Talk Python Courses Links from the show Guests Charlie Marsh: github.com Ralf Gommers: github.com Jonathan Dekhtiar: github.com CPU dispatcher: numpy.org build options: numpy.org Red Hat RHEL: www.redhat.com Red Hat RHEL AI: www.redhat.com RedHats presentation: wheelnext.dev CUDA release: developer.nvidia.com requires a PEP: discuss.python.org WheelNext: wheelnext.dev Github repo: github.com PEP 817: peps.python.org PEP 825: discuss.python.org uv: docs.astral.sh A variant-enabled build of uv: astral.sh pyx: astral.sh pypackaging-native: pypackaging-native.github.io PEP 784: peps.python.org Watch this episode on YouTube: youtube.com Episode #544 deep-dive: talkpython.fm/544 Episode transcripts: talkpython.fm Theme Song: Developer Rap
"We're not going to get the liberation we all crave on a soul level without risk." Andrea reflects on why, now more than ever, we must follow our hearts and refuse to let fear, or Steve Bannon, that Jabba the Hutt of American politics, live rent-free in our heads. Being brave, taking risks: that's how we win. In this special excerpt from last Monday's Gaslit Nation Salon, Andrea honors her beloved uncle, Phil Bourne. "Uncle Phil" was the founding dean of the University of Virginia School of Data Science, earned more than 100,000 citations on Google Scholar, championed the collaboration between the liberal arts and STEM as essential to the future of education, and served as the founding Editor-in-Chief of PLOS Computational Biology, where he created the "Ten Simple Rules" series. Honor Uncle Phil's memory by reaching out to your loved ones and saying what's truly in your heart. We do not have as much time here as we think. Don't let the fascists steal that time from you. Bethany McKee, founder of the Outreach Committee, a group that meets to discuss how to deal with the MAGA cultists in our lives as they awaken to their own self-destruction, will host today's Gaslit Nation Salon at 4 p.m. ET. You can find the Zoom link at Patreon.com/Gaslit. Thank you to everyone who supports the show. We could not make Gaslit Nation without you. Join us for an evening honoring the power of art and defiance at the book launch of Mrs. Orwell, Andrea's inspiring new graphic novel, illustrated by Brahm Revel. When: April 13 Where: PowerHouse Books Arena, DUMBO, Brooklyn Details here: https://powerhousearena.com/events/book-launch-mrs-orwell-by-andrea-chalupa-in-conversation-with-nomiki-konst/ Patreon Supporters: You and your guests get in free and receive a complimentary book! Just message us through Patreon to claim yours. Not a member yet? Join our community at Patreon.com/Gaslit. We couldn't make this show without you–see you there!