Podcasts about cython

  • 12PODCASTS
  • 29EPISODES
  • 1h 11mAVG DURATION
  • ?INFREQUENT EPISODES
  • Sep 14, 2023LATEST

POPULARITY

20172018201920202021202220232024


Best podcasts about cython

Latest podcast episodes about cython

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Want to help define the AI Engineer stack? Have opinions on the top tools, communities and builders? We're collaborating with friends at Amplify to launch the first State of AI Engineering survey! Please fill it out (and tell your friends)!If AI is so important, why is its software so bad?This was the motivating question for Chris Lattner as he reconnected with his product counterpart on Tensorflow, Tim Davis, and started working on a modular solution to the problem of sprawling, monolithic, fragmented platforms in AI development. They announced a $30m seed in 2022 and, following their successful double launch of Modular/Mojo

The Real Python Podcast
Differentiating the Versions of Python & Unlocking IPython's Magic

The Real Python Podcast

Play Episode Listen Later Jul 28, 2023 46:11


What are all the different versions of Python? You may have heard of Cython, Brython, PyPy, or others and wondered where they fit into the Python landscape. This week on the show, Christopher Trudeau is here, bringing another batch of PyCoder's Weekly articles and projects.

Python Bytes
#345 Some Big Time Releases

Python Bytes

Play Episode Listen Later Jul 26, 2023 35:52


Watch on YouTube About the show Sponsored by us! Support our work through: Our courses at Talk Python Training The Python People Podcast Patreon Supporters Connect with the hosts Michael: @mkennedy@fosstodon.org Brian: @brianokken@fosstodon.org Show: @pythonbytes@fosstodon.org Join us on YouTube at pythonbytes.fm/live to be part of the audience. Usually Tuesdays at 11am PT. Older video versions available there too. Michael #1: Cython 3.0 Long in development, the new major release of the Python-to-C compiler sheds legacy Python support and readies Cython developers for big changes in Python. Cython 3 cleans up and modernizes Cython. Pure Python mode allows Python developers to use their existing Python linting and code analysis tools on Cython. Brian #2: Reading code : An important but seldom-discussed skill Eric Matthes A cool walk through of several techniques to read code Strategies Ignore function definitions And in the example, also ignore comments Simplify repetitive blocks Examples shows mentally lumping a bunch of print statements into “print message” Utilize IDE tools, like folding to hide functions your not looking at Also includes a note about writing readable code. Notes: People believe your function and variable names, they should be descriptive, and they should not be deceptive. Michael #3: Major new version of MicroPython: v1.20.0 via Matt Trentini >10 months, >1000 mainline commits from >100 contributors This release of MicroPython introduces a new lightweight package manager called mip. In the MicroPython runtime, core/built-in types have been compressed by only including in the C-level type struct as many slots for C function pointers as is needed for a given type → Any third-party C extensions will need to be updated to work with this change. Massive list of detailed changes. Brian #4: Advanced Python Tips for Development Scofield Idehen There's 15 in the article, here's a few 1 & 2. Use List Comprehensions and Generator Expressions. It's cool to see them side by side enumerate() is fun Embrace zip(). It's weird, but very useful. Utilize slots to Reduce Memory Usage Extras Brian: Hear the story behind the quote “I came for the language, but I stayed for the community.” and learn about fountain pens, tea, and a Murderbot, on this week‘s Python People. Michael: Search (LLM like) Talk Python: explore-talk-python-to-me.streamlit.app by Aguss Joke: You're full stack now Seriously, take the HTMX course :)

Python Podcast
GUI-Applikationen am Beispiel von MiaPlan

Python Podcast

Play Episode Listen Later May 4, 2023


Python Bytes
#322 Python Packages, Let Me Count The Ways

Python Bytes

Play Episode Listen Later Feb 7, 2023 46:40


Watch on YouTube About the show Sponsored by Microsoft for Startups Founders Hub. Connect with the hosts Michael: @mkennedy@fosstodon.org Brian: @brianokken@fosstodon.org Show: @pythonbytes@fosstodon.org Special guest: @calvinhp@fosstodon.org Join us on YouTube at pythonbytes.fm/stream/live to be part of the audience. Usually Tuesdays at 11am PT. Older video versions available there too. Brian #1: Packaging Python Projects Tutorial from PyPA This is a really good starting point to understand how to share Python code through packaging. Includes discussion of directory layout creating package files, LICENSE, pyproject.toml, README.md, tests and src dir how to fill out build-system section of pyproject.toml using either hatchling, setuptools, flit, or pdm as backends metadata using build to generate wheels and tarballs uploading with twine However For small-ish pure Python projects, I still prefer flit flit init creates pyproject.toml and LICENSE will probably still need to hand tweak pyproject.toml flit build replaces build flit publish replaces twine The process can be confusing, even for seasoned professionals. Further discussion later in the show Michael #2: untangle xml Convert XML to Python objects Children can be accessed with parent.child, attributes with element['attribute']. Call the parse() method with a filename, an URL or an XML string. Given this XML: [HTML_REMOVED] [HTML_REMOVED] [HTML_REMOVED] [HTML_REMOVED] Access the document: obj.root.child['name'] # u'child1' A little cleaner that ElementTree perhaps. Calvin #3: Mypy 1.0 Released Mypy is a static type checker for Python, basically a Python linter on steroids Started in 2012 and developed by a team at Dropbox lead by https://github.com/JukkaL What's New? New Release Numbering Scheme not using symver Significant backward incompatible changes will be announced in the blog post for the previous feature release feature flags will allow users to upgrade and turn on the new behavior Mypy 1.0 is 40% faster than 0.991 against the Dropbox internal codebase 20 optimizations included in this release Mypy now warns about errors used before definition or possibly undefined variables for example if a variable is used outside of a block of code that may not execute Mypy now supports the new Self type introduced in PEP 673 and Python 3.11 Support ParamSpec in Type Aliases Also, ParamSpec and Generic Self types are no loner experimental Lots of Miscellaneous New Features Fixes to crashes Support for compiling Python match statements introduced in Python 3.10 Brian #4: Thoughts on the Python packaging ecosystem Pradyun Gedam Some great background on the internal tension around packaging. Brian's note: in the meantime people are struggling to share Python code the “best practice” answer seems to shift regularly this might be healthy to arrive at better tooling in the long term, but in the short term, it's hurting us. From the article: The Python packaging ecosystem unintentionally became the type of competitive space that it is today. The community needs to make an explicit decision if it should continue operating under the model that led to status quo. Pick from N different tools that do N different things is a good model. Pick from N ~equivalent choices is a really bad user experience. Picking a default doesn't make other approaches illegal. Communication about the Python packaging ecosystem is fragmented, and we should improve that. Pradyun: “Many of the users who write Python code are not primarily full-time software engineers or “developers”.” from Thea: “The reason there are so many tools for managing Python dependencies is because Python is not a monoculture and different folks need different things.” opening up the build backend through pyproject.toml-based builds was good but the fracturing of multiple “workflow” tools seems bad. “I am certain that it is not possible to create a single “workflow” tool for Python software. What we have today, an ecosystem of tooling where each makes different design choices and technical trade-offs, is a part of why Python is as widespread as it is today. This flexibility and availability of choice is, however, both a blessing and a curse.” On building a default workflow tool around pip interesting idea There's tension between “we need a default workflow tool” and “unix philosophy: many focused tools that can work together”. Michael #5: Top PyPI Packages A monthly dump of the 5,000 most-downloaded packages from PyPI. Also, a full copy of PyPI info too: github.com/orf/pypi-data Calvin #6: SQLAlchemy 2.0 Released #57 on the Top PyPI Packages

Python Bytes
#303 This title is required or is it optional?

Python Bytes

Play Episode Listen Later Sep 29, 2022 37:56


Watch the live stream: Watch on YouTube About the show Sponsored by Microsoft for Startups Founders Hub. Michael #1: Human regular expressions revisited via Mikael Honkala We mentioned of Al Sweigart's humre in Python Bytes… Mikael went on a little search and compiled my findings into this repo. A lot of people feel that re needs some help. At least 3 of the "serious" packages I found came out in the last few months. Since a package like this is not overly complex to make, all the ways to approach the problem are clearly being explored. Unfortunately these seem to be mostly single-person efforts, and many have fallen to the wayside before long. Hopefully there's some consolidation on the horizon, to share some of the maintenance effort and establish some of the packages as here for the long haul. The list could be useful to you if you are: Looking for a tool: Check the list to get a quick idea of the "look and feel" of each package. Thinking about building a tool: Check the list for alternative approaches, and maybe consider if contributing to an existing package might be a better way to get what you need. Building a tool, or already have one: Use the list to clarify and communicate what the main differences and strengths of your solution are. Brian #2: Implicit Optional Types Will Be Disabled by Default … in a future mypy feature release (possibly the next one after 0.98x) … Thanks Adam Johnson for spotting this and letting us know Stop doing this: s: str = None Do one of these: s: str | None = None s: Union[str, None] = None s: Optional[str] = None ← but this has problems Optional != optional From python docs: ”Optional[X] is equivalent to X | None (or Union[X, None]).” “Note that this is not the same concept as an optional argument, which is one that has a default. An optional argument with a default does not require the Optional qualifier on its type annotation just because it is optional. “ Best described in FastAPI docs, Python Types Intro, starting at “Possibly None" Recommendation is to use: s: str | None = None for Python 3.10+ s: Union[str, None] = None for Python 3.9+ For 3.7, 3.8, you still have Optional as an option, I think. Why haven't you upgraded to 3.9? We're almost to 3.11, what's the problem?! Michael #3: cython-lint by Marco Gorelli A tool (and pre-commit hook) to lint Cython files, similar to how flake8 lints Python files, and works by parsing Cython's own AST (abstract syntax tree). Found quite a few nice clean-ups which could be applied on: pandas numpy scikit-learn cupy Brian #4: difftastic - structural diff “Difftastic is a structural diff tool that understands syntax.” “Difftastic detects the language, parses the code, and then compares the syntax trees.” Interesting story about building difftastic For one off git diff replacement use GIT_EXTERNAL_DIFF=difft git diff or GIT_EXTERNAL_DIFF="difft --syntax-highlight=off" git diff To always use difft with git, see https://difftastic.wilfred.me.uk/git.html Extras Brian: Oh My Git! - An open source game about learning Git! Python 3.11.0 is up to rc2 Michael: NextDNS Joke: I mean, who's wrong?

Python Podcast
Microservices

Python Podcast

Play Episode Listen Later Apr 7, 2022 115:55


Janis, Dominik und Jochen unterhalten sich über Microservices. Letztes hatten wir ja schon so ein bisschen darüber gesprochen und daraufhin hat sich Janis gemeldet und gefragt, ob wir da nicht mal eine komplette Sendung mit ihm drüber machen wollen. Wollten wir natürlich :).   Und hier noch die Antwort auf alle Fragen im Bereich Softwareentwicklung Shownotes Unsere E-Mail für Fragen, Anregungen & Kommentare: hallo@python-podcast.de News aus der Szene Okta breach PYPL PopularitY of Programming Language Meta donates $300,000 to the Python Software Foundation | Łukasz Langa - #Programming GitHub Issues Migration: status update Cython is 20! Neue Programmiersprachen: vlang | zig April: PyCon DE & PyData Berlin 2022 Juli: EuroPython September: DjangoCon EU 2022 Werbung Ailio sucht Mitarbeiter | Anfragen bitte an diese Mailadresse: business@ailio.de Microservices BoundedContext / Single source of truth Buch: Building Microservices, 2nd Edition Sam Newman on Information Hiding, Ubiquitous Language, UI Decomposition and Building Microservices Sam Newman: Monolith to Microservices (InfoQ Podcast) Folge 99 - Sam Newman - Monolith to Microservices ELK-Stack Apache Kafka Buch: Software Architecture with Python MonolithFirst Benchmark Caddy / Nginx / Uvicorn Benchmarking nginx vs caddy vs uvicorn for serving static files Uvicorn / uvloop Picks bpytop / glances Kafka Connect

Talk Python To Me - Python conversations for passionate developers
#355: EdgeDB - Building a database in Python

Talk Python To Me - Python conversations for passionate developers

Play Episode Listen Later Mar 6, 2022 78:06


What database are you using in your apps these days? If you like most Python people, it's probably PostgreSQL. If you roll with NoSQL like me, you're probably using MongoDB. Maybe you're even using a graph database focused more on relationships. But there's a new Python database in town, and as you learn in during this episode, many critical Python libraries have come into existence because of it. This database is called EdgeDB. EdgeDB is built upon Postgres, implemented mostly in python, and is something of a marriage of a traditional relational database and an ORM. Python's async and await keywords, uvloop - the high performance asyncio event loop, and asyncpg all have ties back to the creation of EdgeDB. Yury Selivanov, the co-founder & CEO of EdgeDB, PSF fellow, and Python core developer is here to tell use about EdgeDB along with the history of many of these impactful language features and packages. Links from the show Yury Selivanov: @1st1 MagicPython: github.com/MagicStack/MagicPython uvloop: github.com/MagicStack/uvloop asyncpg: github.com/MagicStack/asyncpg TaskGroups and ExceptionGroups: twitter.com EdgeDB: edgedb.com Schema modeling: edgedb.com/showcase/data-modeling Easy EdgeDB book: edgedb.com/easy-edgedb Roadmap: edgedb.com/roadmap pgMustard: pgmustard.com PyBay: Building a Database with Python Talk: youtube.com Michael's course on async and await + Cython + uvloop: talkpython.fm/async Michael's PyBay talk: Flask + HTMX: youtube.com Watch this episode on YouTube: youtube.com Episode transcripts: talkpython.fm --- Stay in touch with us --- Subscribe on YouTube: youtube.com Follow Talk Python on Twitter: @talkpython Follow Michael on Twitter: @mkennedy Sponsors Sentry Error Monitoring, Code TALKPYTHON SignalWire Talk Python Training

Sustain
Episode 111: Amanda Casari on ACROSS and Measuring Contributions in OSS

Sustain

Play Episode Listen Later Mar 4, 2022 43:02


Guest Amanda Casari Panelists Richard Littauer | Ben Nickolls | Eric Berry Show Notes Hello and welcome to Sustain! The podcast where we talk about sustaining open source for the long haul. We are very excited for today's podcast. Our guest is Amanda Casari, who is a Developer Relations Engineer and Open Source Researcher at Google Open Source Programs Office (OSPO). Today, we learn about some open source work Amanda is doing with her research team at the University of Vermont Complex Systems Center, she tells us about a project called ACROSS, and a paper that was written by her team that was actively looking at contributions that are measured for code centric repositories. Amanda goes in depth about what open source is to her, she shares advice if you're looking to collaborate more effectively with people in open source, she talks more about how we can support projects financially to other parts of the world and mentions some great groups she worked with. Go ahead and download this episode to learn more! [00:02:00] Amanda fills us in on the open source work that she started working on with the University of Vermont Complex Systems Center. [00:06:43] Amanda explains the “assumptions we have that aren't verified,” as well as a paper that came from their research team and what they examined. [00:09:52] We learn more about how people interface with closed decisions behind doors and open source. [00:13:30] Ben asks Amanda to tell us what kind of behaviors and differences she sees between communities that emerge and continue to exists off of platforms like GitHub and GitLab. [00:15:50] Amanda tells us about a project their team is working on called ACROSS, and a paper that won a FOSS award last year that was about actively looking at contributions that are measured for code centric repositories. [0019:18] Eric wonders what type of responsibility Amanda sees that would come from GitHub and if that's going to affect us long term. [00:23:01] Amanda explains working as a Control Systems Engineer, and she explains how she sees open source as blocked diagrams and feedback loops. [00:27:53] We hear some great advice from Amanda if you are someone who wants to make the world of open source a more complex and beautiful place with what you have to offer. [00:32:08] We hear some thoughts from Amanda for people working in open source who don't have a huge amount of privilege to have the ability to share their energy and find it harder to think laterally. [00:35:27] Ben wonders what we can do to support projects financially and what we can do to support the next generation from the different parts of the world who haven't had the opportunity to benefit yet. Amanda shares her thoughts and mentions some really great groups she worked with such as Open Source Community Africa, PyCon Africa, and Python Ghana. [00:39:24] Find out where you can follow Amanda online. Quotes [00:09:01] “A lot of open source decision making is really behind proprietary or closed doors.” [00:19:59] “When it feels like there is only one option for any kind of tool, infrastructure, or access, that's when I always start getting concerned.” [00:24:58] “Open source is a ___ system.” [00:29:59] “Open source is not one thing, it's many interactive parts that fit together in different ways.” Spotlight [00:40:10] Eric's spotlight is an article Amanda submitted on “Open source ecosystems need equitable credit across contributions.” [00:40:39] Ben's spotlight is a shout out to Jess Sachs and the maintainers of Faker.js. [00:41:22] Richard's spotlight is Red Hen Baking in Vermont. [00:41:47] Amanda's spotlights are two books: Data Feminism _and _The Data-Sitters Club that she found on The Executable Books Project. Links SustainOSS (https://sustainoss.org/) SustainOSS Twitter (https://twitter.com/SustainOSS?ref_src=twsrc%5Egoogle%7Ctwcamp%5Eserp%7Ctwgr%5Eauthor) SustainOSS Discourse (https://discourse.sustainoss.org/) Amanda Casari Twitter (https://twitter.com/amcasari?ref_src=twsrc%5Egoogle%7Ctwcamp%5Eserp%7Ctwgr%5Eauthor) Amanda Casari LinkedIn (https://www.linkedin.com/in/amcasari/) Open Source Stories (https://www.opensourcestories.org/) The penumbra of open source: projects outside of centralized platforms are longer maintained, more academic and more collaborative (https://arxiv.org/abs/2106.15611) Getting the Giella source code for your language (https://giellalt.uit.no/infra/GettingStarted.html) Julia Ferraioli Blog (https://www.juliaferraioli.com/blog/) What contributions count? Analysis of attribution in open source (article) (https://scholar.google.com/citations?view_op=view_citation&hl=en&user=VRBk-q8AAAAJ&citation_for_view=VRBk-q8AAAAJ:qjMakFHDy7sC) ACROSS Taxonomy-GitHub (https://github.com/google/across) RubyConf 2021- Black Swan Events in Open Source-That time we broke the Internet (https://docs.google.com/presentation/d/1g9UDReu80wo14H8beoAJ6n69ZorBYhLjKxOU1ngegeY/edit#slid) All Contributors bot-GitHub App (https://github.com/all-contributors/app) All Contributors (https://allcontributors.org/) Open Source Community Africa (https://oscafrica.org/) PyCon Africa (https://pycon-africa-stage.us.aldryn.io/) Python Ghana (https://www.pythonghana.org/) Open source ecosystems need equitable credit across contributions (article) (https://bagrow.com/pdf/casari2021.pdf) Faker (https://github.com/faker-js/faker) Red Hen Baking Co. (https://www.redhenbaking.com/) Data Feminism (https://data-feminism.mitpress.mit.edu/) The Executable Books Project (https://executablebooks.org/en/latest/) The Data-Sitters Club (https://datasittersclub.github.io/site/index.html) Credits Produced by Richard Littauer (https://www.burntfen.com/) Associate Producer Justin Dorfman (https://www.justindorfman.com/) Edited by Paul M. Bahr at Peachtree Sound (https://www.peachtreesound.com/) Show notes by DeAnn Bahr Peachtree Sound (https://www.peachtreesound.com/) Transcript by Layten Pryce (https://www.fiverr.com/misstranscript) Transcript Richard [00:11]: Hello, and welcome to Sustain, the podcast where we're talking about sustaining open-source for the long haul. Who are we? Where do we come from? Where are we going? What are we going to talk about today? Very excited for today's podcast. We have an amazing guest. One of the few guests from the state I am in, which is really fun for me. I just feel like saying that first before anything else, because I don't know why, but before we introduce her, I want to make sure we also talk about the other people you're going to be hearing on today's podcast. So I am Richard [name]. Hello everyone. And then we also have Benjamin Nichols, sometimes known as Ben, how are you? Ben [00:48]: I'm good. I'm a bit enjoying the sun. Thank you. Richard [00:51]: Cool. Okay, great, Eric, how are you doing? Eric [00:54]: No sun, but I'm really happy to be here. I'm very well caffeinated. Richard [00:58]: That is very good. I'm going with apple ciders today. I don't know why, I think it's because I already have caffeine. Great. So that's the little tiny stuff at the beginning to set the mood for the show. And now the actual content. Our guest today is the amazing Amanda Casari. Amanda Casari is a lot of things. She doesn't like titles very much, which is cool. So I'm just going to say what she wrote down in the prep doc, DevRel engineer, plus open source researcher at Google open-source programs office, which we're going to shorten to the Google OSPO for the rest of this conversation, because that's just too much of a word. She also lives in Vermont and has a long and storied career. Amanda, how are you doing? Amanda [01:39]: Hi, I'm doing great. It's so good to be here today. And I'm also absolutely thrilled Richard, that you also live in Vermont. Richard [01:47]: I know we have this small thing in Vermont where we really like talking about being in Vermont. I think it's because we're in a little man's complex because it's a very small state and so it's just nice to be like, oh, someone else, Amanda, actually that might be a good intro. So you've been active in open source communities for over a decade. You've organized local community groups. You've filed issues. You've cleaned the documentation, you've tested fixes or fixed tests. You've done all the things. You move chairs around, but like you're really a systems level person. [02:14] You're all about thinking about what open-source is and how can we make sure that the entirety of open-source regenerates builds better, is more sustainable, is more resilient, is more better for the people inside of it. Part of that work has been working directly with UVM, which is confusingly, the University of Vermont and it's based in Burlington. And it now has, I believe some sort of OSPO. Can you talk about what that is and how that happened? Amanda [02:40]: Yeah, so as brief as I can make it, because otherwise I will spend the next 45 minutes talking about this. I switched into the Google OSPO office because I started and worked on a partnership and a research group with the University of Vermont complex system center. So we started to look within Google and understand how can we really begin to picture, strategize, think about, learn from open-source, like you said, from a systems and ecosystems and networks perspective, which is in line with my background. [03:16] So in the way, way before, I'm a actually a control systems engineer. So problems that are dull, dangerous or dirty fit right with that robotics line of thinking and examining infrastructures and legacy infrastructures and how things interconnect and where they need support and where they don't, is absolutely aligned with what I used to work on. And then I did go to the University of Vermont and I was a fellow at the complex system center. When I was studying power systems and I actually looked at electrical engineering and applied mathematics. [03:48] And so a lot of that is fundamental for the reason why, like my brain is really shaped to examine and look at things, as to what scales and what doesn't, but not from some of the software perspective of how do you scale things, but where do you actually, and can you find rules that may or may not apply at different scales and may not work? So we may try to apply things that work at a smaller group, at a larger scale and they break down and that's when they actually don't scale. So working with the University of Vermont, we started in early 2020, which was a really interesting time to get a new research line started, especially when one of your core researchers is an infectious disease modeler. But I would say the benefit from starting at that time is that we really got lucky in a few places. [04:37] So one of the places that we got lucky in early 2020, is we took everything that we were thinking about for the next two years of life. And we said, this is probably going to change. And we fundamentally moved some of the money and the grant money around to start instead examining who needs support now, what can we do now? So if we're not going to be able to travel, we're not going to be able to hold community workshops. We're not going to be able to invite open-source people together to talk to us, what should we be doing instead? [05:08] One of the things that we did is we hired another researcher. So we took some of the travel money and some of the budget for commuting. We moved that into a position at the time and that, one, was wonderful because that person is brilliant. But second, it really worked out well because I don't remember if everyone remember early 2020 academic institutions were shutting budget and roles and department shut down. And it was really a crisis mode, but we were sheltered from a lot of that because of the structure we set up. [05:33] But there's been a lot of great research coming out of that group and that team. One of the fundamental things we've been just trying to figure out is where's the information you would need to understand and what's happening at open-source at a large scale level? And we found there are a lot of assumptions that are made that we can't verify. So we find that we are looking for information always in a way that respects individuals and respects people in open-source as humans. And doesn't observe them in a way that is without their consent, but it's very hard to find the information you need that doesn't just result from conveniently available information on the internet. [06:12] But for the OSPO perspective at the University of Vermont, UVM is a recent recipient of a Sloan tech grant that is going to be establishing an open-source programs office and also has a research component to understand and look at open-source communities as they emerge, especially as they emerge in local communities who have a directive to really support local effects rather than maybe like a global effect or a corporate good Richard [06:36]: So much in there. Most interesting was there were assumptions that we have that aren't verified. What assumptions are you talking about regarding open-source and what have you looked at? Amanda [06:47]: So I rant a lot amongst researchers and groups of people, Richard, as you know, and I don't have time to verify all of my ranting or all of my hypothesis. But one of the research lines that I am most excited about learning and exploring more. There's a paper that came out from our team and I will add it to the show notes late,r is called the penumbra of open-source. And so the research team and I was not on this paper, but the research team examined whether or not the sample that we used from GitHub is actually representative of the larger open-source ecosystem. [07:24] And so they went about looking for individual hosted, but public and open Git servers to be able to start to look at whether or not, if you choose not to be on a platform like GitHub or GitLab or any other hosted platform repository, does your open-source project organization, metadata, community, organization, decision making, does that look like what's hosted on GitHub? And they found that it wasn't. So GitHub itself, they called the convenient sample. It's something that's used because it's easy for researchers to get to, which I would also challenge the convenience and ease of getting specifically that data access, because most of that data is accessed by researchers, by aggregated collections like the GitHub archive, or there's a few other aggregation projects, but they're all open-source or research projects. [08:15] They are funded by groups like Google or groups like Microsoft. But if you actually wanted to do aggregated research of what is happening in open-source and trends in time. That's something that is a huge data engineering project. And the best that we can do right now is samples off of those aggregated platforms. But it's not clear in a way that it used to be. So if you look at a lot of the studies that are coming out, they may look at something like the Linux kernel, or they may look at something like projects from the Apache software foundation, because all of the tools that those developers use are in a much more aggregated and less distributed format and also less proprietary systems. [08:57] So that data is actually accessible and is more transparent. Otherwise, a lot of open-source decision making is really behind proprietary or closed doors. And that might be the decision of the community. They may not also realize that like the effects of those decisions. Richard [09:12]: I don't know of a lot of projects that are outside of GitHub. I used to know of one, I just checked and Gela Techno Finn minority language documentation has now moved to GitHub, which seems to happen a lot, I assume. And so it's always shocking to me to hear that people have projects elsewhere and they think about it elsewhere. One of the things I want to focus on though, besides that, which always blows my mind, is you talked about open source decision making happening behind doors. And it seems to me to be at ends with what we think of as open-source naively when we begin learning about open=source, we think, oh, open-source, everything's out in the open. [09:50] It's great. freedom of speech, freedom of everywhere. I want to know more about how people interface with closed decisions behind doors and open-source, and whether everyone knows that, and we're just not talking about it openly, or whether that's something that actually causes fractures in communities when they realize that the power is elsewhere. I'm just curious about your opinion on this. Amanda [10:13]: So to be perfectly frank and clear, decisions about open-source have always been behind closed doors. So there is an illusion of access, but not everybody has always been invited to those meetings. So talking with folks who have been involved in open-source even much longer than I have, we've talked about these different kinds of cyclic patterns and community and transparency and in governance, different kinds of governance models. So it used to be that folks would show up a few days before a conference, ahead of time or stay afterwards for a few conferences. [10:49] And if you were invited to those meetings, you were part of that decision making group. But I would like to point out that the first person that became a core dev programmer contributor for the Cython kernel is actually Mariatta Wijaya. And she just joined that a few years ago. So she was the first person who identified as a female who was even invited for this programming language that's been around for 20 years. And I will say, I feel like that community's done a wonderful job in understanding their limitations and where they have and have not been transparent and open. [11:21] And Guido van Rossum has the creator of the language has also been one of the staunch supporters, allies, and movers of change for that. But it took a long time for that to happen. So the idea that there are these close off areas where decision are making is nothing new. However, there was always this idea that at least conversations and decisions and communication happen as something as open as a mailing list, and everybody had access to something like the mailing list. Maybe it was cell hosted or maybe it was hosted on a centralized platform, but at least you could see it. That's not the same case anymore. [11:54] We have a ton of developer platforms now that people choose to have conversations on. Sometimes those communications get centralized with things like repositories. And that is for trying to make communication and understanding more atomic, which is totally understandable. And every community gets to make these decisions for themselves. And if you are trying to piece together all of this information, it's a huge data archeology problem. This is something that Julia Farole and I talk about a lot, is if you just want to understand what's happening in a community, who is making decisions, who has access, who is even doing any of the work, like if we just want to understand what work is even visible or valued in a community that's very challenging to see right now. And that's another one of our core research areas that we're working on, is just making labor visible across open-source. Ben [12:47]: So I just wanted to kind of pick up and extend Richards question to a degree. And just, if you can talk a little bit about the difference that you see in communities that are based on more kind of some might say modern traditional platforms, like GitLab, maybe [13:06 inaudible] to a certain degree, but versus those projects that exist kind of, I would say off-platform and behind kind of mailing list and so on, because I think a lot of people would say that some communication methods like mailing list, mailman and so on could be argued to be less accessible than say, like GitHubs, that's now got a lot more kind of discussion based features and so on. So I was just wondering like what kinds of behaviors you see and what kind of difference do you see between communities that emerge and continue to kind of exist off of platforms, like GitHub and GitLab? Amanda [13:43]: So I will say, I feel like the differences between centralized platform centric communities and non platform centric communities. I feel like that actually is still an open research question because of the fact that again, like the data collection for it is pretty hard to do, so you have to start like adding layers at a time. So you can look at things at just like maybe how the repositories are structured, but that may or may not be indicative of how decisions are made, which may or may not be indicative of communication layers. [14:12] But when we start thinking about this in terms of how do you model that? These are all actually separate modeling techniques that you use for each of these different kinds of layers. And I think that is something our team is actively interested in and working on. I have a lot of theories that are not founded on that right now. I would love to start looking at what kinds and if any, are there heard cultural norms, values, but I would really love to start understanding and seeing when a decision is made to choose one technology over the other for dev tool stacks for a community, because there's a lot of porting that's happened in the last few years. [14:51] How has that worked out? So not even like the initial choice to choose that dev tool or that infrastructure stack may have been made five years ago for different reasons that they would be made now. Has that worked out to meet the community's goals? Has it changed who has access and who has voice? Has it changed who's work is visible or is that something that's still an unsolved problem for the community? And are there ways that we need to think about focusing on that so that they get more visibility and transparency regardless of their decision? Ben [15:21]: I kind of feel like those latter points about whose contributions are recognized and valued and so on is a little bit of a, hidden nugget of another point, because I would say that my opinion, which is also not based on fact, but my experience to date has been communities that are based around platforms like GitHub are maybe a little bit more code centric and communities that aren't are possibly a little bit more interpersonal. And I think that there's a whole load of issues that we could potentially unpack there. Do you see any of that already? Is that something that you are already kind of thinking about or working on? Amanda [15:56]: Yes. So our team has been working on, we call it the across project and I always forget what the acronym stands for, but it basically comes to like better attribution and credit in open source. So we have done research on that. The paper actually won the Fass award at Minimg Software Repositories conference last year. And it was actively looking at contributions that are measured for code centric repositories, as you said, because this is what we're really trying to show, is that when you're only looking at code and acknowledging that a lot of people are trying to shove a lot of things into repos these days that maybe they weren't intentionally designed for, for, but again, going along with that idea of atomic information, about a project or about a community or about an ecosystem. [16:38] So looking at a repository centric view, we evaluated the difference between how GitHub contributors shows actions and gives attribution how the events API does it. There's a tool that one of my colleagues, Katie McLaughlin wrote called octohatrack, which looks at a code repo on GitHub and produces a list of contributors for anybody who's ever interacted with that repository, which is different than what the GitHub API shows. And then we also compared that against repositories that were using the all contributors bot. So the all contributors bot for those listening who are not familiar with this, the bot it is a way that you can manually add in or add in through different actions. So it's, auto plus manual. [17:19] Ways that you can start to give people credit and attribution for things that may not be reflected by a change in the repo. So we started to look at the difference between for communities and projects, what kind of things were getting added manually versus what automatic contributions would show. And we were able to see that folks that were using manual additions were giving credit from more of the kind of work that would never show up in an API. And so part of this is really starting to think about what kind of mixed methods tooling, changes to tooling we should be thinking about as a community to really start to give that visibility into all of the work that happens like this podcast itself, unless it's in a repo is not going to be showing up as a part of the open-source community if you're doing archeology around open-source contributions. [18:12] But I would argue that discourse and thought and community should be something that would be recognized. And so we held some workshops. I mean, we're going to have some more results coming out from that. But one of the things that we did find, which we can talk about is that getting everybody in open-source to agree on what a project is, an organization is, or an event is a very hard problem. So standardized definitions is not something that carries across as a global ecosystem level. And so when we talked earlier about examining different projects, I think drawing boundaries and open-source is a very challenging problem. So you have to be very distinct when you talk about where the boundaries around people are or around technology is as opposed to being able to say open source is like this big, broad thing. Ben [19:01]: I was wondering the role of GitHub. And I'm curious your thoughts on how much control we actually have as an open-source community to make really effective changes when the tool that basically we all kind of go to for open source is a private company with their own interests. I was wondering what type of responsibility you see that would come from GitHub and is that going to affect us long term and how so? Amanda [19:26] : I mean, obviously I work for a for-profit company. I don't work for a nonprofit, I don't work for, I'm not an independent consultant or contractor. So for me, I do look at the question of what is the goal of a community to moving to a centralized platform at any time. And I think that when done intentionally and if always done with a feeling of independence and autonomy, that's the right decision for that team to be able to move and choose which dev tools and platforms work best for them. When it feels like there are only one option for any kind of tool or infrastructure or access, that's when I always will start getting concern. [20:10] So for me, when we think about centralized platforms, I think the trade offs for that is considering whether or not this is serving the community, or is this serving the platform and the product? And always taking the perspective and understanding that whenever you choose to be on a product, even if it's a free tier, it's not that are giving nothing in response for getting everything. So in the before, like before I used to, I had this job, I think one of the jokes I used to have with my friends is, if you would like me to tear down your terms and conditions from a data perspective, I'm happy to do that for you to talk about what kind of things the data teams may be working with based on what you sign off as a user. [20:51] It's something I've been highly aware of my entire career, but I don't know if everybody else views it that way. So I also know that when I talk with folks about doing productivity studies of open-source, it makes people feel a little bit nervous. Nobody wants to observed in a way that they are not opting into. So when I try to think about the work that we're doing and where we encourage and think about transparency, not just as a cultural communal trait, but as a source of representation and census. [21:21] So when we hear or think or talk about the larger effects that open-source has in the world, who gets to be represented in that, how is their work represented in that? Your decisions around transparency and proprietary information, how is that influencing or changing the way that larger view has? How does it change the conversation? How does that change the global business and how investments are made? And I think that we can want to pretend that all of those analogies and realities don't exist, but the fact is that they do, and individual efforts can add up to collective and cumulative effects. [22:04] But that's when we really have to start talking as to who does it serve and why. And so I think for me, when I think about centralized platforms and whether or not that gives access, or it removes access, as long as communities are understanding that and understanding who it leaves out and who it includes, that's really the decision that I look for when I'm trying to see why and how people are choosing to be on different kinds of managed services. Richard [22:33]: I'm really enjoying this conversation and I'm really enjoying listening to you, but it's been difficult for me to formulate a question effectively, partially because a lot of the words you are using are not things that I have here on autopilot. A lot of our guests, no offense to them, they're wonderful guests, but I can just be like, cool, where is your business model coming from? How's that going? How are you making things better? And with you, the concepts that you're throwing out during the conversation are ones that I don't regularly wrestle with, using this verbiage which I find very effective. One of the things that I know we've talked about before is open-source as different types of systems, open-source X kind of a system. You mentioned earlier that you worked as a control. I, don't even remember the term because I don't really know what it is, like a control engineer or something I'm guessing that's more like low level. Amanda [23:22]: Okay. I will give you a little bit of a break Richard in that, control systems engineer comes up on exactly zero drop menus. Anytime I've ever had to input. So I don't even know how many programs have that, but it is what's on my bachelor's degree and it's not something that is, and to be quite fair, it's weapons and control systems engineering. Because I went to the United States Naval academy. So that definitely not on there, but my focus while I was there was robotic systems and environmental engineering, which at the time was why are microgrids not yet feasible and how much does solar cost? So totally fine. If that doesn't didn't originally. Richard [24:05]: That's excellent. Thank you for explaining, what did that mean again? Amanda [24:10]: Well, okay. So the TLDR control systems is how do you take what could be inoperable systems and actually make them work together, in a way where you can abstract enough of the way the physics that you can understand where they interconnect. And for me basically it's how do I now see the world as block diagrams and feedback loops? Richard [24:29]: So how do you see open-source as block diagram and feedback loops? What is open-source then to you? Amanda [24:34]: Okay. So I have a full list of these kinds of things and I will say like I have open documents in writing that I have not yet pushed out. And Julie and I do did touch on this in our Ruby comp talk. So we gave a talk last year called black swans of open-source. And that's a research line we're still working on because we're so fascinated by this issue. But the way that we talk about it is open-source. Like you said, open-source is a blank system. And then it's all these different layers and lenses and views that we are looking at this system as. [25:07] And so talking about, I think we talked about before that open-source is a complex system, which is why Vermont complex systems work so well, then I can go through complexity theory or drop some links into the show notes for folks who need to be able to work on that. But we also view the lens that open-source is a sociotechnical system that you cannot divorce the human and social elements and constructs from the technical decisions and effects that it has. Open-source is distributed. It's cooperative. It's an economic system that we don't talk about enough what that means and the effects that it has again on people in it and how it evolves over time. [25:40] And most recently I've also been trying to parse out in my brain that if we view open-source as a legacy system. The concept of open-source as a legacy system, what does that mean for me and a Jing, like an aging global system construct while still keeping it running and then evolving it moving forward. Where are the magnetic tape mainframes of open-source that we just stick these clients and these things on top of? And then build fatter clients on top of, and then we look at it and we're like, well, everything's fine, right? [26:20] But then we start to have things like critical vulnerabilities that are deep down in these older infrastructures and it strikes us by surprise. So I think this is where the black swans area moves into is because Julie and I really try to parse apart and understand what are the analogies and assumptions that we use to describe open-source and are those valid, do they exist? Are they just constructs in our minds that we've used as either recruiting tales or onboarding tales or based on life experience, but don't really exist outside of our own time-frame. [26:56] So this is, I think for me trying to like really take a step back and understand not to is based off of my experience, people ,I know what I can see online, and this was the Genesis for our open-source stories project too. So for those who don't know, Julie and I run a Story Corp project where we are gathering stories from folks in open-source and making them visible in public. And the purpose of that isn't even to talk about people's journeys in open source, it's just to talk about them as humans so that we really start bringing that cultural perspective together, especially before some folks just decide they no longer want to be involved. [27:31] So these are all the different ways that like, let's say background, current work, everything kind of blends together. How are we actually thinking about this and how does the world that we all love and are apart of work and how can we describe it better so that we could better support it? Richard [27:46]: I couldn't hard agree more with everything that you're saying around different ways of viewing open-source. One of the main question I have personally, and I'm going to try to phrase it in a way that's not just about Richard, is what advice would you give to someone who has these thoughts about open-source? You seem to be very and looking at a complex system and finagling other people to pay you to work on that complex system and then be able to actually effectively get your ideas about that system out there into the world. [28:14] I'm curious for those who are doing other open-source projects, for those who want to try a different economic system in their project, who want to talk about open-source is an ethics system, who want to collaborate more effectively with other people about whether open-source is even the term they want to use anymore, et cetera, et cetera. How would you suggest that they make the world of open-source a more complex and beautiful place with what they offer? What should they do? Amanda [28:41]: First of all, call me maybe, because I love co conspirator and people to talk to and work with. And I would say we talked earlier about how I'm not a fan of titles. Part of that is because so much of my career has been really non-linear, job titles, experiences, roles. And this even goes into, when I talk about thinking of representing labor and open source, I really try to avoid nouns and focus on verbs because it's less about what a person is called and more about the work that they do based on what's needed at the time or required. And so I think one of my verbs I would turn into a noun Richard is professional nerd sniper, and that's hard. [29:16] I don't want sniper in there. So it needs to be like snippet, maybe professional nerd snippet, because going back to the XKCD comic, I am very good in conversations at picking up on what brings people energy and then trying to examine in my like mind map of files, where is there a gap that I see in the world or in my projects or interests or someone else's interests and how can I help this energetic person fit with the thing that gives them energy? [29:48] So for other people, I would say that first of all, if you do have the idea that open source is a complex system, keeping in mind that then open source is not one thing. It's many interacting components and parts that interact together in multiple ways, which also tells us that there are local rules you can look at so that there's no one way to go about being in open-source, doing open-source, contributing to open-source, leading in open-source. So giving yourself, first of all, the permission to examine what is it that brings you energy and where can you put that, versus trying to follow someone else's path or pattern to what it is that they think being a leader in open-source looks like. I mean, I started being a data scientist in 2009. Nobody knew what being a data scientist would look like in 2021, 12 years ago. [30:46] So for people who are trying to examine what to do with their time, energy, talent, is really looking at, I try to view things as we're working in an emergent system. There's no map for what's happening next, especially now. There's so much chaos in what's happening in so many different things that we're working on that if you're trying to move things forward in a linear, like exponential scale, you will probably fail right now. But if instead you're viewing and looking at your work, your contributions, what you want to have as really kind of interacting and nudging things in a way where greater things can emerge from it, I feel like you'll get more satisfaction. [31:28] So I feel like a lot of that disconnect that folks have who view things either as a system or from a complexity point, is that they feel like they keep being shoved into these other expectations and these other expectations of time or scale or the way things work. And I would say if you draw back to the things that you really think to be true and examine that and find other people who value that you'll be much more satisfied. Richard [31:53]: I know you're a huge fan of DEI work in open source. A lot of what you said strikes me as very easy to accomplish if you're privileged, not saying that was intentional about what you said, I'm just saying that's how it struck me. And one of the things I'm curious about is, how would you ask people who are less privileged in open-source to be able to have the ability to do that and to share that energy and to open those doors. What would you suggest for people working open-source who don't have a huge amount of privilege and may find it harder to laterally? Amanda [32:23]: So, first of all, I do want to say, I think working in open-source isn't always going to be recognized as a centralized platform contribution profile. So when we're trying to say who and how do we actually recognize that work, please do not use that as the measurement for your own contributions, which is why I talk a lot about how some of my main contributions in open-source have been making pies for people because it makes me happy and it makes them happy. And that just makes general community good. [32:48] One of the questions I have is when we are looking at understanding what is best and what's next and needed in open-source, I am concerned that we have an increasingly weird bias. And so weird in that case would be categorized as Western educated, industrialized, rich and democratic. I mean, it's something I'm aware of. I talk to people about, and like incognizant of when we are trying to understand the future, are we increasing that or are we decreasing that? [33:15] And for me that means a lot more connection, outreach and learning from people who don't grow up or contribute or form communities that look like that. And I'll say, I have a ton of work to do there. And I'm very excited to meet more folks who create community, contribute to technology, who don't fit that profile and learning more about what engages them, what keeps them there and what challenges they face, because we know what challenges some folks face. We know that some folks work at technology companies and are extremely talented and rich, but none of their work ever shows up in a public place. And then when they get home, they have other things that they have to do and they will never have anything it's in a public place, but it doesn't make them any less of a contributor in the world. [34:02] Or maybe even a contributor towards asking questions and clarifications and making documentation improved in a way that their name will never show up. But I do think the centralized idea of finding and connecting with community is universal and ensuring that everyone has access to information and communication networks is a human right. And so making sure that people all have access to global communication regardless of where they live and the devices that allow them those communication is something we should all be concerned with and that we should make sure that we are in a way that increases equity and not in a way that actually separates us even more. Ben [34:39]: I love this conversation. There have been so many touch points for me that I'm just massively interested in. And to be honest, a little bit obsessed by, and I think there is a moment, an intersection here between kind of a philosophical kind of view of open-source. You kind of get to decide whether it is about the peopl or it's about the code, which for me is kind of like the discussions that you sometimes hear about market economics, is demand and supply actually decided by the demand side or by the supply side, because the supply side creates the demand side? [35:14] I was wondering with that in mind, and talking about the privilege that people have at the moment to be able to use their free time to contribute to open-source software versus those that necessarily don't, what are your thoughts on kind of emerging ways of being able to support projects financially and things that we can do to support that, to bring the next generation from the developing world, from the global [35:38 inaudible], from however you want to kind of refer to the parts of the world where people just haven't really had the opportunity to benefit yet. Amanda [35:45]: So I think one of the best things we can think about doing is technology companies can start building more offices in places that are not the United States and Europe and certain countries in Asia. So encouraging, not just offshore or remote job. And I know that the idea of offices right now still feels like perhaps either a scary thing. But the reason I bring that up is because very concretely that also changes tax structures and incentives and benefits for companies. [36:11] So there's a big difference between being able to hire someone as a contract, which is fine. That's sometimes the job structure that some people want, but that's a very different benefit structure for other people than sometimes being a full-time employee. So when I think about equity, one of the first things I started thinking about is where are you investing in offices? Where are you investing in incorporating your company? Where are you invested in hiring people from? And the very clear economics of link communities in those countries and countries that are not places that other companies do business is sometimes it can be very challenging as you well know, to get money transferred across borders. [36:47] And in a way where it respects regulatory requirements and actually understands all of those tax incentives. So sometimes one of the hard problems in open-source is getting resources to the groups. If you have resources and someone else needs them moving the thing you have to the thing in need can be very challenging because we only have so many systems that are set up to be able to do that. And being able to do that at scale is an entirely different problem. So when I start thinking about growing places, first of all, I do think about also asking the people who are already there and who are already creating those groups and those challenges. [37:25] So I really have learned a lot and I absolutely love working with the folks from open-source community Africa, and also from Python, Africa and Python, Ghana or some really interesting groups. Python, Ghana is interesting for me because is a countrywide Python community. It's both distributed and centralized in the same way that seems to be working well for folks that they work with. And it incorporates a lot of other kind of groups. Open-source community Africa, I had a chance to go to their open-source festival right before the shutdown in 2020. [37:56] And they had, I think they were expecting like a few hundred people. And by the final day it was over a thousand. I mean, it was tons of students and people brought together and it was absolutely wonderful. When I think also too, about another thing I'm working on now, I would love to improve documentation transparency and reporting around sponsorships for open-source of just making it more clear, what organizations need in a way that is discoverable accessible and able to be found by groups. [38:30] I would love the people who have resources to give, to cast wider nets and have better places to be able to connect with those they depend on and in return, I would love transparency reporting for those sponsorships and the impacts of those sponsorships to be accessible in ways that when we see organizations or foundations or very small projects, be recipients of sponsors, giving them the support and the tools they need to be able to show what impact that had also for holding each other more accountable. There's a lot of money moving around in these ecosystems. And the questions that I constantly have is, are those the right places they should be moving? Richard [39:15]: I think that's probably a really good place to wrap up because it was just so succinct and perfect. So thank you so much, Amanda, for people who want to get in touch with you on the internet to learn more how they can collaborate and get these things done with your help, if you're available, where can they find you online? Amanda [39:30]: Twitter is the best place to contact me, which I know is a closed platform, but it's the easiest way for me to go through all of the direct contact. If you're curious about the open-source stories project, we are on GitHub, but we also have a website with links to be able to contact there as well. Richard [39:49]LThank you so much. And Twitter will also be in the show notes for those of you who want to reach her on Twitter. Amanda this has been excellent, but don't go yet. This is the part of the show where we talk about people, projects or things, which we think we should shed light on and or that need more love, that's right. It's spotlight, Eric Barry, what is your spotlight today? Eric [40:11]: First I got to say, I'm just overwhelmed on how amazing the show has been. So thank you, Amanda. Absolutely incredible podcast episode. I'm a big fan boy. So what I'd like to spotlight is actually an article you had submitted on open-source ecosystems, which need equitable credit across all of the contributions and stuff. I read through that, it was just really fascinating. I recommend anybody to read it. The link will be in the show notes. Richard [40:35]: Thank you so much. Excellent. Ben Nichols. Ben [40:38]: This is incredibly timely. So excuse me if it doesn't age too well, but I just wanted to give a big shout out to Jess Sax and the maintainers of [inaudible] JS that have picked up the project and are kind of providing a huge value to the community that depend on that project. We've been working with them over the course of the last week and the way that they have acted to try to kind of set things up in the best interests of all of the users, all of the kind of contributors, the previous maintainers and everything. Like it's just, it's been great to work with them. So I just wanted to kind of call out Jess specifically, but all of the new maintainers of [inaudible] JS. Richard [41:18]: Awesome. Thank you. In a left turn, I'm going to just give a shout out to Red Hen baking. If you're in Vermont and you want to go to a really nice bakery, there's a place in Middlesex, which is really nice. It's called Red Hen. If you don't have a local baker, I'd suggest looking around because if you're in the United States, there's probably a bakery near you somewhere that makes really good bread. This is mine. So Red Hen baking is excellent. Really like their mad river loaf, highly suggest. Amanda, what is your spotlight today? Amanda [41:47]: Yeah. So for those who don't know, I'm also a complete library and book nerd. And so I get really excited about the open-access projects and books. And so my recommendation, I couldn't narrow it down. So I'm going to say my recommendations today. I love the data feminism book that came out in 2020. It is available via open-access. I recently found a project called the data sitters club, which attracted to me because I found it on the executable book project, which is a whole community around Jupiter book, open-access and computational publishing. [42:16] The data sitters club is this group of people who are helping to explain computational text analysis and open data using open-access, open data and actual exploring fair use. And it is completely fair use of the babysitters club that I grew up with. And I absolutely adore the way that they've adopted that. They have a lovely debt of public health posters for the pandemic that they created in 2020 that still bring me joy to read. Richard [42:46]: Love it. Awesome, Amanda, thank you. Once again, it was great having you on, look forward to talking to you further in the future and best of luck with everything. Thanks. Amanda [42:55]: Thank you. This is great. Special Guest: Amanda Casari.

Python Podcast
FastAPI

Python Podcast

Play Episode Listen Later Feb 14, 2022 87:43


Dominik und Jochen unterhalten sich über FastAPI. FastAPI ist ein noch sehr junges, aber trotzdem recht verbreitetes Webframework für Python, das darauf ausgelegt ist, die moderneren Sprachfeatures von Python wie Typannotationen und Async-Fähigkeit besser zu nutzen als traditionellere Webframeworks wie Django oder Flask.     Shownotes Unsere E-Mail für Fragen, Anregungen & Kommentare: hallo@python-podcast.de News aus der Szene PEP 665 -- A file format to list Python dependencies for reproducibility of an application | Brett Cannon CPython on WASM At long last, Black is no longer a beta product! | Stability Policy Django wird jetzt auch wie in DEP 8 angekündigt mit black formatiert PyTest 7.0 release HATEOAS — An Alternative Explanation The future of editing in Wagtail Prototype Fund EdgeDB 1.0 Release | asyncpg -- A fast PostgreSQL Database Client Library for Python/asyncio | uvloop is a fast, drop-in replacement of the built-in asyncio event loop. uvloop is implemented in Cython and uses libuv under the hood. Twitter: My dental hygienist: "Are you flossing regularly?" Me: "Do you backup your laptop and photos regularly?" Laravel Livewire mit Christoph Rumpel | Alpine.Js | Caleb Porzio Werbung Exklusiv-Deal + ein Geschenk

Talk Python To Me - Python conversations for passionate developers

SQLAlchemy is the most widely used ORM (Object Relational Mapper) for Python developers. It's been around since February 2006. But we might be in for the most significant release since the first one: SQLAlchemy 2.0. This version adds async and await support, new context-manager friendly features everywhere, and even a unified query syntax. Mike Bayer is back to give us a glimpse of what's coming and why Python's database story is getting stronger. Links from the show SQLAlchemy: sqlalchemy.org Mike on Twitter: @zzzeek Migrating to SQLAlchemy 2.0: sqlalchemy.org awesome-sqlalchemy: github.com sqlalchemy-continuum versioning: readthedocs.io enum support: github.com alembic: sqlalchemy.org GeoAlchemy: geoalchemy.org sqltap profiling: github.com nplusone: github.com Unit of work: duckduckgo.com ORM + Dataclasses: sqlalchemy.org SQLModel: sqlmodel.tiangolo.com Cython example: cython.org Async SQLAlchemy example: sqlalchemy.org ORM Usages Stats (see ORM section): jetbrains.com Watch this episode on YouTube: youtube.com Episode transcripts: talkpython.fm --- Stay in touch with us --- Subscribe on YouTube: youtube.com Follow Talk Python on Twitter: @talkpython Follow Michael on Twitter: @mkennedy Sponsors TopTal Talk Python Training AssemblyAI

Python Bytes
#255 Closember eve, the cure for Hacktoberfest?

Python Bytes

Play Episode Listen Later Oct 20, 2021 46:49


Watch the live stream: Watch on YouTube About the show Sponsored by us: Check out the courses over at Talk Python And Brian's book too! Special guest: Will McGugan Michael #1: Wrapping C++ with Cython By Anton Zhdan-Pushkin A small series showcasing the implementation of a Cython wrapper over a C++ library. C library: yaacrl - Yet Another Audio Recognition Library is a small Shazam-like library, which can recognize songs using a small recorded fragment. For Cython to consume yaacrl correctly, we need to “teach” it about the API using `cdef extern It is convenient to put such declarations in *.pxd files. One of the first features of Cython that I find extremely useful — aliasing. With aliasing, we can use names like Storage or Fingerprint for Python classes without shadowing original C++ classes. Implementing a wrapper: pyaacrl - The most common way to wrap a C++ class is to use Extension types. As an extension type a just a C struct, it can have an underlying C++ class as a field and act as a proxy to it. Cython documentation has a whole page dedicated to the pitfalls of “Using C++ in Cython.” Distribution is hard, but there is a tool that is designed specifically for such needs: scikit-build. PyBind11 too Brian #2: tbump : bump software releases suggested by Sephi Berry limits the manual process of updating a project version tbump init 1.2.2 initializes a tbump.toml file with customizable settings --pyproject will append to pyproject.toml instead tbump 1.2.3 will patch files: wherever the version listed (optional) run configured commands before commit failing commands stop the bump. commit the changes with a configurable message add a version tag push code push tag (optional) run post publish command Tell you what it's going to do before it does it. (can opt out of this check) pretty much everything is customizable and configurable. I tried this on a flit based project. Only required one change # For each file to patch, add a [[file]] config # section containing the path of the file, relative to the # tbump.toml location. [[file]] src = "pytest_srcpaths.py" search = '__version__ = "{current_version}"' cool example of a pre-commit check: # [[before_commit]] # name = "check changelog" # cmd = "grep -q {new_version} Changelog.rst" Will #3: Closember by Matthias Bussonnier Michael #4: scikit learn goes 1.0 via Brian Skinn The library has been stable for quite some time, releasing version 1.0 is recognizing that and signalling it to our users. Features: Keyword and positional arguments - To improve the readability of code written based on scikit-learn, now users have to provide most parameters with their names, as keyword arguments, instead of positional arguments. Spline Transformers - One way to add nonlinear terms to a dataset's feature set is to generate spline basis functions for continuous/numerical features with the new SplineTransformer. Quantile Regressor - Quantile regression estimates the median or other quantiles of Y conditional on X Feature Names Support - When an estimator is passed a pandas' dataframe during fit, the estimator will set a feature_names_in_ attribute containing the feature names. A more flexible plotting API Online One-Class SVM Histogram-based Gradient Boosting Models are now stable Better docs Brian #5: Using devpi as an offline PyPI cache Jason R. Coombs This is the devpi tutorial I've been waiting for. Single machine local server mirror of PyPI (mirroring needs primed), usable in offline mode. $ pipx install devpi-server $ devpi-init $ devpi-server now in another window, prime the cache by grabbing whatever you need, with the index redirected (venv) $ export PIP_INDEX_URL=http://localhost:3141/root/pypi/ (venv) $ pip install pytest, ... then you can restart the server anytime, or even offline $ devpi-server --offline tutorial includes examples, proving how simple this is. Will #6: PyPi command line Extras Brian: I've started using pyenv on my Mac just for downloading Python versions. Verdict still out if I like it better than just downloading from pytest.org. Also started using Starship with no customizations so far. I'd like to hear from people if they have nice Starship customizations I should try. vscode.dev is a thing, announcement just today Michael: PyCascades Call for Proposals is currently open Got your M1 Max? Prediction: Tools like Crossover for Windows apps will become more of a thing. Will: GIL removal https://docs.google.com/document/u/0/d/18CXhDb1ygxg-YXNBJNzfzZsDFosB5e6BfnXLlejd9l0/mobilebasic?urp=gmail_link https://lwn.net/SubscriberLink/872869/0e62bba2db51ec7a/ vscode.dev Joke: The torture never stops IE (“Safari”) Eating Glue

Python en español
Python en español #21: Tertulia 2021-02-23

Python en español

Play Episode Listen Later May 27, 2021 122:37


Grupos de excepciones (PEP 654), PYPI y hasta bitcoins y blockchains (¡sin hype!) https://podcast.jcea.es/python/21 Este audio tiene mucho ruido producido por el roce del micrófono de Jesús Cea en la ropa. Participantes: Jesús Cea, email: jcea@jcea.es, twitter: @jcea, https://blog.jcea.es/, https://www.jcea.es/. Conectando desde Madrid. Javier, conectando desde Madrid. Miguel Sánchez, email: msanchez@uninet.edu, conectando desde Canarias. Eduardo Castro, email: info@ecdesign.es. Conectando desde A Guarda. Víctor Ramírez, twitter: @virako, programador python y amante de vim, conectando desde Huelva. Audio editado por Pablo Gómez, twitter: @julebek. La música de la entrada y la salida es "Lightning Bugs", de Jason Shaw. Publicada en https://audionautix.com/ con licencia - Creative Commons Attribution 4.0 International License. [00:53] El aviso legal de rigor. ¡Hay un voluntario para editar! [02:23] Pasamos a reunirnos en la sala "py2021" en vez de en la sala "py2020". [03:13] ¿Las cadenas son inmutables? Las cadenas son inmutables, pero los id() se reutilizan cuando se liberan objetos. Se explica qué es id(). No es una identidad persistente, depende de su direcciñon de memoria y la memoria se reutiliza cuando se liberan objetos. [07:23] ¿Hay forma de mutar una cadena? No desde el propio lenguaje, pero desde C y... ¡No vayas por ahí! Python 3.9.2 soluciona un desbordamiento de memoria. [10:03] Grupos de excepciones: PEP 654 https://www.python.org/dev/peps/pep-0654/. Lista de correo de Python Ideas: https://mail.python.org/mailman3/lists/python-ideas.python.org/. Si usas la funcionalidad, la sintaxis y semántica de las excepciones se modifica. Como ocurrió con async y await, si alguno de los paquetes nuevos usa esta funcionalidad, te contaminará tu propio código. [14:28] Probar una biblioteca en diferentes versiones de Python. Tener varias versiones instaladas de Python. make altinstall es tu amiga para poder instalar varias versiones diferentes de Python a la vez en el sistemas operativo. Diferencia entre llamar a python3 y llamar a python3.6. Matriz de tests. Docker: https://es.wikipedia.org/wiki/Docker_(software). Flake8: https://pypi.org/project/flake8/. [22:43] Bibliotecas y cambios de sintaxis en Python con la evolución del intérprete. Si código Python 3 funciona en Python 2... ¿El código era Python 3 realmente? Proyectos con compatibilidad mal especificada. [25:53] Pruebas en varias versiones: Tox: https://pypi.org/project/tox/. pyenv: https://pypi.org/project/pyenv/. Pylint: https://pypi.org/project/pylint/. [27:53] Black: https://pypi.org/project/black/. Reformateador de código "nazi". Se acabaron las discusiones de estilo. ¿Puede un formateador "nazi" ser configurable? ¿No es un oxímoron? [32:28] Pasar tests y comprobaciones cuando se mete código en el control de versiones: gitlint: https://jorisroovers.com/gitlint/. vim-autopep8: https://vim-autopep8.readthedocs.io/en/latest/. [34:53] PEPs recientes con cambios de sintaxis: Grupos de excepciones y "pattern matching": Grupos de excepciones: PEP 654 -- Exception Groups and except* https://www.python.org/dev/peps/pep-0654/. La semántica cambia bastante y "contamina" el código como lo hace async/await. PEP 622 -- Structural Pattern Matching https://www.python.org/dev/peps/pep-0622/. PEP 634 -- Structural Pattern Matching: Specification https://www.python.org/dev/peps/pep-0634/. PEP 635 -- Structural Pattern Matching: Motivation and Rationale https://www.python.org/dev/peps/pep-0635/. PEP 636 -- Structural Pattern Matching: Tutorial https://www.python.org/dev/peps/pep-0636/. [40:28] Trio https://pypi.org/project/trio/ programación asíncrona mejor hecha que con asyncio https://docs.python.org/3/library/asyncio.html. El concepto de "guardería" en Trio: https://trio.readthedocs.io/en/stable/reference-core.html#trio.Nursery. [44:23] Python ha cumplido 30 años. La primera versión pública fue la 0.9.1 en 1991. Happy birthday, Python, you're 30 years old this week: Easy to learn, and the right tool at the right time https://www.theregister.com/2021/02/20/python_at_30/. Compilación en sistemas operativos modernos: https://github.com/smontanaro/python-0.9.1. [45:13] Comparación de diferentes tipos. En Python 2 se podían mezclar, pero en Python 3 no. Programar funciones de comparación personalizadas. Definir tipos personalizados que sepan compararse entre sí. Problemas al migrar un sistema de persistencia de Python 2 a Python 3. BTree: https://es.wikipedia.org/wiki/%C3%81rbol-B. [52:33] ¿Por qué Pillow https://pypi.org/project/Pillow/ se sigue importando como import PIL, la librería que reemplazó hace eones? Confuso. Casos similares (hay muchos más): python-dateutil https://pypi.org/project/python-dateutil/. Beautiful Soup: https://pypi.org/project/beautifulsoup4/. dnspython https://pypi.org/project/dnspython/. [59:18] Seguridad en PYPI https://pypi.org/. [01:00:48] ¿El buscador de PYPI https://pypi.org/ sirve para algo? El orden por relevancia es un chiste. [01:02:18] Estadísticas de descarga en PYPI https://pypi.org/: Antes había contadores de descargas. Vanity: https://pypi.org/project/vanity/. Ahora tenemos: (depende de Google) PyPI Download Stats https://pypistats.org/. pypinfo https://pypi.org/project/pypinfo/. [01:09:48] Servicios que Google ha matado: https://killedbygoogle.com/. 229 servicios hasta el momento. [01:10:23] Jesús y su ideología de servicios gratuitos que crecen a costa del trabajo de los usuarios. Encima dependes de ellos y queman el mercado para servicios comerciales. La resignación y pasividad de los usuarios. [01:13:28] Idea de Jesús: Estamos viviendo en la edad oscura de la informática. Dentro de 50 años no se podrá acceder a la información generada en estos momentos. Por ejemplo: Videojuegos en red con servidores. https://archive.org/. GeoCities https://es.wikipedia.org/wiki/GeoCities. Tumblr https://es.wikipedia.org/wiki/Tumblr. [01:16:43] ¿Mirrors de PYPI https://pypi.org/? Ahora mismo no hay verificación de firmas digitales. Package signing & detection/verification: https://github.com/pypa/warehouse/milestone/16. Red distribuida por IPFS https://es.wikipedia.org/wiki/Sistema_de_archivos_interplanetario o BitTorrent https://es.wikipedia.org/wiki/BitTorrent. Ideas de Jesús para Python España: red distribuida de fotos de las PyConES http://www.es.pycon.org/. [01:21:13] Montar servicios encima de PYPI https://pypi.org/. PYPI proporciona RSS https://es.wikipedia.org/wiki/Rss. PyPI recent updates https://pypi.org/rss/updates.xml. PyPI newest packages https://pypi.org/rss/packages.xml. [01:24:43] GitHub: Security vulnerability alerts for Python https://github.blog/2018-07-12-security-vulnerability-alerts-for-python/. [01:25:13] Compilar paquetes binarios para Windows. [01:26:48] Cython https://pypi.org/project/Cython/ y mypyc https://github.com/mypyc/mypyc. [01:28:33] A veces te importa a ti más el código que al dueño de la biblioteca. Dinámicas de colaboración en proyectos de código abierto. Core developer de Python: Mariatta Wijaya - What is a Python Core Developer? https://www.youtube.com/watch?v=hhj7eb6TrtI. La importancia de la realimentación. [01:35:43] Kodi https://es.wikipedia.org/wiki/Kodi y dinámica del proyecto: La compresión de datos en WebDAV. Que la gestión de base de datos sea Python. Bajar la barrera de entrada al proyecto. [01:39:21] ¿Qué micros tenemos para grabar? ¿Cómo hablamos? [01:45:08] Digresión sobre BitCoins https://es.wikipedia.org/wiki/Bitcoin y la importancia de hacer copia de seguridad del monedero. Blockchain https://es.wikipedia.org/wiki/Cadena_de_bloques. BitCoin desarrolla ideas muy interesantes. Prueba de trabajo: https://es.wikipedia.org/wiki/Sistema_de_prueba_de_trabajo. Cypherpunk: https://en.wikipedia.org/wiki/Cypherpunk. Contrato inteligente: https://es.wikipedia.org/wiki/Contrato_inteligente. Datos abiertos: https://es.wikipedia.org/wiki/Datos_abiertos. [01:55:23] Ojo, que todo queda grabado para la posteridad. Estamos teniendo una conversación de bar. Hay que valorar que las opiniones son opiniones de bar, con el peso de una opinión de bar. [01:58:03] ¿Poner deberes? Nadie tiene tiempo... [01:58:58] La motivación para hacer todo esto. Solapar intereses es difícil. [02:00:23] Despedida. [02:01:45] Final.

Python en español
Python en español #19: Tertulia 2021-02-09

Python en español

Play Episode Listen Later May 20, 2021 126:40


¿Se pueden usar diferentes versiones de una misma librería en un proyecto? (resumen: ¡No vayas por ahí!). MultiVersion Concurrent Control https://podcast.jcea.es/python/19 Participantes: Jesús Cea, email: jcea@jcea.es, twitter: @jcea, https://blog.jcea.es/, https://www.jcea.es/. Conectando desde Madrid. Víctor Ramírez, twitter: @virako, programador python y amante de vim, conectando desde Huelva. Javier, conectando desde Madrid. Miguel Sánchez, email: msanchez@uninet.edu, conectando desde Canarias. Audio editado por Pablo Gómez, twitter: @julebek. La música de la entrada y la salida es "Lightning Bugs", de Jason Shaw. Publicada en https://audionautix.com/ con licencia - Creative Commons Attribution 4.0 International License. [00:52] Prólogo: FOSDEM https://fosdem.org/. Escuchar audios previos para ver errores y comentar "erratas". [03:07] ¡Tenemos un voluntario para editar el podcast! Detalles de cómo grabamos las tertulias. Todo se graba en una sola pista :-(. RNNoise: https://people.xiph.org/~jm/demo/rnnoise/. Ideas para automatizar el proceso. [09:17] Aviso legal de que se está grabando el sonido y que se va a publicar online. El audio publicado tendrá un índice en texto, para poder hacer búsquedas y poder moverse con facilidad entre temas. [11:22] Errata de la tertulia anterior: No, pipenv https://pypi.org/project/pipenv/ no puede instalar dos versiones diferentes de la misma librería. [13:07] Tormenta de ideas https://es.wikipedia.org/wiki/Lluvia_de_ideas sobre cómo usar diferentes versiones de la misma librería en el mismo proyecto. Conclusión: ¡No vayas por ahí! Las bibliotecas solo se cargan una vez en el programa, aunque se hagan muchos import en el código. sys.modules https://docs.python.org/3/library/sys.html#sys.modules. Dependencias transitivas. Subintérpretes Python. PEP 554: https://www.python.org/dev/peps/pep-0554/. Módulos en C: PEP 489 -- Multi-phase extension module initialization https://www.python.org/dev/peps/pep-0489/. [22:17] Python 3.10a5. PEP 636 -- Structural Pattern Matching: Tutorial https://www.python.org/dev/peps/pep-0636/. ¡Más sintaxis nueva! PEP 617 -- New PEG parser for CPython https://www.python.org/dev/peps/pep-0617/. [23:57] Nuitka https://nuitka.net/. Puede generar un binario que no depende de tener nada instalado. [26:02] Volvemos a "Structural Pattern Matching" https://www.python.org/dev/peps/pep-0636/. "Switch" con esteroides. [27:32] Lo importante que fue la modernización de los tutoriales y ejemplos para ayudar a la migración de Python 2 a Python 3. PEP 414 -- Explicit Unicode Literal for Python 3.3 https://www.python.org/dev/peps/pep-0414/. Jesús Cea opina que la migración de Python 2 a Python 3 se hizo mal y ha sido muy traumática. [30:22] PEP 8 https://www.python.org/dev/peps/pep-0008/. ¿Ajustarse estrictamente a 80 columnas? Flake8: https://pypi.org/project/flake8/. [33:22] Mucho cuidado con "python-ideas" https://mail.python.org/mailman3/lists/python-ideas.python.org/. Tabulación de código. La anotación de tipos puede gustar o no, pero de momento es opcional. Tema recurrente: ¿Qué es ser pythonico? [35:12] Ventajas de anotar tipos. Origen de MYPY: http://mypy-lang.org/. Aportar información al IDE https://en.wikipedia.org/wiki/Integrated_development_environment. Valor a la hora de documentar los tipos en los API https://en.wikipedia.org/wiki/API. [39:52] Cryptography https://cryptography.io/en/latest/ y polémica al integrar módulos en Rust https://en.wikipedia.org/wiki/Rust_(programming_language) Comunidad tóxica. [41:27] Digresión sobre systemd https://en.wikipedia.org/wiki/Systemd y otras cosas de sistemas. ¿El cambio por el cambio? [45:07] El peso de la web está moviéndose otra vez al backend. ¿Qué opciones tiene Python en este area? El cliente web solo envía eventos al servidor y recibe cambios al DOM https://es.wikipedia.org/wiki/Document_Object_Model enviadas por el servidor. Abre la posibilidad olvidarnos de JavaScript: https://es.wikipedia.org/wiki/JavaScript. ItsNat: https://en.wikipedia.org/wiki/ItsNat. [51:02] splash https://pypi.org/project/splash/. Servicio de dibujado de javascript en Python. AJAX: https://es.wikipedia.org/wiki/AJAX. [56:07] Integrar Python en otros programas y demonios. LUA: https://es.wikipedia.org/wiki/Lua. [57:07] PyOxidizer https://pyoxidizer.readthedocs.io/en/stable/ y PyO3 https://pyo3.rs/. Interactuar con otros lenguajes. Python en Java, interactuando sin dolor: Jython https://www.jython.org/. [59:52] ¿Cómo empezamos en Python? Valor de Python como lenguaje fácil de entender y pseudocódigo. SpamBayes: http://spambayes.sourceforge.net/. Tutorial de Python: https://docs.python.org/es/3/tutorial/index.html. bc -l https://linux.die.net/man/1/bc. [01:05:07] Modificación atómica de ficheros. En Unix se suele hacer: write + flush + rename. rename: https://www.man7.org/linux/man-pages/man2/rename.2.html. MS Windows eso no funciona. Python 3.3 añadió os.replace() https://docs.python.org/3.8/library/os.html#os.replace. En MS Windows es atómico... casi siempre: Issue8828: Atomic function to rename a file https://bugs.python.org/issue8828. [01:10:02] Combinar fork e hilos en Python es una receta para el desastre. fork: https://www.man7.org/linux/man-pages/man2/fork.2.html. multiprocessing: https://docs.python.org/3/library/multiprocessing.html. [01:11:37] Decorador @overload https://docs.python.org/3/library/typing.html#typing.overload. @functools.singledispatch https://docs.python.org/3/library/functools.html. ¿Qué se ve cuando salta una excepción? Especializaciones. Cython https://cython.org/. [01:17:00] AnyIO https://anyio.readthedocs.io/en/stable/basics.html. Unificación de reactores asíncronos. [01:18:12] "lxml soporta xpath". Hilo en la lista de correo: "[Python-es] Biblioteca XPATH" https://mail.python.org/pipermail/python-es/2021-February/037931.html. lxml: https://lxml.de/. beautifulsoup4: https://pypi.org/project/beautifulsoup4/. XPath: https://es.wikipedia.org/wiki/XPath. Scrapy: https://scrapy.org/. El buscador de PyPI https://pypi.org/ funciona fatal a la hora de ordenar por relevancia. [01:20:02] El valor de estudiar el código fuente ajeno no solo para aprender de él sino también para descubrir qué bibliotecas útiles utilizan para añadirlas a tu cajón de herramientas. Es la documentación última. Los tests son muy útiles para saber cómo se usa el producto. [01:22:02] ¿Cómo gestionáis la paginación cuando los datos del backend cambian? ¿Cómo evitáis repetir resultados o saltaros datos? Brainstorming de diversas estrategias. Berkeley DB: https://pypi.org/project/berkeleydb/. lmdb: https://pypi.org/project/lmdb/. Multiversion concurrency control: https://es.wikipedia.org/wiki/Multiversion_concurrency_control. Copy on Write: https://es.wikipedia.org/wiki/Copy_on_write. Snapshot: https://es.wikipedia.org/wiki/Copia_instant%C3%A1nea_de_volumen. BTree: https://es.wikipedia.org/wiki/%C3%81rbol-B. PostgreSQL: https://www.postgresql.org/. ZFS: https://es.wikipedia.org/wiki/ZFS_(sistema_de_archivos). Normalización y formas normales: https://es.wikipedia.org/wiki/Forma_normal_(base_de_datos). [01:48:42] FOSDEM https://fosdem.org/: Virako recomienda las siguientes: Some SQL Tricks of an Application DBA - Non-trivial tips for database development https://fosdem.org/2021/schedule/event/postgresql_some_sql_tricks_of_an_application_dba/. Database Disasters and How to Find Them https://fosdem.org/2021/schedule/event/postgresql_database_disasters_and_how_to_find_them/. Practical advice for using Mypy - Hidden gems in the typing system! https://fosdem.org/2021/schedule/event/python_mypy/. Escaping the Cargo Cult - How to structure your project without losing your mind. https://fosdem.org/2021/schedule/event/python_escaping_cargo_cult/. [01:52:02] Charla Python Madrid https://www.python-madrid.es/. TDD - ¿panacea del desarrollo o pérdida de tiempo? https://www.python-madrid.es/meetings/reunion-febrero-2021-python-madrid/. [01:54:27] Comentado en la tertulia de la semana pasada: Bugs sobre "pickle" https://docs.python.org/3/library/pickle.html en el módulo __main__. Se trata de un problema conocido. Ejemplo de código: https://pastebin.com/vGM1sh8r. Issue24676: Error in pickle using cProfile https://bugs.python.org/issue24676. Issue9914: trace/profile conflict with the use of sys.modules[__name__] https://bugs.python.org/issue9914. Issue9325: Add an option to pdb/trace/profile to run library module as a script https://bugs.python.org/issue9325. [02:00:42] Que te cuenten lo que no funciona es mucho más interesante. Postmortem. [02:02:52] Whoosh: https://whoosh.readthedocs.io/en/latest/intro.html. ¿Cómo normalizar las palabras para español? La palabra "real" Whoosh: https://www.wordreference.com/es/translation.asp?tranword=whoosh. Dificultades para buscar el proyecto Python Whoosh https://whoosh.readthedocs.io/en/latest/intro.html en internet. [02:05:48] Final.

Python en español
Python en español #16: Tertulia 2021-01-19

Python en español

Play Episode Listen Later May 13, 2021 143:23


Polémica Frameworks, compilación al vuelo, compiladores y rendimiento Python, scraping web y la persistencia vuelve a la carga https://podcast.jcea.es/python/16 Participantes: Jesús Cea, email: jcea@jcea.es, twitter: @jcea, https://blog.jcea.es/, https://www.jcea.es/. Conectando desde Madrid. Eduardo Castro, email: info@ecdesign.es. Conectando desde A Guarda. Javier, conectando desde Madrid. Víctor Ramírez, twitter: @virako, programador python y amante de vim, conectando desde Huelva. Dani, conectando desde Málaga, invitado por Virako. Javier, conectando desde Sevilla, también invitado por Virako. Antonio, conectado desde Albacete. Jorge Rúa, conectando desde Vigo. Audio editado por Pablo Gómez, twitter: @julebek. La música de la entrada y la salida es "Lightning Bugs", de Jason Shaw. Publicada en https://audionautix.com/ con licencia - Creative Commons Attribution 4.0 International License. [01:17] Event sourcing y nieve. Borrasca Filomena: https://es.wikipedia.org/wiki/Borrasca_Filomena. [03:52] Los comentarios legales habituales para poder grabar la tertulia. [04:47] Presentaciones varias, dinámica y motivación de las tertulias. [11:22] Los problemas logísticos de Jesús Cea con sus charlas. [12:52] Debate: Frameworks y cómo condicionan el conocimiento del lenguaje y la forma de desarrollar código. Mucha tela que cortar. [30:22] Conexión con el mundo asyncio. [34:12] Digresión: ¿Cómo funciona la protección CSRF? https://es.wikipedia.org/wiki/Cross-site_request_forgery. Diferencia semántica entre verbos HTTP: GET y POST https://en.wikipedia.org/wiki/POST_(HTTP). Algunos recursos de seguridad web (no exhaustivo, la lista es infinita): CSRF: https://es.wikipedia.org/wiki/Cross-site_request_forgery. Cross-Origin Resource Sharing (CORS) https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS. Content Security Policy Reference https://content-security-policy.com/. La documentación de FastAPI https://fastapi.tiangolo.com/ tiene mucho de seguridad: CORS (Cross-Origin Resource Sharing): https://fastapi.tiangolo.com/tutorial/cors/. OAuth2 with Password (and hashing), Bearer with JWT tokens https://fastapi.tiangolo.com/tutorial/security/oauth2-jwt/. About HTTPS https://fastapi.tiangolo.com/deployment/https/. [39:52] Proyecto ItsNat https://en.wikipedia.org/wiki/ItsNat. Estado en el servidor y el cliente solo gestiona eventos y actualizaciones del DOM que le envía el servidor. Se está moviendo otra vez la inteligencia del navegador al servidor. [44:42] ¿Realmente es imprescindible usar Javascript si tu interfaz es el navegador? Brython: https://brython.info/. Pyjs (antiguo Pyjamas): https://en.wikipedia.org/wiki/Pyjs. Emscripten: https://emscripten.org/. [48:57] ¡Compilación al vuelo! Versionado de diccionarios. PEP 509 Add a private version to dict: https://www.python.org/dev/peps/pep-0509/. Compilación al vuelo: Pyjion: https://pyjion.readthedocs.io/en/latest/index.html. Conflicto con la portabilidad del intérprete. numba: https://numba.pydata.org/. Hay pocos "core developers" y heredar código avanzado que luego hay que mantener es un problema. LLVM: https://en.wikipedia.org/wiki/LLVM. [01:04:27] Los lenguajes de programación deben ser conservadores porque no tienes ni idea de lo que están utilizando los programadores. [01:05:32] Si la documentación se ha actualizado, más vale que hayas actualizado tu código a "cómo se hacen ahora las cosas". [01:06:47] Tema recurrente: ¿Es mejor estar dentro o fuera de la biblioteca estándar? Boost: https://www.boost.org/. [01:09:12] Compiladores de Python: Cython: https://cython.org/. Rendimiento y ofuscación. nuitka: https://nuitka.net/. numba: https://numba.pydata.org/. PyPy: https://www.pypy.org/. [01:10:42] Mejoras recientes en la implementación de Python: Issue 26647: ceval: use Wordcode, 16-bit bytecode: https://bugs.python.org/issue26647. Issue 9203: Use computed gotos by default: https://bugs.python.org/issue9203. [01:14:52] Psyco https://en.wikipedia.org/wiki/Psyco. [01:16:22] Etiquetado de tipos para ayudar a los JIT. Cython: https://cython.org/. MYPY: http://mypy-lang.org/. MYPYC: https://mypyc.readthedocs.io/en/latest/index.html. Especialización. [01:22:37] GHC (The Glasgow Haskell Compiler): https://www.haskell.org/ghc/. [01:25:07] Memoria transaccional https://en.wikipedia.org/wiki/Transactional_memory. Implementaciones en Python: Sistemas de persistencia como Durus https://www.mems-exchange.org/software/DurusWorks/ o ZODB http://www.zodb.org/. Mecanismos de resolución de conflictos. [01:34:32] Más sobre optimizaciones y guardas. Mucha discusión sobre el GIL: https://en.wikipedia.org/wiki/Global_interpreter_lock. La atomicidad de operaciones no está documentada en ningún sitio. [01:42:02] Ejemplo de bytecode: >>> def rutina(n): ... n += 1 ... n = n + 1 ... >>> dis.dis(rutina) 2 0 LOAD_FAST 0 (n) 2 LOAD_CONST 1 (1) 4 INPLACE_ADD 6 STORE_FAST 0 (n) 3 8 LOAD_FAST 0 (n) 10 LOAD_CONST 1 (1) 12 BINARY_ADD 14 STORE_FAST 0 (n) 16 LOAD_CONST 0 (None) 18 RETURN_VALUE [01:45:02] Cuando haces cosas muy avanzadas que usan cosas no definidas formalmente, mejor verificar las suposiciones. [01:46:47] La ventaja de probar cosas en proyectos personales: ¿Por qué Jesús Cea se ha hecho su propio scraper web? "Maldades". scrapy: https://scrapy.org/. [01:49:22] Migración de versiones en sistemas de persistencia. [02:05:07] Event sourcing. Event sourcing: https://dev.to/barryosull/event-sourcing-what-it-is-and-why-its-awesome. Logs de modificaciones. [02:08:07] Ventajas de haber usado scrapy: https://scrapy.org/. Concurrencia. tarpit. Problemas habituales: Normalización de URLs. Webs mal formadas. [02:13:47] Módulos de scraping: newspaper3k: https://pypi.org/project/newspaper3k/. [02:15:02] Recapitulación. Pyjion: https://pyjion.readthedocs.io/en/latest/index.html. MYPYC: https://mypyc.readthedocs.io/en/latest/index.html. [02:16:02] Compilación de módulos de Python para MS Windows. Generar un wheel. Aprovechar sistemas de integración continua que levantan máquinas virtuales. [02:22:21] Final.

Python en español
Python en español #11: Tertulia 2020-12-15

Python en español

Play Episode Listen Later Apr 27, 2021 102:06


Más de lo que nunca quisiste aprender sobre JIT, guardas y especialización https://podcast.jcea.es/python/11 En lo que sigue, cuando se habla de CPython, se refiere al intérprete de referencia de Python, que está escrito en lenguaje C: https://www.python.org/downloads/. Participantes: Eduardo Castro, email: info@ecdesign.es. Conectando desde A Guarda. Jesús Cea, email: jcea@jcea.es, twitter: @jcea, https://blog.jcea.es/, https://www.jcea.es/. Conectando desde Madrid. Javier, conectando desde Madrid. Víctor Ramírez, twitter: @virako, programador python y amante de vim, conectando desde Huelva. Miguel Sánchez, email: msanchez@uninet.edu, conectando desde Canarias. Audio editado por Pablo Gómez, twitter: @julebek. La música de la entrada y la salida es "Lightning Bugs", de Jason Shaw. Publicada en https://audionautix.com/ con licencia - Creative Commons Attribution 4.0 International License. [00:52] Aviso de que se está grabando. Temas legales. [01:52] Valor de publicar estos audios y las dificultades para hacerlo. [02:42] Métodos mágicos: __set_name__(). PEP 487: https://www.python.org/dev/peps/pep-0487/. [04:12] Problemas con PIP 20.3.2: https://github.com/pypa/pip/issues/9284. [05:52] ¿Actualizar a la última versión o esperar? Poder "echar atrás" fácil. Acumular cambios pendientes es deuda técnica. [10:42] Google caído https://www.theguardian.com/technology/2020/dec/14/google-suffers-worldwide-outage-with-gmail-youtube-and-other-services-down. [11:02] Generación de wheels en varios sistemas: https://pythonwheels.com/. auditwheel: https://pypi.org/project/auditwheel/. ¿Generación de Wheels en Microsoft Windows? [13:12] Caché local de PIP https://pip.pypa.io/en/stable/. [14:17] Event Sourcing https://dev.to/barryosull/event-sourcing-what-it-is-and-why-its-awesome. Módulo eventsourcing: https://pypi.org/project/eventsourcing/. [14:42] De momento se puede usar el viejo "resolver" de dependencias de PIP. Se puede usar la opción -use-deprecated=legacy-resolver. Esa opción se puede meter también en el fichero de configuración, para no tener que escribirlo en cada invocación. Jesús Cea comete el pecado de meter paquetes Python en el sistema operativo. [17:02] Batallitas de Jesús Cea. Jesús lleva dos años dándole vueltas a esto: bpo35930: "Raising an exception raised in a "future" instance will create reference cycles": https://bugs.python.org/issue35930. Explicación detallada del asunto. Brainstorming. [21:22] Visión a alto nivel del recolector de basuras de Python (cpython) Contador de referencias. Inmediato, pero no recoge ciclos. Si se crean instancias y no se destruyen, se llama a un recolector "pesado" que también recoge ciclos. Esto puede ser problemático al arrancar el programa, antes de que la creación/destrucción de objetos se "estabilice". gc.disable(): https://docs.python.org/3/library/gc.html#gc.disable. Jesús Cea "abusa" de los destructores y de que se ejecuten cuando él quiere. Lo práctico contra lo puro. Jesús ofrece cervezas. gc.collect(): https://docs.python.org/3/library/gc.html#gc.collect. Esto sirve tanto para recoger los ciclos como para comprobar si tu programa tiene ciclos de memoria o no. Futures: https://docs.python.org/3/library/concurrent.futures.html. [35:29] Módulo Manhole https://pypi.org/project/manhole/. Explorar un programa en producción. Tracemalloc: https://docs.python.org/3/library/tracemalloc.html. DTrace: http://dtrace.org/blogs/about/. py-spy: https://pypi.org/project/py-spy/. Pérdidas de memoria: Recordar lo hablado ya en tertulias anteriores. jemalloc: http://jemalloc.net/. MALLOC_PERTURB_: https://debarshiray.wordpress.com/2016/04/09/malloc_perturb_/. zswap: https://en.wikipedia.org/wiki/Zswap. [42:52] Micropython: https://micropython.org/. ESP8266: https://en.wikipedia.org/wiki/ESP8266. ESP32: https://en.wikipedia.org/wiki/ESP32. Bluetooth Low Energy: https://en.wikipedia.org/wiki/Bluetooth_Low_Energy. ¿Qué ventajas aporta usar Micropython? Velocidad de desarrollo y depuración. [52:42] ¿El futuro será mejor? O no. Desperdicio de recursos materiales porque realmente sobran. Python es mucho más lento que C y no digamos ensamblador. [57:17] Cambiar Python por un lenguaje más rápido. Go: https://en.wikipedia.org/wiki/Go_(programming_language). Rust: https://en.wikipedia.org/wiki/Rust_(programming_language). C++: https://en.wikipedia.org/wiki/C%2B%2B. [01:00:20] Python no pinta nada en móviles. Kivy: https://kivy.org/. [01:02:07] Acelerar Python. Subinterpreters: PEP 554: https://www.python.org/dev/peps/pep-0554/. Si los subintérpretes no compartiesen NADA, se podrían lanzar simultaneamente en varios núcleos de la CPU sin competir por un GIL https://en.wikipedia.org/wiki/Global_interpreter_lock único. JIT: https://es.wikipedia.org/wiki/Compilaci%C3%B3n_en_tiempo_de_ejecuci%C3%B3n. PYPY: https://www.pypy.org/. RPython: https://rpython.readthedocs.io/en/latest/. Numba: https://numba.pydata.org/. Cython: https://cython.org/. Python es "potencialmente" muy dinámico, pero en la práctica los programas no lo son. Jesús pone varios ejemplos. Conversación densa entre Jesús y Javier. Guardas para comprobar que la especialización sigue siendo correcta. Por ejemplo, para los diccionarios: PEP 509 Add a private version to dict: https://www.python.org/dev/peps/pep-0509/ "Tipado" más estricto. MYPY: http://mypy-lang.org/. Pydantic: https://pydantic-docs.helpmanual.io/. Comprobación de tipos en tiempo de ejecución. Descubrimiento de tipos en tiempo de ejecución, proporcionando "especialización". psyco: https://en.wikipedia.org/wiki/Psyco. Eduardo Castro entra y simplifica la discusión. Jesús explica qué hace "a+b" internamente. [01:29:22] PyParallel http://pyparallel.org/ Memoria transaccional: https://es.wikipedia.org/wiki/Memoria_transaccional. (nota de Jesús Cea): Los sistemas de persistencia Python descritos en tertulias anteriores pueden considerarse casos de memoria transaccional... si somos flexibles. "Colorear" objetos y que dos hilos no puedan acceder a objetos del mismo color simultaneamente o en transacciones concurrentes. [01:30:42] PYPY https://www.pypy.org/ es tan sofisticado que no lo entiende ni dios. Jesús Cea lo ha intentado y se ha rendido. psyco: https://en.wikipedia.org/wiki/Psyco. CFFI: https://cffi.readthedocs.io/en/latest/. [01:35:22] Compilar CPython a WebAssembly https://en.wikipedia.org/wiki/WebAssembly va más rápido que en C nativo. [01:36:02] Simplemente compilar código python con Cython https://cython.org/ sin declaración de tipos dobla la velocidad de ejecución. ¡CPython lo puede hacer mejor! [01:36:57] Subinterpreters: PEP 554: https://www.python.org/dev/peps/pep-0554/. Poder usar todos los núcleos de la CPU. [01:38:07] Seguimos hablando del asunto. [01:39:07] Un problema es que Python tiene la vocación de funcionar en todas partes, así que hay resistencia para implementar mejoras solo en ciertas plataformas. [01:40:17] Cierre. Dadle una pesada al bug bpo35930: "Raising an exception raised in a "future" instance will create reference cycles": https://bugs.python.org/issue35930. [01:41:13] Final.

Python Podcast
Python in der Visual Effects Branche

Python Podcast

Play Episode Listen Later Apr 26, 2021 88:08


Fabian arbeitet als Pipeline TD in der Visual Effects Industrie und hat uns gefragt, ob wir Interesse hätten, uns mal mit diesem Thema zu beschäftigen. Wir fanden die Idee super, denn uns (Dominik und Jochen) war gar nicht klar, dass dort inzwischen auch eine Menge Python eingesetzt wird. Daher haben wir dazu jetzt einfach mal eine Episode mit Fabian aufgenommen :). Wenn  ihr auch ein Thema habt, über das ihr gern mal mit uns sprechen würdet, schreibt einfach eine Mail an die Mailadresse in den Shownotes. Wahrscheinlich gibt es eine Menge Anwendungen für Python, von denen wir noch nie etwas gehört haben.     Shownotes Unsere E-Mail für Fragen, Anregungen & Kommentare: hallo@python-podcast.de News aus der Szene Django 3.2 Release Notes Maya | 2020.3 Release Python in der Visual Effects Branche Rigger / Animator Outside the Wire Houdini PyQt / PySide Renderfarm Git Large File Storage (git-lfs) NVIDIA Demos (Bilder mittels Machine Learning generieren) DALL·E: Creating Images from Text (OpenAI Modell) Pygame CUDA / plaidML Cython / Numba Python f-strings PYTHONPATH pyenv / Conda PyInstaller / PyOxidizer / Nuitka / PyRun Picks IceCream / rich Blind Watermark / devdocs VirtualFish Öffentliches Tag auf konektom

Python en español
Python en español #8: Tertulia 2020-11-24

Python en español

Play Episode Listen Later Apr 20, 2021 100:01


Doblegando a la culebra https://podcast.jcea.es/python/8 Se me oye (Jesús Cea) muy mal y es muy cansado porque hablo mucho y tengo mala calidad de sonido. Lo siento. Se han eliminado las pausas en la edición, así que es bastante cansado oír a Jesús Cea hablar a toda velocidad y sin respirar. Lo haremos mejor la próxima vez. Se oye mucho tecleo. Participantes: Eduardo Castro, email: info@ecdesign.es. Jesús Cea, email: jcea@jcea.es, twitter: @jcea, https://blog.jcea.es/, https://www.jcea.es/. Sara Sáez, twitter: @saruskysaez. Luis. Audio editado por Pablo Gómez, twitter: @julebek. La música de la entrada y la salida es "Lightning Bugs", de Jason Shaw. Publicada en https://audionautix.com/ con licencia Creative Commons Attribution 4.0 International License. [01:42] API limitado API limitado de Python para asegurar compatibilidad binaria de extensiones en C entre versiones diferentes del intérprete de Python. PEP 384: https://www.python.org/dev/peps/pep-0384/. [03:42] Por qué empecé a usar Python. [06:52] Eduardo: Cartas de restaurante con códigos QR: https://www.qrico.eu/. [10:42] ¿Es mejor que una biblioteca esté en la biblioteca estándar de Python o ser una librería externa? Tema recurrent. Pros y contras. [18:34] Soporte de Python en MS Windows. Distribución de librerías precompiladas. ¿Cómo compilar una extensión C en MS Windows? [20:52] Problema de las distribuciones binarias cuando sale una nueva versión de Python. Es una de las motivaciones para usar el API limitado definido en PEP 384: https://www.python.org/dev/peps/pep-0384/. [23:22] Sistema de notificación de actualizaciones de librerías. Por ejemplo: https://libraries.io/. Feed RSS de PYPI: https://pypi.org/rss/updates.xml. ¿Actualizas a la última versión? Pros y contras. [28:22] Mejor entrar con vídeo a la tertulia. [29:12] Debugging de uso de memoria y memory leaks. Flamegraphs: http://www.brendangregg.com/flamegraphs.html. Tracemalloc: https://docs.python.org/3/library/tracemalloc.html. [33:52] Virtualenv, ¿qué usa cada uno? ¿Y en MS Windows? [35:52] Soporte de Python en MS Windows. La mayor parte del uso de Python es en MS Windows, pero los "core developers" no usar MS Windows. Eso causa problemas de soporte. [40:52] Guido van Rosum y Microsoft. Guido van Rosum ha empezado a trabajar para Microsoft: https://www.msn.com/en-us/news/technology/python-creator-guido-van-rossum-joins-microsoft/ar-BB1aXmPu. [44:22] ¿Ya estáis usando Python 3.9? El API limitado se va ampliando versión a versión de Python. PEP 384: https://www.python.org/dev/peps/pep-0384/. [45:22] Opciones para acelerar la ejecución de código Python. Numba https://numba.pydata.org/. Cython https://cython.org/. Pero una vez que empiezas etiquetar tipos, el código resultante ya no es Python. El futuro es type hinting: PEP 484 https://www.python.org/dev/peps/pep-0484/. Programar una extensión en C nativo. PyPy https://www.pypy.org/. Ojo con la compatibilidad. [54:32] Métodos para enlentecer Python :-) [55:12] Protección de código en Python. Cython https://cython.org/. [58:47] Mezclar código C en Python. Programar un módulo C. CFFI: https://cffi.readthedocs.io/en/latest/. [01:01:52] Guido van Rosum y Microsoft (segunda parte) Volvemos al tema de Guido van Rosum trabajando para Microsoft: https://www.msn.com/en-us/news/technology/python-creator-guido-van-rossum-joins-microsoft/ar-BB1aXmPu. La polémica del "walrus operator" u "operador morsa". [01:05:22] "Operador morsa" o "Walrus operator". PEP 572 https://www.python.org/dev/peps/pep-0572/. Tema recurrrente: Python se está complicando cada vez más. Problema para los novatos. [01:14:32] Opciones para acelerar la ejecución de código Python (2). Otra forma de acelerar Python: MYPY http://mypy-lang.org/ y MYPYC https://github.com/mypyc/mypyc. Type hinting. PEP 484 https://www.python.org/dev/peps/pep-0484/. [01:17:42] ¿Python con tipos? Motivación. [01:20:52] ¿Quien paga los tests? [01:22:37] Los tests como documentación. [01:23:32] ¿Qué usais para tests? [01:26:22] ¿Qué hace cada uno con Python? Hobby, Zope https://zope.readthedocs.io/en/latest/, imágenes, numpy https://numpy.org/, Jupyter https://jupyter.org/. Persistencia de datos y ORMs. Integrar Python dentro de otros proyectos, como en Kodi https://www.kodi.tv/. Django https://www.djangoproject.com/, micropython http://www.micropython.org/. [01:33:12] Colofón y mi motivación para las tertulias.

捕蛇者说
Ep 28. gRPC and Python

捕蛇者说

Play Episode Listen Later Apr 12, 2021 76:11


如果喜欢我们的节目,欢迎通过爱发电打赏支持:https://afdian.net/@pythonhunter 嘉宾 Lidi Zheng 主播 Laike9m 小白 时间线 00:00:28 开始 00:00:40 嘉宾介绍 00:01:29 嘉宾经历 00:05:26 嘉宾在 CMU(卡内基·梅隆大学) 研究生的经历回顾 00:07:08 嘉宾在出入 Google 时的情况 00:09:04 什么是 RPC 00:09:55 gPRC 与 RPC 的关系是什么 00:10:19 gRPC 中 g 的含义 00:11:23 gRPC 支持的语言 00:12:26 为什么 gRPC 要使用 HTTP2 00:13:43 gRPC 使用了 HTTP2 的哪些特性 r 00:14:10 什么是流控制 00:14:49 流控制的一些选项是否可以在 gRPC 中修改 00:16:02 gRPC 的 streaming 是如何实现的 00:16:31 HTTP3 的出现是否会影响 00:18:55 关于 TCP 和 UDP 服务保障的相关讨论 00:20:08 gRPC Protocol Buffers 00:23:36 关于 gRPC Python 00:24:08 XX 语言的使用经验 00:26:34 如何让 gRPC 支持 asyncio 00:32:34 Python asyncio 特性讨论 00:33:00 gRPC 与服务发现 00:40:40 gRPC 与商业开源 00:51:17 如何防止恶意代码从 Github 流入企业内部 00:57:52 从 gRPC 角度出发聊一聊 Python 的性能 01:06:44 有考虑用 Cython 重写 gRPC 嘛 相关链接 00:10:14 Thrift | 这边有一个口误 Thrift 是由 Facebook 开发的 00:17:12 HTTP Headers Trailer 属性 00:19:02 ISP | Core Provider 00:20:13 gRPC Protocol Buffers 00:22:17 SOAP | EBS 00:23:58 Cython 00:30:28 Youtube-Lidi Zheng, Pau Freixes - gRPC Python, C Extensions, and AsyncIO 00:34:55 Envoy Proxy 00:38:16 Google Cloud Traffic Director | 可能需要科学上网才能打开 00:46:04 Monolithic 单体架构 00:58:14 Cyberbrain 01:00:07 Message Pack 01:00:57 Why Is GIL Worse Than We Thought? 01:09:22 yep 01:11:09 十三机兵防卫圈 | 百度百科 01:11:52 点击补番 -> 永生之酒

Coder Radio
403: Forbidden

Coder Radio

Play Episode Listen Later Mar 4, 2021 49:38


After we pine about the way things used to be, Mike shares why he is developing a fondness for C++.

GNU World Order Linux Cast

Thoughts about the new **Gemini** Internet protocol, and a demonstration of some basic **Cython** from the **d** software series of Slackware Linux. shasum -a256=fb71964c824a2a827f50f27640a684c3612ac690c292fc02a14ee00791055dda

cython
Python Bytes
#209 JITing Python with .NET, no irons in sight

Python Bytes

Play Episode Listen Later Nov 27, 2020 33:13


Sponsored by us! Support our work through: Our courses at Talk Python Training Test & Code Podcast Patreon Supporters Michael #1: Running Python on .NET 5 by Anthony Shaw Talked about pyjion way back when on episode 49 with Brett Cannon. .NET 5 was released on November 10, 2020. It is the cross-platform and open-source replacement of the .NET Core project and the .NET project that ran exclusively on Windows since the late 90’s. See the conference about it if you want to go deeper. Performance: I just saw a SO post about someone complaining their Python was 31x slower than C#. The most common way around this performance barrier is to compile Python extensions from C or using something like Cython. .NET 5 CLR comes bundled with a performant JIT compiler (codenamed RyuJIT) that will compile .NETs IL into native machine instructions on Intel x86, x86-64, and ARM CPU architectures. Pyjion is a project to replace the core execution loop of CPython by transpiling CPython bytecode to ECMA CIL and then using the .NET 5 CLR to compile that into machine code. It then executes the machine-code compiled JIT frames at runtime instead of using the native execution loop of CPython. A few releases of Python ago (CPython specifically, the most commonly used version of Python) in 3.7 a new API was added to be able to swap out “frame execution” with a replacement implementation. This is otherwise known as PEP 523. This extension uses the same standard library as Python 3.9. Will this be compatible with my existing Python code? What about C Extensions? The short answer is- if your existing Python code runs on CPython 3.9 – yes it will be compatible. Tested against the full CPython “test suite” on all platforms. In fact, it was the first JIT ever to pass the test suite. Is this faster? The short answer a little, but not by much (yet). see also: https://twitter.com/anthonypjshaw/status/1328457723608928256?s=20 Brian #2: PEP 621 -- Storing project metadata in pyproject.toml Progress on standardizing what goes into pyproject.toml Authors Brett Cannon, Paul Ganssle, Pradyun Gedam, Sébastien Eustace (of poetry), Thomas Kluyver (of flit), Tzu-Ping Chung Motivators of this PEP are: Encourage users to specify core metadata statically for speed, ease of specification, unambiguity, and deterministic consumption by build back-ends Provide a tool-agnostic way of specifying metadata for ease of learning and transitioning between build back-ends Allow for more code sharing between build back-ends for the "boring parts" of a project's metadata Doesn’t change any existing core metadata Doesn’t attempt to standardize all possible metadata Included in table named [project]: name version description readme requires-python license authors/maintainers keywords classifiers urls entry points dependencies/optional-dependencies dynamic There’s an example in the PEP that helps clear things up Many items have synonyms specified for flit/poetry/setuptools (presumably for backward compatibility) Michael #3: GitHub revamps copyright takedown policy after restoring YouTube-dl In October following a DMCA complaint from the Recording Industry Association of America (RIAA) it was taken down at GitHub. Citing a letter from the Electronic Frontier Foundation (the EFF), GitHub says it ultimately found that the RIAA’s complaint didn’t have any merit. The RIAA argued the tool ran afoul of section 1201 of the US copyright law by giving people the means to circumvent YouTube’s DRM. the EFF dissects the RIAA’s claims, highlighting where the organization had either misinterpreted the law or how the code of YouTube-dl works. “Importantly, YouTube-dl does not decrypt video streams that are encrypted with commercial DRM technologies, such as Widevine, that are used by subscription videos sites, such as Netflix,” the organization points out when it comes to the RIAA’s primary claim. GitHub is implementing new policies to avoid a repeat of a repeat situation moving forward. First, it says a team of both technical and legal experts will manually evaluate every single section 1201 claim. If the company’s technical and legal teams ultimately find any issues with a project, GitHub will give its owners the chance to address those problems before it takes down their work. GitHub is establishing a $1 million legal defense fund for developers. Sidebar: EFF has just launched How to Fix the Internet, a new podcast mini-series that examines potential solutions to six ills facing the modern digital landscape. Brian #4: Install & Configure MongoDB on the Raspberry Pi Mark Smith Definitely a “wow, I didn’t know you could do that” article. Tutorial walks through Installing 64 bit Ubuntu Server on a Raspberry Pi Configure wifi Install MongoDB on Pi Set up a user account, to safely expose MongoDB on a home network. Now you’ve got a MongoDB server in your house. So cool Michael #5: Extra! extra! extra!, hear all about it! Follow up on my critique of things like SQL & CSS put next to Python and Java. Maybe best to grab the conversation from here. Guido joins Microsoft, why? People seem to see this as a positive for sure. But they checked him out! New code editor roaming the streets: Nova from Panic. Two thumbs up on Big Sur and now waiting on the Mac Mini M1. Brian #6: A Python driven AI Stylist Inspired by Social Media Dale Markowitz A bunch of Google tools (cloud storage, firebase, cloud vision api, product search api) Some React for front end Python to batch script General oversimplified process: photos from social media for inspiration photos of everything in your closet, multiple of each item use AI suggest outfits from your closet that match inspiration photos Ok. The process is really more of a promo for Google AI products, and not so much about Python, but it’s a cool “look what you can do with software” kinda thing. Also, many of the tools used by online retail, like “similar products” and such, are available to lots of people now, and that’s cool. Joke: Back to the [dev] future!

Python Bytes
#201 Understand git by rebuilding it in Python

Python Bytes

Play Episode Listen Later Oct 2, 2020 40:26


Sponsored by us! Support our work through: Our courses at Talk Python Training Python Testing with pytest Michael #1: Under the hood of calling C/C++ from Python Basics first: what C compiles to? Each operating system features some exact format to work with. Among the most popular ones are: ELF (Executable and Linkable Format), used by most of Linux distros PE (Portable Executable), used by Windows Mach-O (Mach object), used by Apple products We also need to make our library visible to our programs. An easiest way to do so is to copy it to /usr/lib/ - default system-wide directory for libraries. Maybe put it in system / system32 on Windows? ctypes: the simplest way With the shared object compiled, we are ready to call it. Consider ctypes to be the easiest way to execute some C code, because: it’s included in the standard library, writing a wrapper uses plain Python. lib = ctypes.CDLL(f'/usr/lib/libdullmath.so') lib.get_pi For C: You need to be clear about the calling convention (extern “C” for example) Now we can load libraries at runtime, but we are still missing the way to generate correct caller ABI to use external C libraries. Do deal with it, libffi was created. Libffi is a portable C library, designed for implementing FFI tools, hence the name. Given structs and functions definitions, it calculates an ABI of function calls at runtime. A mature approach to improve in this area is to allow libraries to introduce themselves. We can oblige every library to define a function named entry_point, which will return metadata about functions it contains. Final destination: C/C++ extensions and Python/C API CPython provides a similar API for implementing C-based extensions: “Extending and Embedding the Python Interpreter”. // NOTE: entry point function has dynamic name PyInit_[HTML_REMOVED] PyMODINIT_FUNC PyInit_mymath(void) { return PyModule_Create(&mymathmodule); } The main difference is that we have to wrap initial C functions with Python-specific ones. CPython interpreter uses its own PyObject type internally rather than raw int, char*, and so on, and we need the wrappers to perform the conversion. Cython, Boost.Python, pybind11 and all all all The main challenge of writing pure C extensions is a massive amount of boilerplate that needs to be written. Mainly this boilerplate is related to wrapping and unwrapping PyObject. It becomes especially hard if a module introduces its own classes (object types). To solve this issue, a plethora of different tools was created. All of them introduce a certain way to generate wrapping boilerplate automatically. They also provide easy access to C++ code and advanced tools for the compilation of extensions. Examples aiohttp - asyncio web framework that uses Cython for HTTP parsing, uvloop - event loop that is wrapping libuv, fully written in Cython, httptools - bindings to nodejs HTTP parser, also fully written in Cython (a lot of other big projects like sanic or uvicorn use httptools). Cecil #2: ugit: DIY Git in Python Michael #3: Things I Learned to Become a Senior Software Engineer by Niel Kakkar Growing using different ladders of abstraction Entering my second year, I had all the basics in place. I did figure out something insightful. I’m working inside the software development lifecycle, but this lifecycle is part of a bigger lifecycle: the product and infrastructure development lifecycle. Learning what people around me are doing Since we’re not in a closed system, it makes sense to better understand the job of the product managers, the sales people, and the analysts. Product managers are the best source for this. They know how the business makes money, who are the clients, and what do clients need. Learning good habits of mind Thinking well: Diving into cog sci, one output was a framework for critical thinking. It’s compounding, and compounding is powerful. Strategies for making day-to-day more effective: The other side of the coin is habits that allow you to think well. It starts with noticing little irritations during the day, inefficiencies in meetings, and then figuring out strategies to avoid them. Some good habits I’ve noticed: Never leave a meeting without making the decision / having a next action Decide who is going to get it done. Things without an owner rarely get done. Document design decisions made during a project Acquiring new tools for thought & mental models New tools for thought are related to thinking well, but more specific to software engineering. For example, I was recently struggling with a domain with lots of complex business logic. Edge cases were the norm, and we wanted to design a system that handles this cleanly. That’s when I read about Domain Driven Design Protect your slack When I say slack, I don’t mean the company, but the adjective. One thing that gives me high output and productivity gains is to “slow down”. Want to get more done? Slow down. When there is slack, you get a chance to experiment, learn, and think things through. This means you get enough time to get things done. When there is no slack, deadlines are tight, and all your focus goes into getting shit done. Ask Questions Q: What is a package? A: It’s code wrapped together that can be installed on a system. Q: Why do I need packages? A: They give a consistent way of getting all the files you need in the right place. Without them, things are easy to mess up. You need to ensure every file is where it’s supposed to be, the system paths are set up, and dependent packages are available. Q: How do packages differ from applications I can install on my system? A: It’s a very similar idea! Windows installer is like a package manager that helps install applications. Similarly, DPKG and rpm packages are like .exes that you can install on Linux systems, with the help of apt and yum package managers, which are like the windows installers. Force multipliers One sprint I didn’t get much done myself. I wrote very limited code. Instead, I co-ordinated which changes should go out when (it was a complicated sprint), tested they worked well, did lots of code reviews, made alternate design suggestions, and pair-programmed wherever I could to get things un-stuck. We got everything done, and in this case, zooming out helped make decisions for PRs easier. It was one of our highest velocity sprints. Embrace fear: I’ve learned to embrace this feeling. It excites me. It’s information about what I’m going to learn. I’ve taken it so far that I’ve started tracking it in my human log - “Did I feel fear this week?” If the answer is no too many weeks in a row, I’ve gotten too comfortable. Super powers Getting into the source code when documentation isn’t enough Quest: Reading open source code. Quickly build a mental model for the code you’re looking at Quest: Reading open source code. Embracing fear Quest: Build a side project. Confidence to express ignorance Quest: Overcome the first gotcha with growing. Cecil #4: Build tech skills for space exploration Michael #5: Profiling Django Views by Farhan Azmi We know we need to profile our code Many Python profiling tools exist, but this article will limit only to the most used tools: cProfile and django-silk . The two tools mainly profile in regards to function calls and execution time. To incorporate cProfile to Django views, we can write our own middleware that captures the profiling on every request sent to our Django views. Thankfully, there exists a simpler solution: django-cprofile-middleware. It is a simple profiling middleware created by a Github user omarish. To profile this view with the installed middleware, we can just append prof parameter to the end of the URL, i.e. http://localhost:8000/api/auth/users/availability/?username=[HTML_REMOVED]&email=[HTML_REMOVED]&prof We can visualize the profile result further with Python profiler visualizing library, such as SnakeViz. Just add &download to the request. the profile result could not show which database query that brings performance hit. This is needed especially when our application is centered around database (SQL) queries: That’s where django-silk comes in. Add as middleware: Silk will automatically intercept requests we make to our views and the UI can be accessed from the path /silk/ . Dive into a request to see all the headers/form/etc + DB query and perf. Cecil #6: Send an SMS message with Azure Communication Services Extras: Michael: Was on Real Python podcast Cecil: https://studentambassadors.microsoft.com/ Joke: Dependencies

Talk Python To Me - Python conversations for passionate developers

Python is growing incredibly quickly and has found its place in many facets of the developer and computational space. But one area that is still shaky and uncertain is packaging and shipping software to users. I'm not talking about building reusable libraries and hosting them on PyPI. I'm talking about shipping executable software to non-developers. Take a moment to stop and think about what ways you would send an end-user a program built with Python that they can simply run. It's a bit of a mixed bag, isn't it? On this episode, we welcome back Cristian Medina to run through the state if Python packaging. Links from the show Cris on Twitter: @tryexceptpass tryexceptpass: tryexceptpass.org Russel Keith-Magee keynote & black swans: youtube.com 4 Attempts at Packaging Python as an Executable article: tryexceptpass.org Official Python Docker image: hub.docker.com Docker: docker.com Vagrant: vagrantup.com PyInstaller: pyinstaller.org Briefcase: beeware.org Pex: github.com Shiv: github.com pipx: pypi.org/project/pipx PyOxidixer: gregoryszorc.com Nuitka: nuitka.net Cython: cython.org Flatpak: flatpak.org Snapcraft: snapcraft.io Sponsors TideLift Linode Talk Python Training

Python Podcast
Python in der Wissenschaft

Python Podcast

Play Episode Listen Later Jun 30, 2019 113:03


In unserer elften Episode reden wir mit Gerrit über Python in der Wissenschaft. Themen waren diesmal das Veröffentlichen von Code, das Setzen von Code in Veröffentlichungen und Codegolf. Es war etwas warm im Wintergarten, aber falls Auphonic es schafft, das Ventilatorengeräusch herauszufiltern, sollte zumindest die Audioqualität diesmal wieder passen. Apropos Audioqualität, einer der Sprecher hatte ein schlechteres Headset als die Anderen. Könnt ihr heraushören wer? Würde mich mal interessieren, ob man das überhaupt hören kann... Shownotes Unsere E-Mail für Fragen, Anregungen & Kommentare: hallo@python-podcast.de News aus der Szene PyOxidizer Russell Keith-Magee - Keynote - PyCon 2019 PyRun - funktioniert auch mit 3.7 Jessica Garson - Making Music with Python, SuperCollider and FoxDot - PyCon 2019 Jordan Adler, Joe Gordon - Migrating Pinterest from Python2 to Python3 - PyCon 2019 Codegolf Code Golf Stack Exchange LSD Radix Python in der Wissenschaft Differentialgleichungen SIMD Efficiently and easily integrating differential equations with JiTCODE, JiTCDDE, and JiTCSDE - JiTCODE, JiTCDDE, JiTCSDE SymPy SageMath MATLAB GNU Octave Cython arXiv gnuplot Altair Picks NumPy Data Classes Per object permissions for Django Bandit is a tool designed to find common security issues in Python code Öffentliches Tag auf konektom

Python Bytes
#81 Making your C library callable from Python by wrapping it with Cython

Python Bytes

Play Episode Listen Later Jun 5, 2018 17:00


科技最前沿,论天文物理 人工智能 数码编程 大数据等

hello,这里是《科技最前沿》,喜爱科学的你来啦,我是你的老朋友微信公众号——丘孔语论。科技最前沿,主要从丘孔语论比较感兴趣的几个领域来谈论科学科技,可能涉及天文、物理、互联网/IT、人工智能/Ai、数码/手机、编程、大数据、商业大佬、创新创业创客、化学、医学、养生、心理学、灵性等领域;认识天地,开阔思维,重塑自我。原创 Linux中国 2017-04-25 17:39优化你最贵的资源。那就是你,而不是计算机。 选择一种语言/框架/架构来帮助你快速开发(比如 Python)。不要仅仅因为某些技术的快而选择它们。 当你遇到性能问题时,请找到瓶颈所在。 你的瓶颈很可能不是 CPU 或者 Python 本身。 如果 Python 成为你的瓶颈(你已经优化过你的算法),那么可以转向热门的 Cython 或者 C。 尽情享受可以快速做完事情的乐趣。 -- Nick Humrich本文导航-速度不再重要 …… 03%-速度是唯一重要的东西 …… 10%-一个微服务的案例 …… 16%-CPU 不是你的瓶颈 …… 22%-如果 CPU 时间是一个问题怎么办? …… 35%-那么,Python 更快一些吗? …… 39%-但是如果速度真的重要呢? …… 56%-优化 Python …… 78%编译自: https://medium.com/hacker-daily/yes-python-is-slow-and-i-dont-care-13763980b5a1作者: Nick Humrich译者: zhousiyu325为牺牲性能追求生产率而呐喊让我从关于 Python 中的 asyncio 这个标准库的讨论中休息一会,谈谈我最近正在思考的一些东西:Python 的速度。对不了解我的人说明一下,我是一个 Python 的粉丝,而且我在我能想到的所有地方都积极地使用 Python。人们对 Python 最大的抱怨之一就是它的速度比较慢,有些人甚至拒绝尝试使用 Python,因为它比其他语言速度慢。这里说说为什么我认为应该尝试使用 Python,尽管它是有点慢。速度不再重要过去的情形是,程序需要花费很长的时间来运行,CPU 比较贵,内存也很贵。程序的运行时间是一个很重要的指标。计算机非常的昂贵,计算机运行所需要的电也是相当贵的。对这些资源进行优化是因为一个永恒的商业法则:优化你最贵的资源。在过去,最贵的资源是计算机的运行时间。这就是导致计算机科学致力于研究不同算法的效率的原因。然而,这已经不再是正确的,因为现在硅芯片很便宜,确实很便宜。运行时间不再是你最贵的资源。公司最贵的资源现在是它的员工时间。或者换句话说,就是你。把事情做完比把它变快更加重要。实际上,这是相当的重要,我将把它再次放在这里,仿佛它是一个引文一样(给那些只是粗略浏览的人):把事情做完比快速地做事更加重要。你可能会说:“我的公司在意速度,我开发一个 web 应用程序,那么所有的响应时间必须少于 x 毫秒。”或者,“我们失去了客户,因为他们认为我们的 app 运行太慢了。”我并不是想说速度一点也不重要,我只是想说速度不再是最重要的东西;它不再是你最贵的资源。速度是唯一重要的东西当你在编程的背景下说 速度 时,你通常是说性能,也就是 CPU 周期。当你的 CEO 在编程的背景下说 速度 时,他指的是业务速度,最重要的指标是产品上市的时间。基本上,你的产品/web 程序是多么的快并不重要。它是用什么语言写的也不重要。甚至它需要花费多少钱也不重要。在一天结束时,让你的公司存活下来或者死去的唯一事物就是产品上市时间。我不只是说创业公司的想法 -- 你开始赚钱需要花费多久,更多的是“从想法到客户手中”的时间期限。企业能够存活下来的唯一方法就是比你的竞争对手更快地创新。如果在你的产品上市之前,你的竞争对手已经提前上市了,那么你想出了多少好的主意也将不再重要。你必须第一个上市,或者至少能跟上。一但你放慢了脚步,你就输了。企业能够存活下来的唯一方法就是比你的竞争对手更快地创新。一个微服务的案例像 Amazon、Google 和 Netflix 这样的公司明白快速前进的重要性。他们创建了一个业务系统,可以使用这个系统迅速地前进和快速的创新。微服务是针对他们的问题的解决方案。这篇文章不谈你是否应该使用微服务,但是至少要理解为什么 Amazon 和 Google 认为他们应该使用微服务。微服务本来就很慢。微服务的主要概念是用网络调用来打破边界。这意味着你正在把使用的函数调用(几个 cpu 周期)转变为一个网络调用。没有什么比这更影响性能了。和 CPU 相比较,网络调用真的很慢。但是这些大公司仍然选择使用微服务。我所知道的架构里面没有比微服务还要慢的了。微服务最大的弊端就是它的性能,但是最大的长处就是上市的时间。通过在较小的项目和代码库上建立团队,一个公司能够以更快的速度进行迭代和创新。这恰恰表明了,非常大的公司也很在意上市时间,而不仅仅只是只有创业公司。CPU 不是你的瓶颈如果你在写一个网络应用程序,如 web 服务器,很有可能的情况会是,CPU 时间并不是你的程序的瓶颈。当你的 web 服务器处理一个请求时,可能会进行几次网络调用,例如到数据库,或者像 Redis 这样的缓存服务器。虽然这些服务本身可能比较快速,但是对它们的网络调用却很慢。这里有一篇很好的关于特定操作的速度差异的博客文章[1]。在这篇文章里,作者把 CPU 周期时间缩放到更容易理解的人类时间。如果一个单独的 CPU 周期等同于 1 秒,那么一个从 California 到 New York 的网络调用将相当于 4 年。那就说明了网络调用是多少的慢。按一些粗略估计,我们可以假设在同一数据中心内的普通网络调用大约需要 3 毫秒。这相当于我们“人类比例” 3 个月。现在假设你的程序是高 CPU 密集型,这需要 100000 个 CPU 周期来对单一调用进行响应。这相当于刚刚超过 1 天。现在让我们假设你使用的是一种要慢 5 倍的语言,这将需要大约 5 天。很好,将那与我们 3 个月的网络调用时间相比,4 天的差异就显得并不是很重要了。如果有人为了一个包裹不得不至少等待 3 个月,我不认为额外的 4 天对他们来说真的很重要。上面所说的终极意思是,尽管 Python 速度慢,但是这并不重要。语言的速度(或者 CPU 时间)几乎从来不是问题。实际上谷歌曾经就这一概念做过一个研究,并且他们就此发表过一篇论文[2]。那篇论文论述了设计高吞吐量的系统。在结论里,他们说到:在高吞吐量的环境中使用解释性语言似乎是矛盾的,但是我们已经发现 CPU 时间几乎不是限制因素;语言的表达性是指,大多数程序是源程序,同时它们的大多数时间花费在 I/O 读写和本机的运行时代码上。而且,解释性语言无论是在语言层面的轻松实验还是在允许我们在很多机器上探索分布计算的方法都是很有帮助的,再次强调:CPU 时间几乎不是限制因素。如果 CPU 时间是一个问题怎么办?你可能会说,“前面说的情况真是太好了,但是我们确实有过一些问题,这些问题中 CPU 成为了我们的瓶颈,并造成了我们的 web 应用的速度十分缓慢”,或者“在服务器上 X 语言比 Y 语言需要更少的硬件资源来运行。”这些都可能是对的。关于 web 服务器有这样的美妙的事情:你可以几乎无限地负载均衡它们。换句话说,可以在 web 服务器上投入更多的硬件。当然,Python 可能会比其他语言要求更好的硬件资源,比如 c 语言。只是把硬件投入在 CPU 问题上。相比于你的时间,硬件就显得非常的便宜了。如果你在一年内节省了两周的生产力时间,那将远远多于所增加的硬件开销的回报。那么,Python 更快一些吗?这一篇文章里面,我一直在谈论最重要的是开发时间。所以问题依然存在:当就开发时间而言,Python 要比其他语言更快吗?按常规惯例来看,我、google[3] 还有[4]其他[5]几个人[6]可以告诉你 Python 是多么的高效[7]。它为你抽象出很多东西,帮助你关注那些你真正应该编写代码的地方,而不会被困在琐碎事情的杂草里,比如你是否应该使用一个向量或者一个数组。但你可能不喜欢只是听别人说的这些话,所以让我们来看一些更多的经验数据。在大多数情况下,关于 python 是否是更高效语言的争论可以归结为脚本语言(或动态语言)与静态类型语言两者的争论。我认为人们普遍接受的是静态类型语言的生产力较低,但是,这有一篇优秀的论文[8]解释了为什么不是这样。就 Python 而言,这里有一项研究[9],它调查了不同语言编写字符串处理的代码所需要花费的时间,供参考。在上述研究中,Python 的效率比 Java 高出 2 倍。有一些其他研究也显示相似的东西。 Rosetta Code 对编程语言的差异进行了深入的研究[10]。在论文中,他们把 python 与其他脚本语言/解释性语言相比较,得出结论:Python 更简洁,即使与函数式语言相比较(平均要短 1.2 到 1.6 倍)普遍的趋势似乎是 Python 中的代码行总是更少。代码行听起来可能像一个可怕的指标,但是包括上面已经提到的两项研究在内的多项研究[11]表明,每种语言中每行代码所需要花费的时间大约是一样的。因此,限制代码行数就可以提高生产效率。甚至 codinghorror(一名 C# 程序员)本人写了一篇关于 Python 是如何更有效率的文章[12]。我认为说 Python 比其他的很多语言更加的有效率是公正的。这主要是由于 Python 有大量的自带以及第三方库。这里是一篇讨论 Python 和其他语言间的差异的简单的文章[13]。如果你不知道为何 Python 是如此的小巧和高效,我邀请你借此机会学习一点 python,自己多实践。这儿是你的第一个程序:import __hello__但是如果速度真的重要呢?上述论点的语气可能会让人觉得优化与速度一点也不重要。但事实是,很多时候运行时性能真的很重要。一个例子是,你有一个 web 应用程序,其中有一个特定的端点需要用很长的时间来响应。你知道这个程序需要多快,并且知道程序需要改进多少。在我们的例子中,发生了两件事:我们注意到有一个端点执行缓慢。我们承认它是缓慢,因为我们有一个可以衡量是否足够快的标准,而它没达到那个标准。我们不必在应用程序中微调优化所有内容,只需要让其中每一个都“足够快”。如果一个端点花费了几秒钟来响应,你的用户可能会注意到,但是,他们并不会注意到你将响应时间由 35 毫秒降低到 25 毫秒。“足够好”就是你需要做到的所有事情。免责声明: 我应该说有一些应用程序,如实时投标程序,确实需要细微优化,每一毫秒都相当重要。但那只是例外,而不是规则。为了明白如何对端点进行优化,你的第一步将是配置代码,并尝试找出瓶颈在哪。毕竟:任何除了瓶颈之外的改进都是错觉。Any improvements made anywhere besides the bottleneck are an illusion. -- Gene Kim如果你的优化没有触及到瓶颈,你只是浪费你的时间,并没有解决实际问题。在你优化瓶颈之前,你不会得到任何重要的改进。如果你在不知道瓶颈是什么前就尝试优化,那么你最终只会在部分代码中玩耍。在测量和确定瓶颈之前优化代码被称为“过早优化”。人们常提及 Donald Knuth 说的话,但他声称这句话实际上是他从别人那里听来的:过早优化是万恶之源Premature optimization is the root of all evil。在谈到维护代码库时,来自 Donald Knuth 的更完整的引文是:在 97% 的时间里,我们应该忘记微不足道的效率:过早的优化是万恶之源。然而在关 键的 3%,我们不应该错过优化的机会。 —— Donald Knuth换句话说,他所说的是,在大多数时间你应该忘记对你的代码进行优化。它几乎总是足够好。在不是足够好的情况下,我们通常只需要触及 3% 的代码路径。比如因为你使用了 if 语句而不是函数,你的端点快了几纳秒,但这并不会使你赢得任何奖项。过早的优化包括调用某些更快的函数,或者甚至使用特定的数据结构,因为它通常更快。计算机科学认为,如果一个方法或者算法与另一个具有相同的渐近增长(或称为 Big-O),那么它们是等价的,即使在实践中要慢两倍。计算机是如此之快,算法随着数据/使用增加而造成的计算增长远远超过实际速度本身。换句话说,如果你有两个 O(log n) 的函数,但是一个要慢两倍,这实际上并不重要。随着数据规模的增大,它们都以同样的速度“慢下来”。这就是过早优化是万恶之源的原因;它浪费了我们的时间,几乎从来没有真正有助于我们的性能改进。就 Big-O 而言,你可以认为对你的程序而言,所有的语言都是 O(n),其中 n 是代码或者指令的行数。对于同样的指令,它们以同样的速率增长。对于渐进增长,一种语言的速度快慢并不重要,所有语言都是相同的。在这个逻辑下,你可以说,为你的应用程序选择一种语言仅仅是因为它的“快速”是过早优化的最终形式。你选择某些预期快速的东西,却没有测量,也不理解瓶颈将在哪里。为您的应用选择语言只是因为它的“快速”,是过早优化的最终形式。优化 Python我最喜欢 Python 的一点是,它可以让你一次优化一点点代码。假设你有一个 Python 的方法,你发现它是你的瓶颈。你对它优化过几次,可能遵循这里[14]和那里[15]的一些指导,现在,你很肯定 Python 本身就是你的瓶颈。Python 有调用 C 代码的能力,这意味着,你可以用 C 重写这个方法来减少性能问题。你可以一次重写一个这样的方法。这个过程允许你用任何可以编译为 C 兼容汇编程序的语言,编写良好优化后的瓶颈方法。这让你能够在大多数时间使用 Python 编写,只在必要的时候都才用较低级的语言来写代码。有一种叫做 Cython 的编程语言,它是 Python 的超集。它几乎是 Python 和 C 的合并,是一种渐进类型的语言。任何 Python 代码都是有效的 Cython 代码,Cython 代码可以编译成 C 代码。使用 Cython,你可以编写一个模块或者一个方法,并逐渐进步到越来越多的 C 类型和性能。你可以将 C 类型和 Python 的鸭子类型混在一起。使用 Cython,你可以获得混合后的完美组合,只在瓶颈处进行优化,同时在其他所有地方不失去 Python 的美丽。星战前夜的一幅截图:这是用 Python 编写的 space MMO 游戏。当您最终遇到 Python 的性能问题阻碍时,你不需要把你的整个代码库用另一种不同的语言来编写。你只需要用 Cython 重写几个函数,几乎就能得到你所需要的性能。这就是星战前夜[16]采取的策略。这是一个大型多玩家的电脑游戏,在整个架构中使用 Python 和 Cython。它们通过优化 C/Cython 中的瓶颈来实现游戏级别的性能。如果这个策略对他们有用,那么它应该对任何人都有帮助。或者,还有其他方法来优化你的 Python。例如,PyPy[17] 是一个 Python 的 JIT 实现,它通过使用 PyPy 替掉 CPython(这是 Python 的默认实现),为长时间运行的应用程序提供重要的运行时改进(如 web 服务器)。让我们回顾一下要点:优化你最贵的资源。那就是你,而不是计算机。选择一种语言/框架/架构来帮助你快速开发(比如 Python)。不要仅仅因为某些技术的快而选择它们。当你遇到性能问题时,请找到瓶颈所在。你的瓶颈很可能不是 CPU 或者 Python 本身。如果 Python 成为你的瓶颈(你已经优化过你的算法),那么可以转向热门的 Cython 或者 C。尽情享受可以快速做完事情的乐趣。我希望你喜欢阅读这篇文章,就像我喜欢写这篇文章一样。如果你想说谢谢,请为我点下赞。另外,如果某个时候你想和我讨论 Python,你可以在 twitter 上艾特我(@nhumrich),或者你可以在 Python slack channel[18] 找到我。作者简介:Nick Humrich -- 坚持采用持续交付的方法,并为之写了很多工具。同是还是一名 Python 黑客与技术狂热者,目前是一名 DevOps 工程师。

The Python Podcast.__init__
Cython with Craig Citro and Robert Bradshaw

The Python Podcast.__init__

Play Episode Listen Later Feb 19, 2016 52:02


Do you find yourself reaching for a different language when you need some extra speed? With Cython you can get the best of both worlds by writing your code in Python and executing it as compiled code. In this episode we were joined by Craig Citro and Robert Bradshaw from the Cython project to discuss how and when you might want to incorporate it into your applications.

Python en español
Python en español #2: No hemos tenido tiempo de hacerlo más corto

Python en español

Play Episode Listen Later Mar 24, 2015 73:15


Segundo episodio cargadito http://podcast.jcea.es/python/2 Notas: 00:00: Presentación podcast. 00:18: Hablamos sobre las novedades de Python 3.4.3. 00:18: Samuel habla sobre algunas novedades relevantes. 03:49: Jesús nos habla sobre la validación de certificados en esta versión. 06:29: Jesús nos habla sobre Python 2.7.9, las nuevas mejoras de seguridad y el protocolo SNI. 10:35: Hablamos sobre la EuroPython en Bilbao. 15:45: Juan Ignacio nos comenta los eventos Python de estas semanas. 23:18: Hablamos sobre la información de última hora de PyConES 2015. 24:19: Jesús no habla sobre OpenBadges. 33:49: Comentamos las preguntas y comentarios de los oyentes en la cuenta de twitter oficial. 34:43: Hablamos sobre decoradores (gracias a @imasdemase). 40:38: Hablamos sobre Cython y Ansible (gracias a @ryllada). 49:50: Hablamos sobre el estado de la instalación de paquetes en Python (gracias a @Pybonacci). 53:25: Hablamos sobre los números de la primitiva de forma pythónica! :S (gracias a @ipedrazas). 56:17: Hablamos sobre cryptography.io (gracias a @WuShell). 01:02:16: Comentamos brevemente las demás menciones y comentarios de los oyentes. 01:04:30: Jesús habla sobre cómo grabamos (WebRTC) y las bondades del formato de audio OPUS. 01:08:00: Jesús defiende la importancia de documentar la historia. ¿Quien organizó la primera PyConES en 2013? 01:10:18: Nos despedimos hasta la próxima recordando nuestras vías de contacto. 01:11:20: Jesús nos explica por qué el Twitter del podcast es @Python_podCAST.