Podcasts about PyPy

CISM 89.3 : On prend toujours un micro pour la vie

Play Episode Listen Later Oct 10, 2023 24:13

Topics covered in this episode: Psycopg 3 dacite RIP: Fast, barebones pip implementation in Rust Flaky Tests follow up Extras Joke Watch on YouTube About the show Sponsored by us! Support our work through: Our courses at Talk Python Training The Complete pytest Course Patreon Supporters Connect with the hosts Michael: @mkennedy@fosstodon.org Brian: @brianokken@fosstodon.org Show: @pythonbytes@fosstodon.org Join us on YouTube at pythonbytes.fm/live to be part of the audience. Usually Tuesdays at 11am PT. Older video versions available there too. Brian #1: Psycopg 3 Psycopg folks recommend starting with 3 for new projects 2 is still actively maintained, but no new features are planned recommend staying with 2 for legacy projects Psycopg 3 project 2 vs 3 feature comparison A few Psycopg 3 highlights native asyncio support native support for more Python types (such as Enums) and PostgreSQL types (such as multirange) Default server-side parameters binding Allows binary parameters and query results (and text, of course) Pipeline/batch mode support Static typing support Michael #2: dacite via Raymond Peck Simple creation of data classes from dictionaries Dacite supports following features: nested structures (basic) types checking optional fields (i.e. typing.Optional) unions forward references collections custom type hooks It's important to mention that dacite is not a data validation library. Type hooks are interesting too. Brian #3: RIP: Fast, barebones pip implementation in Rust list of current and planned features of RIP, the biggest are listed below: Downloading and aggressive caching of PyPI metadata. (done) Resolving of PyPI packages using Resolvo. (done) Installation of wheel files (planned) Support sdist files (planned) new project, just a couple weeks old. … “We would love to have you contribute!” Michael #4: Flaky Tests follow up by Marwan Sarieddine I was inspired by the Talk Python podcast on "Taming flaky tests" with Gregory Kapfhammer and Owain Parry so I wrote up an article on my blog titled "How not to footgun yourself when writing tests - a showcase of flaky tests” Extras Brian: Just wrapping up some personal projects, which means… Python People episodes soon Python Test episodes soon (but later) More course chapters coming Michael: PyBay 2023 was fun Switched to Spark Mail, recommended Dust (what science fiction story telling should be), try: FTL Oceanus Joke: There are more hydrogen atoms in a single molecule of water than there are stars in the entire Solar System. - mas.to/@SmudgeTheInsultCat/111174610011921264 The Big Rewrite

On prend toujours un micro pour la vie : pypy pop mtl cool

Play Episode Listen Later Sep 27, 2023

Tel un prisme, Josélito Michaud comprend de multiples facettes que l'émission propose de faire découvrir au grand public grâce à la richesse des parcours de vie de chacun. C'est ainsi qu'au fil des semaines, autour d'un micro, Riff Tabaracci est son équipe replongent à tour de rôle dans le récit courageux de leur processus de guérison intérieure et se livrent avec beaucoup d'émotion, d'authenticité et de courage.

micro jos la vie toujours clip tel prend michaud pypy riff tabaracci

Doing it the Hard Way: Making the AI engine and language

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Sep 14, 2023 89:22

Want to help define the AI Engineer stack? Have opinions on the top tools, communities and builders? We're collaborating with friends at Amplify to launch the first State of AI Engineering survey! Please fill it out (and tell your friends)!If AI is so important, why is its software so bad?This was the motivating question for Chris Lattner as he reconnected with his product counterpart on Tensorflow, Tim Davis, and started working on a modular solution to the problem of sprawling, monolithic, fragmented platforms in AI development. They announced a $30m seed in 2022 and, following their successful double launch of Modular/Mojo

ceo president ai google starting apple future state design phd goals performance dna microsoft iphone language cnn tesla fall in love attention tree humans matrix discord android origins nerds switzerland mac lego ios ipads windows intel senior director cto actors swift siri openai load residence nvidia rust hardware api engine generally learnings cs ads prom mojo python ui gpt ml linux enabling llama amplify autopilot automatic macbook amd guido macos underneath llm hard way macs cpu gpu flute google cloud modular docker walrus gpus tf simplest alessio speculative satya rl playgrounds cpus gcp dsl proms cpp clippy hpc tensorflow cuda keras xcode chris do caffe compiler risc v smol pytorch distinguished engineer risc tim davis objective c google brain clang product engineering intel cpus tpu jupyter notebooks cutlass jeremy howard swift playgrounds a100 imagenet llvm pypi graviton halide andrej karpathy amx chris lattner tvm alerted numpy ai engineer george hotz scott forstall chris ray christ here chris yeah tabular chris you sagemaker cisc chris no chris oh chris so resnet chris well chris one xla pypy chris right mnist chris they wkb mlir cython

Differentiating the Versions of Python & Unlocking IPython's Magic

The Real Python Podcast

Play Episode Listen Later Jul 28, 2023 46:11

What are all the different versions of Python? You may have heard of Cython, Brython, PyPy, or others and wondered where they fit into the Python landscape. This week on the show, Christopher Trudeau is here, bringing another batch of PyCoder's Weekly articles and projects.

magic unlocking python differentiating versions ipython pypy cython

Ep.145 - PyScript e PyPy con Antonio Cuni (Anaconda)

Gitbar - Italian developer podcast

Play Episode Listen Later Feb 2, 2023 76:41

Negli ultimi mesi abbiamo parlato di python e del suo sbarco nel mondo del browser. Questa settimana lo abbiamo fatto in modo piu strutturato con uno dei sui contributor. Abbiamo con noi Antonio Cuni direttamente da Anaconda. ## Supportaci suhttps://www.gitbar.it/support ## Paese dei balocchi - https://amzn.to/3HRClOi - https://www.youtube.com/watch?... ## Link amazon affiliatohttps://amzn.to/3XDznm1 ## Contatti @brainrepo su twitter o via mail a https://gitbar.it. ## Crediti Le sigle sono state prodotte da MondoComputazionale Le musiche da Blan Kytt - RSPN Sweet Lullaby by Agnese Valmaggia Monkeys Spinning Monkeys by Kevin MacLeod

kevin macleod questa abbiamo anaconda paese negli pypy supportaci

PyPy - Just in Time

Python Podcast

Play Episode Listen Later Jan 27, 2023 152:40

PyPy - Just in Time 27. Januar 2023, Jochen Warum ist der Python Interpreter eigentlich nicht selbst in Python geschrieben? Vor ziemlich genau zwanzig Jahren wurde ein Projekt gestartet, um das zu ändern. Eine gute Gelegenheit für Dominik und Jochen mit Carl Friedrich, einem der Core-Entwickler von PyPy zu sprechen.Wenn ihr Lust bekommen habt, einmal selbst an PyPy herum zu schrauben, könnt ihr die Entwickler hier kontaktieren oder euch einfach direkt bei Carl Friedrich melden

#317 Most loved and most dreaded dev tools of 2022

security code supply chains wordpress python devops solarwinds supply chain security pypy

Play Episode Listen Later Jan 3, 2023 48:31

Watch on YouTube About the show Sponsored by Microsoft for Startups Founders Hub. Connect with the hosts Michael: @mkennedy@fosstodon.org Brian: @brianokken@fosstodon.org Show: @pythonbytes@fosstodon.org Michael #1: StackOverflow 2022 Developer Survey Last year we saw Git as a fundamental tool to being a developer. This year it appears that Docker is becoming a similar fundamental tool for Professional Developers, increasing from 55% to 69%. Language: Rust is […] the most loved language with 87% of developers saying they want to continue using it. JS Frameworks: Angular.js is in its third year as the most dreaded. Let me Google that for you: 62% of all respondents spend more than 30 minutes a day searching for answers or solutions to problems. 25% spending more than an hour each day. The demise of the full-stack developer is overrated. I do wish there were more women in the field. Databases: Postgres is #1 and MongoDB is still going strong. The “which web framework do you use?” question is a full on train wreck. Why is this so hard for people to write the question? Node.js or Express (built on Node) vs. FastAPI or Flask (but no Python?) Most wanted / loved language is Rust (wanted) and Python/Rust tied for most wanted. Worked with vs. want to work with has some interesting graphics. Brian #2: PePy.tech - PyPI download stats with package version breakdown Petru Rares Sincraian We've discussed pypistats.org before, which highlights daily downloads downloads per major/minor Python version downloads per OS PyPy is a bit more useful for me default shows last few versions and total for this major version “select versions” box is editable. clicking in it shows dropdown with downloads per version already there you can add * for graph of total or other major versions if you want to compare daily/weekly/monthly is nice, to round out some noise and see larger trends Oddity I noticed - daily graph isn't the same dates as the table. off by a day on both sides not a big deal, but I notice these kinds of things. Michael #3: Codon Python Compiler via Jeff Hutchins and Abdulaziz Alqasem A high-performance, zero-overhead, extensible Python compiler using LLVM You can scale performance and produce executables, even when using third party libraries such as matplotlib. It also supports writing and executing GPU kernels, which is an interesting feature. See how it works at exaloop.io BTW, really terrible licensing. Free for non-commercial (great) “Contact us” for commercial use (it's fine to charge, but give us a price) Brian #4: 8 Levels of Using Type Hints in Python Yang Zhou (yahng cho) A progression of using type hints that seems to track how I've picked them up Type Hints for Basic Data Types. x: int Define a Constant Using Final Type DB: Final = '``PostgreSQL' (ok. I haven't used this one at all yet) Adding multipe type hints to one variable. int | None Using general type hints. def func(nums: Iterable) Also using Optional Type hints for functions def func(name: str) → str: (I probably would put this at #2) Alias of type hints (not used this yet, but looks cool) PostsType = dict[int, str] new_posts: PostsType = {1: 'Python Type Hints', 2: 'Python Tricks'} Type hints for a class itself, i.e. Self type from typing import Self class ListNode: def __init__(self, prev_node: Self) -> None: pass Provide literals for a variable. (not used this yet, but looks cool) from typing import Literal weekend_day: Literal['Saturday', 'Sunday'] weekend_day = 'Saturday' weekend_day = 'Monday' # will by a type error Extras Brian: I hear a heartbeat for Test & Code, so it must not be dead yet. Michael: New article: Welcome Back RSS From this I learned about Readwise, Kustosz, and Python's reader. Year progress == 100% PyTorch discloses malicious dependency chain compromise over holidays (of course found over RSS and reeder — see article above) Joke: vim switch

Code Supply Chain Security

Voice of the DBA

Play Episode Listen Later Oct 11, 2022 2:28

There have been a number of attacks in the last few years on source code. In fact, I saw a new one this week for an e-commerce Wordpress plugin. This time hackers got access to the distribution server for the company, Fishpig, and altered the plug-ins that their customers download. A few years ago this was big news, with the SolarWinds exploit. There was also an attack on PyPy, a popular Python package that many people include in their code. There have been no shortages of problems in npm packages as well. I'm sure this has happened in other software packages, which is scary. In the days of DevOps where we publish code from a repository, an exploit against your developers might go unnoticed. Then again, maybe not. Read the rest of Code Supply Chain Security

GraalVM: Meta Circularity on Different Levels

airhacks.fm podcast with adam bien

Play Episode Listen Later Sep 18, 2022 63:14

An airhacks.fm conversation with Fabio Niephaus (@fniephaus) about: enjoying lego mindstorms, learning python, then Java, pencils and mice, using bluej, lejos - Java for lego, building extension for PHP fusion, enjoying SmallTalk, PyPy and GraalVM, rpyhton (restricted python) toolchain, AOT compilation, Java BeanShell, bringing SmallTalk to other languages with PyPy, Java on Truffle - espresso, combining multiple interpreters in one JVM, Hasso-Plattner-Institut in Potsdam, self-sustaining programming system, Truffle Native Function Interface, TruffleSqueak, RSqueak/VM, GraalVM Dashboard, Paper on Polyglot VM built with RPython, RPython Toolchain, GraalVM Reachability Metadata Repository, using GraalVM with Github Actions. GitHub Action for GraalVM, GraalVM 22.2 release blog post, New GraalVM reachability metadata repository, source level debugging with native images, continuous native image build tracking, Embedding Truffle Languages by Kevin Menard Fabio Niephaus on twitter: @fniephaus

java small talk php potsdam truffles aot different levels circularity jvm github actions hasso plattner institut pypy

190: Testing PyPy - Carl Friedrich Bolz-Tereick

Test & Code - Python Testing & Development

Play Episode Listen Later Jun 21, 2022 51:17

PyPy is a fast, compliant alternative implementation of Python. cPython is implemented in C. PyPy is implemented in Python. What does that mean? And how do you test something as huge as an alternative implementation of Python? Special Guest: Carl Friedrich Bolz-Tereick.

testing python bolz performance testing carl friedrich pypy cpython

PyPyのパッケージ汚染やRedashの脆弱性の話　他

Secure Liaison

Play Episode Listen Later Nov 28, 2021 73:41

(収録日: 2021/11/27) # 感想はtwitterでハッシュタグ「#secure旅団 #secureLiaison」やGoogle Formにいただけると嬉しいです。 # 内容年末に向けて経済活動回してるよって話 BIMIの話 Emotet復活の話を少々とメールというプロトコルのユースケース PyPyのパッケージ汚染(11/18) セッションごととかにセキュリティ強弱をつける話（CAEP / Shared Signals and Events WG) OSSFのAllstartとかScore Cards runtimeのauditの話: Secure Namespaced Kernel Audit for Containers Redashの脆弱性の話とGithub Security Advisory 身代金払わず2億円で新システム　徳島サイバー被害病院ブラックフライデーの話とPS5欲しいって話 # 参加者: 名無しさん、ykyanさん, @ken5scal # BGM: "A Fool in Love" by Imprismed ジングル: @hajipion

love playstation 5 pypy

All My Favorite Songs 003 by Drive Like Jehu - All Tomorrow's Parties 2.0

All My Favorite Songs

Play Episode Listen Later Oct 21, 2021

Drive Like Jehu was an American post-hardcore band from San Diego active from 1990 to 1995. It was formed by rhythm guitarist and vocalist Rick Froberg and lead guitarist John Reis, ex-members of Pitchfork, along with bassist Mike Kennedy and drummer Mark Trombino, both from Night Soil Man, after their two bands disbanded in 1990. Drive Like Jehu's music was characterized by passionate singing, unusual song structure, indirect melodic themes, intricate guitar playing, and calculated use of tension, resulting in a distinctive sound amongst other post-hardcore acts and helped to catalyze the evolution of hardcore punk into emo. In this episode all songs by bands selected by Drive Like Jehu to play All Tomorrow's Parties 2.0 April 22-24, 2016. Lineup: Hot Snakes, The Blind Shake, Mrs. Magician, Flamin' Groovies, The King Khan & BBQ Show, The Schizophonics, Metz, Mission Of Burma, The Gories, King Khan and the Shrines, The Spits, Rocket From The Crypt, Betunizer, The Monkeywrench, Holly Golightly, Dan Sartain, Martin Rev, Tortoise, The Ex, PyPy, Gary Wilson, Claw Hammer, Wau y Los Arrrghs

Software at Scale 34 - Faster Python with Guido van Rossum

Software at Scale

Play Episode Listen Later Oct 5, 2021 31:11

Guido van Rossum is the creator of the Python programming language and a Distinguished Engineer at Microsoft. Apple Podcasts | Spotify | Google PodcastsWe discuss Guido’s new work on making CPython faster (PEP 659), Tiers of Python Interpreter Execution, and high impact, low hanging fruit performance improvements.Highlights(an edited summary)[00:21] What got you interested in working on Python performance?Guido: In some sense, it was probably a topic that was fairly comfortable to me because it means working with a core of Python, where I still feel I know my way around. When I started at Microsoft, I briefly looked at Azure but realized I never enjoyed that kind of work at Google or Dropbox. Then I looked at Machine Learning, but it would take a lot of time to do something interested with the non-Python, and even Python-related bits.[02:31] What was different about the set of Mark Shannon’s ideas on Python performance that convinced you to go after them?Guido: I liked how he was thinking about the problem. Most of the other approaches around Python performance like PyPy and Cinder are not suitable for all use cases since they aren’t backward compatible with extension modules. Mark has the perspective and experience of a CPython developer, as well as a viable approach that would maintain backward compatibility, which is the hardest problem to solve. The Python Bytecode interpreter is modified often across minor releases (for eg: 3.8 → 3.9) for various reasons like new opcodes, so modifying that is a relatively safe approach. Utsav: [09:45] Could you walk us through the idea of the tiers of execution of the Python Interpreter?Guido: When you execute a program, you don't know if it's going to crash after running a fraction of a millisecond, or whether it's going to be a three-week-long computation. Because it could be the same code, just in the first case, it has a bug. And so, if it takes three weeks to run the program, maybe it would make sense to spend half an hour ahead of time optimizing all the code that's going to be run. But obviously, especially in dynamic languages like Python, where we do as much as we can without asking the user to tell us exactly how they need it done, you just want to start executing code as quickly as you can. So that if it's a small script, or a large program that happens to fail early, or just exits early for a good reason, you don't spend any time being distracted by optimizing all that code.So, what we try to do there is keep the bytecode compiler simple so that we get to execute the beginning of the code as soon as possible. If we see that certain functions are being executed many times over, then we call that a hot function, and some definition of “hot”. For some purposes, maybe it's a hot function if it gets called more than once, or more than twice, or more than 10 times. For other purposes, you want to be more conservative, and you can say, “Well, it's only hot if it's been called 1000 times.”The specializing adaptive compiler (PEP 659) then tries to replace certain bytecodes with bytecodes that are faster, but only work if the types of the arguments are specific types. A simple hypothetical example is the plus operator in Python. It can add lots of things like integers, strings, lists, or even tuples. On the other hand, you can't add an integer to a string. So, the optimization step - often called quickening, but usually in our context, we call it specializing - is to have a separate “binary add” integer bytecode, a second-tier bytecode hidden from the user. This opcode assumes that both of its arguments are actual Python integer objects, reaches directly into those objects to find the values, adds those values together in machine registers, and pushes the result back on the stack. The binary adds integer operation still has to make a type check on the arguments. So, it's not completely free but a type check can be implemented much faster than a sort of completely generic object-oriented dispatch, like what normally happens for most generic add operations. Finally, it's always possible that a function is called millions of times with integer arguments, and then suddenly a piece of data calls it with a floating-point argument, or something worse. At that point, the interpreter will simply execute the original bytecode. That's an important part so that you still have the full Python semantics.Utsav [18:20] Generally you hear of these techniques in the context of JIT, a Just-In-Time compiler, but that’s not being implemented right now.Just-In-Time compilation has a whole bunch of emotional baggage with it at this point that we're trying to avoid. In our case, it’s unclear what and when we’re exactly compiling. At some point ahead of program execution, we compile your source code into bytecode. Then we translate the bytecode into specialized bytecode. I mean, everything happens at some point during runtime, so which part would you call Just-In-Time? Also, it’s often assumed that Just-In-Time compilation automatically makes all your code better. Unfortunately, you often can't actually predict what the performance of your code is going to be. And we have enough of that with modern CPUs and their fantastic branch prediction. For example, we write code in a way that we think will clearly reduce the number of memory accesses. When we benchmark it, we find that it runs just as fast as the old unoptimized code because the CPU figured out access patterns without any of our help. I wish I knew what went on in modern CPUs when it comes to branch prediction and inline caching because that is absolute magic. Full TranscriptUtsav: [00:14] Thank you, Guido, for joining me on another episode of the Software at Scale podcast. It's great to have you here. Guido: [00:20] Great to be here on the show. Utsav: [00:21] Yeah. And it's just fun to talk to you again. So, the last time we spoke was at Dropbox many, many years ago. And you got retired, and then you decided that you wanted to do something new. And you work on performance now at Microsoft, and that's amazing. So, to start off with, I just want to ask you, you could pick any project that you wanted to, based on some slides that I've seen. So, what got you interested in working on Python performance?Guido: [00:47] In some sense, it was probably a topic that was fairly comfortable to me because it means working with a core of Python, where I still feel I know my way around. Some other things I considered briefly in my first month at Microsoft, I looked into, “Well, what can I do with Azure?”, and I almost immediately remembered that I was not cut out to be a cloud engineer. That was never the fun part of my job at Dropbox. It wasn't the fun part of my job before that at Google either. And it wouldn't be any fun to do that at Microsoft. So, I gave up on that quickly. I looked in machine learning, which I knew absolutely nothing about when I joined Microsoft. I still know nothing, but I've at least sat through a brief course and talked to a bunch of people who know a lot about it. And my conclusion was actually that it's a huge field. It is mostly mathematics and statistics and there is very little Python content in the field. And it would take me years to do anything interesting with the non-Python part and probably even with the Python part, given that people just write very simple functions and classes, at best in their machine learning code. But at least I know a bit more about the terminology that people use. And when people say kernel, I now know what they mean. Or at least I'm not confused anymore as I was before.Utsav: [02:31] That makes sense. And that is very similar to my experience with machine learning. Okay, so then you decided that you want to work on Python performance, right? And then you are probably familiar with Mark Shannon's ideas?Guido: [02:43] Very much so. Yeah.Utsav: [02:44] Yeah. So, was there anything different about the set of ideas that you decided that this makes sense and I should work on a project to implement these ideas?Guido: [02:55] Mark Shannon's ideas are not unique, perhaps, but I know he's been working on for a long time. I remember many years ago, I went to one of the earlier Python UK conferences, where he gave a talk about his PhD work, which was also about making Python faster. And over the years, he's never stopped thinking about it. And he sort of has a holistic attitude about it. Obviously, the results remain to be seen, but I liked what he was saying about how he was thinking about it. And if you take PyPy, it has always sounded like PyPy is sort of a magical solution that only a few people in the world understand how it works. And those people built that and then decided to do other things. And then they left it to a team of engineers to solve the real problems with PyPy, which are all in the realm of compatibility with extension modules. And they never really solved that. [04:09] So you may remember that there was some usage of PyPy at Dropbox because there was one tiny process where someone had discovered that PyPy was actually so much faster that it was worth it. But it had to run in its own little process and there was no maintenance. And it was a pain, of course, to make sure that there was a version of PyPy available on every machine. Because for the main Dropbox application, we could never switch to PyPy because that depended on 100 different extension modules. And just testing all that code would take forever. [04:49] I think since we're talking about Dropbox, Pyston was also an interesting example. They've come back actually; you've probably heard that. The Pyston people were much more pragmatic, and they've learned from PyPy’s failures. [05:04] But they have always taken this attitude of, again, “we're going to start with CPython,” which is good because that way they are sort of guaranteed compatibility with extension modules. But still, they make these huge sets of changes, at least Pyston one, and they had to roll back a whole bunch of things because, again, of compatibility issues, where I think one of the things, they had a bunch of very interesting improvements to the garbage collection. I think they got rid of the reference counting, though. And because of that, the behavior of many real-world Python programs was completely changed. [05:53] So why do I think that Mark's work will be different or Mark's ideas? Well, for one, because Mark has been in Python core developer for a long time. And so, he knows what we're up against. He knows how careful we have with backwards compatibility. And he knows that we cannot just say get rid of reference counting or change the object layout. Like there was a project that was recently released by Facebook basically, was born dead, or at least it was revealed to the world in its dead form, CI Python (Cinder), which was a significantly faster Python implementation, but using sort of many of the optimizations came from changes in object layout that just aren't compatible with extension modules. And Mark has sort of carved out these ideas that work on the bytecode interpreter itself. [06:58] Now, the bytecode is something where we know that it's not going to sort of affect third-party extension modules too much if we change it, because the bytecode changes in every Python release. And internals of the interpreter of the bytecode interpreter, change in every Python release. And yes, we still run into the occasional issue. Every release, there is some esoteric hack that someone is using that breaks. And they file an issue in the bug tracker because they don't want to research or they haven't yet researched what exactly is the root cause of the problem, because all they know is their users say, “My program worked in Python 3.7, and it broke in Python 3.8. So clearly, Python 3.8 broke something.” And since it only breaks when we're using Library X, it must be maybe Library X's fault. But Library X, the maintainers don't know exactly what's going on because the user just says it doesn't work or give them a thousand-line traceback. And they bounce it back to core Python, and they say, “Python 3.8 broke our library for all our users, or 10% of our users,” or whatever. [08:16] And it takes a long time to find out, “Oh, yeah, they're just poking inside one of the standard objects, using maybe information they gleaned from internal headers, or they're calling a C API that starts with an underscore.” And you're not supposed to do that. Well, you can do that but then you pay the price, which is you have to fix your code at every next Python release. And in between, sort of for bug fix releases like if you go from 3.8.0 to 3.8.1, all the way up to 3.8.9, we guarantee a lot more - the bytecodes stay stable. But 3.9 may break all your hacks and it changes the bytecode. One thing we did I think in 3.10, was all the jumps in the bytecode are now counted in instructions rather than bytes, and instructions are two bytes. Otherwise, the instruction format is the same, but all the jumps jump a different distance if you don't update your bytecode. And of course, the Python bytecode compiler knows about this. But people who generate their own bytecode as a sort of the ultimate Python hack would suffer.Utsav: [09:30] So the biggest challenge by far is backwards compatibility.Guido: [09:34] It always is. Yeah, everybody wants their Python to be faster until they find out that making it faster also breaks some corner case in their code.Utsav: [09:45] So maybe you can walk us through the idea of the tiers of execution or tiers of the Python interpreter that have been described in some of those slides.Guido: [09:54] Yeah, so that is a fairly arbitrary set of goals that you can use for most interpreted languages. Guido: [10:02] And it's actually a useful way to think about it. And it's something that we sort of plan to implement, it's not that there are actually currently tiers like that. At best, we have two tiers, and they don't map perfectly to what you saw in that document. But the basic idea is-- I think this also is implemented in .NET Core. But again, I don't know if it's sort of something documented, or if it's just this is how their optimizer works. So, when you just start executing a program, you don't know if it's going to crash after running a fraction of a millisecond, or whether it's going to be a three-week-long computation. Because it could be the same code, just in the first case, it has a bug. And so, if it takes three weeks to run the program, maybe it would make sense to spend half an hour ahead of time optimizing all the code that's going to be run. But obviously, especially in dynamic language, and something like Python, where we do as much as we can without asking the user to tell us exactly how they need it done, you just want to start executing the code as quickly as you can. So that if it's a small script, or a large program that happens to fail early, or just exits early for a good reason, you don't spend any time being distracted by optimizing all that code. [11:38] And so if this was a statically compiled language, the user would have to specify that basically, when they run the compiler, they say, “Well, run a sort of optimize for speed or optimize for time, or O2, O3 or maybe optimized for debugging O0.” In Python, we try not to bother the user with those decisions. So, you have to generate bytecode before you can execute even the first line of code. So, what we try to do there is keep the bytecode compiler simple, keep the bytecode interpreter simple, so that we get to execute the beginning of the code as soon as possible. If we see that certain functions are being executed many times over, then we call that a hot function, and you can sort of define what's hot. For some purposes, maybe it's a hot function if it gets called more than once, or more than twice, or more than 10 times. For other purposes, you want to be more conservative, and you can say, “Well, it's only hot if it's been called 1000 times.” [12:48] But anyway, for a hot function, you want to do more work. And so, the specializing adaptive compiler, at that point, tries to replace certain bytecodes with bytecodes that are faster, but that work only if the types of the arguments are specific types. A simple example but pretty hypothetical is the plus operator in Python at least, can add lots of things. It can add integers, it can add floats, it can add strings, it can list or tuples. On the other hand, you can't add an integer to a string, for example. So, what we do there, the optimization step - and it's also called quickening, but usually in our context, we call it specializing - is we have a separate binary add integer bytecode. And it's sort of a second-tier bytecode that is hidden from the user. If the user asked for the disassembly of their function, they will never see binary add integer, they will also always see just binary add. But what the interpreter sees once the function has been quickened, the interpreter may see binary add integers. And the binary add integer just assumes that both of its arguments, that's both the numbers on the stack, are actual Python integer objects. It just reaches directly into those objects to find the values, adds those values together in machine registers, and push the result back on the stack. [14:35] Now, there are all sorts of things that make that difficult to do. For example, if the value doesn't fit in a register for the result, or either of the input values, or maybe even though you expected it was going to be adding two integers, this particular time it's going to add to an integer and a floating-point or maybe even two strings. [15:00] So the first stage of specialization is actually… I'm blanking out on the term, but there is an intermediate step where we record the types of arguments. And during that intermediate step, the bytecode actually executes slightly slower than the default bytecode. But that only happens for a few executions of a function because then it knows this place is always called with integers on the stack, this place is always called with strings on the stack, and maybe this place, we still don't know or it's a mixed bag. And so then, the one where every time it was called during this recording phase, it was two integers, we replace it with that binary add integer operation. The binary adds integer operation, then, before it reaches into the object, still has to make a type check on the arguments. So, it's not completely free but a type check can be implemented much faster than a sort of completely generic object-oriented dispatch, like what normally happens for the most generic binary add operations. [16:14] So once we've recorded the types, we specialize it based on the types, and the interpreter then puts in guards. So, the interpreter code for the specialized instruction has guards that check whether all the conditions that will make the specialized instruction work, are actually met. If one of the conditions is not met, it's not going to fail, it's just going to execute the original bytecode. So, it's going to fall back to the slow path rather than failing. That's an important part so that you still have the full Python semantics. And it's always possible that a function is called hundreds or millions of times with integer arguments, and then suddenly a piece of data calls it with a floating-point argument, or something worse. And the semantics still say, “Well, then it has to do with the floating-point way.Utsav: [17:12] It has to deoptimize, in a sense.Guido: [17:14] Yeah. And there are various counters in all the mechanisms where, if you encounter something that fails the guard once, that doesn't deoptimize the whole instruction. But if you sort of keep encountering mismatches of the guards, then eventually, the specialized instruction is just deoptimized and we go back to, “Oh, yeah, we'll just do it the slow way because the slow way is apparently the fastest, we can do.” Utsav: [17:45] It's kind of like branch prediction.Guido: [17:47] I wish I knew what went on in modern CPUs when it comes to branch prediction and inline caching because that is absolute magic. And it's actually one of the things we're up against with this project, because we write code in a way that we think will clearly reduce the number of memory accesses, for example. And when we benchmark it, we find that it runs just as fast as the old unoptimized code because the CPU figured it out without any of our help. Utsav: [18:20] Yeah. I mean, these techniques, generally you hear them in a context of JIT, a Just-In-Time compiler, but y’all are not implementing that right now.Guido: [18:30] JIT is like, yeah, in our case, it would be a misnomer. What we do expect to eventually be doing is, in addition to specialization, we may be generating machine code. That's probably going to be well past 3.11, maybe past 3.12. So, the release that we still have until October next year is going to be 3.11, and that's where the specializing interpreters going to make its first entry. I don't think that we're going to do anything with machine code unless we get extremely lucky with our results halfway through the year. But eventually, that will be another tier. But I don't know, Just-In-Time compilation has a whole bunch of emotional baggage with it at this point that we're trying to avoid.Utsav: [19:25] Is it baggage from other projects trying it?Guido: [19:29] People assume that Just-In-Time compilation automatically makes all your code better. It turns out that it's not that simple. In our case, compilation is like, “What exactly is it that we compile?” At some point ahead of time, we compile your source code into bytecode. Then we translate the bytecode into specialized bytecode. I mean, everything happens at some point during runtime, so which thing would you call Just-In-Time? Guido: [20:04] So I'm not a big fan of using that term. And it usually makes people think of feats of magical optimization that have been touted by the Java community for a long time. And unfortunately, the magic is often such that you can't actually predict what the performance of your code is going to be. And we have enough of that, for example, with the modern CPUs and their fantastic branch prediction.Utsav: [20:35] Speaking of that, I saw that there's also a bunch of small wins y'all spoke about, that y’all can use to just improve performance, things like fixing the place of __dict__ in objects and changing the way integers are represented. What is just maybe one interesting story that came out of that?Guido: [20:53] Well, I would say calling Python functions is something that we actually are currently working on. And I have to say that this is not just the Microsoft team, but also other people in the core dev team, who are very excited about this and helping us in many ways. So, the idea is that in the Python interpreter, up to and including version 3.10, which is going to be released next week, actually, whenever you call a Python function, the first thing you do is create a frame object. And a frame object contains a bunch of state that is specific to that call that you're making. So, it points to the code object that represents the function that's being called, it points to the globals, it has a space for the local variables of the call, it has space for the arguments, it has space for the anonymous values on the evaluation stack. But the key thing is that it’s still a Python object. And there are some use cases where people actually inspect the Python frame objects, for example, if they want to do weird stuff with local variables. [22:18] Now, if you're a debugger, it makes total sense that you want to actually look at what are all the local variables in this frame? What are their names? What are their values and types? A debugger may even want to modify a local variable while the code is stopped in a breakpoint. That's all great. But for the execution of most code, most of the time, certainly, when you're not using a debugger, there's no reason that that frame needs to be a Python object. Because a Python object has a header, it has a reference count, it has a type, it is allocated as its own small segment of memory on the heap. It's all fairly inefficient. Also, if you call a function, then you create a few objects, then from that function, you call another function, all those frame objects end up scattered throughout the entire heap of the program. [23:17] What we have implemented in our version of 3.11, which is currently just the main branch of the CPython repo, is an allocation scheme where when we call a function, we still create something that holds the frame, but we allocate that in an array of frame structures. So, I can't call them frame objects because they don't have an object header, they don't have a reference count or type, it's just an array of structures. This means that unless that array runs out of space, calls can be slightly faster because you don't jump around on the heap. And allocation sort of is to allocate the next frame, you compare two pointers, and then you bump one counter, and now you have a new frame structure. And so, creation, and also deallocation of frames is faster. Frames are smaller because you don't have the object header. You also don't have the malloc overhead or the garbage collection overhead. And of course, it's backwards incompatible. So, what do we do now? Fortunately, there aren't that many ways that people access frames. And what we do is when people call an API that returns a frame object, we say, “Okay, well sure. Here's the frame in our array. Now we're going to allocate an object and we're going to copy some values to the frame object,” and we give that to the Python code. So, you can still introspect it and you can look at the locals as if nothing has changed. [25:04] But most of the time, people don't look at add frames. And this is actually an old optimization. I remember that the same idea existed in IronPython. And they did it differently. I think for them, it was like a compile-time choice when the bytecode equivalent in IronPython was generated for a function, it would dynamically make a choice whether to allocate a frame object or just a frame structure for that call. And their big bugaboo was, well, there is a function you can call sys dunder __getFrame__ and it just gives you the frame object. So, in the compiler, they were looking, were you using the exact thing named system dunder __getFrame__ and then they would say, “Oh, that's getFrame, now we're going to compile you slightly slower so you use a frame object.” We have the advantage that we can just always allocate the frame object on the fly. But we get similar benefits. And oh, yeah, I mentioned that the frame objects are allocated in array, what happens if that array runs out? Well, it's actually sort of a linked list of arrays. So, we can still create a new array of frames, like we have space for 100 or so which, in many programs, that's plenty. And if your call stack is more than 100 deep, we'll just have one discontinuity, but the semantics are still the same and we still have most of the benefits.Utsav: [26:39] Yeah, and maybe as a wrap-up question, there are a bunch of other improvements happening in the Python community for performance as well, right? There's Mypyc, which we're familiar with, which is using types, Mypy types to maybe compiled code to basically speed up. Are there any other improvements like that, that you're excited about, or you're interested in just following?Guido: [27:01] Well, Mypyc is very interesting. It gives much better performance boost, but only when you fully annotate your code and only when you actually follow the annotations precisely at runtime. In Mypy, if you say, “This function takes two integers,” and it returns an integer, then if you call it with something else, it's going to immediately blow up. It'll give you a traceback. But the standard Python semantics are that type annotations are optional, and sometimes they're white lies. And so, the types that you see at runtime may not actually be compatible with the types that were specified in the annotations. And it doesn't affect how your program executes. Unless you sort of start introspecting the annotations, your program runs exactly the same with or without annotations. [28:05] I mean, there are a couple of big holes that are in the type system, like any. And the type checker will say, “Oh, if you put any, everything is going to be fine.” And so, using that, it's very easy to have something that is passed, an object of an invalid type, and the type checker will never complain about it. And our promise is that the runtime will not complain about it either unless it really is a runtime error. Obviously, if you're somehow adding an integer to a string at runtime, it's still going to be a problem. But if you have a function that, say, computes the greatest common divisor of two numbers, which is this really cute little loop, if you define the percent operator in just the right way, you can pass in anything. I think there are examples where you can actually pass it to strings, and it will return a string without ever failing. [29:07] And so basically, Mypyc does things like the instance attributes are always represented in a compact way where there is no dunder __dict__. The best that we can do, which we are working on designing how we're actually going to do that, is make it so that if you don't look at the dunder __dict__ attribute, we don't necessarily have to store the instance attributes in a dictionary as long as we preserve the exact semantics. But if you use the dunder __dict__, at some later point, again, just like the frame objects, we have to materialize a dictionary. And Mypyc doesn't do that. It's super-fast if you don't use dunder __dict__. If you do use dunder __dict__, it just says, “dunder __dict__ not supported in this case.” [29:59] Mypyc really only compiles a small subset of the Python language. And that's great if that's the subset you're interested in. But I'm sure you can imagine how complex that is in practice for a large program.Utsav: [30:17] It reminds me of JavaScript performance when everything is working fast and then you use this one function, which you're not supposed to use to introspect an object or something, and then performance just breaks down. Guido: [30:29] Yeah, that will happen. Utsav: [30:31] But it's still super exciting. And I'm also super thankful that Python fails loudly when you try to add a number in the string, not like JavaScript,Guido: [30:41] Or PHP, or Perl.Utsav: [30:44] But yeah, thank you so much for being a guest. I think this was a lot of fun. And I think it walked through the performance improvement y’all are trying to make in an accessible way. So, I think it’s going to be useful for a lot of people. Yeah, thank you for being a guest.Guido: [30:58] My pleasure. It’s been a fun chat. Get on the email list at www.softwareatscale.dev

Episode 93: Dan Lorenc and OSS Supply Chain Security at Google

Sustain

Play Episode Listen Later Oct 1, 2021 36:23

Guest Dan Lorenc Panelists Eric Berry | Justin Dorfman | Richard Littauer Show Notes Hello and welcome to Sustain! The podcast where we talk about sustaining open source for the long haul. Today, we have a very special guest, Dan Lorenc, who is a Staff Software Engineer and the lead for Google's Open Source Security Team. Dan founded projects like Minikube, Skaffold, TektonCD, and Sigstore. He blogs regularly about supply chain security and serves on the TAC for the Open SSF. Dan fill us in on how Docker fits into what he's doing at Google, he tells us about who's running the Open Standards that Docker is depending on, and what he's most excited for with Docker with standardization and in the future. We also learn a little more about a blog post he did recently and what he means by “package managers should become boring,” and he tells us how package managers can help pay maintainers to support their libraries. We learn more about his project Sigstore, and his perspective on the long-term growth of the software industry towards security and how that will change in the next five to ten years. Go ahead and download this episode now to find out much more! [00:01:09] Dan tells us his background and how he got to where he is today. [00:03:08] Eric wonders how Docker fits into what Dan is doing at Google and if he can compare Minicube and his work with what the Docker team is trying to drive. He also compares Kubernetes to Docker and how they relate. [00:06:13] Dan talks about if he sees a shift of adoption in the sphere of what he's seeing, and Eric asks if he feels that local development with Docker is devalued a little bit if you don't use the same Docker configuration for your production deploy. [00:08:49] Richard wonders in the long-term, if Dan thinks we're going to continually keep making Dockers, better Kubernetes, or at some point are we going to decide that tooling is enough. [00:10:35] We learn who's currently running the Open Standards that Docker is depending on and Dan talks about the different standards. [00:12:13] Dan shares how he thinks the shift towards open standards in particular with Docker, influences open source developers who are in more smaller companies, in SMEs, in medium-sized companies, or solo developers out there who may not have the time to get involved in open standards. [00:13:45] Find out what Dan is really excited about in terms of Docker, with standardization or in the future that will lead to a more sustainable ecosystem. [00:15:17] Justin brings up Dan's blog and a recent post he just did called, “In Defense of Package Managers,” and in it he mentions package managers should become boring, so he explains what he means by that. [00:18:01] Dan discusses how package managers can help pay maintainers to support their libraries. [00:22:03] Richard asks Dan if he has any thoughts on getting other ways of recognition to maintainers down the stack than just paying them. He mentions things that he loves that GitHub's been doing recently showing people their contribution history. [00:23:46] Find out about Dan's project Sigstore and what his adoption looks like so far. [00:26:35] Richard wonders if Dan thinks it's a good idea to have that ecosystem depend upon a few brilliant people like him doing this work or if there's a larger community of people working on security supply chain issues. Also, who are his colleagues that he bounces these ideas off of and how do we eliminate the bus factor here. Dan tells us they have a slack for Sigstore [00:30:03] We learn Dan's perspective on the long-term growth of the software industry towards security in general, how will that change over the next five to ten years, and how his role and the role of people like him will change. [00:31:35] Find out all the places you can follow Dan on the internet. Quotes [00:10:14] “You kind of move past that single point of failure and single tool shame that's actually used to manage everything.” [00:12:44] “So, they kind of helped contribute to the standardization process by proving stuff out by getting to try all the new exciting stuff.” [00:16:33] The “bullseye” release actually just went on a couple of days ago which was awesome.” [00:17:04] “It's a problem because there's nobody maintaining, which is a really good topic for sustainability.” [00:24:46] “But nobody's doing it for open source, nobody's signing their code on PyPy or Ruby Gems even though you can.” [00:29:50] “These are not the Kim Kardashians of the coding community.” [00:30:25] “Something that we've been constantly reminding, you know, the policy makers wherever we can, is that 80 to 90% of software in use today is open source.” [00:30:51] “And even if companies can do this work for the software that they produce if we don't think of, and don't take care of, and don't remember that these same requirements are going to hit opensource at the very bottom of the stack, and we're kind of placing unfunded mandates and burdens on these repositories and maintainers that they didn't sign up for it.” [00:31:11] “So we're really trying to remind everyone that as we increase these security standards, which we should do and we need to do, because software is serious, and people's lives depend on it.” Spotlight [00:32:32] Eric's spotlight is a game called Incremancer by James Gittins. [00:33:35] Justin's spotlight is Visual Studio Live Share. [00:34:04] Richard's spotlight is the BibTeX Community. [00:35:03] Dan's spotlight is the Debian maintainers. Links SustainOSS (https://sustainoss.org/) SustainOSS Twitter (https://twitter.com/SustainOSS?ref_src=twsrc%5Egoogle%7Ctwcamp%5Eserp%7Ctwgr%5Eauthor) SustainOSS Discourse (https://discourse.sustainoss.org/) Dan Lorenc Twitter (https://twitter.com/lorenc_dan?ref_src=twsrc%5Egoogle%7Ctwcamp%5Eserp%7Ctwgr%5Eauthor) Dan Lorenc Linkedin (https://www.linkedin.com/in/danlorenc) Dan Lorenc Blog (https://dlorenc.medium.com/) Tekton (https://tekton.dev/) Minikube (https://minikube.sigs.k8s.io/docs/) Skaffold (https://skaffold.dev/) Open SSF (https://openssf.org/) Open Container Initiative (https://opencontainers.org/) Committing to Cloud Native podcast-Episode 20-Taking Open Source Supply Chain Security Seriously with Dan Lorenc (https://podcast.curiefense.io/20) “In Defense of Package Managers” by Dan Lorenc (https://dlorenc.medium.com/in-defense-of-package-managers-31792111d7b1?) Open Source Insights (https://deps.dev/) GitHub repositories Nebraska users (https://github.com/search?q=location%3Anebraska&type=users) CHAOSScast podcast (https://podcast.chaoss.community/) Sigstore (https://www.sigstore.dev/) RyotaK Twitter (https://twitter.com/ryotkak) Dustin Ingram Twitter (https://twitter.com/di_codes?ref_src=twsrc%5Egoogle%7Ctwcamp%5Eserp%7Ctwgr%5Eauthor) Incremancer (https://incremancer.gti.nz/) Visual Studio Live Share (https://visualstudio.microsoft.com/services/live-share/) Enhanced support for citations on GitHub-Arfon Smith (https://github.blog/2021-08-19-enhanced-support-citations-github/) Debian (https://www.debian.org/) Debian “bullseye” Release (https://www.debian.org/releases/bullseye/) Credits Produced by Richard Littauer (https://www.burntfen.com/) Edited by Paul M. Bahr at Peachtree Sound (https://www.peachtreesound.com/) Show notes by DeAnn Bahr at Peachtree Sound (https://www.peachtreesound.com/) Special Guest: Dan Lorenc.

google nebraska kim kardashian supply chains quotes edited spotlight committing github enhanced sustain smes docker kubernetes in defense tac bahr cloud native dockers debian paul m eric berry supply chain security staff software engineer open standards rubygems tekton sigstore pypy dan lorenc minikube

Python en español #28: Tertulia 2021-04-13

Play Episode Listen Later Jun 29, 2021 76:58

Tener varias versiones de Python en el mismo ordenador, estado de Durus, su licencia y cómo funciona la persistencia de datos https://podcast.jcea.es/python/28 Participantes: Jesús Cea, email: jcea@jcea.es, twitter: @jcea, https://blog.jcea.es/, https://www.jcea.es/. Conectando desde Madrid. Jesús, conectando desde Ferrol. Felipem, conectando desde Cantabria. Eduardo Castro, email: info@ecdesign.es. Conectando desde A Guarda. Víctor Ramírez, twitter: @virako, programador python y amante de vim, conectando desde Huelva. Sergio, conectando desde Vigo. Juan José, Nekmo, https://nekmo.com/, https://github.com/Nekmo/. Madrileño conectando desde Málaga. Miguel Sánchez, email: msanchez@uninet.edu, conectando desde Las Palmas. Audio editado por Pablo Gómez, twitter: @julebek. La música de la entrada y la salida es "Lightning Bugs", de Jason Shaw. Publicada en https://audionautix.com/ con licencia - Creative Commons Attribution 4.0 International License. [00:52] Presentaciones. [03:47] Utilizar diferentes versiones de Python en el mismo ordenador. Cada paquete instalado está vinculado a una instancia concreta de Python instalada en el sistema. Nunca hacer pip install, sino indicar la versión: pip3.9 install. A la hora de instalar paquetes Python en la versión nativa del sistema operativo, se puede usar pip o bien el gestor de paquetes del sistema operativo. Mezclar ambas es una receta para el desastre. [16:37] Un problema de los paquetes precompilados ("wheels" https://www.python.org/dev/peps/pep-0427/) es que no se suelen precompilar de forma retroactiva para la última versión de Python que acaba de salir. No suelen estar disponibles hasta que sale una versión nueva del paquete, lo que puede tardar meses. [19:52] ¿Bibliotecas para manejar imágenes, compatibles con PyPy https://www.pypy.org/? Numpy https://numpy.org/ aún no funciona en PyPy https://www.pypy.org/. [21:17] ¿Qué es PyPy https://www.pypy.org/ exactamente? Jit: Compilación al vuelo https://es.wikipedia.org/wiki/Compilaci%C3%B3n_en_tiempo_de_ejecuci%C3%B3n. Barrera de entrada muy grande para entrar en el proyecto. Curva de aprendizaje. Problemas con los módulos en C. No valoraron la importancia del ecosistema. HPy https://hpyproject.org/. [27:27] Experiencia de un par de semanas con Flit https://pypi.org/project/flit/. Jesús Cea lo está utilizando para publicar su biblioteca toc2audio https://docs.jcea.es/toc2audio/. Herramienta propuesta en la charla "Python Packaging: Lo estás haciendo mal" https://www.youtube.com/watch?v=OeOtIEDFr4Y, de Juan Luis Cano. https://github.com/astrojuanlu/charla-python-packaging. https://nbviewer.jupyter.org/format/slides/github/astrojuanlu/charla-python-packaging/blob/main/Charla%20Python%20packaging.ipynb#/ PEP 621 -- Storing project metadata in pyproject.toml https://www.python.org/dev/peps/pep-0621/. Lo importante que es tener enlaces directos al "changelog" o a la documentación en PyPI https://pypi.org/. [31:32] Módulos de documentación. Carencias. Docstrings. doctest https://docs.python.org/3/library/doctest.html. Sphinx https://pypi.org/project/Sphinx/. make html. Tema eterno: Incluir una biblioteca en la biblioteca estándar o como biblioteca estándar. ReST: reStructuredText https://docutils.sourceforge.io/rst.html. PEP 287 -- reStructuredText Docstring Format https://www.python.org/dev/peps/pep-0287/. docutils: https://pypi.org/project/docutils/. [40:02] ¿Formato tertulia o preguntas y respuestas? [41:22] Estado actual de Durus https://www.mems-exchange.org/software/DurusWorks/ y comentarios variados sobre el sistema de persistencia. Jesús Cea ha estado intentando conectar con los autores, con poco éxito. Jesús Cea tiene problemas con la licencia. ¿Abandonar el proyecto y pasarse a ZODB https://zodb.org/en/latest/? La gente está haciendo "forks" https://en.wikipedia.org/wiki/Fork_(software_development) pasando olímpicamente de las licencias. Jesús Cea se está currando varios cambios de licencia en ciertos proyectos que le interesan, con muy poco éxito. ZOPE https://zopefoundation.github.io/Zope/. COPYRIGHT ASSIGNMENT https://www.copylaw.com/forms/copyassn.html. [50:32] ¿Cómo funciona un sistema de persistencia? Modelo completamente diferente a un ORM https://en.wikipedia.org/wiki/Object%E2%80%93relational_mapping. SQL: https://en.wikipedia.org/wiki/SQL. Working set: https://en.wikipedia.org/wiki/Working_set. [58:17] Volvemos al tema de licencias. [59:52] Explícame esto: https://lists.es.python.org/pipermail/general/2021-April/003476.html. Creamos un fichero "a.py" con el contenido: def x(): print('X') Creamos otro fichero "b.py" con el contenido: import a class clase: x = a.x def p(self): print(self.x) self.x() if __name__ == '__main__': a.x() b = clase() b.p() Ejecutas "b.py" y me explicas por qué sale lo que sale :-). [01:03:42] A la gente le encanta que le "piquen". [01:03:52] Las versiones actuales de Python ya han integrado el parche del "memory leak" que se habló en navidades. bpo-35930: Raising an exception raised in a "future" instance will create reference cycles #24995 https://github.com/python/cpython/pull/24995. [01:04:22] Llamada a ponencias de la PyConES https://2021.es.pycon.org/. [01:05:22] Volvemos al reto en https://lists.es.python.org/pipermail/general/2021-April/003476.html. Pista: los métodos son descriptores: https://docs.python.org/3/howto/descriptor.html. Bound method: https://www.geeksforgeeks.org/bound-methods-python/. Métodos estáticos: https://pythonbasics.org/static-method/. No se ha entendido nada porque ha habido numerosos cortes de sonido. El tema está bastante mejor explicado y se entiende en, por ejemplo, From Function to Method https://wiki.python.org/moin/FromFunctionToMethod. [01:10:02] Atributos de función. PEP 232 -- Function Attributes https://www.python.org/dev/peps/pep-0232/. Se pueden meter atributos a un método, pero se hace a nivel de clase, no de instancia, porque los métodos pertenecen a la clase, no a la instancia:class clase: def p(self): clase.p.hola = 78 >>> x=clase() >>> x.p() >>> x.p.hola 78 >>> y=clase() >>> a.p.hola 78 >>> clase.p.hola 78 [01:14:42] Notas de las grabaciones, temas futuros y enviar temas con algún tiempo previo a la tertulia si requieren pensar. [01:16:06] Final.

Python en español #26: Tertulia 2021-03-30

Play Episode Listen Later Jun 17, 2021 109:14

Diseccionamos la charla de Juan Luis Cano "Python Packaging: Lo estás haciendo mal" y mucho DevOps https://podcast.jcea.es/python/26 Este audio tiene mucho ruido producido por el roce del micrófono de Jesús Cea en la ropa. Participantes: Jesús Cea, email: jcea@jcea.es, twitter: @jcea, https://blog.jcea.es/, https://www.jcea.es/. Conectando desde Madrid. Felipem, conectando desde Cantabria. Víctor Ramírez, twitter: @virako, programador python y amante de vim, conectando desde Huelva. Javier, conectando desde Madrid. Audio editado por Pablo Gómez, twitter: @julebek. La música de la entrada y la salida es "Lightning Bugs", de Jason Shaw. Publicada en https://audionautix.com/ con licencia - Creative Commons Attribution 4.0 International License. [00:50] Preludio. Hay que automatizarlo todo, y lo que no se puede automatizar, se documenta. Detalles de calidad de grabación. Lo que falta para publicar los audios. toc2audio https://docs.jcea.es/toc2audio/. La publicación de audios es inminente. Diversas plataformas de podcast https://es.wikipedia.org/wiki/Podcasting. Spotify https://es.wikipedia.org/wiki/Spotify. ¿Y publicar en Youtube? Estadísticas de descarga. [08:20] Autonomía digital. ¡Muerte al MP3! https://es.wikipedia.org/wiki/MP3 [10:20] Jesús Cea se queja de que la encuesta de programadores de Python no es sobre Python. Python Developers Survey 2020 Results https://www.jetbrains.com/lp/python-developers-survey-2020/ [11:55] Python Packaging: Lo estás haciendo mal https://www.youtube.com/watch?v=OeOtIEDFr4Y. https://github.com/astrojuanlu/charla-python-packaging. https://nbviewer.jupyter.org/format/slides/github/astrojuanlu/charla-python-packaging/blob/main/Charla%20Python%20packaging.ipynb#/ La charla ha gustado bastante en general. Flit https://pypi.org/project/flit/. Mucha documentación online está anticuada. Viene bien una lista de "buenas prácticas" actualizadas. El peso del "legado" anticuado. El ecosistema se está moviendo muy rápido. Buenas prácticas: https://packaging.python.org/. Esperemos que alguien mantenga eso actualizado. PEP 621 -- Storing project metadata in pyproject.toml https://www.python.org/dev/peps/pep-0621/. Pecado que Jesús Cea comete constantemente: ¡instalar paquetes a nivel de sistema operativo!. No le da problemas porque hace tantas barbaridades que se cancelan unas a otras. ¡Tú mejor que sigas las recomendaciones de Juan Luis Cano https://twitter.com/juanluisback! pipenv es el mal https://pypi.org/project/pipenv/. pip-tools https://pypi.org/project/pip-tools/. pip-compile. pipdeptree https://pypi.org/project/pipdeptree/. [35:28] A la hora de fijar dependencias, no es lo mismo bibliotecas que aplicaciones. [40:58] ¿Estar a la última o actualizar cuando no hay más remedio? ¡Tests de integración! https://es.wikipedia.org/wiki/Prueba_de_integraci%C3%B3n [45:15] Un 100% de cobertura de código no garantiza que se ejecuten todos los estados del código. [49:10] Tests de mutaciones https://es.wikipedia.org/wiki/Prueba_de_mutaci%C3%B3n. hypothesis https://pypi.org/project/hypothesis/. mutant https://pypi.org/project/mutant/. [50:50] Flit https://pypi.org/project/flit/. PEP 420 -- Implicit Namespace Packages https://www.python.org/dev/peps/pep-0420/. PEP 621 -- Storing project metadata in pyproject.toml https://www.python.org/dev/peps/pep-0621/. [55:35] PEP 427 -- The Wheel Binary Package Format 1.0 https://www.python.org/dev/peps/pep-0427/. Conda: https://docs.conda.io/en/latest/. Problemas para que los Wheel soporten las nuevas versiones de Python. Cuando sale una nueva versión de Python, suele ser necesario esperar para tener soporte Wheels de los paquetes que nos interesan. ELF (Executable and Linkable Format): https://en.wikipedia.org/wiki/Executable_and_Linkable_Format. [01:03:10] ¿Alguien usando un sistema operativo viejo va a instalar una versión moderna de Python? Si puedes instalar Python desde código fuente, seguro que puedes compilar mi librería desde código fuente también. Ojo con los paquetes binarios avanzados en CPUs antiguas. SSE: https://en.wikipedia.org/wiki/Streaming_SIMD_Extensions. cmov: https://en.wikipedia.org/wiki/Predication_(computer_architecture)#History. [01:10:48] Docker https://es.wikipedia.org/wiki/Docker_(software). [01:11:20] Réplicas locales de PyPI https://pypi.org/ y PyPI privados. [01:14:45] ccache https://ccache.dev/. Ansible: https://es.wikipedia.org/wiki/Ansible_(software). [01:18:58] HPy https://hpyproject.org/. [01:20:10] ¿Proponer temas esotéricos? ¿Mandar deberes? [01:21:05] Más sobre HPy https://hpyproject.org/. API alternativa para módulos Python en C. https://es.wikipedia.org/wiki/Interfaz_de_programaci%C3%B3n_de_aplicaciones. Permite generar un Wheel https://www.python.org/dev/peps/pep-0427/ que funciona en varias versiones de Python. Buen rendimiento tanto en CPython como en PyPy https://www.pypy.org/. Posible API https://es.wikipedia.org/wiki/Interfaz_de_programaci%C3%B3n_de_aplicaciones futuro para CPython. [01:29:02] Ayuda para adecentar la página web de los podcasts: https://podcast.jcea.es/python/. La publicación de los audios es inminente. Reusaremos el podcast "Python en español" https://podcast.jcea.es/python/. He pedido permiso a mis antiguos compañeros. CSS: https://es.wikipedia.org/wiki/Hoja_de_estilos_en_cascada. Hay tanto retraso en la publicación que cualquier "feedback" tardará en salir y en notarse sus efectos. [01:35:10] Canal de Telegram de coordinación: https://t.me/joinchat/y__YXXQM6bg1MTQ0. [01:36:10] Machete Mode https://nedbatchelder.com/blog/202103/machete_mode_tagging_frames.html. Usarlo para depurar un bug. Pena de muerte en producción. Ideas locas: James Powell https://twitter.com/dontusethiscode. Conocimiento íntimo del lenguaje y de su implementación. Javier disfruta dando charlas de temas profundos y esotéricos. [01:42:30] El parche de Memory Leak ya se ha integrado el Python. bpo-35930: Raising an exception raised in a "future" instance will create reference cycles #24995 https://github.com/python/cpython/pull/24995. [01:43:30] Despedida y deberes futuros. Security funding & NYU https://discuss.python.org/t/new-packaging-security-funding-nyu/7792. TUF (The Update Framework) https://theupdateframework.io/. PEP 458 -- Secure PyPI downloads with signed repository metadata https://www.python.org/dev/peps/pep-0458/. PEP 480 -- Surviving a Compromise of PyPI: End-to-end signing of packages https://www.python.org/dev/peps/pep-0480/. En honor a Eduardo, que no se ha conectado hoy, metemos ruido de teclado para que nuestro editor Pablo no lo eche de menos. [01:48:20] Final.

Python en español #20: Tertulia 2021-02-16

Play Episode Listen Later May 22, 2021 114:47

Internet Archive, no acabamos de hablar del nuevo "pattern matching", complejidad creciente de la sintaxis de Python https://podcast.jcea.es/python/20 Participantes: Eduardo Castro, email: info@ecdesign.es. Conectando desde A Guarda. Jesús Cea, email: jcea@jcea.es, twitter: @jcea, https://blog.jcea.es/, https://www.jcea.es/. Conectando desde Madrid. Víctor Ramírez, twitter: @virako, programador python y amante de vim, conectando desde Huelva. Javier, conectando desde Madrid. Audio editado por Pablo Gómez, twitter: @julebek. La música de la entrada y la salida es "Lightning Bugs", de Jason Shaw. Publicada en https://audionautix.com/ con licencia - Creative Commons Attribution 4.0 International License. [01:33] Cómo documentar en Python. Google docs: https://docs.google.com. Wikis en GitHub: https://docs.github.com/en/communities/documenting-your-project-with-wikis/about-wikis. Ventajas de tener la documentación en el control de versiones del proyecto. Ventajas de ir escribiendo la documentación mientras escribes el propio código: Realimentación. Sphinx: https://www.sphinx-doc.org/en/master/. sphinx.ext.autodoc: https://www.sphinx-doc.org/en/master/usage/extensions/autodoc.html. plantuml: https://github.com/sphinx-contrib/plantuml. Markdown: https://www.markdownguide.org/. [03:48] La vieja guardia es escéptica con las novedades de la semana. No hay balas de plata. La documentación guía el desarrollo. Paralelismo con los tests. [08:38] Open source y la vergüenza: tests y documentación. [09:28] CPython Internals Book https://realpython.com/products/cpython-internals-book/. [11:13] HPy https://hpyproject.org/. Nuevo API https://es.wikipedia.org/wiki/Api para programar extensiones C para Python, independizándote de la versión del intérprete y compatible con cosas como PyPy: https://www.pypy.org/. [13:18] Internet Archive como biblioteca de libros modernos: https://archive.org/details/inlibrary. Funciona como una biblioteca tradicional. Préstamo de libros. Están escaneando a toda velocidad: 2.5 millones de libros en el momento de escribir estas notas (mayo de 2021). Internet Archive: https://archive.org/. Wayback Machine: https://web.archive.org/. Preservación de videojuegos, páginas en flash, discos de música... [17:03] Web de Python en Internet Archive. 1997: https://web.archive.org/web/19970606181701/http://www.python.org/. 1998: https://web.archive.org/web/19981212032130/http://www.python.org/. Un ejemplo de "batteries included": https://commons.wikimedia.org/wiki/File:Python_batteries_included.jpg. [17:53] Jesús Cea echa de menos la internet distribuida. [18:23] Pattern Matching en Python 3.10. PEP 622 -- Structural Pattern Matching https://www.python.org/dev/peps/pep-0622/. ¿"match" y "case" serán palabras reservadas? PEP 617 -- New PEG parser for CPython https://www.python.org/dev/peps/pep-0617/. Se repasa la funcionalidad un poco por encima. [27:48] Logs fáciles de configurar y decorados con colorines: Daiquiri: https://daiquiri.readthedocs.io/en/latest/. Colorama: https://pypi.org/project/colorama/. Compatible con Windows. [29:28] Truco: Python -i: Ejecuta un script y pasa a modo interactivo. Comentado hace unas semanas. También se puede hacer desde el propio código con code.InteractiveConsole(locals=globals()).interact(). Jesús Cea se queja de que usando la invocación desde código no funciona la edición de líneas. Javier da la pista correcta: para que funcione, basta con hacer import readline antes de lanzar el modo interactivo. [30:48] Manhole: https://pypi.org/project/manhole/. [31:53] Breakpoints condicionales https://docs.python.org/3/library/pdb.html#pdbcommand-condition. breakpoint() como función nativa: PEP 553 -- Built-in breakpoint() https://www.python.org/dev/peps/pep-0553/. import pdb; pdb.set_trace(). [33:28] Scraping a mano: scrapy shell: https://docs.scrapy.org/en/latest/topics/shell.html. Jesús Cea no echa de menos Scrapy https://docs.scrapy.org/en/latest/. [36:03] Indexador y buscador de documentos: Whoosh https://whoosh.readthedocs.io/en/latest/intro.html. Jesús necesitaba ignorar tildes, lo que impacta en la extracción del lexema. El backend está documentado, para que te lo puedas currar tú si lo necesitas. [38:23] ¿Cómo hacer copia de seguridad de un fichero de 600 gigabytes con pocos cambios internos? [40:58] Eduardo Castro ha ganado un hackathon en Pontevedra. Software para Django: https://www.djangoproject.com/. [46:38] Experiencias agridulces con los hackathones https://en.wikipedia.org/wiki/Hackathon. Netflix Prize https://en.wikipedia.org/wiki/Netflix_Prize. [50:38] Una URL puede no estar no disponible ya cuando escuchas el podcast: Podcast: Programar es una mierda: https://www.programaresunamierda.com/. [52:28] Jamii https://jamii.es/. API https://es.wikipedia.org/wiki/Api [55:38] GraphQL https://es.wikipedia.org/wiki/GraphQL. REST: https://es.wikipedia.org/wiki/Transferencia_de_Estado_Representacional. Permisos de usuario. No hay cacheo. Vulcain: https://github.com/dunglas/vulcain. [01:02:53] HTTP/2 https://en.wikipedia.org/wiki/HTTP/2. HTTP/2 Server Push: https://en.wikipedia.org/wiki/HTTP/2_Server_Push. No se tiene que responder por orden. Multiplexación. [01:08:53] La explosión de la complejidad innecesaria ocultada por bibliotecas: OAuth2 https://en.wikipedia.org/wiki/OAuth#OAuth_2.0. OpenID: https://en.wikipedia.org/wiki/OpenID. [01:10:33] Complejidad creciente de la sintaxis de Python. Volvemos a Structural Pattern Matching https://www.python.org/dev/peps/pep-0622/. Complejidad de la sintaxis. Un lenguaje pequeño y capaz reemplaza a lenguajes dinosaurio. Python reemplazó a otros lenguajes dinosaurio. Ahora Python es un dinosaurio. ¿Cuándo saldrá un lenguaje que reemplace a Python? [01:12:13] Metaclases https://realpython.com/python-metaclasses/. Closures: https://es.wikipedia.org/wiki/Clausura_(inform%C3%A1tica). [01:15:08] Empiezan a aparecer sublenguajes, tribus, subculturas de Python. Ciertos cambios de sintaxis pueden unificar subculturas: "la forma oficial de hacerlo". El operador ternario de Python v = VALOR1 if CONDICIÓN else VALOR2: PEP 308 -- Conditional Expressions https://www.python.org/dev/peps/pep-0308/. List comprehension: [f(i) for i in ITER if CONDICIÓN(i)]: PEP 202 -- List Comprehensions https://www.python.org/dev/peps/pep-0202/. [01:20:18] En los viejos tiempos, podías hacer barbaridades como True = 0. Esto funciona en Pythonn 2.7. Es algo que se cambió en Python 3.0: https://docs.python.org/3.0/whatsnew/3.0.html#changed-syntax. [01:21:53] Jesús Cea echa de menos que se eliminen cosas. Está obsesionado con el tamaño del lenguaje. ¿Qué eliminaríamos? [01:25:23] El lenguaje C incluye solo lo mínimo imprescindible. [01:26:48] Curiosidades: What the f*ck Python! https://github.com/satwikkansal/wtfpython: >>> all([]) True >>> all([[]]) False >>> all([[[]]]) True [01:28:03] Algunos avances en la investigación del bug descrito por Virako en las últimas semanas: Ejemplo de código: https://pastebin.com/vGM1sh8r. Issue24676: Error in pickle using cProfile https://bugs.python.org/issue24676. Issue9914: trace/profile conflict with the use of sys.modules[__name__] https://bugs.python.org/issue9914. Issue9325: Add an option to pdb/trace/profile to run library module as a script https://bugs.python.org/issue9325. Requiere mejorar el módulo runpy https://docs.python.org/3/library/runpy.html. A nadie le ha dolido lo suficiente el bug como para solucionarlo. No es que sea realmente difícil. Tal vez sí. [01:35:53] Nuitka https://nuitka.net/. Ejecutables Python independientes de lo que tengas instalado en el sistema. Por ejemplo, para poder usar una versión de Python "moderna". También funciona en MS Windows. [01:39:43] Tertulia previa: Fuentes de caracteres con ligaduras. Combinación de caracteres unicode. Las banderas de los países, por ejemplo, son un código "bandera" seguido del código del país: https://en.wikipedia.org/wiki/Regional_indicator_symbol. La bandera de Taiwan se ve distinta en China que en el resto del mundo: https://emojipedia.org/flag-taiwan/. "Collation" https://en.wikipedia.org/wiki/Unicode_collation_algorithm, para ordenar y comparar correctamente caracteres unicode: PyICU: https://pypi.org/project/PyICU/. [01:50:23] Cuando el Steering Council https://www.python.org/dev/peps/pep-0013/ vota un tema polémico, la decisión es final. Ya no se busca el consenso a toda costa. [01:52:53] Despedida. [01:53:55] Final.

Python en español #16: Tertulia 2021-01-19

Play Episode Listen Later May 13, 2021 143:23

Polémica Frameworks, compilación al vuelo, compiladores y rendimiento Python, scraping web y la persistencia vuelve a la carga https://podcast.jcea.es/python/16 Participantes: Jesús Cea, email: jcea@jcea.es, twitter: @jcea, https://blog.jcea.es/, https://www.jcea.es/. Conectando desde Madrid. Eduardo Castro, email: info@ecdesign.es. Conectando desde A Guarda. Javier, conectando desde Madrid. Víctor Ramírez, twitter: @virako, programador python y amante de vim, conectando desde Huelva. Dani, conectando desde Málaga, invitado por Virako. Javier, conectando desde Sevilla, también invitado por Virako. Antonio, conectado desde Albacete. Jorge Rúa, conectando desde Vigo. Audio editado por Pablo Gómez, twitter: @julebek. La música de la entrada y la salida es "Lightning Bugs", de Jason Shaw. Publicada en https://audionautix.com/ con licencia - Creative Commons Attribution 4.0 International License. [01:17] Event sourcing y nieve. Borrasca Filomena: https://es.wikipedia.org/wiki/Borrasca_Filomena. [03:52] Los comentarios legales habituales para poder grabar la tertulia. [04:47] Presentaciones varias, dinámica y motivación de las tertulias. [11:22] Los problemas logísticos de Jesús Cea con sus charlas. [12:52] Debate: Frameworks y cómo condicionan el conocimiento del lenguaje y la forma de desarrollar código. Mucha tela que cortar. [30:22] Conexión con el mundo asyncio. [34:12] Digresión: ¿Cómo funciona la protección CSRF? https://es.wikipedia.org/wiki/Cross-site_request_forgery. Diferencia semántica entre verbos HTTP: GET y POST https://en.wikipedia.org/wiki/POST_(HTTP). Algunos recursos de seguridad web (no exhaustivo, la lista es infinita): CSRF: https://es.wikipedia.org/wiki/Cross-site_request_forgery. Cross-Origin Resource Sharing (CORS) https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS. Content Security Policy Reference https://content-security-policy.com/. La documentación de FastAPI https://fastapi.tiangolo.com/ tiene mucho de seguridad: CORS (Cross-Origin Resource Sharing): https://fastapi.tiangolo.com/tutorial/cors/. OAuth2 with Password (and hashing), Bearer with JWT tokens https://fastapi.tiangolo.com/tutorial/security/oauth2-jwt/. About HTTPS https://fastapi.tiangolo.com/deployment/https/. [39:52] Proyecto ItsNat https://en.wikipedia.org/wiki/ItsNat. Estado en el servidor y el cliente solo gestiona eventos y actualizaciones del DOM que le envía el servidor. Se está moviendo otra vez la inteligencia del navegador al servidor. [44:42] ¿Realmente es imprescindible usar Javascript si tu interfaz es el navegador? Brython: https://brython.info/. Pyjs (antiguo Pyjamas): https://en.wikipedia.org/wiki/Pyjs. Emscripten: https://emscripten.org/. [48:57] ¡Compilación al vuelo! Versionado de diccionarios. PEP 509 Add a private version to dict: https://www.python.org/dev/peps/pep-0509/. Compilación al vuelo: Pyjion: https://pyjion.readthedocs.io/en/latest/index.html. Conflicto con la portabilidad del intérprete. numba: https://numba.pydata.org/. Hay pocos "core developers" y heredar código avanzado que luego hay que mantener es un problema. LLVM: https://en.wikipedia.org/wiki/LLVM. [01:04:27] Los lenguajes de programación deben ser conservadores porque no tienes ni idea de lo que están utilizando los programadores. [01:05:32] Si la documentación se ha actualizado, más vale que hayas actualizado tu código a "cómo se hacen ahora las cosas". [01:06:47] Tema recurrente: ¿Es mejor estar dentro o fuera de la biblioteca estándar? Boost: https://www.boost.org/. [01:09:12] Compiladores de Python: Cython: https://cython.org/. Rendimiento y ofuscación. nuitka: https://nuitka.net/. numba: https://numba.pydata.org/. PyPy: https://www.pypy.org/. [01:10:42] Mejoras recientes en la implementación de Python: Issue 26647: ceval: use Wordcode, 16-bit bytecode: https://bugs.python.org/issue26647. Issue 9203: Use computed gotos by default: https://bugs.python.org/issue9203. [01:14:52] Psyco https://en.wikipedia.org/wiki/Psyco. [01:16:22] Etiquetado de tipos para ayudar a los JIT. Cython: https://cython.org/. MYPY: http://mypy-lang.org/. MYPYC: https://mypyc.readthedocs.io/en/latest/index.html. Especialización. [01:22:37] GHC (The Glasgow Haskell Compiler): https://www.haskell.org/ghc/. [01:25:07] Memoria transaccional https://en.wikipedia.org/wiki/Transactional_memory. Implementaciones en Python: Sistemas de persistencia como Durus https://www.mems-exchange.org/software/DurusWorks/ o ZODB http://www.zodb.org/. Mecanismos de resolución de conflictos. [01:34:32] Más sobre optimizaciones y guardas. Mucha discusión sobre el GIL: https://en.wikipedia.org/wiki/Global_interpreter_lock. La atomicidad de operaciones no está documentada en ningún sitio. [01:42:02] Ejemplo de bytecode: >>> def rutina(n): ... n += 1 ... n = n + 1 ... >>> dis.dis(rutina) 2 0 LOAD_FAST 0 (n) 2 LOAD_CONST 1 (1) 4 INPLACE_ADD 6 STORE_FAST 0 (n) 3 8 LOAD_FAST 0 (n) 10 LOAD_CONST 1 (1) 12 BINARY_ADD 14 STORE_FAST 0 (n) 16 LOAD_CONST 0 (None) 18 RETURN_VALUE [01:45:02] Cuando haces cosas muy avanzadas que usan cosas no definidas formalmente, mejor verificar las suposiciones. [01:46:47] La ventaja de probar cosas en proyectos personales: ¿Por qué Jesús Cea se ha hecho su propio scraper web? "Maldades". scrapy: https://scrapy.org/. [01:49:22] Migración de versiones en sistemas de persistencia. [02:05:07] Event sourcing. Event sourcing: https://dev.to/barryosull/event-sourcing-what-it-is-and-why-its-awesome. Logs de modificaciones. [02:08:07] Ventajas de haber usado scrapy: https://scrapy.org/. Concurrencia. tarpit. Problemas habituales: Normalización de URLs. Webs mal formadas. [02:13:47] Módulos de scraping: newspaper3k: https://pypi.org/project/newspaper3k/. [02:15:02] Recapitulación. Pyjion: https://pyjion.readthedocs.io/en/latest/index.html. MYPYC: https://mypyc.readthedocs.io/en/latest/index.html. [02:16:02] Compilación de módulos de Python para MS Windows. Generar un wheel. Aprovechar sistemas de integración continua que levantan máquinas virtuales. [02:22:21] Final.

Python en español #12: Tertulia 2020-12-22

Play Episode Listen Later Apr 28, 2021 117:29

Ciclos de memoria, "core developers" y dataclasses https://podcast.jcea.es/python/12 En lo que sigue, cuando se habla de CPython, se refiere al intérprete de referencia de Python, que está escrito en lenguaje C: https://www.python.org/downloads/. Participantes: Eduardo Castro, email: info@ecdesign.es. Conectando desde A Guarda. Jesús Cea, email: jcea@jcea.es, twitter: @jcea, https://blog.jcea.es/, https://www.jcea.es/. Conectando desde Madrid. Javier, conectando desde Madrid. Víctor Ramírez, twitter: @virako, programador python y amante de vim, conectando desde Huelva. Juan Carlos. Audio editado por Pablo Gómez, twitter: @julebek. La música de la entrada y la salida es "Lightning Bugs", de Jason Shaw. Publicada en https://audionautix.com/ con licencia - Creative Commons Attribution 4.0 International License. [00:52] Seguimos hablando del bug comentado la semana pasada. bug bpo35930: "Raising an exception raised in a "future" instance will create reference cycles": https://bugs.python.org/issue35930. [02:17] El "bytecode" https://es.wikipedia.org/wiki/Bytecode que genera Python es muy mejorable. >>> import dis >>> def suma(valores): ... s=0 ... for i in valores: ... s+=i ... return s ... >>> dis.dis(suma) 2 0 LOAD_CONST 1 (0) 2 STORE_FAST 1 (s) 3 4 LOAD_FAST 0 (valores) 6 GET_ITER >> 8 FOR_ITER 12 (to 22) 10 STORE_FAST 2 (i) 4 12 LOAD_FAST 1 (s) 14 LOAD_FAST 2 (i) 16 INPLACE_ADD 18 STORE_FAST 1 (s) 20 JUMP_ABSOLUTE 8 5 >> 22 LOAD_FAST 1 (s) 24 RETURN_VALUE Inferencia de tipos: https://es.wikipedia.org/wiki/Inferencia_de_tipos. [08:32] Recogida de basuras. gc.set_threshold(): https://docs.python.org/3/library/gc.html#gc.set_threshold. gc.disable(): https://docs.python.org/3/library/gc.html#gc.disable. [11:27] Herramientas de monitorización: DTrace: http://dtrace.org/blogs/. Monitoriza el sistema operativo entero, incluyendo las aplicaciones, todo integrado, de forma segura y sin modificar el software. [13:32] Funcionalidades de auditoría de Python: PEP 551 -- Security transparency in the Python runtime https://www.python.org/dev/peps/pep-0551/. PEP 578 -- Python Runtime Audit Hooks https://www.python.org/dev/peps/pep-0578/. [16:47] Más herramientas de monitorización: SystemTap: https://es.wikipedia.org/wiki/SystemTap. eBPF: https://ebpf.io/. py-spy: https://github.com/benfred/py-spy. [17:52] Más sobre DTrace https://es.wikipedia.org/wiki/DTrace_(Sun_Microsystems) y Python: Añadir sondas DTrace al intérprete de Python: https://www.jcea.es/artic/python_dtrace.htm. [22:12] Tracemalloc. tracemalloc: https://docs.python.org/3/library/tracemalloc.html. [23:02] Seguimos hablando del bug comentado la semana pasada. bug bpo35930: "Raising an exception raised in a "future" instance will create reference cycles": https://bugs.python.org/issue35930. ¡Se ofrece una caja de cervezas! Brainstorming. Diagnóstico detallado. weakref — Weak references: https://docs.python.org/3/library/weakref.html. Se sube la apuesta a caja y media de cervezas :-). La excepción salta en un hilo y se "transporta" y almacena para que se pueda acceder desde otro hilo. Test reproducible. [36:42] Aviso legal. Machine learning para identificar los diferentes hablantes. [38:27] Las futuras notas de las grabaciones serán EXHAUSTIVAS (como estáis comprobando leyendo esto :). [39:17] Ideas para "cebar" las tertulias. Muchos temas recurrentes, se ve que hay temas "flotando" en el aire. [40:37] Cómo organizar las tertulias, diferentes intereses y profundidad. Dinámica de la tertulia. [42:32] ¿Cómo se organizan los "core developers"? El desarrollo se ha movido en github. Los bugs están a medio migrar, se va a integrar más en github. https://pyfound.blogspot.com/2020/05/pythons-migration-to-github-request-for.html PEP 581 -- Using GitHub Issues for CPython https://www.python.org/dev/peps/pep-0581/. Guía del desarrollador: https://devguide.python.org/. Backporting de bugs de cpython de la versión en desarrollo a las versiones estables. ¿Cómo se obtiene y se pierde el status de "core developer"? Steering council. PEP 8016: https://www.python.org/dev/peps/pep-8016/. Rol que cumple y cómo se elige. Desde que Guido no es BDFL, está muy activo en listas de correo y picando código. [52:22] ¡Víctor quiere más bugs para aprender! Bugs marcados como "easy", como forma de entrada a desarrolladores nuevos. [53:42] ¿Qué partes de CPython están escritas en C y cuáles en Python? Se escribe en C lo que no tiene más remedio, por rendimiento o porque interactúa con el sistema operativo. Más adelante de la conversación Jesús Cea explica cómo ver si un módulo concreto está en C o en Python sin tener que ir al código fuente. [57:32] PyPy https://www.pypy.org/. Intérprete de Python escrito en Python. RPython: https://rpython.readthedocs.io/en/latest/. [58:27] ¿Incluir otros lenguajes en la implementación de CPython? Rust: https://es.wikipedia.org/wiki/Rust_(lenguaje_de_programaci%C3%B3n). PyOxidizer: https://github.com/indygreg/PyOxidizer. Fragmentación. Jesús Cea estoy más centrado en la parte de C porque la mayor parte de los "core developers" no saben C. Añadir más lenguajes reduce el grupo de gente que puede mantener esas partes. Portabilidad de C. Bootstraping de un lenguaje con el propio lenguaje. Forth: https://en.wikipedia.org/wiki/Forth_(programming_language). [01:05:02] Python 3.9. Mejoras. Dificultades para utilizar la última versión de Python, en función de lo que tenga el cliente. [01:08:07] Dataclasses: https://docs.python.org/3/library/dataclasses.html. La dificultad para tener atributos opcionales. Algunas ideas. attrs: https://www.attrs.org/en/stable/. Usar valores "sentinel". DRY: https://es.wikipedia.org/wiki/No_te_repitas. [01:20:52] Pydantic: https://pydantic-docs.helpmanual.io/. [01:23:07] Horarios de las tertulias. Mucha discusión y algunas ideas. De momento no habrá cambios. Hace falta más feedback. Se agradecería que la gente que deje las tertulias, explicase por qué se ha ido. [01:30:27] Jesús Cea explica cómo ver si un módulo concreto está en C o en Python sin tener que ir al código fuente. [01:31:18] Más sobre la dinámica de las tertulias. Debate sobre presentarse o no en tertulias abiertas, o tener la cámara apagada. Va siendo necesario tener algun repositorio para que la gente de la tertulia pueda compartir cosas. ¿Lista de correo específica para las tertulias? [01:36:42] Actas de las tertulias y publicar las grabaciones de una puñetera vez. ¿Algún ingeniero de sonido en la sala? ¿Baratito? [01:39:08] El "nivel" de las listas de correo. ¿Dónde están las conversaciones interesantes? (aparte de la tertulia semanal :-). La maldición de lo básico e "introducción a". Igual para que haya conversación interesante, hay que hacer preguntas interesantes :-). Python-Madrid antes de que llegase Meetup. Jesús Cea sugiere listas como "python-ideas": https://mail.python.org/mailman3/lists/python-ideas.python.org/. También la lista de programación Python en español: python-es@python.org. Javier tiene intereses muy extraños :-). [01:54:52] Cierre. [01:56:42] Final.

Python en español #11: Tertulia 2020-12-15

Play Episode Listen Later Apr 27, 2021 102:06

Más de lo que nunca quisiste aprender sobre JIT, guardas y especialización https://podcast.jcea.es/python/11 En lo que sigue, cuando se habla de CPython, se refiere al intérprete de referencia de Python, que está escrito en lenguaje C: https://www.python.org/downloads/. Participantes: Eduardo Castro, email: info@ecdesign.es. Conectando desde A Guarda. Jesús Cea, email: jcea@jcea.es, twitter: @jcea, https://blog.jcea.es/, https://www.jcea.es/. Conectando desde Madrid. Javier, conectando desde Madrid. Víctor Ramírez, twitter: @virako, programador python y amante de vim, conectando desde Huelva. Miguel Sánchez, email: msanchez@uninet.edu, conectando desde Canarias. Audio editado por Pablo Gómez, twitter: @julebek. La música de la entrada y la salida es "Lightning Bugs", de Jason Shaw. Publicada en https://audionautix.com/ con licencia - Creative Commons Attribution 4.0 International License. [00:52] Aviso de que se está grabando. Temas legales. [01:52] Valor de publicar estos audios y las dificultades para hacerlo. [02:42] Métodos mágicos: __set_name__(). PEP 487: https://www.python.org/dev/peps/pep-0487/. [04:12] Problemas con PIP 20.3.2: https://github.com/pypa/pip/issues/9284. [05:52] ¿Actualizar a la última versión o esperar? Poder "echar atrás" fácil. Acumular cambios pendientes es deuda técnica. [10:42] Google caído https://www.theguardian.com/technology/2020/dec/14/google-suffers-worldwide-outage-with-gmail-youtube-and-other-services-down. [11:02] Generación de wheels en varios sistemas: https://pythonwheels.com/. auditwheel: https://pypi.org/project/auditwheel/. ¿Generación de Wheels en Microsoft Windows? [13:12] Caché local de PIP https://pip.pypa.io/en/stable/. [14:17] Event Sourcing https://dev.to/barryosull/event-sourcing-what-it-is-and-why-its-awesome. Módulo eventsourcing: https://pypi.org/project/eventsourcing/. [14:42] De momento se puede usar el viejo "resolver" de dependencias de PIP. Se puede usar la opción -use-deprecated=legacy-resolver. Esa opción se puede meter también en el fichero de configuración, para no tener que escribirlo en cada invocación. Jesús Cea comete el pecado de meter paquetes Python en el sistema operativo. [17:02] Batallitas de Jesús Cea. Jesús lleva dos años dándole vueltas a esto: bpo35930: "Raising an exception raised in a "future" instance will create reference cycles": https://bugs.python.org/issue35930. Explicación detallada del asunto. Brainstorming. [21:22] Visión a alto nivel del recolector de basuras de Python (cpython) Contador de referencias. Inmediato, pero no recoge ciclos. Si se crean instancias y no se destruyen, se llama a un recolector "pesado" que también recoge ciclos. Esto puede ser problemático al arrancar el programa, antes de que la creación/destrucción de objetos se "estabilice". gc.disable(): https://docs.python.org/3/library/gc.html#gc.disable. Jesús Cea "abusa" de los destructores y de que se ejecuten cuando él quiere. Lo práctico contra lo puro. Jesús ofrece cervezas. gc.collect(): https://docs.python.org/3/library/gc.html#gc.collect. Esto sirve tanto para recoger los ciclos como para comprobar si tu programa tiene ciclos de memoria o no. Futures: https://docs.python.org/3/library/concurrent.futures.html. [35:29] Módulo Manhole https://pypi.org/project/manhole/. Explorar un programa en producción. Tracemalloc: https://docs.python.org/3/library/tracemalloc.html. DTrace: http://dtrace.org/blogs/about/. py-spy: https://pypi.org/project/py-spy/. Pérdidas de memoria: Recordar lo hablado ya en tertulias anteriores. jemalloc: http://jemalloc.net/. MALLOC_PERTURB_: https://debarshiray.wordpress.com/2016/04/09/malloc_perturb_/. zswap: https://en.wikipedia.org/wiki/Zswap. [42:52] Micropython: https://micropython.org/. ESP8266: https://en.wikipedia.org/wiki/ESP8266. ESP32: https://en.wikipedia.org/wiki/ESP32. Bluetooth Low Energy: https://en.wikipedia.org/wiki/Bluetooth_Low_Energy. ¿Qué ventajas aporta usar Micropython? Velocidad de desarrollo y depuración. [52:42] ¿El futuro será mejor? O no. Desperdicio de recursos materiales porque realmente sobran. Python es mucho más lento que C y no digamos ensamblador. [57:17] Cambiar Python por un lenguaje más rápido. Go: https://en.wikipedia.org/wiki/Go_(programming_language). Rust: https://en.wikipedia.org/wiki/Rust_(programming_language). C++: https://en.wikipedia.org/wiki/C%2B%2B. [01:00:20] Python no pinta nada en móviles. Kivy: https://kivy.org/. [01:02:07] Acelerar Python. Subinterpreters: PEP 554: https://www.python.org/dev/peps/pep-0554/. Si los subintérpretes no compartiesen NADA, se podrían lanzar simultaneamente en varios núcleos de la CPU sin competir por un GIL https://en.wikipedia.org/wiki/Global_interpreter_lock único. JIT: https://es.wikipedia.org/wiki/Compilaci%C3%B3n_en_tiempo_de_ejecuci%C3%B3n. PYPY: https://www.pypy.org/. RPython: https://rpython.readthedocs.io/en/latest/. Numba: https://numba.pydata.org/. Cython: https://cython.org/. Python es "potencialmente" muy dinámico, pero en la práctica los programas no lo son. Jesús pone varios ejemplos. Conversación densa entre Jesús y Javier. Guardas para comprobar que la especialización sigue siendo correcta. Por ejemplo, para los diccionarios: PEP 509 Add a private version to dict: https://www.python.org/dev/peps/pep-0509/ "Tipado" más estricto. MYPY: http://mypy-lang.org/. Pydantic: https://pydantic-docs.helpmanual.io/. Comprobación de tipos en tiempo de ejecución. Descubrimiento de tipos en tiempo de ejecución, proporcionando "especialización". psyco: https://en.wikipedia.org/wiki/Psyco. Eduardo Castro entra y simplifica la discusión. Jesús explica qué hace "a+b" internamente. [01:29:22] PyParallel http://pyparallel.org/ Memoria transaccional: https://es.wikipedia.org/wiki/Memoria_transaccional. (nota de Jesús Cea): Los sistemas de persistencia Python descritos en tertulias anteriores pueden considerarse casos de memoria transaccional... si somos flexibles. "Colorear" objetos y que dos hilos no puedan acceder a objetos del mismo color simultaneamente o en transacciones concurrentes. [01:30:42] PYPY https://www.pypy.org/ es tan sofisticado que no lo entiende ni dios. Jesús Cea lo ha intentado y se ha rendido. psyco: https://en.wikipedia.org/wiki/Psyco. CFFI: https://cffi.readthedocs.io/en/latest/. [01:35:22] Compilar CPython a WebAssembly https://en.wikipedia.org/wiki/WebAssembly va más rápido que en C nativo. [01:36:02] Simplemente compilar código python con Cython https://cython.org/ sin declaración de tipos dobla la velocidad de ejecución. ¡CPython lo puede hacer mejor! [01:36:57] Subinterpreters: PEP 554: https://www.python.org/dev/peps/pep-0554/. Poder usar todos los núcleos de la CPU. [01:38:07] Seguimos hablando del asunto. [01:39:07] Un problema es que Python tiene la vocación de funcionar en todas partes, así que hay resistencia para implementar mejoras solo en ciertas plataformas. [01:40:17] Cierre. Dadle una pesada al bug bpo35930: "Raising an exception raised in a "future" instance will create reference cycles": https://bugs.python.org/issue35930. [01:41:13] Final.

Python en español #8: Tertulia 2020-11-24

Play Episode Listen Later Apr 20, 2021 100:01

Doblegando a la culebra https://podcast.jcea.es/python/8 Se me oye (Jesús Cea) muy mal y es muy cansado porque hablo mucho y tengo mala calidad de sonido. Lo siento. Se han eliminado las pausas en la edición, así que es bastante cansado oír a Jesús Cea hablar a toda velocidad y sin respirar. Lo haremos mejor la próxima vez. Se oye mucho tecleo. Participantes: Eduardo Castro, email: info@ecdesign.es. Jesús Cea, email: jcea@jcea.es, twitter: @jcea, https://blog.jcea.es/, https://www.jcea.es/. Sara Sáez, twitter: @saruskysaez. Luis. Audio editado por Pablo Gómez, twitter: @julebek. La música de la entrada y la salida es "Lightning Bugs", de Jason Shaw. Publicada en https://audionautix.com/ con licencia Creative Commons Attribution 4.0 International License. [01:42] API limitado API limitado de Python para asegurar compatibilidad binaria de extensiones en C entre versiones diferentes del intérprete de Python. PEP 384: https://www.python.org/dev/peps/pep-0384/. [03:42] Por qué empecé a usar Python. [06:52] Eduardo: Cartas de restaurante con códigos QR: https://www.qrico.eu/. [10:42] ¿Es mejor que una biblioteca esté en la biblioteca estándar de Python o ser una librería externa? Tema recurrent. Pros y contras. [18:34] Soporte de Python en MS Windows. Distribución de librerías precompiladas. ¿Cómo compilar una extensión C en MS Windows? [20:52] Problema de las distribuciones binarias cuando sale una nueva versión de Python. Es una de las motivaciones para usar el API limitado definido en PEP 384: https://www.python.org/dev/peps/pep-0384/. [23:22] Sistema de notificación de actualizaciones de librerías. Por ejemplo: https://libraries.io/. Feed RSS de PYPI: https://pypi.org/rss/updates.xml. ¿Actualizas a la última versión? Pros y contras. [28:22] Mejor entrar con vídeo a la tertulia. [29:12] Debugging de uso de memoria y memory leaks. Flamegraphs: http://www.brendangregg.com/flamegraphs.html. Tracemalloc: https://docs.python.org/3/library/tracemalloc.html. [33:52] Virtualenv, ¿qué usa cada uno? ¿Y en MS Windows? [35:52] Soporte de Python en MS Windows. La mayor parte del uso de Python es en MS Windows, pero los "core developers" no usar MS Windows. Eso causa problemas de soporte. [40:52] Guido van Rosum y Microsoft. Guido van Rosum ha empezado a trabajar para Microsoft: https://www.msn.com/en-us/news/technology/python-creator-guido-van-rossum-joins-microsoft/ar-BB1aXmPu. [44:22] ¿Ya estáis usando Python 3.9? El API limitado se va ampliando versión a versión de Python. PEP 384: https://www.python.org/dev/peps/pep-0384/. [45:22] Opciones para acelerar la ejecución de código Python. Numba https://numba.pydata.org/. Cython https://cython.org/. Pero una vez que empiezas etiquetar tipos, el código resultante ya no es Python. El futuro es type hinting: PEP 484 https://www.python.org/dev/peps/pep-0484/. Programar una extensión en C nativo. PyPy https://www.pypy.org/. Ojo con la compatibilidad. [54:32] Métodos para enlentecer Python :-) [55:12] Protección de código en Python. Cython https://cython.org/. [58:47] Mezclar código C en Python. Programar un módulo C. CFFI: https://cffi.readthedocs.io/en/latest/. [01:01:52] Guido van Rosum y Microsoft (segunda parte) Volvemos al tema de Guido van Rosum trabajando para Microsoft: https://www.msn.com/en-us/news/technology/python-creator-guido-van-rossum-joins-microsoft/ar-BB1aXmPu. La polémica del "walrus operator" u "operador morsa". [01:05:22] "Operador morsa" o "Walrus operator". PEP 572 https://www.python.org/dev/peps/pep-0572/. Tema recurrrente: Python se está complicando cada vez más. Problema para los novatos. [01:14:32] Opciones para acelerar la ejecución de código Python (2). Otra forma de acelerar Python: MYPY http://mypy-lang.org/ y MYPYC https://github.com/mypyc/mypyc. Type hinting. PEP 484 https://www.python.org/dev/peps/pep-0484/. [01:17:42] ¿Python con tipos? Motivación. [01:20:52] ¿Quien paga los tests? [01:22:37] Los tests como documentación. [01:23:32] ¿Qué usais para tests? [01:26:22] ¿Qué hace cada uno con Python? Hobby, Zope https://zope.readthedocs.io/en/latest/, imágenes, numpy https://numpy.org/, Jupyter https://jupyter.org/. Persistencia de datos y ORMs. Integrar Python dentro de otros proyectos, como en Kodi https://www.kodi.tv/. Django https://www.djangoproject.com/, micropython http://www.micropython.org/. [01:33:12] Colofón y mi motivación para las tertulias.

#206 Python dropping old operating systems is normal!

All Jupiter Broadcasting Shows

Play Episode Listen Later Nov 8, 2020 42:56

Sponsored by Techmeme Ride Home podcast: pythonbytes.fm/ride Special guest: Steve Dower - @zooba Brian #1: Making Enums (as always, arguably) more Pythonic “I hate enums” Harry Percival Hilarious look at why enums are frustrating in Python and a semi-reasonable workaround to make them usable. Problems with enums of strings: Can’t directly compare enum elements with the values Having to use .value is dumb. Can’t do random choice of enum values Can’t convert directly to a list of values If you use IntEnum instead of Enum and use integer values instead of strings, it kinda works better. Making your own StringEnum also is better, but still doesn’t allow comparison. Solution: class BRAIN(str, Enum): SMALL = 'small' MEDIUM = 'medium' GALAXY = 'galaxy' def __str__(self) -> str: return str.__str__(self) Derive from both str and Enum, and add a *__str(self)__* method. Fixes everything except random.choice(). Michael #2: Python 3.10 will be up to 10% faster 4.5 years in the making, from Yury Selivanov work picked up by Pablo Galindo, Python core developer, Python 3.10/3.11 release manager LOAD_METHOD, CALL_METHOD, and LOAD_GLOBAL improved “Lot of conversations with Victor about his PEP 509, and he sent me a link to his amazing compilation of notes about CPython performance. One optimization that he pointed out to me was LOAD/CALL_METHOD opcodes, an idea first originated in PyPy.” There is a patch that implements this optimization Based on: LOAD_ATTR stores in its cache a pointer to the type of the object it works with, its tp_version_tag, and a hint for PyDict_GetItemHint. When we have a cache hit, LOAD_ATTR becomes super fast, since it only needs to lookup key/value in type's dict by a known offset (the real code is a bit more complex, to handle all edge cases of descriptor protocol etc). Steve #3: Python 3.9 and no more Windows 7 PEP 11 -- Removing support for little used platforms | Python.org Windows 7 - Microsoft Lifecycle | Microsoft Docs Default x64 download Brian #4: Writing Robust Bash Shell Scripts David Pashley Some great tips that I learned, and I’ve been writing bash scripts for decades. set -u : exits your script if you use an uninitialized variable set -e : exit the script if any statement returns a non-true return value. Prevents errors from snowballing. Expect the unexpected, like missing files, missing directories, etc. Be prepared for spaces in filenames. if [ "$filename" = "foo" ]; Using trap to handle interrupts, exits, terminal kills, to leave the system in a good state. Be careful of race conditions Be atomic Michael #5: Ideas for 5x faster CPython Twitter post by Anthony Shaw calling attention to roadmap by Mark Shannon Implementation plan for speeding up CPython: The overall aim is to speed up CPython by a factor of (approximately) five. We aim to do this in four distinct stages, each stage increasing the speed of CPython by (approximately) 50%: 1.5**4 ≈ 5 Each stage will be targeted at a separate release of CPython. Stage 1 -- Python 3.10: The key improvement for 3.10 will be an adaptive, specializing interpreter. The interpreter will adapt to types and values during execution, exploiting type stability in the program, without needing runtime code generation. Stage 2 -- Python 3.11: Improved performance for integers of less than one machine word. Faster calls and returns, through better handling of frames. Better object memory layout and reduced memory management overhead. Stage 3 -- Python 3.12 (requires runtime code generation): Simple "JIT" compiler for small regions. Stage 4 -- Python 3.13 (requires runtime code generation): Extend regions for compilation. Enhance compiler to generate superior machine code. Wild conversation over here. One excerpt, from Larry Hastings: Speaking as the Gilectomy guy: borrowed references are evil. The definition of the valid lifetime of a borrowed reference doesn't exist, because they are a hack (baked into the API!) that we mostly "get away with" just because of the GIL. If I still had wishes left on my monkey's paw I'd wish them away (1). (1) Unfortunately, I used my last wish back in February, wishing I could spend more time at home.* Steve #6: CPython core developer sprints Hosted by pythondiscord.com https://youtu.be/gXMdfBTcOfQ - Core dev Q&A Extras Brian: Tools I found recently that are kinda awesome in their own way - Brian mcbroken.com - Is the ice cream machine near you working? just a funny single purpose website vim-adventures.com - with a dash. Practice vim key bindings while playing an adventure game. Super cool. Joke: Hackobertfest 2020 t-shirt https://twitter.com/McCroden/status/1319646107790704640 5 Most Difficult Programming Languages in the World (Not really long enough for a full topic, but funny. I think I’ll cut short the last code example after we record) suggested by Troy Caudill Author: Lokajit Tikayatray malboge, intercal, brainf*, cow, and whitespace whitespace is my favorite: “Entire language depends on space, tab, and linefeed for writing any program. The Whitespace interpreter ignores Non-Whitespace characters and considers them as code comments.” Intercal is kinda great in that One thing I love about this article is that the author actually writes a “Hello World!” for each language. Examples of “Hello World!” malboge (=

2020-08-14 | Linux Headlines 188

Play Episode Listen Later Aug 14, 2020

Google could be extending its Firefox search royalty deal, PyPy leaves the Software Freedom Conservancy, Ubuntu puts out a call for testing, Linspire removes snapd support, Microsoft showcases its open source contributions, and Facebook joins The Linux Foundation.

google microsoft linux ubuntu firefox linux foundation jupiter broadcasting software freedom conservancy pypy

#191 Live from the Manning Python Conference

airhacks.fm podcast with adam bien

Play Episode Listen Later Jul 22, 2020 52:33

Special guest: Ines Montani Michael #1: VS Code Device Simulator Want to experiment with MicroPython? Teaching a course with little IoT devices? Circuit Playground Express BBC micro:bit Adafruit CLUE with a screen Get a free VS code extension that adds a high fidelity simulator Easily create the starter code (main.py) Interact with all the sensors (buttons, motion sensors, acceleration detection, device shake detection, etc.) Deploy and debug on a real device when ready Had the team over on Talk Python. Brian #2: pytest 6.0.0rc1 New features You can put configuration in pyproject.toml Inline type annotations. Most user facing API and internal code. New flags - --no-header - --no-summary - --strict-config : error on unknown config key - --code-highlight : turn on/off code highlighting in terminal Recursive comparison for dataclass and attrs Tons of fixes Improved documentation There’s a list of breaking changes and deprications. But really, nothing in the list seems like a big deal to me. Plugin authors, including myself, should go test this. Already found one problem. pytest-check: stop on fail works fine, but failing tests marked with xfail show up as xpass. Gonna have to look into that. And might have to recruit Anthony to help out again. To try it: pip install pytest==6.0.0rc1 I’m currently running through the pytest book to make sure it all still works with pytest 6. So far, so good. The one hiccup I’ve found so far, TinyDB had a breaking change with 4.0, so you need to pip install tinydb==3.15.2 to get the tasks project to run right. I should have pinned that in the original setup.py. However, all of the pytest stuff is still valid. Guido just tweeted: “Yay type annotations in pytest!” Ines #3: TextAttack Python framework for adversarial attacks and data augmentation for natural language processing What are adversarial attacks? You might have seen examples like these: image classifier predicting a cat even if the image is complete noise people at protests wearing shirts and masks with certain patterns to trick facial recognition Google Translate hallucinating bible texts if you feed it nonsense or repetitive syllables What does it mean to "understand" a model? How does it behave in different situations, with unexpected data? We can't just inspect the weights – that's not how neural networks work To understand a model, we need to run it and find behaviours we don't like TextAttack lets you run various different “attacks” from the current academic literature It also lets you create more robust training data using data augmentation, for example, replacing words with synonyms, swapping characters, etc. Michael #4: What is the core of the Python programming language? By Brett Cannon, core developer Brett and I discussed Python implementation for WebAssembly before Get Python into the browser, but with the fact that both iOS and Android support running JavaScript as part of an app it would also get Python on to mobile. We have lived with CPython for so long that I suspect most of us simply think that "Python == CPython". PyPy tries to be so compatible that they will implement implementation details of CPython. Basically most implementations of Python strive to pass CPython's test suite and to be as compatible with CPython as possible. Python’s dynamic nature makes it hard to do outside of an interpreter That has led Brett to contemplate the question of what exactly is Python? How much would one have to implement to compile Python directly to WebAssembly and still be considered a Python implementation? Does Python need a REPL? Could you live without locals()? How much compatibility is necessary to be useful? The answer dictates how hard it is to implement Python and how compatible it would be with preexisting software. [Brett] has no answers It might make sense to develop a compiler that translates Python code directly to WebAssembly and sacrifice some compatibility for performance. It might make sense to develop an interpreter that targets WebAssembly's design but maintains a lot of compatibility with preexisting code. It might make sense to simply support RustPython in their WebAssembly endeavours. Maybe Pyodide will get us there. Michael’s thoughts: How about a Python standard language spec? A standard-library “standard???!?” spec. It’s possible - .NET did it. What would be build if we could build it with web assembly? Interesting options open up, say with NodeJS like capabilities, front-end frameworks This could be MUCH bigger if we got browser makes to support alternative runtimes through WebAssembly Brian #5: Getting started with Pathlib Chris May Blog post: Stop working so hard on paths. Get started with pathlib! PDF “field guide”: Getting started with Pathlib Really great introduction to Pathlib Some of the info This file as a path object: Path(__file__) Parent directory: Path(__file__).parent Absolute path: Path(__file__).parent.resolve() Two levels up: Path(__file__).resolve(strict=True).parents[1] See pdf for explanation. Current working dir: Path.cwd() Path building with / Working with files and folders Using glob Finding parts of paths and file names. Any time spent learning Pathlib is worth it. If I can do it in Pathlib, I do. It makes my code more readable. Ines #6: Data Version Control (DVC) We're currently working on v3.0 of spaCy and one of the big features is going to be a completely new way to train your custom models, manage end-to-end training workflows and make your experiments reproducible It will also integrate with a tool called DVC (short for Data Version Control), which we've started using internally DVC is an open-source tool for version control, specifically for machine learning and data Machine learning = code + data. You can check your code into a Git repo, but you can't really check in your datasets and model weights. So it's very difficult to keep track of changes. You can think of DVC as “Git for data” and the command line usage is actually pretty similar – for example, you run dvc init to initialize a repo and dvc add to start tracking assets DVC lets you track any assets by adding meta files to your repository. So everything, including your data, is versioned, and you can always go back to the commit with the best accuracy It also builds a dependency graph based on the inputs and outputs of each step, so you only have to re-run a step if things changed for example, you might have a preprocessing step that converts your data and then a step that trains your model. If the data hasn't changed, you don't have to re-run the preprocessing step. They recently released a new tool called CML (short for Continuous Machine Learning), which we haven't tried yet. CI for Machine Learning Previews look pretty cool: you can submit a PR with some changes and a GitHub action will run your experiment and auto-comment on the PR with the results, changes in accuracy and some graphs (similar to tools like Code Coverage etc.) Extra Michael: Podcast Python Search API package, by Anton Zhiyanov Mid-string f-string upgrades coming to PyCharm. And Flynt! via Colin Martin Ines: Built-in generic types in 3.9 (PEP 585): you can now write list[str] ! Brian: https://testandcode.com/120: FastAPI & Typer - Sebastián Ramírez Jokes Fast API Job Experience Sebastián Ramírez - @tiangolo I saw a job post the other day. It required 4+ years of experience in FastAPI. I couldn't apply as I only have 1.5+ years of experience since I created that thing. Maybe it's time to re-evaluate that "years of experience = skill level". Defragged Zebra

Serverless on AWS Lambda with Stephanie Prime

newline

Play Episode Listen Later Jun 17, 2020 60:46

newline Podcast Sudo StephNate: [00:00:00] Steph, just tell us a little bit about your work and kind of your background with, like AWS and like what you're doing now.Steph: [00:00:06] Yes, so I work as a engineer for a manage services provider called Second Watch. We basically partner with other big companies that use AWS or some other clouds sometimes Azure for managing their cloud infrastructure, which basically just means that.We help big companies who may not, their focus may not be technology, it may not be cloud stuff in general, and we're able to just basically optimize the cost of everything, make sure that things are running reliably and smoothly, and we're able to work with AWS directly to kind of keep people ahead of the curve when.New stuff is coming out and just it changes so much, you know, it's important to be able to adapt. So like personally, my role is I develop automation for our internal operations teams. So we have a bunch of, you know, just really smart people who are always working on customer specific AWS issues. And we see some of the same issues.Pop up over and over. Of course, you know, security , auditing, cost optimization. And so my team makes optimizations that we can distribute to all of these clients who have to maintain their own. You know, they have their own AWS account. It's theirs. And we make it so that we're actually able to distribute these automations same way in all of our customers' accounts.So the idea is that, and it's really wouldn't be doable without serverless because the idea is that everyone has to own their own infrastructure, right? Your AWS account is yours does or your resources, you don't, for security reasons, want to put all of your stuff on somebody else's account. But actually managing them all the same way can be a really difficult, even with scripts, because permissions different places have to be granted through the AWS permissions up with access, I identity and access management, right? So serverless gave us the real tool that we needed to be able to at scale, say, Hey, we came up with a little script that will run on an hourly basis to check to see how much usage these servers are getting, and if they're not production servers, you know, spin them down if they're not in use to save money.Little things like that when it comes to operations and AWS Lambda is just so good for it because it's all about, you know, like I said, doing things reliably. Doing things in a ways that can be audited and logged and doing things for like a decent price. So like background wise, I used to work at AWS in AWS support actually, and I kind of supported some of their dev ops products like OpsWorks, which is based on chef for configuration management, elastic Beanstalk and AWS CloudFormation, specifically. After working there for a bit, I really got to see, you know, how it's made and what the underlying system is like. And it was just crazy just to see how much work goes into all this, just so you can have a supposedly, easier to use for an end. But serverless just kinda changed all that for the better.Luckily.Amelia: [00:02:57] So it sounds like AWS has a ton of different services. What are the main ones and how many are there?Steph: [00:03:04] So I don't think I can even count anymore because they just, they do release new ones all the time. So hundreds at this point, but really main ones, and maybe not hundreds, maybe a little over a hundred would be a better estimate.I mean, EC2 which is elastic compute is. The bread and butter. Historically, AWS is just, they're virtualized servers basically. So EC2, the thing that made AWS really special from the beginning and that made cloud start to take over the world was the concept of auto scaling groups, which are basically definitions you attached to EC2 and it basically allows you to say, Hey, if I start getting a lot of traffic on.This one type of server, right? You know, create a second server that looks exactly the same and load balance the traffic through it. So when they say scaling, that's basically what, how you scale, easy to use auto scaling groups and elastic load balancers and kind of distribute the traffic out. The other big thing besides the scalability of with auto scaling groups is.Redundancy. So there's this idea of regions within AWS, and within each region there's availability zones. So regions are the general, like you can think of it as the place where data center is kind of like located within like a small degree. So it's usually like. Virginia is one, right? That's us East one.It's the oldest one. Another one is in California, but they're all over the world now. So the idea is you pick a region to minimize latency, so you pick the region that's closest to you. And then within the region, there's the idea of availability zones, which are basically just discreet, like physical locations of the servers that you administer them the same way, but they're protected.So like if a tornado runs through and hits one of your data centers. If you happen to have them distributed between two different availability zones, then you'll still be able to, you know, serve traffic. The other one will go down, but then the elastic load balancer will just notice that it's not responding and send the traffic to the other availability zone.So those are the main concepts that make it like EC2 those are what you need to use it effectively.Nate: [00:05:12] So with an easy to instance, that would be like a virtual server. I mean, it's not quite a Docker container, I guess we're getting to nuance there, but it's basically like a server that you would have like command line access to.You could log in and you can do more or less whenever you want on an EC2 instance.Steph: [00:05:29] Right, exactly. And so it used to be AWS used what was called Zen virtualization to do it. And that's just like you can run Zen on your own computer, you can get a computer and set up a virtual machine, almost just like they used to do it .So they are constantly putting out like new ways of virtualizing more efficiently. So they do have new technology now, but it's not something that was really, I mean, it was well known, but they really took it to a new kind of scale, which made it really impressive.Nate: [00:05:56] Okay, so EC2 lets you have full access to the box that's running and you might like load bounce requests against that.How does that contrast with what you do with AWS Lambda and serverless?Steph: [00:06:09] So with , you still have to, you know, either secure shell or, you know, furious and windows. Use RDP or something to actually get in there. You care about what ports are open. You have security groups for that. You care about all the stuff you would care about normally with a server you care about.Is it patched and up today you care about, you know, what's the current memory and CPU usage? All those things don't go away on EC2 just because it's cloud, right? When we start bringing serverless into the mix, suddenly. They do go and away. I mean, and there's still a few limitations. Like for instance, a Lambda has a limit on how much memory it can process with, just because they have to have a way to kind of keep costs down and define the units of them and define where to put them.Right? But at its core, what a Lambda is, it actually runs on a Docker container. You can think of it like a pre-configured Docker container with some pre-installed dependencies. So for Python, it would have. The latest version of Python that it says it has, it would have boto. It would have the stuff that it needs to execute that, and it would also have some basic, it's structured like it was, you know, basic Linux.So there's like a attempt. So slash temp you can write files there, right. But really it's like a Docker container. That runs underneath it on a fleet of . As far as availability zone distribution goes, that's already built into land, but you don't have to think about it with . You do have to think about it.Because if you only run one easy to server and it's in one availability zone, it's not really different from just having a physical server somewhere with a traditional provider.Nate: [00:07:38] So. There are these two terms, there's like serverless and Lambda. Can you talk a little bit about like the difference between those two terms and when to use each appropriately?Steph: [00:07:48] Yeah, so they are in a way sorta interchangeable, right? Because serverless technology just means the general idea of. I have an application, I have it defined it an artifact of we'll say zip from our get repo, right? So that application is my main artifact, and then I pass it to a service somewhere. I don't know.It could be at work. The Google app engine, that's a type of serverless technology and AWS Lambda is just the specific AWS serverless technology. But the reason AWS Lambda is, in my opinion so powerful, is because it integrates really well with the other features of AWS. So permissions management works with AWS Lambda API gateway.there's a lot of really tight integrations you can make with Lambda so that it doesn't, it's not like you have to keep half of your stuff one place and half of your stuff somewhere else. I remember when like Heroku was really big . A lot of people, you know, maybe they were maintaining an AWS account and they were also maintaining a bunch of stuff and Heroku, and they're just trying to make it work together.And even though Heroku does use, you know, AWS on the backend, or at least it did back then, it can just make things more complicated. But the whole server, this idea of the artifact is you make your code general, it's like a little microservice in a way. So I can take my serverless application and ideally, you know, it's just Python.I use NF, I write it the right way. Getting it to work on a different server. This back end, like for, exit. I think Azure has one, and Google app engine isn't really too much of a change. There's some changes to permissions and the way that you invoke it, but at the core of it, the real resource is just the application itself.It's not, you know, how many, you know, units of compute. Does it have, how many, you know, how much memory, what are the IP address rules and all that. YouNate: [00:09:35] know. So what are some good apps to build on serverless?Steph: [00:09:39] Yes. So you can build almost anything today on serverless, there's actually so much support, especially with AWS Lambda for integrations with all these other kinds of services that the stuff you can't do is getting more limited.But there is a trade off with cost, right? Because. To me the situation where it shines, where I would for no reason ever choose anything but serverless, is if you have something that's kind of bursty. So let's say you're making like a report generation tool that needs to run, but really you only run it three times a week or something like things that.They need to live somewhere. They need to be consistent. They need to be stable, they need to be available, but you don't know how often they're going to call. And even if they can go from that, there is small numbers of times it's being called, because the cool thing about serverless is , you're charged per every 100 milliseconds of time that it's being processed.When it comes to , you're charged and units that are, it used to be by the hour, I think they finally fixed it, and it's down to smaller increments. . But if you can write it. Efficiently. You can save a ton of money just by doing it this way, depending on what the use cases. So some stuff, like if you're using API gateway with Lambda, that actually can.Be a lot more expensive than Lambda will be. But you don't have to worry about, especially if you need redundancy. Cause otherwise you have to run a minimum of two two servers just to keep them both up for a AZ kind of outages situation. You don't have to worry about that with Lambda. So anything that with lower usage 100%.If it's bursty 100% use Lambda, if it's one of those things where you just don't have many dependencies on it, then Lambda is a really good choice as well. So there's especially infrastructure management, which is, if you look, I think Warner Vogels, he wrote something recently about how serverless driven infrastructure automation is kind of going to be the really key point to making places that are using cloud use cloud more effectively.And so that's one group of people. That's a big group of people. If you're a big company and you already use the AWS and you're not getting everything out of it that you thought you would get. Sometimes there's serverless use cases that already exist out there and like there's a serverless application repo that AWS provides and AWS config integrations, so that if you can trigger a serverless action based off of some other resource actions. So like, let's say that your auto scaling group scaled up and you wanted to like notify somebody, there's so many things you could do with it. It's super useful for that. But even if you're just, I'm co you're coming at it from like a blank slate and you want to create something .There are a lot of really good use cases for serverless. If you are, like I said, you're not really sure about how it's going to scale. You don't want to deal with redundancy and it fits into like a fairly well-defined, you know, this is pretty much all Python and it works with minimal dependencies. Then it's a really good starting place for that.Nate: [00:12:29] You know, you mentioned earlier that serverless is very good for when you have bursty services in that if you were to do it based on and then also get that redundancy one. You're going to have to run while you're saying you'll have to run at least two EC2 instances, just 24 hours a day. I'm paying for those.Plus you're also going to pay for API gateway. Do you pay hourly for API gatewaySteph: [00:12:53] API gateway? It, it would work the same either way, but you would pay for, in that case, like a load balancer.Nate: [00:12:59] What is API gateway? Do you use that for serverless?Steph: [00:13:02] All the time. So API gateway?Nate: [00:13:04] Yeah. Tell us the elements of a typical serverless stack.So I understand there's like Lambda, for example, maybe you say like you use CloudFront. In front of your Lambda functions, which may be store images and S3 that like a typical stack? And then can you explain like what each of those services are,Steph: [00:13:22] how you would do that? Yeah, so you're, you're not, you're on the right track here.So, okay. So a good way to think about it is, if you look at AWS has published something which a lot of documentations on it called the serverless application management standard. So S a N. And so basically if you look at that, it actually defines the core units of serverless applications. So which is the function, an API, if you, if you want one.And basically any other permission type resources. So in your case, let's say it was something where I just wanted like a really. Basic tutorial that AWS provides is someone wants to upload an image for their profile and you want to, you know, scale it down to like a smaller image before you store it on your S3.You know, just so they're all the same size and it saves you a ton, all that. So if you're creating something like that, the AWS resources that you would need are basically an API gateway, which is. Acts as basically the definition of your API schema. So like if you've ever used swagger or like a open API, these standards where you basically just define, and JSON, you know it's a rest API, you do get here, post here, this resource name.That's a standard that you might see outside of AWS a lot. And so API Gateway is just basically a way to implement that same standard. They work with AWS. So that's how you can think of API gateway. It also manages stuff like authentication integration. So if you want to enable OAuth or something on something, you could set that up the API gateway level.SoNate: [00:14:55] if you had API gateway set up. Then is that basically a web server hosted by Amazon?Steph: [00:15:03] Yeah, that's basically it.Nate: [00:15:05] And so then your API gateway is just assigned essentially randomly a DNS name by Amazon. If you wanted to have a custom DNS name to your API gateway. How do you do that?Steph: [00:15:21] Oh, it's just a setting.It's pretty. so what you could do, yeah, so if you already have a domain name, right? Route 53 which is AWS is domain name management service, you can use that to basically point that domain to the API gateway.Nate: [00:15:35] So you'd use route 53 you configure your DNS to have route 53 point a specific DNS name to your API gateway, and your API gateway would be like a web server that also handles like authentication and AWS integration. Okay,Steph: [00:15:51] got it. Yeah, that's a good breakdown of what that works. So that's your first kind of half of how people commonly trigger Lambdas. And that's not the only way to trigger it, but it's a very common way to do it. So what happens is when the API gateway is configured, part of it is you set what happens when the method is invoked.So there's like a REST API as a type of API gateway that. People use a lot. There's a few others, like a web socket, one which is pretty cool for streaming uses, and then they're always adding new options to it. So it's a really neat service. So you would have that kind of input go into your API gate.We would decide where to route it. Right. So in a lot of cases here, you might say that the Lambda function is where it gets routed to. That's considered the integration to it. And so basically API gateway passes it all of this stuff from the requests that you want to pass it. So, you know, I want to give it the content that was uploaded.I want to give it the IP address. It originally came from whatever you want to give it.Nate: [00:16:47] What backend would people use for API gateway other than Lambda? Like would you use an API gateway in front of an EC2 instance?Steph: [00:16:56] You could, but I would use probably a load balancer or application load balancer and that kind of thing.There's a lot of things you can integrate it for. Another cool one is, AWS API calls. It can proxy, so it can just directly take input from an API and send it to a specific API call if you want to do that. That's kind of some advanced usage, but Lambdas are kind of what I see is the go-to right now.Nate: [00:17:20] So the basic stack that we're looking at is you use API gateway to be in front of your Lambda function, and then your Lambda function just basically does the work, which is either like a writing to some sort of storage or calculating some sort of response. You mentioned earlier, you said, you know the Lambda function it can be fronted by an API if you want one. And then you mentioned, you know, there's other ways that you can trigger these Lambda functions. Can you talk a little bit about like what some of those other ways are?Steph: [00:17:48] Yeah, so actually those are really cool. So the cool thing is that you could trigger it based off of basically any type of CloudWatch event is a big one.And so CloudWatch is basically a monitoring slash auditing kind of service that AWS provides. So you can set triggers that go off when alarms are set. So. It could be usage, it could be, Hey, somebody logged in from an IP address that we don't recognize. You could do some really cool stuff with CloudWatch events specifically. And so those are one that I think for like management purposes are really powerful to leverage. But also you can do it off of S3 events, which are really cool. So like you could just have it, so anytime somebody uploads a. Let's say it was a or CI build, right? You're doing IA builds and you're putting your artifacts into a S three bucket, so you know this is released version 1.1 or whatever, right?You put it into an S3 bucket, right? You can hook it up so that when ever something gets put into that S3 bucket. That another action is that takes place so you can make it so that, you know, whenever we upload a release, then you know, notify these people. So now an email or you can make it so that it, you know, as complicated as you want, you can make it trigger a different kind of part in your build stage.If you have things that are outside of AWS, you can have it trigger from there. There's a lot of really cool, just direct kind of things that you don't need. An API for. An S3 is a good one. The notification service, SNS it's used within AWS a lot that can be used. The queuing service AWS provides called SQS.It works with, and also just scheduled events, which I really like because it replaces the need for a crown server. So if you have things that run, you know, every Tuesday or whatever, right, you can just trigger your Lambda to do that from just one configuration point, you don't have to deal with anything more complicated than that.Nate: [00:19:38] I feel like that gives me a pretty good grounding in the ecosystem, in the setting. Maybe we could talk a little bit more about tools and tooling. Yeah, so I know that in the JavaScript world, on like the node world, they have the serverless framework, which is basically this abstraction over, I think it's over like Lambda and you know, Azure functions and Google up.Google cloud probably too. Do they have like a serverless framework for Python or is there like a framework that you end up using? Or do you just generally just write straight on top of Lambda?Steph: [00:20:06] So that's a great question and I definitely do recommend that even though there is like a front end you could do to just start, you know, typing code in and making the Lambda work right.It's definitely better to have some sort of framework that. Integrates with your actual, like, you know, wherever you use to store your code and test it and that kind of thing. So serverless is a really big one, and that's, it's kind of annoying because serverless, you know, it also refers to the greater ecosystem of code that runs without managing underlying servers.But in this particular case, Serverless is also like a third party company in tooling, and it does work for Python. It works for, a whole budget. That's kind of like the serverless equivalent in my head of like Terraform, something that is kind of meant to be kind of generic, but it offers a lot of kind of value to people just getting started. If you just want to put something in your, read me that says, here's how to, you know, deploy this from Github. You know, serverless is a cool tool for that. I don't personally like building with it myself just because I find that this SAM, which is Serverless Application Model, I think I said management earlier, but it's actually model.I just looked that up. I feel like that has everything I really want for AWS and I get more fine grain control. I don't like having too much obstruction and I also don't like. When you have to install something and something changes between versions and that changes the way your infrastructure gets deployed.That's a pet peeve of mine, and that's why I don't really use Terraform too much for the same reason. When you're operating really in one world, which in my case is AWS, I just don't get the value out of that. Right. But with the serverless application model, and they have a whole Sam CLI, they have a bunch of tools coming out.So many examples on their Github repos as well. I find that it's got really everything. I want to use plus some CloudFormation plugs right into that. So if I need to do anything outside of the normal serverless kind of world, I can do that. So it's better to use serverless than to not use anything at all. I think it's a good tool and really good way to kind of get used to it and started, but at least my case where it really matters to have super consistent deployments where I'm sharing between people and accounts and all of that. And I find that SAM really gives me the best kind of best of both worlds.Amelia: [00:22:17] So, as far as I understand it, serverless is a fairly new concept.Steph: [00:22:22] You know, it's one of those things it's catching on. Recently, I felt like Google app engine candidate a long time ago, and it was kind of a niche thing for awhile, but it recently it, we're starting to see. Bigger enterprises, people who might not necessarily want bleeding edge stuff start to accept that serverless is going to be the future.And that's why we're seeing all this stuff come up and it's, it's actually really exciting. But the good thing is it's been around long enough that a lot of the actual tooling and the architecture patterns that you will use are mature. They've been used for years. Their sites you've been using for a long time that.You don't know that it's serverless on the back end, but it is because it's one of those things that doesn't really affect you unless you're kind of working on it. Right. But it's new to a lot of people, but I think it's in a good spot where it's more approachable than it used to be.Nate: [00:23:10] When you say that there's like a lot of standard patterns, maybe we could talk about some of those.So when you write a Lambda function and code, either with like Python or Java script or whatever, there are bloods, they say Python because you use Python primarily right? Well, maybe we could talk a little bit about that. Like why do you prefer Python?Steph: [00:23:26] Yeah, so just coming from my background, which is, like I said, I did some support, did some straight dev ops, kind of a more assisted mini before the world kind of became a more interesting place kind of background.Python is just one of those tools that is installed on like every Linux server in the world and works kind of predictably. Enough people know it that it's, it's not too hard to like. Share between people who may not be, you know, super advanced developers, right? Cause a lot of people I work with, maybe they have varying levels of skills and Python's one of those ones you can start to pick up pretty quickly.And it's not too foreign really to people coming from other languages as well. So it's just a practicality thing for a lot of it. But there's also a lot of the tooling that is around. Dev ops type stuff is in Python, like them, Ansible for configuration management, super useful tool. You know, it's all Python.So really there's, there's a lot of good reasons to use Python from, like in my world it's, it's one of the things where you don't have to use one specific language, but Python is just, it has what I need and it gets, I can work with it pretty quickly. The ecosystems develop. There's still a lot of people who use it and it's a good tool for what I have to do.Nate: [00:24:35] Yeah, there's tons, and I was looking at the metrics. I think Python is like, last year was like one of the fastest growing programming languages too. There's a lot of new people coming into Python,Steph: [00:24:44] and a lot of it is data science people too, right? People who may not necessarily have a strong programming background, but there's the tooling they need in a Python already there.There's community, and it sucks that it's not as scary looking as some other languages, frankly. You know.Nate: [00:24:58] And what are some of the other like cloud libraries that Python has? Like I've seen one that's called like BotoSteph: [00:25:03] Boto is the one that Amazon provides as their SDK, basically. And so every Lambda comes bundled with Boto three you know, by default.So yeah, there was an older version of ODA for Python too. But Boto three is the main one everyone uses now. So yeah, Bodo is great. I use it extensively. It's pretty easy to use, a lot of documentation, a lot of examples, a lot of people answering questions about it on StackOverflow, but I'm really, every language does have an SDK for AWS these days, and they all pretty much work the same way because they're all just based off of.The AWS API APIs and all the API APIs are well-defined and pretty stable, so it's not too much of a stretch to use any other language, but Bono's the big one, the requests library in Python is super useful just because it makes it easier to deal with, you know, interacting with API APIs or interacting with requests to APIs.It's just all about, you know, HTP requests and all that. Some of the new Python three. Libraries are really good as well, just because they kind of improve. It used to be like with Python 2, you know, there's URL lib for processing requests and it was just not as easy to use. So people would always bundle a third party tool, like requests, but it's getting better.Also, you know, Python, there's some. Different options for testing Py unit and unit test, and really there's just a bunch of libraries that are well maintained by the community. There's a kazillion on PyPy, but I try to keep outside dependencies from Python to a total minimum because again, I just don't like when things change from underneath me, how things function.So it's one of the things where I can do a lot without. Installing third party libraries, so wherever I can avoid it, I do.Nate: [00:26:47] So let's talk a little bit about these patterns that you have. So Lambda functions generally have a pretty well defined structure, and it's basically through that convention. It makes it somewhat straightforward to write each function. Can you talk a little bit about like, I don't know, the anatomy of a Lambda function?Steph: [00:27:05] Yeah, so at its basic core, the most important thing that every Lambda function in the world is going to have is something called a handler. And so the handler is basically a function that is accessed to begin the way that it starts.So, any Lambda function when it's invoked. So anytime you are calling it, it's called invoking a Lambda function. It sends it parameters that are event. An event is basically just the data that defines, Hey, this is stuff you need to act on. And it sends it something called context, which a lot of time you never touched the context object.But it's useful too, because AWS provides it with every Lambda and it's basically like, Hey, this is the ID of the currently running Lambda function. You know, this is where you're running. This is the Lambdas name. So like for logging stuff, context can be really useful. Or for stuff where it's like your function code may need to know something about where it is.You can save yourself time from, you don't have to use like an environment. They're able, sometimes if you can look in the context object. So at the core it's cause you have at least a file, you can name it whatever you want. A lot of people call it index and then within that file you define a function called handler.Again, it doesn't have to be called handler, but. That makes it easy to know which one it is, and it takes that event and context. And so really, if that's all you have, you can literally just have your Lambda file be one Python file that says, you can say def handler takes, you know, object and then return something.And then that can be it. As long as you define that index dot handler as your handler resource, which is, that's a lot of words, but basically we need to find your Lambda within AWS. The required parameters are basically the handler URI, which is the name of the file, and then a.in the name of the handler function.So that's at its most basic. Every Lambda has that, but then you start, you know, scoping it out so you can actually know, organize your code decently. And then it's just a matter of, is there a read me there. Just like any other Python application really, you know, do you have a read me? Do you want to use like a requirements.txt file to like define, you know, these are the exact third party libraries that I'm going to be using.That's really useful. And if you're defining it with SAM, which I really recommend. Then there's a file called template.yaml And that's just contains the actual, like AWS resource definition, as well as any like CloudFormation defined resources that you're using. So you can make a template.yaml as the infrastructure kind of as code, and then everything else, just the code as code.Nate: [00:29:36] Okay. So unpacking that a little bit, you'll invoke this function and they'll basically be two arguments. One is the event that's coming in the event in particular, and then it'll also have the context, which is sort of metadata about the context in which this request is running. So you mentioned some of the things that come in the context, which is like what region you're in or what the name of the function is that you're on.What are some of the parameters in the event object.Steph: [00:30:02] So the interesting thing about the event object. Is, it can be anything. It just has to be basically a Python dictionary or basically, you know, you could think of it like a JSON, right? So it's not predefined and Lambda itself doesn't care what the event is.That's all up to your code to decide what is it, what is a valid event, and how to act on it. So API gateway if you're using that. There's a lot of example events, API gateway will send and so if you like ever try it, look at like the test events for Lambda, you'll see a lot of like templates, which are just JSON files with like expected outputs.But really it can be anything.Nate: [00:30:41] So the way that Lambda's structured is that API gateway will typically pass in an event that's maybe like the request was a POST request, and then it has these like query parameters or headers attached to it. And all of that would be within like the request object. But the event could also be like you mentioned like CloudWatch, like there's like a CloudWatch event that could come in and say, you basically just have to configure your handler to handle any of the events you expect that handler to receive.Steph: [00:31:07] Yeah, exactly.Nate: [00:31:09] So let's talk a little bit more about the development tooling. How in the world do you test these sorts of things? Like with, do you have to deploy every single time or tell us about like the development tooling that you use to test along the way.Steph: [00:31:22] Yeah. So I'm, one of the great things about SAM and there's some other tools for this as well, is that it lets you test your Lambdas locally before you deploy it, if you want.And the way that it does that is, I mentioned earlier that Lambda is really at its core, a container, like a Docker container running on a server somewhere. Is, it just creates a Docker container that behaves exactly like a Lambda would, and it sends your events. So you would just define basically a JSON with the expected data from either API gateway or whatever, right?You make a test one and then it will send it to that. It'll build it on demand for you and you test it all locally with Docker. When you like it, you use the same tool and it'll package it up and deploy it for you. So yeah, it's actually not too bad to test locally at all.Nate: [00:32:05] So you create JSON files of the events that you want it to handle, and then you just like invoke it with those particular events.Steph: [00:32:12] Yeah, so basically like if I created it like a test event, I would save it to my repo is tests slash API gateway event.json Had put in the data I expect, and then I would do like a SAM. So the command is like SAM, a local invoke, and then I would give it to the file path to the JSON, and it would process it.I'd see the exact same output that I would expect to see from Lambda. So it'll say, Hey, this took this many milliseconds to invoke the response code was this, this is what was printed. So it's really useful just for. It's almost a one to one with what you would get from Amazon Lambda output.Amelia: [00:32:50] And then to update your Lambda functions.Do you have to go inside the AWS GUI or can you do that from the command line.Steph: [00:32:57] yeah, no, you can do that from the command line with Sam as well. So there's a Sam package and Sam deploy command. It's useful if you need to use basically any type of CII testing service to like manage your deployments or anything like that.Then you can get a package it and then send it the package to your, Whatever you're using, like Gitlab or something, right. For further validation and then have Gitlab deploy it. Like if you don't want people to have deployed credentials on their local machine, that's the reason it's kind of broken up into two steps there.But basically you just do a command, Sam deploy, and what it does is it goes out to Amazon. It says, Hey, update the Lambda to point to this as the new resource artifact to be invoked. And if you're using and which I think it's enabled by default, not actually the versioning feature, it actually just adds another version of the Lambda so that if you need to roll back, you can just go to the previous one, which is really useful sometimes.Nate: [00:33:54] So let's talk a little bit about deployment. One of the things that I think is stressing when you are deploying Lambda functions is like, I have no idea how much it's going to cost. How is it going to cost to launch something, and how much am I going to pay? And I guess maybe you can kind of calculate if you estimate the number of requests you think you're going to get, but how do you approach that when you're writing a new function?Steph: [00:34:18] Yeah, so the first thing I look at is what's the minimum, basically timeout, what's the minimum memory usage? So number of invocations is a big factor, right? So like if you have free tier, I think it's like a million invocations you get, but that's like assuming like a hundred under a hundred milliseconds each.So when you just deploy it, there's no cost for just deploying it. You don't get charged until it's invoked. If you're storing like an artifact and as three, there's a little cost for you keeping it in as three. But it's usually really, really minimal. So the big thing is really, how many times are you give it?Is it over a million times and or are you not on free tier? The costs, like I said, it gets batchedtogether and it's actually really pretty cheap just in terms of number of invocations cause at the bigger places where you can normally save costs. Is it over-provisioned for how much memory you give it?Right. I think the smallest unit you can give it as 128 that can go up to like two gigabytes maybe more now. So if you have it set where, Oh, I want it to use this much memory and it really never is going to use that much memory and that's kind of like wasteful or if you know, if it uses that much, that's like something's wrongNate: [00:35:25] cause you pay, you configure beforehand, like we're going to use max 128 megabytes of memory and then it's allocated on invocation or something like that.And then if you set it too high, you were going to pay more than you need to. Is that right?Steph: [00:35:40] Yeah. Well and it's more like, I think I'll have to double check cause it actually just show you how much memory you use each time in Lambda is invoked. So you can sort of measure if it's getting near that or if you think you need more than it might give an error.If it doesn't, it isn't able to complete . But in general, like. I haven't had many cases where the memory has been the limiting factor. I will say that, the timeout can sometimes get you, because if a Lambda's processing forever, like let's say API gateway, a lot of times API gateway has its own sort of timeout, which is, I think it's like 30 seconds to respond.And if your Lambda is set to, you know, you give it five minutes to process it always five minutes processing. If you, let's say that you program something wrong and there's like a loop somewhere and it's going on forever, it'll waste five minutes. Computing API gateway will give up after 30 seconds, but you'll still be charged for the five minutes that Lambda was kind of doing its thing.SoNate: [00:36:29] it's like, I know that AWS is services and Lambda are created by like world-class engineers. It's the highest performing infrastructure probably in the world, but as a user, sometimes it feels like there's giant Rube Goldberg machine, and I have like no idea. All of the different aspects that are involved in, like how do you manage that complexity?Like when you're trying to learn AWS, like let's say someone who's listening to this, they want to try to understand this. How do you. Go about making sense of all of that. Like difficulty.Steph: [00:37:02] You really have to go through a lot of the docs, like videos, people showing you how they did something isn't always the best just because they kind of skirt around all the things that went wrong in the process, right? So it's really important just to understand, just to look at the documentation for what all these features are before you use them. The marketing people make it sound like it's super easy and go, and to a degree, it really is like, it's easier than the alternative, right?It's where you put your complexities the problem Nate: [00:37:29] yeah, and I think that part of the problem that I have with their docs is like they are trying to give you every possible path because they're an infrastructure provider, and so they support like these very complex use cases. And so it's, it's like the world's most detailed choose your own adventure.It's like, Oh, have you decide that you need to take this path? Go to or this one path B. Path C there's like so many different like paths you can kind of go down. It's just a lot when you're first learning.Steph: [00:37:58] It is, and sometimes like the blog posts have better kind of actual tutorial kind of things for like real use cases.So if you have a use case that is general enough, a lot of times you can just Google for it and there'll be something that one of their solution architects wrote up about had actually do it from like a, you know, user-friendly perspective that anything with the options is that you need to be aware of them too, just because the way that they interact can be really important.If you do ever do something that's not done before and the reason why it's so powerful and what, you know why it takes all these super smart people to set up and all this stuff is actually because are just so many variables that go into it that you can do so much with that. It's so easy to shoot yourself in the foot.It always has been in a way, right? But it's just learning how to not shoot yourself in the foot and use it like with the right agility. And once you get that down, it's really great.Amelia: [00:38:46] So there's over a hundred AWS services. How do you personally find new services that you want to try out or how does anyone discover any of these different services.Steph: [00:38:57] What I do is, you know, I get the emails from AWS whenever they release new ones, and I try to, you know, keep up to date with that. Sometimes I'll read blog posts that I see people writing about how they're using some of them, but honestly, a lot of it's just based off of when I'm doing something, I just keep an eye out.If there's something like, I wished that it did sometimes, like, I used some AWS systems manager a lot, which is basically. You can think of it. It's sort of like a config management an orchestration tool. It lets you, basically, it's a little agent. You can sell on servers and you can, you know, just automate patching and all this other like little stuff that you would do with like Chef or Puppet or other config management tools.And. It seems like they keep announcing services. What are really just like tie ins to existing ones, right? Which is like, Oh, this one adds, you know, for instance, like the secret management and the parameter store would secrets. A lot of them are really just integrations to other AWS services, so it's not as much.The really core ones that everyone needs to know is, you know, EC2 of course Lambda, so big API gateway and CloudFormation because it's basically. The infrastructure as code format that is super useful just for structuring everything. And I guess S3 is the other one. Yeah. Let's talk aboutNate: [00:40:15] cloud formation for a second.So earlier you said your Lambda function is typically going to have a template.yaml. Is that template.yaml CloudFormation code.Steph: [00:40:26] So at its core, yes. But the way you write it is different. So how it works is that the Sam templating language is defined to simplify. What you would with CloudFormation.So a CloudFormation you have to put a gazillion variables in.And it's like, there's some ways to like make that easier. Like I really like using a Python library called Tropo sphere, where you can actually use Python to generate your own cloud formation templates for you. And it's really nice just cause, you know, I like to know I'll need a loop for something or I'll need to like fetch a value from somewhere else.And it's great to have that kind of flexibility with it . The, the Sam template is specifically a transform, is what they call it, of cloud formation, which means that it executes against the CloudFormation service. So the CloudFormation service receives that kind of turns it into the core that it understands and executes on it.So at the core of it, it is executing on top of CloudFormation. You could create a mostly equivalent kind of CloudFormation template usually, but there's more to it. But there's a lot of just reasons why you would want to use Sam for serverless specifically, just because they add so many niceties and stuff around, you know, permissions management that you don't have to like think of as much and shortcuts and it's just a lot easier to deal with, which is a nice change.But the power of CloudFormation is that if you wanted to do something. That like maybe SAM didn't support the is outside the normal scope. You could just stick a CloudFormation resource definition in it and it would work the same way cause it's running against it. It's one of those services where people, sometimes it gets a bad rap because it's so complicated, but it's also super stable.It behaves in a super predictable way and it's just, I think learning how to use that when I worked at AWS was really valuable.Nate: [00:42:08] What tools do you use to manage the security when you're configuring these things? So earlier you mentioned IAM, which is a, I don't know what it stands for.Steph: [00:42:19] Identity and access management,Nate: [00:42:20] right?Which is like configuration language or configuration that we can configure, which accounts have access to certain resources. let me give you an example. One question I have is how do you make sure each system has the minimum level of permissions and what tools you use? So for example, I wrote this Lambda function a couple of weeks ago.Yeah. I was just following some tutorial and they said like, yeah, make sure that you create this IAM role as like one of the resources for launching this Lambda function, which I think they're like, that's great. But then like. How do I pin down the permissions when I'm granting that function itself permissions to grant new IAM roles. So it was like I basically just had to give it route permission according to my low, my skill level, because otherwise I wasn't able to. Create, I am roles without the authority to create new roles, which just seems like root permissions.Steph: [00:43:13] Yes. So there are some ways that's super risky, honestly, like super risky.Nate: [00:43:17] Yeah. I'm going to need your help,Steph: [00:43:19] but it is a thing that there are case you can, you can limit it down with the right kind of definition. SoIAM. It's really powerful. Right? So the original case behind a MRI was that, so you're a servers so that if you had a, an application server and a database server separately.You could give them separate IAM roles so that they could have different things they could do. Like you never really want your database server to maybe. Interface directly with, you know, an S three resource, but maybe you want your application server to do that or something. So it was nice because it really let you limit down the scope from a servers and you don't, cause you have to leave keys around if you do it .So you don't have to keep keys anywhere on the server if you're using IAM roles to access that stuff. So anytime you're storing like an AWS secret key on a server, or like in a Lambda, you kinda did something wrong. The thing they are just because that's, AWS doesn't really care about those keys. It just looks, is it a key?Do it here. But when you actually use IAM policies, you could say it has to be from this role. It has to be executed from, you know, this service. So it can make sure that it's Lambda or the one doing this, or is it somebody trying to assume Lambda credentials? Right? There's so much you can do to kind of limit it.With I am. So it was really good to like learn about that. And like all of the AWS certifications do focus on IAM specifically. So if anyone thinking about taking like an AWS certification course, a lot of them will introduce you to that and help a lot with understanding like how to use those correctly.But for what you talked about with you, like how do you deal with a function that passes, that creates a role for another function, right? What you would do in that kind of case is there's an idea of IAM paths. So basically you can give them like as namespacing for IAM permissions, right? So you can make a, I am role that can grant functions that can create roles .Only underneath its own namespace. Within its own path.Nate: [00:45:20] When you say namespaces, I mean did inherit permissions. But the parent permission has?Steph: [00:45:28] Depends. So it doesn't inherit itself. But like, let's say that I was making a build server . And my build server, we had to use a couple of different roles for different pieces of it. For different steps. Cause they used different services or something. So we would give it like the top level one of build. And then in my S3 bucket, I might say aloud upload for anyone whose path had built in it. So that's, that's the idea that you can limit on the other side, what is allowed.And so of course, it's one of the things where you want to by default blacklist as much as possible, and then white list what you can. But in reality it can be very hard to go through some of that stuff. So you just have to try to, wherever you can, just minimize the risk potential and understand what's the worst case that could happen if someone came in and was able to use these credentials for something.Amelia: [00:46:16] What are some of the other common things that people do wrong when they're new to AWS or DevOps?Steph: [00:46:22] One thing I see a lot is people treating the environment variables for Lambdas as if they were. Private, like secrets. So they think that if you put like an API key in through the environment variable that that's kind of like secure, but really like I worked in AWS support, anyone would be able to see that if they were helping you out in your account.So it's not really a secure way to do that. You would need to use a surface like secrets manager, or you'd have some kind of way to, you would encrypt it before you put it in and then the Lambda would decrypt it, right? So there's ways to get around that, but like using environment variables as if there were secure or storing.Secure things within your git repositories that get pushed to AWS is like a really big thing that should be avoided. And we said, what else did you ever own?Nate: [00:47:08] I'm pretty sure that I put an API key in mineSteph: [00:47:11] before. So yeah, no, it's one of the things people do, and it's one of those things that. A lot of people, you know, maybe nothing will go wrong and it's fine, but if you can just reduce the scope, then you don't have to worry about it.And it just makes things easier in the future.Amelia: [00:47:27] What are like the new hot things that are up and coming?Steph: [00:47:30] So I'd say that there's more and more kind of uses for Lambda at edge for like IOT integration, which is pretty cool. So basically Lambda editor. Is basically where you can process your lamb dos computers, basically, like, you know, like, just think of it as like raspberry pi.It's like that kind of type thing, right? So you could take asmall computer and you could put it like, you know, maybe where it doesn't have a completely like, consistent internet connection . So maybe if you're doing like a smart vending machine or something. Think of it like that. Then you could actually execute the Lambda logic directly there and deploy it to there and manage it from AWS whenever it does have like a network connection and then you can basically, it just reduces latency.A lot and let your coat and lets you test your code both like locally and then deploy it out. So it was really cool for like IOT stuff. There's been a lot of like tons of stuff happening in machine learning space on AWS too much for me to even keep on top of. But a lot of the stuff around Alexa voices is super cool, like a poly where you can just, if you play with your Alexa type thing before, it's cool, but you could just write a little Lambda program to actually generate, you know, whatever you want it to say in different accents, different voices on demand, and integrate it with your own thing, which is pretty cool. Like, I mean, I haven't had a super great use case for that yet, but it's fun to play with.Amelia: [00:48:48] I feel like a lot of the internet of things are like that.Steph: [00:48:52] Oh, they totally are. That they really are. But yeah, it's just one of the things you had to keep an eye out for. Sometimes the things that, for me, because I'm dealing so much with like enterprisey kind of stuff that excite me are not really exciting to other people cause it's like, yay, patching has a way to like lock it down to a specific version of this at this time.You know, it's like, it's like, it's not really exciting, but like, yeah.Nate: [00:49:14] And I think that's one of the things that's interesting talking to you is like I write web apps, I think of serverless from like a web app perspective, but it's like, Oh, I'm going to write an API that will let her know, fix my images on the way up or something.But a lot of the uses that you alluded to are like using serverless for managing, other parts of your infrastructure, they're like, you're using, you've got a monitor on some EC2 instance that sends out a cloud watch alert that like then responds in some other way, like within your infrastructure.So that's really interesting.Steph: [00:49:47] Yeah, no, that's, it's just been really valuable for us. And like I said, I mentioned the IAM stuff. That's what makes it all possible really.Amelia: [00:49:52] So this is totally unrelated, but I'm always curious how people got into DevOps, because I do a lot of front end development and I feel like.It's pretty easy to get into front end web development because a lot of people need websites. It's fairly easy to create a small website, so that's a really good gateway, but I've never like on the weekend when it to spin up a server or any of this,Steph: [00:50:19] honestly for me, a lot of it was like my first job in college.Like I was basically part-time tech support / sys admin. And I always loved L nuxi because, and the reason I got into Lennox in the first place is I realized that when I was in high school that I could get around a lot of the schools, like, you know, spy software that won't let you do fun stuff on the internet or with the software if you just use a live boot Linux USB.So part of it was just, I was using it. So, you know. Get around stuff, just curiosity about that kind of stuff . But when I got my first job, that's kind of like assist admin type thing. It kind of became a necessity. Because you know when you have limited resources, it was like me and like another part time person and one full time person and hundreds of people who we had to keep their email and everything.Working for them. It kind of becomes a necessity thing cause you realize that all the stuff that you have to do by hand back then, you can't keep track of it all. You can't keep it all secured for a few people. It's extremely hard. And so one way people dealt with that was, you know, offshoring or hiring people, other people to maintain it.But it was kind of cool at the time to realize that the same stuff I was learning in my CS program about programming. There's no reason I couldn't use that for my job, which was support and admin stuff. So, I think I got introduced to like chef, that was the first tool that I really, I was like, wow, this changes everything.You know, because you would write little Ruby files to do configuration management and then your servers would, you know, you run the chef agent to end, you know. You know, they'd all be configured exactly the same way. And it was testable. And there's all this really cool stuff you could do with chef that I, you know, I had been trying to play to do with like, you know, bash script or just normal Python scripts.But then chef kind of gave me that framework for it. And I got a job at AWS where one of the main components was supporting their AWS ops work stool, which was basically managed chef deployments. And so that was cool because then I learned about how does that work at super high scale. What are other things that people use?And right before I actually, you know, got my first job as a full time dev ops person was when they, they were releasing the beta for Lambda. So I was in the little private beta for AWS employees and we were all kind of just like, wow, this changes a lot. They'll make our jobs a lot easier, you know, in a way it will reduce the need for some of it.But we were so overloaded all the time. And I feel like a lot of people from a perspective know what it feels like to be like. There's so much going on and you can't keep track of it all and you're overloaded all the time and you just want it to be clean and not have to touch it and to do less work at dev ops was kind of like the way forward.So that's really how I got into it.Amelia: [00:52:54] That's awesome. Another thing I keep hearing is that a lot of dev ops tests are slowly being automated. So how does the future of DevOps look if a lot of the things that we're doing by hand now will be automated in the future?Steph: [00:53:09] Well, see, the thing about dev ops is really, it's more of like a goal.It's an ideal. A lot of people, if they're dev ops purists and they'll tell you that it means it's having a culture where. There are not silos between developers and operations, and everyone knows how to deploy and everyone knows how to do everything. But really in reality, not everyone's a generalist.And being a generalist in some ways is kind of its own specialty, which is kind of how I feel about the DevOps role that you see. So I think we'll see that the dev ops role, people might go by different names for the same idea, which is. Basically reliability engineering, like Google has a whole book about site reliability engineering is the same kind of philosophy, right? It's you want to keep things running. You want to know where things are. You want to make things efficient from an infrastructure level. But the way that you do it is you use a lot of the same tools that developers use. So I think that we'll see tiles shift to like serverless architect is a big one that's coming up because that reliability engineering is big.And we may not see people say dev ops is their role as much, but I don't think the need for people who kind of specialize in like infrastructure and deployment and that kind of thing is going to go away. You might have to do more with less, right? Or there might be certain companies that just hire. A bunch of them, like Google and Amazon, right?They're pro still going to be a lot of people, but maybe they're not going to be working at your local place because if they're going to be working for the big people who actually develop the tools that are used for that resource. So I still think it's a great field and it might be getting a little harder to figure out where to enter in this because there's so much competition and attention around the tools and resources that people use, but it's still a really great field overall. And if you just learn, you know, serverless or Kubernetes or something that's big right now, you can start to branch out and it's still a really good place to kind of make a career.Nate: [00:54:59] Yeah. Kubernetes. Oh man, that's a whole nother podcast. We'll have to come back for that.Steph: [00:55:02] Oh, it is. It is.Nate: [00:55:04] So, Steph, tell us more about where we can learn more about you.Steph: [00:55:07] Yeah. So I have a book coming out.Nate: [00:55:10] Yes. Let's talk about the book.Steph: [00:55:12] Yeah. So I'm releasing a book called, Fullstack Serverless. See, I'm terrible.I should know exactly what the title, I don'tNate: [00:55:18] know exactly the title. . Yeah. Full stack. Python with serverless or full-stack serverless with Python,Steph: [00:55:27] full stack Python on Lambda.Nate: [00:55:29] Oh yeah. Lambda. Not serverless.Steph: [00:55:31] Yeah, that's correct. Python on Lambda. Right. And that book really has, it could take you from start to finish, to really understand.I think if you read this kind of book, if I, if I had read this before, like learning it, it wouldn't feel so maybe. Some people confusing or kind of like it's a black box that you don't know what's happening. Cause really at its core lambda that you can understand exactly everything that happens. It has a reason, you know it's running on infrastructure that's not too different from people who run infrastructure on Docker or something.Right. And the code that you write. Can be the same code that you might run on a server or on some other cloud provider. So the real things that I think that the book has that maybe kind of hard to find elsewhere is there's a lot of information about how do you do proper testing and deployment?How do you. Manage your secrets, so you aren't storing those in them in those environment variables. Correct. It has stuff about logging and monitoring, all the different ways that you can trigger Lambda. So API gateway, you know, that's a big one. But then I mentioned S3 and all those other ones. there's going to be examples of pretty much every way you can do that in that book.Stuff about optimizing cost and performance and stuff about using that. SAM, serverless application, a repository, so you can actually publish Lambdas and share them and even sell them if you want to. So it's really a start to finish everything you need to. If you want to have something that you create from scratch.In production. I don't think there's anything left out that you would need to know. I feel pretty confident about that.Nate: [00:57:04] It's great. I think one of the things I love about it is it's almost like the anti version of the docs, like when we talked about earlier that the docs cover every possible use case.This talks about like very specific, but like production use cases in a very approachable, like linear way. You know, even though you can find some tutorials online, maybe. Like you mentioned, they're not always accurate in terms of how you actually do or should do it, and so, yeah, I think your book so far has been really great in covering these production concerns in a linear way.All right. Well, Steph is great to have you.Steph: [00:57:37] Thank you for having me. It was, it was great talking to you both.

From Maxwell over Maxine to Graal VM, SubstrateVM and Truffle

Play Episode Listen Later Mar 8, 2020 73:06

An airhacks.fm conversation with Thomas Wuerthinger (@thomaswue) about: Working on HotSpot, Sun started collaboration with Johannes Kepler University (JKU) in Linz, Java HotSpot is written in C++, "Array Bounds Check Elimination" for Java HotSpot Compiler, increased the performance by approx. 10%, the possibly most impactful student work ever, IdealGraphVisualizer (IGV): the graphical visualisation tool for HotSpot uses NetBeans visual library, IGV is also used for GraalVM, the Maxine Research VM at Sun Microsystems, Project Maxwell was renamed to Maxine, working at Sun's Menlo Park at Maxine, the circular optimization of Java leads to higher performance, the relation between Maxine and GraalVM, replacing the Maxine Compiler with Client HotSpot Compiler "transpiled" from C++ to Java, the C1X compiler, maxine was too ambitious, GraalVM just focusses on the compiler and makes it available for HotSpot, the Java compiler (javac) is written in Java, the quality of the JIT output is the first factor for good performance, HotSpot asks JIT to optimize "hot" methods, Maxine project is stil active, JVMCI, working on crankshaft compiler at Google with a team of 8 people, using Graal as polyglot environment, converting JavaScript to GraalIR was too complex, JavaScript is dynamic and GraalIR is typed, partial evaluation was inspired by PyPy, JavaScript interpreter was written in Java and is optimized by GraalVM, the frozen interpreters, the meta-circularity comes with the native image, a small JavaScript interpreter team implements recent JavaScript features, improving serverside ReactJS rendering performance with GraalVM, R, Ruby and Python are exectly the same integrated as JavaScript, Java is going to be interpreted in the same way as well, method inlining across language boundaries, Truffle is the intepreter API and comes with language-independent tooling, GraalVM is able to output bitcode instead of native code with LLVM, native image was used to compile the Graal compiler itself, the native image contains garbage collector, native image is considered "early adopters" technology, HotSpot mode is still 20% to 50% faster, G1 is going to be available on the native image as well, in future the performance of the AOT could vary +/-10% compared to JIT, polymorphic invocations could become faster on the native image / AOT, profile guided optimizations can be performed also ahead of time, new native images could learn from the past, the stability of AOT and JIT are similar, twitter already uses AOT for years, with Java you have the choice between AOT and JIT, unikernels could be supported by GraalVM in future, the GraalVM is hiring, Thomas Wuerthinger on twitter: @thomaswue

google sun api python java javascript hotspot g1 truffles linz menlo park sun microsystems aot graal jit reactjs llvm igv netbeans pypy

#168 Race your donkey car with Python

Google Cloud Platform Podcast

Play Episode Listen Later Feb 11, 2020 33:34

Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Special guest: Kojo Idrissa! Michael #1: donkeycar Have you ever seen a proper RC car race? Donkeycar is minimalist and modular self driving library for Python. It is developed for hobbyists and students with a focus on allowing fast experimentation and easy community contributions. Use Donkey if you want to: Make an RC car drive its self. Compete in self driving races like DIY Robocars Experiment with autopilots, mapping computer vision and neural networks. Log sensor data (images, user inputs, sensor readings). Drive your car via a web or game controller. Leverage community contributed driving data. Use existing CAD models for design upgrades. Brian #2: RIP Pipenv: Tried Too Hard. Do what you need with pip-tools. Nick Timkovich No releases of pipenv in 2019. It “has been held back by several subdependencies and a complicated release process” main benefits of pipenv: pin everything and use hashes for verifying packages The two file concept (Pipfile Pipfile.lock) is pretty cool and useful But we can do that with pip-tools command line tool pip-compile, which is also used by pipenv: pip-compile --generate-hashes --ouptut-file requirements.txt requirements.in What about virtual environment support? python -m venv venv --prompt $(basename $PWD) or equivalent for your shell works fine, and it’s built in. Kojo #3: str.casefold() used for caseless matching “Casefolding is similar to lowercasing but more aggressive because it is intended to remove all case distinctions in a string.” especially helpful for Unicode characters firstString = "der Fluß" secondString = "der Fluss" # ß is equivalent to ss if firstString.casefold() == secondString.casefold(): print('The strings are equal.') else: print('The strings are not equal.') # prints "The strings are equal." Michael #4: Virtualenv via Brian Skinn Virtualenv 20.0.0 beta1 is available Announcement by Bernat Gabor Why the major release I identified three main pain points: Creating a virtual environment is slow (takes around 3 seconds, even in offline mode; while 3 seconds does not seem that long if you need to create tens of virtual environments, it quickly adds up). The API used within PEP-405 is excellent if you want to create virtual environments; however, only that. It does not allow us to describe the target environment flexibly or to do that without actually creating the environment. The duality of virtualenv versus venv. Right, python3.4 has the venv module as defined by PEP-405. In theory, we could switch to that and forget virtualenv. However, it is not that simple. virtualenv offers a few benefits that venv does not Benefits over venv Ability to discover alternate versions (-p 2 creates a python 2 virtual environment, -p 3.8 a python 3.8, -p pypy3 a PyPy 3, and so on). virtualenv packages out of the box the wheel package as part of the seed packages, this significantly improves package installation speed as pip can now use its wheel cache when installing packages. You are guaranteed to work even when distributions decide not to ship venv (Debian derivates notably make venv an extra package, and not part of the core binary). Can be upgraded out of band from the host python (often via just pip/curl - so can pull in bug fixes and improvements without needing to wait until the platform upgrades venv). Easier to extend, e.g., we added Xonsh activation script generation without much pushback, support for PowerShell activation on POSIX platforms. Brian #5: Property-based tests for the Python standard library (and builtins) Zac Hatfield-Dodds and Paul Ganssle, so far. Goal: Find and fix bugs in Python, before they ship to users. “CPython's existing test suite is good, but bugs still slip through occasionally. We think that using property-based testing tools - i.e. Hypothesis - can help with this. They're no magic bullet, but computer-assisted testing techniques routinely try inputs that humans wouldn't think of (or bother trying), and turn up bugs that humans missed.” “Writing tests that describe every valid input often leads to tighter validation and cleaner designs too, even when no counterexamples are found!” “We aim to have a compelling proof-of-concept by PyCon US, and be running as part of the CPython CI suite by the end of the sprints.” Hypothesis and property based testing is superb to throw at algorithmic pure functions, and the test criteria is relatively straightforward for function pairs that have round trip logic, like tokenize/untokenize, encode/decode, compress/decompress, etc. And there’s probably tons of those types of methods in Python. At the very least, I’m interested in this to watch how other people are using hypothesis. Kojo #6: PyCon US Tutorial Schedule & Registration Find the schedule at https://us.pycon.org/2020/schedule/tutorials/ They tend to sell out FAST Videos are up fast afterwards What’s interesting to me? Migration from Python 2 to 3 Welcome to Circuit Python (Kattni Rembor) Intro to Property-Based Testing Minimum Viable Documentation (Heidi Waterhouse) Extras Michael: Foreword for Mastering Python Networking Pyramid (Waitress) and Django both issued security CVEs. You should upgrade! StackOverflow Survey 2020 is open. Go fill it out! Joke See the cartoon: https://trello-attachments.s3.amazonaws.com/58e3f7c543422d7f3ad84f33/5df14f77efb5642d017a593f/31cba5cdf0e9805d47837916555dd7ab/b5cb6570af72883f06c3dcbf47679e9d.jpg

2019-10-15

Linux Headlines

Play Episode Listen Later Oct 15, 2019 2:56

A double dose of Python, AWS credits for open source projects, a new kernel development course from the Linux Foundation, and an exciting release for KDE Plasma.

amazon training software arm open source python aws plasma kde json wayland linux foundation night mode kde plasma pypy hidpi

Python with Dustin Ingram

Play Episode Listen Later Mar 5, 2019 28:07

Mark and Brian Dorsey spend today talking Python with Dustin Ingram. Python is an interpreted, dynamically typed language, which encourages very readable code. Python is popular for web applications, data science, and much more! Python works great on Google Cloud, especially with App Engine, Compute Engine, and Cloud Functions. To learn more about best (and worst) use cases, listen in! Dustin Ingram Dustin Ingram is a Developer Advocate at Google, focused on supporting the Python community on Google Cloud. He’s also a member of the Python Packaging Authority, maintainer of PyPI, and organizer for the PyTexas conference. Cool things of the week Machine learning can boost the value of wind energy blog Compute Engine Guest Attributes site Colopl open sourced a Cloud Spanner driver for Laravel framework site Running Redis on GCP: four deployment scenarios blog Interview GCP Podcast Episode 3: Kubernetes and Google Container Engine podcast Python site Extending Python with C or C++ docs PyPy site PyPI site App Engine site Compute Engine site Cloud Functions site Ubuntu site Flask site Flask documentation docs Docker site Python documentation docs PyCon site PyCaribbean site Question of the week How can I manipulate images with Cloud Functions? Where can you find us next? Mark will be at GDC, Cloud NEXT, and ECGC in April. Dustin will be at Cloud Next and PyCon. Brian will be lecturing at Cloud Next: ‘Where should I run my code?’

google interview python ubuntu google cloud gdc docker kubernetes gcp developer advocate flask neurotic laravel pypi pycon cloud next pypy cloud spanner ecgc dustin ingram

Ep. 2.0 Back in the Habit

C******a Podcast

Play Episode Listen Later Feb 23, 2018 15:36

Hold on to your tits! Chingona is back for a second season. Nadia, Karen and Lea give you a sneak peek of Season 2. Our insta: www.instagram.com/chingonapodcast/ Our twitter: twitter.com/ChingonaPodcast Music: Pagan Day by Pypy http://freemusicarchive.org/music/PyPy/Live_at_WFMU_for-Spin_Age_Blasters_with_Creamo_Coyl_1232017_1032/Pypy_3_Pagan_Day Chingona Theme song by Raul Garza

habit wfmu chingona pypy spin age blasters creamo coyl

#47 PyPy now works with way more C-extensions and parking your package safely

The Python Podcast.__init__

Play Episode Listen Later Oct 12, 2017 16:44

See the full show notes for this episode on the website at pythonbytes.fm/47.

RPython with Maciej Fijalkowski

Play Episode Listen Later Jan 22, 2016 35:34

RPython is a subset of Python that is used for writing high performance interpreters for dynamic languages. The most well-known product of this tooling is the PyPy interpreter. In this episode we had the pleasure of speaking with Maciej Fijalkowski about what RPython is, what it isn't, what kinds of projects it has been used for, and what makes it so interesting.

python maciej pypy

#32: PyPy.js - PyPy Python in Your Browser

Talk Python To Me - Python conversations for passionate developers

Play Episode Listen Later Nov 3, 2015 59:12

See the full show notes for this episode on the website at talkpython.fm/32.

online training web software developers programming python data science online courses browsers cloud computing ide software developers web development mongodb nosql pycharm pypy python3 python2

#21: PyPy - The JIT Compiled Python Implementation

Talk Python To Me - Python conversations for passionate developers

Play Episode Listen Later Aug 18, 2015 53:57

See the full show notes for this episode on the website at talkpython.fm/21.

online training web software developers programming implementation python data science online courses cloud computing ide software developers web development mongodb compiled nosql pycharm pypy python3 python2

Ruby NoName Podcast S04E24

Ruby NoName podcast

Play Episode Listen Later Dec 7, 2012 50:31

Новости Engine Yard Local JRuby 1.7.1 DRY в RSpec Refinements могу быть отложены до 2.1, полная функциональность точно отложена, статья Чарльза Наттера об этом Статья на русском об ActiveSupport::Notifications Легкий пул процессов xpool StatBoard — статистика создания объектов Ruby 2.0.0 preview2 Вышел Draper 1.0.0.beta1 Библиотека krypt Тур по исходному коду MRI Обсуждение Новый сайт Наш новый сайт Мащенко Артем и твиттер Хилков Иван Юдин Максим Инструмент для статических сайтов middleman Railsclub'Ульяновск — 15 и 16 декабря в Ульяновске Алексей Палажченко Твиттер Алексея Qik Проект PyPy и еще о нем же Язык Go и еще о нем же История создания и философия языка Go Дизайн и философия языка Celluloid Модель акторов и статья в википедии о последовательных процессах (Алексей назвал их “синхронными”, хотя они последовательные) dl.google.com now served by Go Пакетный менеджер для Go Проблема с GC на 32-битных системах Changelog Go 1.1 Пять этажей в сутки Борец за качество подкаста Алексей назвал процедурные языки декларативными, хотя они императивные. facepalm. (Добавлено по просьбе Алексея)

actor mri dry noname rails gc draper celluloid refinements jruby rspec pypy qik

Episode 0x1B: Two Executive Directors

Play Episode Listen Later Oct 25, 2011 31:47

Bradley and Karen discuss their jobs, particularly fundraising, and plans for future shows. Show Notes: Segment 0 (00:36) The Google Summer of Code Program is large philanthropic program by Google for students to write Free Software in the summer. Bradley gave a talk about non-profit organizations at the Google SoC Mentor Summit 2011 Karen mentioned the GNOME Women's Outreach Program, which coordinates with the SoC, and the Season of KDE. (09:36) Conservancy's Amarok, Mercurial and PyPy projects are all currently doing fundraising programs (14:38) Bradley will give two talks at LinuxCon Europe this week. (15:15) Karen will attend the Ubuntu Developer Summit. (20:20) Karen will speak in Latvia later this year. (24:20) Richard Fontana discussed RMS' quote about Jobs on identi.ca (26:27) Segment 1 (29:28) We'll try to record some talks/interviews at upcoming events. Send feedback and comments on the cast to . You can keep in touch with Free as in Freedom on our IRC channel, #faif on irc.freenode.net, and by following Conservancy on identi.ca and and Twitter. Free as in Freedom is produced by Dan Lynch of danlynch.org. Theme music written and performed by Mike Tarantino with Charlie Paxson on drums. The content of this audcast, and the accompanying show notes and music are licensed under the Creative Commons Attribution-Share-Alike 4.0 license (CC BY-SA 4.0).

google freedom law sound executive director jobs executives legal open source linux latvia soc cc by sa kde irc rms conservancy gpl bsd mercurial outreach programs free software amarok dan lynch google summer ogg vorbis pypy software freedom lgpl mike tarantino

Episode 0x1B: Two Executive Directors

Play Episode Listen Later Oct 25, 2011 31:47

Bradley and Karen discuss their jobs, particularly fundraising, and plans for future shows. Show Notes: Segment 0 (00:36) The Google Summer of Code Program is large philanthropic program by Google for students to write Free Software in the summer. Bradley gave a talk about non-profit organizations at the Google SoC Mentor Summit 2011 Karen mentioned the GNOME Women's Outreach Program, which coordinates with the SoC, and the Season of KDE. (09:36) Conservancy's Amarok, Mercurial and PyPy projects are all currently doing fundraising programs (14:38) Bradley will give two talks at LinuxCon Europe this week. (15:15) Karen will attend the Ubuntu Developer Summit. (20:20) Karen will speak in Latvia later this year. (24:20) Richard Fontana discussed RMS' quote about Jobs on identi.ca (26:27) Segment 1 (29:28) We'll try to record some talks/interviews at upcoming events. Send feedback and comments on the cast to . You can keep in touch with Free as in Freedom on our IRC channel, #faif on irc.freenode.net, and by following Conservancy on on Twitter and and FaiF on Twitter. Free as in Freedom is produced by Dan Lynch of danlynch.org. Theme music written and performed by Mike Tarantino with Charlie Paxson on drums. The content of this audcast, and the accompanying show notes and music are licensed under the Creative Commons Attribution-Share-Alike 4.0 license (CC BY-SA 4.0).

google freedom law sound executive director jobs legal open source linux latvia soc cc by sa kde irc conservancy gpl bsd mercurial free software dan lynch google summer ogg vorbis pypy lgpl software freedom mike tarantino

Episode 0x19: GNOME 3.2 and Other Topics

Play Episode Listen Later Sep 28, 2011 48:46

Karen and Bradley discuss the GNOME 3.2 release, Karen interviews Jos Poortvliet, Bradley complains about identi.ca web interface and they discuss together UEFI “secure” boot, and the PyPy Python 3 campaign. Show Notes: Segment 0 (00:40) Bradley wrote a blog post about how GNOME 3 is not for him. Segment 1 (07:14) Karen interviewed Jos Poortvliet Segment 2 (21:04) Bradley mentioned Shaun McCance's post to desktop-devel about response bias, which he posted on user survey thread. (25:04) Karen mentioned that GNOME 3.2 has been released with new features, such as better window resizing. (28:57) Bradley pointed out that gnats was one of the earliest Free Software bug tracking systems. (30:37) Segment 3 (31:53) Bradley mentioned that he feels like the unfrozen caveman lawyer when trying to use identi.ca now. (32:54) Bradley mentioned Matthew Garrett's blog post about UEFI so-called “secure” booting. (37:36) PyPy is trying to raise funds to support Python 3 on PyPy. (41:20) Send feedback and comments on the cast to . You can keep in touch with Free as in Freedom on our IRC channel, #faif on irc.freenode.net, and by following Conservancy on identi.ca and and Twitter. Free as in Freedom is produced by Dan Lynch of danlynch.org. Theme music written and performed by Mike Tarantino with Charlie Paxson on drums. The content of this audcast, and the accompanying show notes and music are licensed under the Creative Commons Attribution-Share-Alike 4.0 license (CC BY-SA 4.0).

freedom law sound legal open source python linux gnome cc by sa irc conservancy gpl bsd free software uefi dan lynch ogg vorbis pypy lgpl software freedom unfrozen caveman lawyer matthew garrett jos poortvliet mike tarantino

Episode 0x19: GNOME 3.2 and Other Topics