These speedups are awesome, but of course one wonders why they haven't been a low-hanging fruit over the past 25 years.
Having read about some of the changes [1], it seems like the python core committers preferred clean over fast implementations and have deviated from this mantra with 3.11.
Now let's get a sane concurrency story (no multiprocessing / queue / pickle hacks) and suddenly it's a completely different language!
[edit] A bit of explanation what I meant by low-hanging fruit: One of the changes is "Subscripting container types such as list, tuple and dict directly index the underlying data structures." Surely that seems like a straight-forward idea in retrospect. In fact, many python (/c) libraries try to do zero-copy work with data structures already, such as numpy.
A few people tried to excuse the slow python, but as far as I know the story, excuses are not necessary. Truth is that python was not meant to be fast, its source code was not meant to be fast, and its design was not optimized with the idea of being fast. Python was meant as a scrypting language that was easy to learn and work with on all levels and the issue of its slowness became important when it outgrew its role and became an application language powering large parts of the internet and the bigger part of the very expensive ML industry. I know that speed is a virtue, but it becomes a fundamental virtue when you have to scale and python was not meant to scale. So, yes, it is easy for people to be righteous and furious over the issue, but being righteous in hindsight is much easier than useful.
Meta has an active port to optimize instagram to make it faster and they just open sourced it to have optimizations merged back into CPython
When you have so many large companies with a vested interest in optimization I believe that Python can become faster by doing realistic and targeted optimizations . The other strategies to optimize didn’t work at all or just served internal problems at large companies .
O, I agree. As I said, when python and its uses scaled, it became quite necessary to make it fast. I like that it will be fast as well and I am not happy that it is slow at the moment. My point is that there are reasons why it was not optimized in the beginning and why this process of optimizations has started now.
Back when Python was started there was really C or C++ for optimized programs and scripting languages like Python and Perl. But since Python had the ability for C Extensions it allowed it to bypass those problems. Since Python was easy to learn both web developers and scientists to learn. Then financial organizations started to get interested and that’s really how Python cemented itself.
What exactly do you do with Python that slows you down ?
I worked on Python almost exclusively for maybe five years. Then I tried go. Each time I wrote a go program, I am giddy with excitement at how fast my first attempt is, scaling so smoothly with the number of cores. I also wrote a lot of fairly performant C in the 90s, so I know what computers can do in a second.
I still use Python for cases when dev time is more important than execution time (which is rarer now that I'm working adjacent to "big data") or when I'm doing things like writing python to close gaps in the various arrays of web apps provided for navigating the corporate work flow, and if I went as fast as a 64 core box let me, we'd have some outages in corp github or artifactory or the like, so I just do it one slow thing at a time on 1 core and wait for the results. Maybe multiprocessing with 10 process worker pool once I'm somewhat confident in the back end system I am talking to.
You should try Nim, it's Python like but compiled so it's as far as C. These days if I want to script something (and don't need Python specific libraries like pandas) I use Nim.
Go recently got pretty decent generics. I'm with you on the error handling though. At least it's explicit. Go also has a plethora of really useful libraries, and that along with good editor and build tooling is probably the real draw of the language for me.
Okay you are using Python gluing two web services together which is what you deep acceptable that Python can do but can you just comment on the things that you don't use Python anymore due to it being slow?
Don't take this the wrong way but I think you could be more specific. Are you saying that similar to Go it should be just faster in general?
So many successful projects/technologies started out that way. The Web, JavaScript, e-mail. DNS started out as a HOSTS.TXT file that people copied around. Linus Torvalds announced Linux as "just a hobby, won't be big and professional like gnu". Minecraft rendered huge worlds with unoptimized Java and fixed-function OpenGL.
Indeed, and that's why I love it. When I need extra performance, which is very rarely, I don't mind spending extra effort outsourcing it to a binary module. More often, the problem is an inefficient algorithm or data arrangement.
I took a graduate level data structures class and the professor used Python among other things "because it's about 80 times slower than C, so you have to think hard about your algorithms". At scale, that matters.
>> It prevents you from taking advantage of multiple cores. Doesn't really impact straight-line execution speed.
Most computers running Python today will have multiple cores. If your program can only use a fraction of its available computing power, it affects the program's execution speed. Python is too widely used in compute-intensive applications (e.g. machine learning) for GIL-related issues to be ignored in the long term.
Ideally, Python should have automatic load scaling and thread-safe data structures in the standard library to take advantage of all available CPU cores.
Java has had concurrent data structures in its standard library for years and is adding support for "easy" multi-threaded execution with Project Loom.
Python needs to add its own Pythonic versions of thread-safe data structures and compute load scaling to take advantage of the multi-core CPUs that it runs on.
>> A data structures course is primarily not going to be concerned with multithreading.
A graduate-level data structures course might include concurrent and parallel data structures.
My point is that you can write fast code just as easily as you can write slow code. So engineers should write fast code when possible. Obviously you can spend a lot of time making things faster, but that doesn't mean you can't be fast by default.
> you can write fast code just as easily as you can write slow code
I think some people can do this and some can't. For some, writing slow code is much easier, and their contributions are still valuable. Once the bottlenecks are problems, someone with more performance-oriented skills can help speed up the critical path, and slow code outside of the critical path is just fine to leave as-is.
If you somehow limited contributions only to those who write fast code, I think you'd be leaving way too much on the table.
You usually need more tricks for fast code. Bubble sort is easy to program (it' my default when I have to sort manually, and the data is has only like 10 items)
There are a few much better options like mergesort or quicksort, but they have their tricks.
But to sort real data really fast, you should use something like timsort, that detects if the data is just the union of two (or a few) sorted parts, so it's faster in many cases where the usual sorting methods don't detect the sorted initial parts. https://en.wikipedia.org/wiki/Timsort
Are you sorting integers? Strings? Ascii-only strings? Perhaps the code should detect some of them and run an specialized version.
Being fast requires effort. It's not always about raw performance of the language yo use, it's about using the right structures, algorithms, tradeoffs, solving the right problems, etc. It's not trivial and I've seen so many bad implementations in "fast" compiled languages.
Not true. Premature optimization is the root of all evil. You first write clean code, and then you profile and optimize. I refer you to the underlyings of dicts through the years (https://www.youtube.com/watch?v=npw4s1QTmPg) as an example of that optimization taking years of incremental changes. Once you see the current version it's easy to claim that you would have get to the current and best version in the first place, as obvious as it looks in hindsight.
CPython and the Python design in general clearly show that writing clean code and optimizing later is significantly harder and take much more effort than keeping optimizations in mind from the start. It doesn't mean you need to write optimal code form day one, just that you need to be careful not to program yourself into a corner.
> Being fast isn't contradictory with this goal. If anything, this is a lesson that so many developers forget. Things should be fast by default.
It absolutely is contradictory. If you look at the development of programming languages interpreters/VMs, after a certain point, improvements in speed become a matter of more complex algorithms and data structures.
Check out garbage collectors - it's true that Golang keeps a simple one, but other languages progressively increase its sophistication - think about Java or Ruby.
Or JITs, for example, which are the latest and greatest in terms of programming languages optimization; they are complicated beasts.
Yes, you can spend a large amount of time making things faster. But note that Go's GC is fast, even though it is simple. It's not the fastest, but it is acceptably fast.
Funny you should pick that example in a sub thread that you started with an assertion that code should be fast from by default.
Go’s GC was intentionally slow at first. They wanted to get it right THEN make it fast.
No offense but you’re not making a strong case. You’re sounding like an inexperienced coder that hasn’t yet learned that premature optimization is bad.
Far from it, Go was designed to be optimizable form the start. The GC was obviously not optimal, but the language semantics were such that the GC can be replaced with a better one with relatively minimal disruption.
Of course one can't release optimal code from version one, that would be absurd.
Also your last sentence is extremely condescending.
Maybe better to elaborate on what it should be, if not fast? Surely you aren’t advocating things should be intentionally slow by default, or carelessly inefficient?
There’s a valid tradeoff between perf and developer time, and it’s fair to want to prioritize developer time. There’s a valid reason to not care about fast if the process is fast enough that a human doesn’t notice.
That said, depends on what your work is, but defaulting to writing faster, more efficient code might benefit a lot of people indirectly. Lower power is valuable for server code and for electricity bills and at some level for air quality in places where power isn’t renewable. Faster benefits parallel processing, it leaves more room for other processes than yours. Faster means companies and users can buy cheaper hardware.
> Maybe better to elaborate on what it should be, if not fast?
It should satisfy the needs of the customer and it should be reasonably secure.
Everything else is a luxury.
My point was that in most of my time in the industry being faster would not have benefited my customers in any manner worth measuring.
I'm not anti-performance. I fantasize about how to make my software faster to maintain my geek credentials. But neither my bosses nor my customers pay for it.
If a customer says they want it faster we'll oblige.
> Yet we should not pass up our opportunities in that critical 3%. A good programmer will not be lulled into complacency by such reasoning, he will be wise to look carefully at the critical code; but only after that code has been identified.
That quote has been thrown around every time in order to justify writing inefficient code and never optimizing it. Python is 10-1000x slower than C, but sure, let's keep using it because premature optimization is the root of all evil, as Knuth said. People really love to ignore the "premature" word in that quote.
Instead, what he meant is that you should profile what part of the code is slow and focus on it first. Knuth didn't say you should be fine with 10-1000x slower code overall.
The thing is, when I see people using this quote, I don't see them generally using it to mean you should never optimize. I think people don't ignore the premature bit in general. Now, throwing this quote out there generally doesn't contribute to the conversation. But then, I think, neither does telling people to read the context when the context doesn't change the meaning of the quote.
Right but if they post just that part, they're probably heavily implying that now is not the time to optimize. I've seen way more people using it to argue that you shouldn't be focussing on performance at this time, than saying "sometimes you gotta focus on that 3% mentioned in the part of the quote that I deliberately omitted"
No, they don't deliberately omit the part of the quote. They are either unaware of that part of the quote or don't think it matters to the point they are making.
Yes, if you quote Knuth here (whether the short quote or a longer version) you are probably responding to someone whom you believe is engaged in premature optimization.
It remains that the person quoting Knuth isn't claiming that there isn't such a thing as justified optimization. As such, pointing to the context doesn't really add to the conversation. (Nor does a thoughtless quote of Knuth either)
I dunno, guess it's hard to say given we're talking about our own subjective experiences. I completely believe there are people who just know the "premature optimization is the root of all evil" part and love to use it because quoting Knuth makes them sound smart. And I'm sure there are also people know it all and who quote that part in isolation (and in good faith) because they want to emphasise that they believe you're jumping the gun on optimization.
But either way I think the original statement is so uncontroversial and common-sense I actually think it doesn't help any argument unless you're talking to an absolutely clueless dunce or unless you're dealing with someone who somehow believes every optimization is premature.
You certainly can accept that slowdown if the total program run-time remains within acceptable limits and the use of a rapid prototyping language reduces development time. There are times when doing computationally heavy, long-running processes where speed is important, but if the 1000x speedup is not noticeable to the user than is it really a good use of development time to convert that to a more optimized language?
As was said, profile, find user-impacting bottlenecks, and then optimize.
I would note that the choice of programming language is a bit different. Projects are pretty much locked into that choice. You've got to decide upfront whether the trade off in a rapid prototyping language is good or not, not wait until you've written the project and then profile it.
> Projects are pretty much locked into that choice.
But, they aren't.
I mean, profile, identify critical components, and rewrite in C (or some other low-level language) for performance is a normal thing for most scripting languages.
> You've got to decide upfront whether the trade off in a rapid prototyping language is good or not, not wait until you've written the project and then profile it.
Yes, it true, if you use Python you can rewrite portions in C to get improved performance. But my point was rather that you couldn't later decide you should have written the entire project in another language like Rust or C++ or Java or Go. You've got the make decision about your primary language up-front.
Or to look at it another way: Python with C extensions is effectively another language. You have to consider it as an option along with Pure Python, Rust, Go, C++, Java, FORTRAN, or what have you. Each language has different trade-offs in development time vs performance.
Certainly, but Python is flexible enough that it readily works with other binaries. If a specific function is slowing down the whole project, an alternate implementation of that function in another language can smooth over that performance hurdle. The nice thing about Python is that it is quite happy interacting with C or go or Fortran libraries to do some of the heavy lifting.
The quote at least contextualises "premature". As it is, premature optimisation is by definition inappropriate -- that's what "premature" means. The context:
a) gives a rule-of-thumb estimate of how much optimisation to do (maybe 3% of all opportunities);
b) explains that non-premature opimisation is not just not the root of all evil but actually a good thing to do; and
c) gives some information about how to do non-premature optimisation, by carefully identifying performance bottlenecks after the unoptimised code has been written.
I agree with GP that unless we know what Knuth meant by "premature" it is tempting to use this quote to justify too little optimisation.
I agree with you, the context changes nothing (and I upvoted you for this reason). However programming languages and infrastructure pieces like this are a bit special, in that optimizations here are almost never premature.
* Some of the many applications relying on these pieces, could almost certainly use the speedup and for those it wouldn't be premature
* The return of investment is massive due to the scale
* There are tremendous productivity gains by increasing the performance baseline because that reduces the time people have to spend optimizing applications
This is very different from applications where you can probably define performance objectives and define much more clearly what is and isn't premature.
I don't know about that. Even with your programming language/infrastructure you still want to identify the slow bits and optimize those. At the end of the day, you only have a certain amount of bandwith for optimization, and you want to use that where you'll get the biggest bang for you buck.
Python was meeting needs well enough to be one of, if not the single, most popular language for a considerable time and continuing to expand and become dominant in new application domains while languages that focussed more heavily on performance rose and fell.
And it's got commercial interests willing to throw money at performance now because of that.
Seems like the Python community, whether as top-down strategy or emergent aggregate of grassroots decisions made the right choices here.
Python had strengths that drove it's adoption, namely that it introduced new ideas about a language's accessibility and readability. I'm not sure it was ever really meeting the needs of application developers. People have been upset about Python performance and how painful it is to write concurrent code for a long time. The innovations in accessibility and readability have been recognized as valuable - and adopted by other languages (Go comes to mind). More recently, it seems like Python is playing catch-up, bringing in innovations from other languages that have become the norm, such as asyncio, typing, even match statements.
Languages don't succeed on their technical merit. They succeed by being good enough to gain traction, after which it is more about market forces. People choose Python for it's great ecosystem and the availability of developers, and they accept the price they pay in performance. But that doesn't imply that performance wasn't an issue in the past, or that Python couldn't have been even more successful if it had been more performant.
And to be clear, I use Python every day, and I deeply appreciate the work that's been put into 3.10 and 3.11, as well as the decades prior. I'm not interested in prosecuting the decisions about priorities that were made in the past. But I do think there are lessons to be learned there.
In my experience it's far more common for "optimizations" to be technical debt than the absence of them.
> only ever think about performance in retrospect
From the extra context it pretty much does mean that. "but only after that code has been identified" - 99.999% of programmers who think they can identify performance bottlenecks other than in retrospect are wrong, IME.
Well it's entirely possible that Knuth and I disagree here, but if you architect an application without thinking about performance, you're likely going to make regrettable decisions that you won't be able to reverse.
It is not possible to predict bottlenecks in computation, no. But the implications of putting global state behind a mutex in a concurrent application should be clear to the programmer, and they should think seriously before making a choice like that. If you think of a different way to do that while the code is still being written, you'll avoid trapping yourself in an irreversible decision.
Because Guido van Rossum just isn't very good with performance, and when others tried to contribute improvements, he started heckling their talk because he thought they were “condescending”: https://lwn.net/Articles/754163/ And by this time, we've come to the point where the Python extension API is as good as set in stone.
Note that all of the given benchmarks are microbenchmarks; the gains in 3.11 are _much_ less pronounced on larger systems like web frameworks.
> one wonders why they haven't been a low-hanging fruit over the past 25 years
Because the core team just hasn't prioritized performance, and have actively resisted performance work, at least until now. The big reason has been about maintainership cost of such work, but often times plenty of VM engineers show up to assist the core team and they have always been pushed away.
> Now let's get a sane concurrency story
You really can't easily add a threading model like that and make everything go faster. The hype of "GIL-removal" branches is that you can take your existing threading.Thread Python code, and run it on a GIL-less Python, and you'll instantly get a 5x speedup. In practice, that's not going to happen, you're going to have to modify your code substantially to support that level of work.
The difficulty with Python's concurrency is that the language doesn't have a cohesive threading model, and many programs are simply held alive and working by the GIL.
Yup. Sad but true. I just wanted c++ style multithreading with unsafe shared memory but it's too late. I was impressed by the recent fork that addresses this in a very sophisticated way, but realistically gil cpython will keep finding speed ups to stave off the transition.
> one wonders why they haven't been a low-hanging fruit over the past 25 years.
From the very page you've linked to:
"Faster CPython explores optimizations for CPython. The main team is funded by Microsoft to work on this full-time. Pablo Galindo Salgado is also funded by Bloomberg LP to work on the project part-time."
Yep. That's the thing about the open source dream; especially for work like this that requires enough time commitment to understand the whole system, and a lot of uninteresting grinding details such that very few people would do it for fun, you really need people being funded to work on it full-time (and a 1-year academic grant probably doesn't cut it), and businesses are really the only source for that.
It's a little confusing but I don't think they meant inlining in the traditional sense. its more like they inlined the C function wrapper around python functions.
> During a Python function call, Python will call an evaluating C function to interpret that function’s code. This effectively limits pure Python recursion to what’s safe for the C stack.
> In 3.11, when CPython detects Python code calling another Python function, it sets up a new frame, and “jumps” to the new code inside the new frame. This avoids calling the C interpreting function altogether.
> Most Python function calls now consume no C stack space. This speeds up most of such calls. In simple recursive functions like fibonacci or factorial, a 1.7x speedup was observed. This also means recursive functions can recurse significantly deeper (if the user increases the recursion limit). We measured a 1-3% improvement in pyperformance.
> Of course one wonders why they haven't been a low-hanging fruit over the past 25 years.
Because it's developed and maintained by volunteers, and there aren't enough folks who want to spend their volunteer time messing around with assembly language. Nor are there enough volunteers that it's practical to require very advanced knowledge of programming language design theory and compiler design theory as a prerequisite for contributing. People will do that stuff if they're being paid 500k a year, like the folks who work on v8 for Google, but there aren't enough people interested in doing it for free to guarantee that CPpython will be maintained in the future if it goes too far down that path.
Don't get me wrong, the fact that Python is a community-led language that's stewarded by a non-profit foundation is imho its single greatest asset. But that also comes with some tradeoffs.
> Because it's developed and maintained by volunteers
Who is working on python voluntarily? I would assume that, like the Linux kernel, the main contributors are highly paid. Certainly, having worked at Dropbox, I can attest to at least some of them being highly paid.
They have no obligation to go out of their way to cater to yours.
The entitlement of the new generation of open-source contributors to require political correctness or friendliness is so destructive it's ridiculous.
I wouldn't want to be involved with any project that prioritizes such zealotry on top of practicality.
It's not about entitlement so much as about ensuring the project can reach it's full potential and continue to stay relevant and useful in perpetuity as people come and go.
The world constantly changes, projects adapt or become less useful.
Not really. There were a couple engineers working at Google on a project called unladen swallow which was extremely promising but it eventually got canceled.
The developer who worked at Microsoft to make iron python I think that was his full-time project as well and it was definitely faster than cpython at the time
"The developer who worked at Microsoft to make iron python" - Jim Hugunin [1] (say his name!) to whom much is owed by the python community and humanity, generally.
I'm sorry I had forgotten at the time. I almost got to work with him at Google. I agree he's a net positive for humanity (I used numeric heavily back in the day)
Those people were not being paid to speed up CPython, though, but mostly-source-compatible (but not at all native extension compatible) alternative interpreters.
That's absolutely not the case with IronPython, where being tied to the .NET ecosystem was very much the point.
But if it was just core-team resistance and not a more fundamentally different objective than just improving the core interpreter performance, maintaining support for native extensions while speeding up the implementation would have been a goal even if it had to be in a fork.
They were solving a fundamentally different (and less valuable to the community) problem than the new faster CPython project.
>That's absolutely not the case with IronPython, where being tied to the .NET ecosystem was very much the point.
That might very well be, but I talked about UnladdedSwallow and such "native" attempts.
Not attempts to port to a different ecosystem like Jython or IronPython or the js port.
>maintaining support for native extensions while speeding up the implementation would have been a goal even if it had to be in a fork.
That's not a necessary goal. One could very well want to improve CPython, the core, even if it meant breaking native extensions. Especially if doing so meant even more capabilities for optimization.
I, for one, would be fine with that, and am pretty sure all important native extensions would have adapted quite soon - like they adapted to Python 3 or the ARM M1 Macs.
In fact, one part of the current proposed improvements includes some (albeit trivial) fixes to native extensions.
If I remember correctly iron python ran on mono it just didn't have all of the .net bits. I remember at the python conference when it was introduced he actually showed how you could bring up clippy or a wizard programmatically from python talking directly to the operating system through its native apis.
Really the value came from the thread safe container/collections data types which are the foundation of high performance programming
I've used a few .net applications on Linux and they work pretty well. But I've never actually tried to develop using mono. I really wish Microsoft had got all in on linux .net support 1 years ago
> JavaScript has proven that any language can be made fast given enough money and brains,
Yeah but commercial Smalltalk proved that a long time before JS did. (Heck, back when it was maintained, the fastest Ruby implementation was built on top of a commercial Smalltalk system, which makes sense given they have a reasonably similar model.)
The hard part is that “enough money“ is...not a given, especially for noncommercial projects. JS got it because Google decided JavaScript speed was integral to it's business of getting the web to replace local apps. Microsoft recently developed sufficient interest in Python to throw some money at it.
Google wasn’t alone in optimizing JS, it actually came late, Safari and Firefox were already competing and improving their runtime speeds, though V8 did doubled down on the bet of a fast JS machine.
The question is why there isn’t enough money, given that there obviously is a lot of interest from big players.
> The question is why there isn’t enough money, given that there obviously is a lot of interest from big players.
I'd argue that there wasn't actually much interest until recently, and that's because it is only recently that interest in the CPython ecosystem has intersected with interest in speed that has money behind it, because of the sudden broad relevance to commercial business of the Python scientific stack for data science.
Both Unladen Swallow and IronPython were driven by interest in Python as a scripting language in contexts detached from, or at least not necessarily attached to, the existing CPython ecosystem.
> Maybe Python's C escape hatch is so good that it’s not worth the trouble.
Even if it wasn't good, the presence of it reduces the necessity of optimizing the runtime. You couldn't call C code in the browser at all until WASM; the only way to make JS faster was to improve the runtime.
> JavaScript has proven that any language can be made fast given enough money and brains, no matter how dynamic.
JavaScript also lacks parallelism. Python has to contend with how dynamic the language is, as well as that dynamism happening in another thread.
There are some Python variants that have a JIT, though they aren't 100% compatible. Pypy has a JIT, and iirc, IronPython and JPython use the JIT from their runtimes (.Net and Java, respectively).
That’s what I meant, if JavaScript hadn’t been trapped in the browser and allowed to call C, maybe there wouldn’t have been so much investment in making it fast.
But still, it’s kind of surprising that PHP has a JIT and Python doesn’t (official implementation I mean, not PyPy).
Python has a lot more third-party packages that are written in native code. PHP has few partly because it just isn't used much outside of web dev, and partly because the native bits that might be needed for web devs, like database libraries, are included in the core distro.
I suspect that the amount of people and especially companies willing to spend time and money optimizing Python are fairly low.
Think about it: if you have some Python application that's having performance issues you can either dig into a foreign codebase to see if you can find something to optimize (with no guarantee of result) and if you do get something done you'll have to get the patch upstream. And all that "only" for a 25% speedup.
Or you could rewrite your application in part or in full in Go, Rust, C++ or some other faster language to get a (probably) vastly bigger speedup without having to deal with third parties.
Everyone likes to shirk away from their jobs; engineers and programmers have ways of making their fun (I'm teaching myself how to write parsers by giving each project a DSL) look like work. Lingerie designers or eyebrow barbers have nothing of the sort, they just blow off work on TikTok or something.
> Or you could rewrite your application in part or in full in Go, Rust, C++
Or you can just throw more hardware at it, or use existing native libraries like NumPy. I don't think there are a ton of real-world use cases where Python's mediocre performance is a genuine deal-breaker.
If Python is even on the table, it's probably good enough.
Instead, there are big companies, who are running let's say the majority of their workloads in python. It's working well, it doesn't need to be very performant, but together all of the workloads are representing a considerable portion of your compute spend.
At a certain scale it makes sense to employ experts who can for example optimize Python itself, or the Linux kernel, or your DBMS. Not because you need the performance improvement for any specific workload, but to shave off 2% of your total compute spend.
This isn't applicable to small or medium companies usually, but it can work out for bigger ones.
There was some guarantee of result. It has been a long process but there was mostly one person who had identified a number of ways to make it faster but wanted financing to actually do the job. Seems Microsoft is doing the financing, but this has been going on for quite a while.
Python has actually had concurrency since about 2019: https://docs.python.org/3/library/asyncio.html. Having used it a few times, it seems fairly sane, but tbf my experience with concurrency in other languages is fairly limited.
I find asyncio to be horrendous, both because of the silliness of its demands on how you build your code and also because of its arbitrarily limited scope. Thread/ProcessPoolExecutor is personally much nicer to use and universally applicable...unless you need to accommodate Ctrl-C and then it's ugly again. But fixing _that_ stupid problem would have been a better expenditure of effort than asyncio.
>I find asyncio to be horrendous, both because of the silliness of its demands on how you build your code and also because of its arbitrarily limited scope.
Do you compare it to threads and pools, or judge it on its merits as an async framework (with you having experience of those that you think are done better elsewhere, e.g. in Javascript, C#, etc)?
Because both things you mention "demands on how you build your code" and "limited scope" are part of the course with async in most languages that aren't async-first.
> Because both things you mention "demands on how you build your code" and "limited scope" are part of the course with async in most languages
I don't see how "asyncio is annoying and can only be used for a fraction of scenarios everywhere else too, not just here" is anything other than reinforcement of what I said. OS threads and processes already exist, can already be applied universally for everything, and the pool executors can work with existing serial code without needing the underlying code to contort itself in very fundamental ways.
Python's version of asyncio being no worse than someone else's version of asyncio does not sound like a strong case for using Python's asyncio vs fixing the better-in-basically-every-way concurrent futures interface that already existed.
>I don't see how "asyncio is annoying and can only be used for a fraction of scenarios everywhere else too, not just here" is anything other than reinforcement of what I said.
Well, I didn't try to refute what you wrote (for one, it's clearly a personal, subjective opinion).
I asked what I've asked merely to clarify whether your issue is with Python's asyncio (e.g. Python got it wrong) or with the tradeoffs inherent in async io APIs in general (regardless of Python).
And it seems that it's the latter. I, for one, am fine with async APIs in JS, which have the same "problems" as the one you've mentioned for Python's, so don't share the sentiment.
> I've asked merely to clarify whether your issue is with Python's asyncio (e.g. Python got it wrong) or with the tradeoffs inherent in async io APIs in general (regardless of Python)
Both, but the latter part is contextual.
> I, for one, am fine with async APIs in JS
Correct me if you think I'm wrong, but JS in its native environment (the browser) never had access to the OS thread and process scheduler, so the concept of what could be done was limited from the start. If all you're allowed to have is a hammer, it's possible to make a fine hammer.
But
1. Python has never had that constraint
2. Python's asyncio in particular is a shitty hammer that only works on special asyncio-branded nails
and 3. Python already had a better futures interface for what asyncio provides and more before asyncio was added.
The combination of all three of those is just kinda galling in a way that it isn't for JS because the contextual landscape is different.
Which is neither here, nor there. Python had another big constraint, the GIL. So threads there couldn't go so far as async would. But even environments with threads (C#, Rust) also got big into async in the same style.
>2. Python's asyncio in particular is a shitty hammer that only works on special asyncio-branded nails
Well, that's also the case with C#, JS, and others with similar async style (aka "colored functions"). And that's not exactly a problem, as much as a design constraint.
What has GIL to do with the thread model vs asyncio? asyncio is also single threaded, so cooperative (and even preemptive) green threads would have been a fully backward compatible option.
JS never had an option as, as far as I understand, callback based async was already the norm, so async functions were an improvement over what came before. C# wants to be an high performance language, so using async to avoid allocating a full call stack per task is understandable. In python the bottleneck would be elsewhere, so scaling would be in no way limited by the amount of stack space you can allocate, so adding async is really hard to justify.
>What has GIL to do with the thread model vs asyncio?
Obviously the fact that the GIL prevents effient use of threads, so asyncio becomes the way to get more load from a single CPU by taking advantage of the otherwise blocking time.
How would the GIL prevent the use of "green" threads? Don't confuse the programming model with the implementation. For example, as far as I understand, gevent threads are not affected by the GIL when running on the same OS thread.
Try C# as a basis for comparison, then. It also has access to native threads and processes, but it adopted async - indeed, it's where both Python and JS got their async/await syntax from.
Asyncio violates every aspect of compositional orthogonality just like decorators you can't combine it with anything else without completely rewriting your code around its constrictions. It's also caused a huge amount of pip installation problems around the AWS CLI and boto
Having both Task and Future was a pretty strange move; and the lack of static typing certainly doesn't help: the moment you get a Task wrapping another Task wrapping the actual result, you really want some static analysis tool to tell you that you forgot one "await".
Concurrency in Python is a weird topic, since multiprocessing is the only "real" concurrency. Threading is "implicit" context switching all in the same process/thread, asyncio is "explicit" context switching.
On top of that, you also have the complication of the GIL. If threads don't release the GIL, then you can't effectively switch contexts.
> Concurrency in Python is a weird topic, since multiprocessing is the only "real" concurrency.
You are confusing concurrency and parallelism.
> Threading is "implicit" context switching all in the same process/thread
No, threading is separate native threads but with a lock that prevents execution of Python code in separate threads simultaneously (native code in separate threads, with at most on running Python, can still work.)
Not in CPython it isn't. Threading in CPython doesn't allow 2 threads to run concurrently (because of GIL). As GP correctly stated, you need multiprocessing (in CPython) for concurrency.
They're emphasizing a precise distinction between "concurrent" (the way it's structured) and "parallel" (the way it runs).
Concurrent programs have multiple right answers for "Which line of computation can make progress?" Sequential execution picks one step from one of them and runs it, then another, and so on, until everything is done. Whichever step is chosen from whichever computation, it's one step per moment in time; concurrency is only the ability to choose. Parallel execution of concurrent code picks steps from two or more computations and runs them at once.
Because of the GIL, Python on CPython has concurrency but limited parallelism.
> Threading in CPython doesn't allow 2 threads to run concurrently (because of GIL)
It does allow threads to execute concurrently. It doesn't allow them to execute in parallel if they all are running Python code (if at least one is rubbing native code and has released the GIL, then those plus one that has not can run in parallel.)
I have used asyncio in anger quite a bit, and have to say that it seems elegant at first and works very well for some use cases.
But when you try to do things that aren't a map-reduce or Pool.map() pattern, it suddenly becomes pretty warty. E.g. scheduling work out to a processpool executor is ugly under the hood and IMO ugly syntactically as well.
I love asyncio! It's a very well put together library. It provides great interfaces to manage event loops, io, and some basic networking. It gives you a lot of freedom to design asynchronous systems as you see fit.
However, batteries are not included. For example, it provides no HTTP client/server. It doesn't interop with any synchronous IO tools in the standard library either, making asyncio a very insular environment.
For the majority of problems, Go or Node.js may be better options. They have much more mature environments for managing asynchrony.
Until you need to do async FFI. Callbacks and the async/await syntactic sugar on top of them compose nicely across language boundaries. But green threads are VM-specific.
I believe the reason is that python does not need any low-hanging fruits to have people use it, which is why they're a priority for so many other projects out there. Low-hanging fruits attract people who can't reach higher than that.
When talking about low-hanging fruits, it's important to consider who they're for. The intended target audience. It's important to ask ones self who grabs for low-hanging fruits and why they need to be prioritized.
And with that in mind, I think the answer is actually obvious: Python never required the speed, because it's just so good.
The language is so popular, people search for and find ways around its limitations, which most likely actually even increases its popularity, because it gives people a lot of space to tinker in.
> Low-hanging fruits attract people who can't reach higher than that.
Do we have completely different definitions of low-hanging fruit?
Python not "requiring" speed is a fair enough point if you want to argue against large complex performance-focused initiatives that consume too much of the team's time, but the whole point of calling something "low-hanging fruit" is precisely that they're easy wins — get the performance without a large effort commitment. Unless those easy wins hinder the language's core goals, there's no reason to portray it as good to actively avoid chasing those wins.
> is precisely that they're easy wins — get the performance without a large effort commitment.
Oh, that's not how I interpret low-hanging fruits. From my perspective a "low-hanging fruit" is like cheap pops in wrestling. Things you say of which you know that it will cause a positive reaction, like saying the name of the town you're in.
As far as I know, the low-hanging fruit isn't named like that because of the fruit, but because of those who reach for it.
My reason for this is the fact that the low-hanging fruit is "being used" specifically because there's lots of people who can reach it. The video gaming industry as a whole, but specifically the mobile space, pretty much serves as perfect evidence of that.
Edit:
It's done for a certain target audience, because it increases exposure and interest. In a way, one might even argue that the target audience itself is a low-hanging fruit, because the creators of the product didn't care much about quality and instead went for that which simply impresses.
I don't think python would have gotten anywhere if they had aimed for that kind of low-hanging fruit.
Ah, ok. We're looking at the same thing from different perspectives then.
What I'm describing, which is the sense I've always seen that expression used as in engineering, and what GP was describing, is: this is an easy low-risk project that have a good chance of producing results.
E.g. If you tell me that your CRUD application suffers from slow reads, the low-hanging fruit is stuff like making sure your queries are hitting appropriate indices instead of doing full table scans, or checking that you're pooling connections instead of creating/dropping connections for every individual query. Those are easy problems to check for and act on that don't require you to try to grab the fruit hard-to-reach fruit at the top of the tree like completely redesigning or DB schema or moving to a new DB engine altogether.
the obvious or easy things that can be most readily done or dealt with in achieving success or making progress toward an objective
"Maria and Victor have about three months' living expenses set aside. That's actually pretty good …. But I urged them to do better …. Looking at their monthly expenses, we found a few pieces of low-hanging fruit: Two hundred dollars a month on clothes? I don't think so. Another $155 for hair and manicures? Denied."
"As the writers and producers sat down in spring 2007 to draw the outlines of Season 7, they knew, Mr. Gordon said, that most of the low-hanging fruit in the action genre had already been picked."
"When business types talk about picking low-hanging fruit, they don't mean, heaven forbid, doing actual physical labor. They mean finding easy solutions."
It can also be a "low-hanging fruit" in a sense that it's possible to do without massive breakage of the ecosystem (incompatible native modules etc). That is, it's still about effort - but effort of the people using the end result.
I see your point, but it directly conflicts with the effort many people put into producing extremely fast libraries for specific purposes, such as web frameworks (benchmarked extensively), ORMs and things like json and date parsing, as seen in the excellent ciso8601 [1] for example.
I disagree that it conflicts. There's an (implied) ceiling on Python performance, even after optimizations. The fear has always been that removing the design choices that cause that ceiling, would result in a different, incompatible language or runtime.
If everyone knows it's never going to reach the performance needed for high performance work, and there's already an excellent escape hatch in the form of C extensions, then why would people be spending time on the middle ground of performance? It'll still be too slow to do the things required, so people will still be going out to C for them.
Personally though, I'm glad for any performance increases. Python runs in so much critical infrastructure, that even a few percent would likely be a considerable energy savings when spread out over all users. Of course that assumes people upgrade their versions...but the community tends to be slow to do so in my experience.
> I believe the reason is that python does not need any low-hanging fruits to have people use it, which is why they're a priority for so many other projects out there. Low-hanging fruits attract people who can't reach higher than that.
Ah! So they are so tall that picking the low-hanging fruit would be too inconvenient for them.
> Now let's get a sane concurrency story (no multiprocessing / queue / pickle hacks) and suddenly it's a completely different language!
Yes, and I would argue it already exists and is called Rust :)
Semi-jokes aside, this is difficult and is not just about removing the GIL and enabling multithreading ; we would need to get better memory and garbage collection controls. Parts of what's make python slow and dangerous in concurrent settings are the ballooning memory allocations on large and fragmented workloads. A -Xmx would help.
There is a race right now to be more performant. .NET, Java, Go already participate, Rust/C++ are anyway there. So to stay relevant they have to start participate. .NET went through the same some years ago.
And why they were not addressed: because starting a certain points, the optimization are e.g. processor specific or non intuitive to understand. Making it hard to maintain vs a simple straight forward solution.
Having read about some of the changes [1], it seems like the python core committers preferred clean over fast implementations and have deviated from this mantra with 3.11.
Now let's get a sane concurrency story (no multiprocessing / queue / pickle hacks) and suddenly it's a completely different language!
[1] Here are the python docs on what precisely gave the speedups: https://docs.python.org/3.11/whatsnew/3.11.html#faster-cpyth...
[edit] A bit of explanation what I meant by low-hanging fruit: One of the changes is "Subscripting container types such as list, tuple and dict directly index the underlying data structures." Surely that seems like a straight-forward idea in retrospect. In fact, many python (/c) libraries try to do zero-copy work with data structures already, such as numpy.