What this really explains is why you should use Python. I use ThreadPoolExecutor + as_completed on every I/O-bound or async workflow reflexively and have literally never encountered a deadlock. (With this method, I mean. I have encountered, and caused, many deadlocks in less pristine environments.)