Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

Here's my experience developing a JVM lightweight thread library[1]:

1. For the scheduler, we use the JDK's superb and battle-tested ForkJoinPool (developed by Doug Lea), which is an excellent work-stealing scheduler, and continues to improve with every release.

2. For synchronization, we've adapted java.util.concurrent's constructs (we use the same interfaces so no change to user code) to respect fibers, but users are expected to mostly use Go-like channels or Erlang like actors that are both included.

3. As for disk IO, Java does provide an asynchronous interface on all platforms, so integrating that wasn't a problem.

4. Integrating with existing libraries is easy if they provide an asynchronous (callback based) API, which is easily turned into fiber-blocking calls. If not, then ForkJoinPool does handle non-frequent blocking of OS thread gracefully.

All in all, the experience has been very pleasant: callbacks are gone and performance/scalability is great. Things will get even better if Linux will adopt Google's proposal for user-scheduled OS threads, so that all code will be completely oblivious to whether the threads are scheduled by the kernel or in user space.

Regarding performance, Linux does have a very good scheduler (unlike, say, OS X), but while there's little latency involved if the kernel directly wakes up a blocked thread (say, after a sleep or as a response to an IO interrupt), it still adds very significant latency when one thread wakes up another. This is very common in code that uses message passing (CSP/actors), and we've been able to reduce scheduling overhead by at least an order of magnitude over OS threads.

I would summarize this as follows: if your code only blocks on IO, or blocks infrequently on synchronization, then OS threads are quite good; but if you structure your program with CSP/actors then user-space threads are only sensible way to go for the time being.

[1]: https://github.com/puniverse/quasar



pron,

I saw your Quasar library before and while I hadn't had the chance to use I'm excited to try it the next time I need to write event code in the JVM.

As far as the disk IO is concerned the Java APIs are only as good as the underlying OS interfaces. And, those are not that great.

I think you're spot on with the assertion that a mostly network bound workloads can benefit from N:M scheduling. Many of the apps that we build nowadays are exactly that.


That is really interesting. If it's not too much trouble to write out, could you explain what causes the latency difference between kernel wake-up and other thread wake-up?


Paul Turner explained this really well at this year's Linux Plumbers Conference. The whole talk is fantastic, but the explanation of what pron is describing in particular (and how it could be improved) starts around 8:39: https://www.youtube.com/watch?v=KXuZi9aeGTw#t=519


thank you very much I love this stuff


I honestly don't know :) I was simply reporting my results experimenting with this (I'll try to write a blog post about it some time in the near future), so I'll defer to those with a deeper knowledge of the Linux kernel.

I have read that the Linux scheduler exploits some heuristics if it can guess how soon a blocked thread will need to be woken up, so this might have something to do with that.


I was worried it might be an empirical result :) thanks for responding, though, and I'd love to see the post when it's finished


Have you got a reference for user-scheduled OS threads? Hadn't come across that...



Thanks for providing the link and it does seam like an interesting proposal.

To summarize:

Switching into ring0 (kernel is not that expensive)

We're going to stay with the 1:1 thread model

We're going to provide an new syscall to hint to the scheduler which thread to switch to.

Provided we have time slice still left the scheduler can do that almost instantly since picking the task to run next is expensive (their data).

This isn't scheduler activations but a yieldTo().

This method avoids all sorts of problems with 3rd party libraries namely avoiding TLS problems.

This doesn't do anything about block IO.


> This doesn't do anything about block IO.

I'm not too familiar with the details but I think they mention that a thread can specify a callback that will be called if it blocks on IO, and the callback can specify another thread to switch to.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: