-
Type: Improvement
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Internal
Testing JEP 425/428 using a JDK 19 early access build (build 26), I am able to create a scenario where all virtual threads are starved of carrier threads due to the fact that all available carrier threads are pinned by synchronized blocks.
Multiple virtual threads are observed (via jcmd) to be blocked waiting to enter the synchronized block in BaseCluster#withLock. Meanwhile, one thread is observed to be in that same block and, within that block, waiting to acquire the ReentrantLock in ConcurrentPool#lockUnfair. Another thread is also trying to acquire that same lock. No threads appear to actually hold that lock.
So what I think is happening is that neither of those two threads waiting to acquire the ReentrantLock are able to do so because all available carrier threads are pinned to other virtual threads waiting to enter the synchronized block in BaseCluster#withLock, and in particular the one that holds the BaseCluster lock is unable to unwind the stack in order to release it.
We can avoid this scenario by replacing the synchronized block in BaseCluster#withLock with a ReentrantLock.
Note: this scenario seems to contradict this part of JEP 425:
Pinning does not make an application incorrect, but it might hinder its scalability. If a virtual thread performs a blocking operation such as I/O or BlockingQueue.take() while it is pinned, then its carrier and the underlying OS thread are blocked for the duration of the operation. Frequent pinning for long durations can harm the scalability of an application by capturing carriers.
The scheduler does not compensate for pinning by expanding its parallelism. Instead, avoid frequent and long-lived pinning by revising synchronized blocks or methods that run frequently and guard potentially long I/O operations to use java.util.concurrent.locks.ReentrantLock instead. There is no need to replace synchronized blocks and methods that are used infrequently (e.g., only performed at startup) or that guard in-memory operations. As always, strive to keep locking policies simple and clear.
Specifically this part:
There is no need to replace synchronized blocks and methods that ... guard in-memory operations.
as this lock is guarding in-memory operations.
JEP 425 does state that:
In a future release, we may be able to remove the first limitation above (pinning inside synchronized).
But it's not clear if that improvement will come before or after JEP 425 becomes a non-preview feature. And even so, we would like the driver to work well with JEP 425/428 even while they are still in preview.