Judging by what happens inside the ThreadPoolExecutor code, it does not suit you. Because a stream for a new task is created inside it, this can be avoided by creating such a stream using an empty task in advance, but still a lot of logic remains. I think something of my AtomicReference that will be in a cycle to expect the appearance of something in the AtomicReference or ConcurrentQueue be faster.
This is on the one hand. On the other hand, do not forget that in any case there is such a thing as the operating system and the time to switch context when changing the flow. You still will not be able to avoid some delays. (if you have ever written tests say on bc, then there always results differed by 10-20, sometimes even by 100ms - that's your system)
That is, you already have restrictions on the duration of the test, so that you can somehow reduce the work of the system to "within the margin of error."
Third Java machine. It constantly optimizes the code, it is very difficult to be guided by the numbers obtained simply by self-writing tests with System.currentTimeMillis. This is already a topic for a separate issue, there are certain benchmarks, there are lists of rules, it is difficult to suggest something concrete, I didn’t need to get exact figures.
Fourth, what is this benchmark with separate streams? Of course, you can have "everything is under control" there and you understand what you are doing, but it sounds strange. At least in conjunction with the troubles about the speed of thread pool.
In general, as a conclusion - the thread pool is not the biggest problem that you have. Use a regular ThreadPoolExecutor, if you suddenly find that it affects something there, well, write your implementation. This is in fact a matter of 10 minutes.
ExecutorServicethisinterfaceif that. Therefore, love will have to write the implementation .. - Drakonoved