Where n is the number of physical cores. I already meet this rule more than once, but I can’t understand why it should be done this way and whether it is always more efficient than creating n threads.
3 answers
but I just can not understand why so
As a rule, programs of "general purpose", except, in fact, calculations perform input-output operations (disk, network, other devices). While waiting for slow operations, the kernel would be nice to load with useful work. Therefore add +1 . Why +1 and not more? I won’t find it now - but I remember for sure, I saw the make -jN . The combination +1 won almost everywhere. Thus, an established practice - has a purely experimental argument.
and is it always more efficient than creating n threads
Not always. If the processes are occupied exclusively by recalculations, and the share of I / O operations is negligible, the addition of an "additional" stream of total speed will not add. Moreover, the speed may drop slightly.
As far as I know, such a recommendation arose from considerations of maximum utilization of computing power. It is assumed that the thread can perform any I / O operations (disk, network, memory), requiring a long waiting time, during which the thread does not load the processor core by 100%. Therefore, an additional n+1 1th stream is created, which at this time uses “idle” powers.
PS The recommendation is not mine and I will not argue about it. ;-)
- Plus) Look for the benchmarks! They say a lot. But there is a very modest suspicion that
CPU+1not quite adequate! 2 + 1 = at the very least, 4 + 1 = normul, 8 + 1 = chic ... and 16 + 2 ??? IMHO, not-a !!! Never!!! 16+ [3 ... 4] (not the fact that 4 is better, just because 4 + 1). The most important thing!!! Do not forget to follow the memory! If a swap begins - give all your own benchmarks for heating kindergartens and nursing homes. - Majestio
The N + 1 recommendation is applicable only for systems that process a stream of non-uniform tasks, with frequent but short interrupts for IO. A typical example is make -jN , which has already been mentioned here. In it, the "extra" thread allows you to fill the CPU idle time caused by IO.
For systems that process obviously identical long tasks, CPU-bound, without IO, it is more profitable to set the number of tasks in N + using affinity - to bind the thread to a specific kernel.
The surest way to find out how many threads are best used for a specific task is to take and measure.
nthreads. Another thing is that usually 1 more thread is waiting for these samen, that is, it does not ask at this time and therefore does not force the system to constantly switch. - Monk