Suppose there is a lazy collection of type IEnumerable (no matter how it is obtained), which will be processed in parallel by the Select operator from PLINQ. For this, the so-called Chunk Partition will be created, which will be processed by the N-th number of worker threads (it is worth noting here that the collection is not indexable, so the Chunk partition will be used). Each such thread will handle chunk using an iterator of the ContiguousChunkLazyEnumerator type. Here is a snippet of the MoveNext code MoveNext the ContiguousChunkLazyEnumerator class (from the .Net source code):
lock (m_sourceSyncLock) { // Some .net stuff try { for (; i < mutables.m_nextChunkMaxSize && m_source.MoveNext(); i++) { // Read the current entry into our buffer. chunkBuffer[i] = m_source.Current; } } // Some .net stuff } As can be seen from the code, each such iterator works with the shared m_source object of type IEnumerator . I'm interested in the next question. How this approach can give a performance boost. After all, in essence, using a lock should completely kill performance (concurrency in essence will not provide benefits). And in theory, the same code executed sequentially by one thread and without blocking will have the same performance.
This implementation is not quite clear to me, maybe I did not take into account something? (I'm still inexperienced in multithreading, I will be grateful for the answers).