Optimizing Parallel.ForEach

Question

Good afternoon, colleagues.

I wrote a working algorithm for receiving and processing data in the typical producer / consumer scenario, I want to move further.

One of the producer-procedures receives a list of files and starts Parallel.ForEach for each element of the list. Each iteration consists of three blocks:

File download
Reading a file through Excel's COM interface and getting a two-dimensional array of strings
Creating an object for each row in the array and sending it to the BlockingCollection

There are several hundred files, it is clear that it is pointless and expensive to launch an Excel instance to read each file, so point 2 is enclosed in a critical section. You can, of course, use a semaphore and process files with multiple instances of Excel, but that’s another story and I don’t want to touch on that.

In the current state, the cycle keeps 4 tasks active (by the number of processors), that is, parallelism turns out to be ineffective: 4 files are quickly downloaded, the tasks wait in turn for locking and the algorithm is almost synchronous.

Question: how to put the task of the first iteration of Parallel.ForEach into standby mode to start working the second one, and then return and finish the first one? Trying to use Await, the execution thread goes out of the loop and I get porridge.

The effective result would be something like this: 4 files were downloaded, the blocking started on Excel, the other three tasks went into the background, three files were downloaded, three files were downloaded, an array from the 1st task was processed, the blocking started in the 2nd task, 2 files were downloaded ...

I would also like to try to abandon Parallel.ForEach , split the algorithm into three synchronous For Each and bind them through 2 consume-коллекции to provide approximately the kind of implementation described above. Or even write three functions and link them directly through Yield without any extra collections, it will be even faster. But this is also another story that I will not touch on in this question.

In this case, I don’t have enough IQ to deal with the asynchrony issue inside the Parallel.ForEach iterations myself, I really hope for your expert advice that will be able to raise my level.

thank

Immediately make a reservation about the choice in favor of the critical section and the use of 1 instance instance of Excel instead of running Excel on each file.
Measurements on a server with 16 cores showed an increase in operating time by half compared with the critical section.

Accepted Answer · 2017-08-19T13:57:43

Try to make a long conveyor.

The queue of addresses to download. Producer takes the list and puts it in a queue, consumer takes the address from the queue from the queue, downloads the file, and puts the path to the file (or its content) secondarily. (For this second stage, consumers act as producers.)
The queue of downloaded files. Each consumer has a copy of Excel, he picks up the downloaded file from the queue, feeds it to Excel and takes the result in the form of a two-dimensional array. After that, each line of the array adds to the queue of arrays, and deletes the file. The number of consumers here is equal to the required number of copies of Excel.
The queue of lines is what you have now is the only instance of BlockingCollection .

Parallel.ForEach and await , in theory, are no longer needed.

To implement the queues, you should use the BlockingCollection , as described here .

In the English article stackoverflow.com/questions/11564506/… I read about the fact that TPL and async\await are incompatible in fact, and acrobatics with await inside Parallel cycles will never work.
It looks cumbersome and at first glance difficult, but it’s interesting to try
@IvanLazarev: Well, this is not quite a fact, since async / await is based on TPL.
But yes, TPL Dataflow is probably a good option for your case.
In general, it makes sense to implement TPL Dataflow in some large projects with really complex architecture, in which the producer / consumer script is not static.
For my project, this is “from a cannon on a sparrow,” but on the whole I reached my goal and understood the essence.
I redid the project by adding two collections for connecting 3 tasks through an iterator, everything works quickly despite my doubts - the collections do not materialize to their full height, the iterators keep in mind only the objects they are working on at the moment.
As a result, the 45 seconds of the application turned into 38: D
And yet - the queue did not do, everything works through the BlockingCollection , since the order of the data does not matter.
Parallel.ForEach also did not give up and used the design for massively downloading files from the server.
He refused to block for Excel simply by iterating over the downloaded files in the synchronous For Each .
@IvanLazarev: Well, by the time I was referring to the BlockingCollection .

Optimizing Parallel.ForEach

1 answer 1

More articles: