Brute force vector and CPU offload

Question

The main function of the program is to copy files from folder to folder on the server by looping through each file in the source folder in a loop. And there is a task for the function to accept the exception files, that is, not to copy them. I stored the exceptions in a vector. And the question arises how to quickly find the correspondence between the vector element and the file being processed. This method works, but it heavily loads the CPU, since it is inside the main function loop.

if (!except.empty()) { except_it = except.begin(); while (except_it != except.end()) { if (ps.filename() == *except_it) { // ps.filename() - имя файла ++it_source; // *except_it - имя файла исключения ++except_it; break; } }

What you are doing is essentially applying the standard find -linear search algorithm.
As you have already been advised, you can use other types of containers ... but something is all strange.
The fact is that yes, other containers will provide asymptotics faster, but they usually have a larger coefficient in O() .
In short, see if the problem is exactly in this - linear search.
This is critical, since the program will be a service running in continuous mode.
And in the main time it compares for the absence of files or obsolete, and only in this case it copies.
And without this cycle, a maximum of 25% comes in a single case, so the average CPU load is much lower.
In such backup programs, they very often make a mistake - they constantly scan the entire file system - whether the files have changed.
But simply inserting sleep (1) in the middle of such a cycle greatly relieves the percent and does not affect the speed of the program.
But the correct approach is to subscribe to file system updates - in this case, it will notify itself that the file has changed.

αλεχολυτ αλεχολυτ 21.1k 9 39 92 · Accepted Answer · 2016-07-18T13:32:37

For a quick (binary) search, it makes sense to use either a sorted vector, or even better to use another container, for example std::set or std::unordered_set . In the case of an unordered set, the search will be performed as quickly as possible (algorithmic complexity is O(1) ).

αλεχολυτ

21.1k 9 39 92

It is likely that the sorted vector will be much better than any other solution. Of course, you need to measure, but I would put on a vector. - ixSci
@ixSci sorted vector is better than hash? Although it is possible to win a small set. Well and still, unless the sorted vector will be better, than set ? - αλεχολυτ
may be better hash - need to be measured. As for the simple set , it will certainly be better. - ixSci
For what reason is sort_vector better than set ? What values are more locally located in memory? From the algorithmic point of view both there and there - binary search after all. - αλεχολυτ 2:58 pm
Exactly. Algorithmic complexity is a very rough indicator of efficiency, especially when we talk about a computer, where, often, locality overcomes any complexity. - ixSci

|

Brute force vector and CPU offload

1 answer 1

More articles: