I have a task: I need to quickly find the number of whole square roots of several trillions of integers not exceeding 100,000 ^ 2 Simply put:
100 = 10 ^ 2 - suitable
3 = not suitable
28 = not suitable
49 = 7 ^ 2 - suitable
...
etc.
As part of the task, I use the openMP directive to evenly distribute the threads between the processor cores, but even in this case, the speed of the program is not very satisfactory. At the moment, the construction I use looks like this:
int y = sqrt(x); if (y==int(y)) i++; I tried to use the search in the SET set, knowing that my number will never exceed 100,000 ^ 2
set<int> mySet; for (int j = 1; j <= 100001; j++) mySet.insert(j*j); if (mySet.find(x) != mySet.end()) i++; But this method turned out to be many times slower than the usual calculation from under the square root. Tell me, is it possible with the help of the language tools to somehow speed up the execution of one iteration or to replace the calculation from under the root with something faster? I will be glad to any tips!
10**10/8.0/1024.0/1024.0/1024.0 = 1.1641532182693481that is, instead of a tree you can get a bitmap of 1.16 GB in which the i-th bit will be 1 if i has an integer root and 0 if it does not. Then you will not need to look for logN in the tree, you only need to check the bit (conditionally - for O (1), although I don’t know how quickly the call will go to such a bit in the RAM) - vegorov