In Qt, the timing of programs in Debug and Release is significantly different.

Question

In short, he wrote a solution to the differential equation in partial derived by a parallel algorithm (using OpenMP) and serial. The joke is that with Debug compilation, I have a sequential algorithm running about 75-85 seconds, and a parallel one 50-55 seconds. It seems to be OK. But it is only necessary to compile in Release mode, so the parallel algorithm runs 25-30 seconds, and the sequential 18-20 seconds !!! So, maybe someone had the same problem? I will not throw off all the code, I'm afraid not to master. Throw off a piece of code:

void DifferenceScheme::solveOMP() { n = round((xMax - xMin) / xStep); m = round((tMax - tMin) / tStep); u = dMatrix(n + 1, dVector(m + 1, initValue)); TridiagonalMatrix tm(n + 1); dVector F(n + 1); Progonka progonka(tm, F); double error; workTime = clock(); #pragma omp parallel for for (int i = 0; i <= n; i++) { u[i][0] = w(xMin + i * xStep, tMin); } #pragma omp parallel for for (int i = 0; i <= m; i++) { u[0][i] = w(xMin, tMin + i * tStep); u[n][i] = w(xMax, tMin + i * tStep); } for (int j = 1; j <= m; j++) { do { #pragma omp parallel for for (int i = 1; i < n; i++) { tm.lower[i - 1] = dLeft(i, j); tm.main[i] = dCenter(i, j); tm.upper[i] = dRight(i, j); F[i] = -1 * f(i, j); } tm.main[0] = tm.main[n] = 1; F[0] = -1 * (u[0][j] - w(xMin, tMin + j * tStep)); F[n] = -1 * (u[n][j] - w(xMax, tMin + j * tStep)); progonka.solveOMP(); #pragma omp parallel for for (int i = 1; i < n; i++) { u[i][j] += progonka.x[i]; } error = 0; for (int i = 1; i < n; i++) if (fabs(progonka.x[i]) > error) error = fabs(progonka.x[i]); } while (error > accuracy); } workTime = clock() - workTime; }

Are you embarrassed by the decrease in the release time or is the parallel implementation slower in the release?
@ Vladimir Martiyanov confuses me that in the release the parallel implementation is slower.
@gbg No, I did not watch the download, but you can hear the fan as it starts to turn.

gbg gbg 13.1k one 21 43 · Answer 1 · 2016-05-30T22:13:46

I believe that N (matrix size) is not enough for you, and the costs for starting threads are comparable to the execution time.

To verify this, run your software on a larger N.

You are right, but I am also the reason for badly splitting - van9petryk

van9petryk van9petryk 191 one 12 · Answer 2 · 2016-05-31T16:09:25

In general, my mistake is that I wrote #pragma omp parallel for everywhere. Thus, I created and deleted constantly streams. I managed to adjust the time of a parallel program to a time by successively changing the parallelism. Now after do {I do #pragma omp parallel, and near each cycle #pragma omp for. thus, threads are created and deleted only at the beginning and at the end do {}

In Qt, the timing of programs in Debug and Release is significantly different.

2 answers 2

More articles: