Inspired by the question of whether to use streams or file functions, and that streams are usually slower.

I sketched a test program (code below), tested it on Open Watcom and Visual C ++ 2015. The results of the output to the console window and to the file are compared with functions for working with streams and files. To work with the old Watcom, I used clock() for the measurement, and it turned out that the OW function ios::sync_with_stdio() recommended in the above question was obsolete , so it was removed for this compiler.

Checked under Windows 7 x64. There are actually two questions - and what is being done with other compilers and operating systems, and how, in fact, you can speed up the output - both console in general and streaming to the level of functions C. Are there any optimization options? What caused the observed variation?

Here is the code:

 #include <vector> #include <iostream> #include <fstream> #include <iomanip> #include <cstdio> #include <ctime> using namespace std; clock_t bench(void(*func)()) { clock_t start = clock(); func(); clock_t stop = clock(); return stop - start; } const int Count = 100000; vector<double> dv; vector<int> iv; FILE * outfile = 0; ofstream * outstream = 0; void printf_console() { for(int i = 0; i < Count; ++i) printf("%d %lf ",iv[i],dv[i]); } void printf_file() { for(int i = 0; i < Count; ++i) fprintf(outfile, "%d %lf ",iv[i],dv[i]); } void stream_console() { for(int i = 0; i < Count; ++i) cout << iv[i] << " " << dv[i] << " "; } void stream_file() { for(int i = 0; i < Count; ++i) *outstream << iv[i] << " " << dv[i] << " "; } int main(int argc, const char * argv[]) { for(int i = 0; i < Count; ++i) { dv.push_back(rand()/double(RAND_MAX)); iv.push_back(rand()); } outfile = fopen("test.dat","wt"); outstream = new ofstream("test.stream"); clock_t out_printf_console = bench(printf_console); clock_t out_printf_file = bench(printf_file); clock_t out_cout_sync = bench(stream_console); clock_t out_stream_file = bench(stream_file); ios::sync_with_stdio(false); clock_t out_cout_async = bench(stream_console); ios::sync_with_stdio(true); clock_t out_cout_rsync = bench(stream_console); cerr << "\n\n"; cerr << "printf console: " << setw(10) << out_printf_console << endl; cerr << "printf file : " << setw(10) << out_printf_file << endl; cerr << "stream sync console: " << setw(10) << out_cout_sync << endl; cerr << "stream async console: " << setw(10) << out_cout_async << endl; cerr << "stream rsync console: " << setw(10) << out_cout_rsync << endl; cerr << "stream file : " << setw(10) << out_stream_file << endl; delete outstream; fclose(outfile); } 

Here are the results:

  Open Watcom: VC++ 2015: printf console: 16739 21806 printf file : 62 69 stream sync console: 16723 87678 stream async console: 16754 86899 stream rsync console: 16692 87254 stream file : 141 150 

PS It is surprising that, in the general case, VC ++ beats Open Watcom, and here it loses capital for it, especially when streaming to the console.

  • Comments are not intended for extended discussion; conversation moved to chat . - Nick Volynkin

3 answers 3

@Harry if other results are interesting.

Windows 7 MinGW g ++ 3.4.5

 printf console: 4.43
 printf file: 0.081
 stream sync console: 18.262
 stream async console: 1.093
 stream rsync console: 1.176
 stream file: 0.318

and in VirtualBox virtualk on the same computer

 Linux avp-ubu1 3.13.0-85-generic # 129-Ubuntu SMP Thu Mar 17 20:50:15 UTC 
 2016 x86_64 x86_64 x86_64 GNU / Linux
 g ++. real (Ubuntu 4.8.4-2ubuntu1 ~ 14.04.1) 4.8.4

 printf console: 0.11216
 printf file: 0.072135
 stream sync console: 0.152673
 stream async console: 0.141306
 stream rsync console: 0.141552
 stream file: 0.070493

By the way about him:

 avp@avp-ubu1:hashcode$ grep CPU /proc/cpuinfo model name : Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz model name : Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz model name : Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz model name : Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz avp@avp-ubu1:hashcode$ 

Time in seconds, because looking at the raw difference of clocks, I suspected that the value of CLOCKS_PER_SEC in these systems is different, in short, the conclusion is made like this:

 cerr << "printf console: " << setw(10) << (out_printf_console / (double)CLOCKS_PER_SEC) << endl; 

(the rest is similar).

PS
the scatter of time values ​​for different launches (for the sake of interest, I ran five times) is observed, but it seems no more than 10%.

  • Those. Linux in a virtualka beats the Windows host? - sercxjo
  • one
    @sercxjo, IMHO the difference is that clock() different (systems use different sources, counting the time provided to the process). Probably in this problem we must take the real time from gettimeofday() - avp
  • Linux has a choice of 8 timer types (man clock_gettime). It may be easier to see everything at once. - sercxjo
  • @sercxjo, well, if you don’t like old things, and taking into account POSIX.1-2008 marks gettimeofday() as obsolete, recommending the use of clock_gettime(2) instead measure with CLOCK_REALTIME (just in any case, do not forget time dimension is one of the darkest themes in the OS). - avp

The first. With such an approach to the measurement of time, you can intend such interesting things as the weather on Mars, or the humidity of the heels of the singer Coin.

A modern multi-tasking system may suddenly want to throw a couple of pages from a disk into RAM in the background - this is your time spread.

A modern CPU may decide that it is boring for it to work on one frequency, and reduce it culturally. Or lift.

How many copies are broken about the time dimension alone.

Try to repeat the tests from the well-known publication , then the value will be higher.

  • one
    I agree, the code with IO to measure using clock() wrong (at least Linux clock implements: on top of clock_gettime(2) (using the CLOCK_PROCESS_CPUTIME_ID clock) . Probably the most correct would be to use the real time from gettimeofday() - avp

The scatter is probably due to the library implementation. Of course (IMHO) a more objective result is not a single call, but a series run. As an experiment, you can try stl from the outside, such as STLPORT or EASTL .