C program. I can not understand how to interpret the measurement results of the local and external functions, i.e. when its code is written in the same file from where it is called and in a separate file.

Computer CPU i5-2500 3.30 GHz Windows 7. It stands VirtualBox 4.1.6 with Ubuntu 10.04.

The time taken to call a function that many times in a cycle reverses the array passed to it, which is placed on the heap by the caller, is measured. Once we measure a local call, and in the second case a call to the same function (with a different name) defined in its file.

Windows

c:/Users/avp/src/sort $ gcc --version gcc.exe (GCC) 3.4.5 (mingw-vista special r3) c:/Users/avp/src/sort $ gcc -O3 tc ec c:/Users/avp/src/sort $ ./a 1000000 test 1000000 elements 1000 loops internal 8057 msec external 8456 msec c:/Users/avp/src/sort $ gcc tc ec c:/Users/avp/src/sort $ ./a 1000000 test 1000000 elements 1000 loops internal 10832 msec external 10518 msec c:/Users/avp/src/sort $ 

Linux

 avp@avp-ubu1:~/src/tst/sort$ gcc --version gcc.real (Ubuntu 4.4.3-4ubuntu5) 4.4.3 avp@avp-ubu1:~/src/tst/sort$ gcc -O3 tc ec avp@avp-ubu1:~/src/tst/sort$ ./a.out 1000000 test 1000000 elements 1000 loops internal 637 msec external 7035 msec avp@avp-ubu1:~/src/tst/sort$ gcc tc ec avp@avp-ubu1:~/src/tst/sort$ ./a.out 1000000 test 1000000 elements 1000 loops internal 8153 msec external 8144 msec avp@avp-ubu1:~/src/tst/sort$ 

I apologize in advance for the volume, but I quote the code.

tc

 #include <stdio.h> #include <stdlib.h> #include <string.h> #include <sys/time.h> static long long mtime() { struct timeval t; gettimeofday(&t, NULL); long long mt = (long long)t.tv_sec * 1000 + t.tv_usec / 1000; return mt; } static void swap (void *t, void *a, void *b, int size) { memcpy(t,a,size); memcpy(a,b,size); memcpy(b,t,size); } void rev (void *a, void *t, int n, int l, int size) { while (l--) { int i, j; for (i = 0, j = n-1; i < j; i++,j--) swap(t, a+i*size,a+j*size,size); } } main (int ac, char *av[]) { int n = 100000, l = 1000; if (ac > 1) if ((n = atoi(av[1])) < 3) n = 100000; if (ac > 2) if ((l = atoi(av[2])) < 1) l = 100; printf ("test %d elements %d loops\n",n,l); long long start; int tt; int *a = malloc(n*sizeof(*a)); void *tmp = malloc(sizeof(*a)); int i; for (i = 0; i < n; i++) a[i] = rand(); start = mtime(); rev(a,tmp,n,l,sizeof(*a)); tt = mtime()-start; printf ("internal %d msec\n",tt); start = mtime(); erev(a,tmp,n,l,sizeof(*a)); tt = mtime()-start; printf ("external %d msec\n",tt); exit (0); } 

ec

 #include <stdio.h> #include <stdlib.h> #include <string.h> static void swap (void *t, void *a, void *b, int size) { memcpy(t,a,size); memcpy(a,b,size); memcpy(b,t,size); } void erev (void *a, void *t, int n, int l, int size) { while (l--) { int i, j; for (i = 0, j = n-1; i < j; i++,j--) swap(t, a+i*size,a+j*size,size); } } 

Actually confuses the difference in runtime on the virtual machine . What explanations can be given to this fact? Please note, this manifests itself with -O3 and the compiler version in Windows is different. But what exactly could be such a fantastic optimization, or why is the virtual machine so affected?

Maybe somewhere in the program error, but I do not see it? But, I drew attention to this behavior of the system when developing other programs, but here I cited a vividly illustrating this anomaly (?) Example.

    2 answers 2

    Well, different versions of gcc. Modern versions generally optimize everything very well. In particular, small functions at -O2 and -O3 can be automatically made embedded. Other optimizations are possible. We must look at the generated assembler code.

    Try experimenting with other levels of optimization.

    • I would understand about " embeddedness " if the cycle were in the calling program. There is something else. The difference in execution time is an order of magnitude . I put the question differently: why is it optimized with main (), but not in another file? - avp
    • Why argue? It is necessary to look at the assembler. - skegg
    • Yes, it's about optimization. - avp

    I found the answer myself without even looking at the code generated by the compiler. Indeed, as @mikillskegg said it’s about optimization, I couldn’t believe in this level of optimization.

    The fact is that the size argument in the rev () function (and later in swap ()) is actually a constant known to the compiler when compiling tc (sizeof (int)), but unknown for functions in ec

    After

     size = sizeof(int); 

    at the beginning of erev () everything worked the same.