What code will be executed faster when repeatedly called, will VS2010 optimize anything?


int Function(){ int Var; Var=2*2; return (Var); } 


 int Function(){ static int Var; Var=2*2; return (Var); } 


 int Function(){ return (2*2); } 

The second question is, what is the most effective dereference or reference to the index !?

The third question is, in the code below, is it possible to optimize something or is it perfect? :)

 unsigned long Time2Sec(unsigned char *pbTimeString){ unsigned long dwResult; dwResult=((*pbTimeString-'0')*10+(*(pbTimeString+1)-'0'))*3600+ ((*(pbTimeString+3)-'0')*10+(*(pbTimeString+4)-'0'))*60+ ((*(pbTimeString+6)-'0')*10+(*(pbTimeString+7)-'0')); return (dwResult); } 

The fourth question is what works faster (by the condition of the problem - one if statement excludes the other)


 if (условие_1) ... if (условие_2) ... 


 if (условие_1) ... else if (условие_2) ... 
  • one
    ideally, a [x] is equivalent to (a + x sizeof (element_a)). based on this and think. - KoVadim
  • What is your understanding of "memory shooting"? In this code, I produce only reading from memory and what can happen even if I climb the wrong way !? - rejie
  • one
    from just the wrong value to the blue screen. It depends on what pointer you pass there. "Memory shooting" is when they read and write to addresses where this code would not follow. - KoVadim
  • one
    @reije, about the index or index? Usually the optimizer makes the same code. I don’t know if the code of the third question is perfect, but looking at the assembler produced by gcc -O3 -S I don’t understand what can be improved in it. About the "shooting memory." IMHO obvious paranoia. My approach is simple, the one who gave the wrong data to the low level function is to blame . In real life, it is necessary to check the input data at the place of their receipt from outside the program, and not the arguments of the functions. - avp
  • 3
    if the data is checked from the outside, then these checks will have to be duplicated in each place where you need to call. A bit of paranoia never gets in the way of a normal code, not an academic task. I believe that each function must either produce a result, or say that it cannot issue it. Of course, nobody gives 100% coverage. In this task, it would be better to write with indexes after all - the code will become more transparent. And yet - do not confuse the "perfect code" and "optimized". @rejie does not need to edit and add a bunch of questions into one - the answers of people look ridiculous. - KoVadim

6 answers 6

The latter option is more likely to be faster - the compiler clearly sees that it can be replaced by 4, or even inline function.

1 and 2 should ideally also be optimized to third. But not the fact that the compiler will master. Need to watch asm code.

But I don’t think that you have exactly that in the code. There probably is more code. That is what you need to watch. But you need to understand that in most cases such manual optimizations can sometimes deceive the compiler and it will not be able to write more efficient code.

But if you prioritize, then I would sort them by speed 3> = 1> = 2.

  • I am always confused by statements like "compiler sees". It is not known what each particular compiler sees (unless you are a developer). - eigenein
  • I personally use experience and 7 feelings. - KoVadim
  • Can you guarantee that a specific version of the author’s compiler will work in the same way as those on which your experience is based? - eigenein
  • I did not give guarantees. everything needs to be checked on the spot. Look inside. But there are general trends. - KoVadim

Such questions make me anxious. TC, and you do not care? What do you want to win?

Gray elders bequeathed: first find a bottleneck, then optimize. And your intentions betray the sin of premature optimization.

The correct answer to your question: the code that looks more readable and understandable.

  • one
    Literally, I do not remember, but one of the greats argued that the true effectiveness of programs is determined by algorithms and data structures adequate to them. However, it will never be superfluous to imagine what the code in a high-level language translates into. - avp

I don’t know how in VS2010, but in Windows-XP MinGW gcc made such code

 c:/Documents and Settings/avp/src/hashcode $ gcc -O3 -S opt.cc:/Documents and Settings/avp/src/hashcode $ gcc --version gcc.exe (GCC) 3.4.5 (mingw-vista special r3) 


  .file "opt.c" .text .p2align 4,,15 .globl _Function1 .def _Function1; .scl 2; .type 32; .endef _Function1: pushl %ebp movl $4, %eax movl %esp, %ebp popl %ebp ret .lcomm Var.0,16 .p2align 4,,15 .globl _Function2 .def _Function2; .scl 2; .type 32; .endef _Function2: pushl %ebp movl $4, %eax movl %esp, %ebp popl %ebp movl %eax, Var.0 movl $4, %eax ret .p2align 4,,15 .globl _Function3 .def _Function3; .scl 2; .type 32; .endef _Function3: pushl %ebp movl $4, %eax movl %esp, %ebp popl %ebp ret 

It follows that options 1 and 3 are the same, and 2 is slower, even if you obviously throw away the extra command mov 4, %eax before ret in Function2.

I wonder why the compiler (when optimizing !!!) creates an empty function frame? Maybe for the possibility of "posthumous" analysis?

  • Please note again (optimization !!!). The compiler threw out an unused variable in option 1 (the stack pointer does not move). Therefore, options 1 and 3 are the same. If in option 1 to write volatile int Var; then we get _Function1: pushl% ebp movl% esp,% ebp subl $ 4,% esp movl $ 4, -4 (% ebp) movl -4 (% ebp),% eax leave ret See the difference? About the real speed. It is impossible to reliably measure such a code in a real system (not an emulator). - avp

If you indicate "optimize" in the flags, then it will be optimized. In each specific case, the code may differ, up to the point that the function itself will be abolished and an opcode will be inserted in the form of one instruction for each of its calls:

 mov кудатотам, 4 
  • And how in VS2010 to see asm code of a specific function, share, if not difficult!? - rejie
  • The easiest way in IDA ( ru.wikipedia.org/wiki/IDA ) is to push a binary compiled with optimization flags. You will be surprised how much the code will become unrecognizable. more likely there will be no function there, so be guided by the logic that the disassembled code describes (: - vv2cc
  • 2
    @ vv2 The easiest way is in the [ Disassembly Window, ] [1] especially if you take into account the ability to be traced by asm code without losing breakpoints and having source code under your eyes. For the sake of viewing the assembler listing to buy IDA - that's forgive, bust. [1]: msdn.microsoft.com/en-us/library/a3cwf295 - Costantino Rupert
  • @ Kotyk_khohet_kusat, judging by your words, you apparently rarely worked with well-optimized code. Well, there will not be some sections of code there, due to their simplification and replacement with optimized variants. sishnomu. IDA - personal preferences, and you don’t need to buy, there are free versions. [Freeware versions of IDA] [1] [1]: hex-rays.com/products/ida/support/download_freeware.shtml - vv2cc

The first question is already more or less chewed. My opinion is 3> = 1> 2.

The second question: no difference. When addressing behind the backstage index, the same address calculation and dereference occur.

Third question. Very hard to read. It is better to break all this summation into pieces and fold. The speed will not affect, but it will be better read.

Fourth question. The option with else will be statistically faster. See comment.

  • I think 4.2 (with else) is faster 4.1. It is desirable (for speed) that condition-1 is performed on average more often than condition-2. - avp pm
  • Yes you are right. In 4.1, the second condition will always be checked, and in 4.2 - if the first is incorrect. - skegg
  • In fact, modern compilers can optimize mutually exclusive conditions. So version 4.1 itself will turn into 4.2 in the event that the mutually exclusive conditions can be identified at the compilation stage. But option 4.2 is still better, since the person reading the code is immediately aware that only one of the code blocks will be executed. - Shamov

A reliable answer can be obtained only by measuring the execution time in each of the options. No matter how you try to reason, the results of performance tests are often not very predictable.

  • one
    So what it turns out, the programmer must optimize his code based not on his knowledge, but on the fact that there the compiler will be in the end? - rejie
  • 3
    The programmer must write normal code, and not distort it so that even the compiler cannot parse. - KoVadim
  • Yes. In the most general case, the programmer should write a non-optimized, well-read and architecturally suitable code, then do profiling, which will reveal the slowest places, after which he will be engaged in optimization, and each time check with the profiler, and if he has optimized anything at all. Even different versions of the same compiler can compile the code in different ways and without making measurements you cannot be sure that the programmer is correct. - eigenein