I started learning assembler, I know C / C ++. Therefore, I decided to build on the knowledge that I already have, and began to disassemble my written code and see how it works.

And immediately came across an illogical moment.
If the assembler has a function to increase the value of the inc eax argument, then it should be different from adding 1 to the register by adding add eax,1 .
I usually assumed that if there are separate such functions, then they are more efficient ... but if you look at the i++ :

 mov eax,dword ptr [i] add eax,1 mov dword ptr [i],eax 

What is better to use: inc or add in this case?

PS I use Visual Studio 2008.

    4 answers 4

    I think the compiler was stupid. Probably, the fact is that theoretically the result of i ++ can still be used later, and because of this, it tries to save the result in r \ eax. In fact, the compiler could generate just

     inc dword ptr [i] 

    In addition, there are clearly three instructions instead of one, and this is certainly slower (unless the value of i ++ is needed later).

    As for the explicit comparison of add eax, 1 and inc eax, I suspect that there is no difference in speed, but the number 1 takes place, which means that it will take its place in the code prefetch queue and may play a negative role. Or maybe not play.

    On the other hand, the compiler really stubbornly uses add eax, 1:

     #include <stdio.h> int main(void) { int i = atoi("7"); return ++i; } 

    It gives the result without inc'a

     00000000004004b0 <main>: 4004b0: 48 83 ec 08 sub $0x8,%rsp 4004b4: bf bc 05 40 00 mov $0x4005bc,%edi 4004b9: 31 c0 xor %eax,%eax 4004bb: e8 e8 fe ff ff callq 4003a8 <atoi@plt> 4004c0: 48 83 c4 08 add $0x8,%rsp 4004c4: 83 c0 01 add $0x1,%eax 

    On the other hand, the Java JIT compiler sometimes uses inc .. itself saw ..

    UPD 2 kiralagin: Read at the official Intel dock , page 341.

    • Any action with ++ or - the compiler acts in the same way: puts the value in eax, performs the action, writes again to the variable, regardless of whether this variable is used in the next command. Did I understand you correctly? - BogolyubskiyAlexey
    • I apologize, but how should this processor increase the number in memory? - kirelagin
    • by inexperience, too, missed it. You mean that the assembler works only with registers? - BogolyubskiyAlexey
    • Well ... at least these are my ideas about the processor architecture :). I can hardly imagine how he performs some operation directly in memory. - kirelagin
    • I did not write anything in assembly language for a million years, but I remember that this feature worked. Although, it seems to me that all the same, I was somewhat mistaken, because the address for inc must be in the register. But what is not the value itself is for sure. - cy6erGn0m

    The thing is, you're compiling in debug mode. Find the switch and set it to the release version)) Yes, you are right, the increment is much faster. Just in the debag version, everything is done to the maximum in the forehead, so that no one gets confused. And in the release version you will be surprised by the strangeness of the teams (he experimented himself, he was pleasantly surprised). The increment is faster because it is shorter, less from memory when performing to read. This is kind of a major advantage. The fact that the debug mode is indicated by the fact that before the change data is read from memory and then written back, which is noticeably longer than the operation itself. Plus, experience, for he once ran into and found the cause;)

    • You are right, thanks! - BogolyubskiyAlexey

    Well, you are asking questions, of course! It's all so iron-dependent things!


    The general rule is simple: in terms of generating assembly code, the compiler is always smarter (more knowledgeable) than you are. If he does so, then it will be better.


    Second thought: You are based on a false message. The introduction of new machine commands is not always dictated by the efficiency of calculations. There is still such a thing as the efficiency of programmers - agree, inc eax much easier to write than add eax,1 ;). And the operation is extremely frequent.

    Now, about why add faster than inc . (Everything that I will write now is not obliged to be true - it just seems that way to me). The fact is that inc , in contrast to add , does not change the state of the carry flag. And some other flags changes. Since I do not know any architecture on which it is possible to set flags one at a time, and not all in a crowd, I try to state the following: inc 'you will need to first read the current state of the carry flag (which causes a false dependency followed by downtime). And add 'is not required because it overwrites it. Hence the increase in performance.

    PS Disclaimer again: I myself am not very good at it until I understand (contact in half a year - apparently, by that time I will already be an expert in this area =)). Just somewhere heard something like this \ read.

    • Thank you, very interesting and in my opinion a reasonable explanation, but you have to wait for more answers, and the fact that i ++ (often used stuff) is not replaced by inc eax is interesting ... - BogolyubskiyAlexey

    gcc (GCC) 3.4.5 (mingw-vista special r3)

    uses incl for i ++ j ++

    when compiling with -O3, it generally holds variables in registers. gcc -S -O3

     main () { int i = 0, j = 999; while (a(i)) { i++; j++; b(j); } } 

    makes (for while) code:

      xorl %esi, %esi movl $999, %ebx jmp L2 .p2align 4,,7 L4: incl %ebx // j++ incl %esi // i++ movl %ebx, (%esp) // передача j в b() call _b L2: movl %esi, (%esp) // передача i call _a testl %eax, %eax // проверка результата, возвращаемого в регистре eax jne L4 

    About the speed of execution. Personally, it seems to me that in a cyclic code of a reasonable size for a modern implementation of the x86 architecture, the speed of execution will, oddly at first glance, look the same. This is due to the prefetching of commands and their conversion to commands of a RISC-like processor core.