I don’t know why, but this assembler command doesn’t give me rest LEA .

C ++

 int f(int t) { return t+1; } int f(int*t) { return *t+1; } int f(int& t) { return t+1; } 

Assembler

 f(int): # @f(int) lea eax, [rdi + 1] ret f(int*): # @f(int*) mov eax, dword ptr [rdi] inc eax ret f(int&): # @f(int&) mov eax, dword ptr [rdi] inc eax ret 

If the MOV team is as clear as day, then the LEA team is not clear!

I know that the LEA instruction calculates the address of the second operand and writes it to the first operand (that's all I know)

In the example that the link, namely: lea eax,[rdi + 1] is clearly not an address calculation and not an entry in the first operand, there is no entry, but most likely something else. Or am I misunderstanding something? Please explain in accordance with C ++ code.

PS I searched, but I did not find an exhaustive answer to my question, I even searched in the Kalashnikov book, and there I didn’t even find this team ... ehh.

  • Comments are not intended for extended discussion; conversation moved to chat . - PashaPash

3 answers 3

 lea eax, [rdi+1] 

This command loads the address of the address rdi + 1 into eax . it loads in eax just rdi+1 .

It looks strange, and in order to understand why lea is needed, and why it is better to simply call a mov or manually calculate an address, you need to understand how commands are written to memory and executed by the processor.

For example, you have a read value command:

 mov eax, [rdi+1]; взять значение по адресу "rdi + 1" 

It compiles into something like

 [опкод mov][флаг что складываем в eax][флаг что берем по адресу rdi][+1] 

enter image description here

Those. in 66 67 8B 47 01

Suppose you need to get the rdi+1 address rdi+1 in eax

You can do one of two things:

Calculate it with your hands:

 mov eax, rdi + 1; не работает, move не умеет плюс! 

and you have to write:

 mov eax, rdi inc eax; 66 05 01 00 00 00 

those. execute two instructions. Perhaps a good option, but only for simple +1. And for addresses like [bp+si+4] ?

 mov eax, bp add eax, si add eax, 4; да, некрасиво! 

or execute lea :

 lea eax, [rdi+1] 

enter image description here

Compare to mov :

enter image description here

Bytecode: 66 67 8D 47 01

Only opcode differs, 8B -> 8D.

The processor has a ready, very efficient mechanism for basic operations with addresses. And it is already implemented for the mov operation - because mov can get the value to the address !.

When using lea processor does everything it does when mov , but skips the last step - retrieving the value from the address. Instead, it adds the address itself to eax . It is much more convenient and faster than counting things like rdi + 1 separate commands.


What does this have to do with your example?

In your example, the parameter is in rdi , and you must return the result in eax . To be honest, the compiler should have written

 mov eax, rdi; 66 A1 add eax, 1; 66 05 01 00 00 00 

Well, ok, for 1 you can use inc :

 mov eax, rdi; 66 A1 inc eax; 66 40 

But these are still two teams. The processor will execute them in turn.

The compiler is smart. It knows that the processor is able to add register values ​​with small constants when processing the lea command. And he inserts one command that will produce the same result.

 lea eax, [rdi + 1] 

It doesn’t matter that no address is actually loaded anywhere - the main thing is that it will work the same way and a little faster - because the processor calculates memory addresses faster than adding numbers :)

  • I did not understand this phrase: Эта команда загружает в eax адрес значения, лежащего под адресу rdi + 1. Т.е. она загружает в eax значение rdi+1. Эта команда загружает в eax адрес значения, лежащего под адресу rdi + 1. Т.е. она загружает в eax значение rdi+1. First, you wrote that eax fits the address, then you write that eax fits the value. Mislead - MaximPro
  • @MaximPro fixed - PashaPash
  • As I understand it, LEA is a command intended for operating with addresses, that is, in fact, the command itself is not able to do anything except forwarding as MOV, except that the second operand has mandatory brackets; first operand. If this is so, then I understood the meaning of the LEA PS command. And the fact that the LEA command is used to do calculations is a trick to reduce the number of operations. - MaximPro
  • @MaximPro yes, right. in fact, there are brackets in order to make the command look like mov, which does the same calculations, but also takes the value at the resulting address. Those. This is a hack for the similarity of commands in the code. - PashaPash
  • yes, the assembler is a hack for the sake of hacks =) - MaximPro

This command loads into the eax address of the one on the right — i.e. rdi+1 . Such a tricky way to combine

 mov eax, rdi inc eax 

Those. in a sense, translated into C ++, this

 &*(rdi+1) 

:) Ie getting the address of the object located at the address [rdi+1] .

See, for example, here .

  • Unclear! Is this how we get the result of the calculations? return x+1 ? - MaximPro
  • Everywhere in the assembler listings, the value passed is rdi . Simply in the second and third functions in rdi - the pointer, and in the first - the value x itself. The return value for all functions is in the eax register. - Harry
  • 2
    Yes, the command to know does not know what it is calculating ... they were told to put the effective byte address to the address [rdi + 1] in the eax register, which it will do. Those. This will put the rdi value in eax so stupid. Try to do it in another way, having put in one command, and at the same time do not affect either other registers, or memory, or flags ... - Akina
  • @Akina Yes, and - "The difference between lea and mov is that the processor addressing block mechanism is used, not the arithmetic logic unit" ... - Harry
  • Comments are not intended for extended discussion; conversation moved to chat . - PashaPash

This command

  lea eax,[rdi + 1] 

writes the value of rdi + 1 to the register eax , where the value of the argument is stored in rdi .

In the other two cases, when the argument is passed by reference, or a pointer to the source argument is passed, the rdi register contains the address of the argument.

  f(int*): # @f(int*) mov eax, dword ptr [rdi] inc eax ret f(int&): # @f(int&) mov eax, dword ptr [rdi] inc eax ret 

Therefore, first the value of the argument is entered into the eax register using the address that is in rdi

  mov eax, dword ptr [rdi] 

and then the value of the eax register is increased by 1.

  inc eax 

That is, the difference between the first definition of the function and the two following ones is that in the first case the rdi register contains a copy of the argument value , whereas in the last two cases the rdi register receives the address of the original argument .

  • I did a little poking around with the assembler emulator and got tangled up at the end. I tried to create a named memory region and write the value there. Then I used the LEA command and it gave me the address of my memory area. But when I tried to use a register instead of a named memory area, I received a register value, not an address. Why is that? What is going on? Help me figure it out. PS I can show screenshots, but it seems to me that it is better to do this in the chat - MaximPro