How does the function stack actually work in C ++?

Question

Dear Colleagues! I want to understand how the stack works, using the example of a function of three variables.

void f2() { int B = 5; int *pB = &B; // задание адреса в указатель int &sB = B; //задание объекта в ссылку cout << "addres of B = " << &B << "\n"; cout << "value of B = " << B << "\n"; cout << "addres of pB = " << &pB << "\n"; cout << "value of pB = " << pB << "\n"; cout << "addres of sB = " << &sB << "\n"; cout << "value of sB = " << sB << "\n"; } int main() { f2(); system("PAUSE"); return 0; }

Output:

 addres of B = 0038F644 value of B = 5 addres of pB = 0038F638 value of pB = 0038F644 addres of sB = 0038F644 value of sB = 5

1. Neponyatka between the first and third line. In the address space, as I thought, the address of the newer variable should be larger than the old one. In practice:

  `0038F638 - 0038F644 = -C;`

That is, a shift of 12 positions back. Although one of the programmers with great experience, said that the shift will go 4 positions forward - the dimension of int , which is expressed in bytes.

2. Address sB - generally coincided with the address B ! And the value of sB coincided with the value of B ! It seemed to me obvious that the value placed in sB should be equal to address B And the address sB must be shifted four positions ( sizeof(int) ) forward, relative to the address of the previous variable.

Questions:

What can explain the observed behavior? How does (in fact) assign the address of the next variable relative to the previous one? Is it possible to print the stack in a more literate way than I did? (I want to see a stack in the form of three entries - type - value - address ).

For a better understanding, you can generate an assembler code.
From this code it can turn out not at all such machine codes that you expect.
I can also recommend reading "Modern Operating Systems" about the work of the stack.
You cannot rely on this, this behavior is on the compiler's conscience.
What and how the compiler places on the stack - this is his compiler case.
And there are also different architectures, different compilation keys, different optimizations ... And also the stack usually grows down (by the way, which is observed in your example).
The compiler has the right to generally push the variable from the stack to the register.
In the general case, all this is true, however, after watching a specific compiler, you can always draw some conclusions about its algorithm for allocating variables to the stack.
In particular, g ++ first (in memory with large addresses) allocates arrays, then variables with large alignment requirements (pointers, long long, double ...), and at the end short and char.
As for the output, you can type something like this macro #define Info(v) std::cout << #v << "\thas type: " << typeid(v).name() << " value: [" << (v) << "] addr: " << (void *)&(v) << '\n' (call for each variable)

Accepted Answer · 2017-02-17T16:59:18

In the address space, as I thought, the address of the newer variable should be larger than the old one.

In most "traditional" platforms, the stack grows from top to bottom: from older addresses to younger ones. Therefore, even if the compiler allocated your variables "in order", there would be nothing surprising in the fact that the address of the more "new" variable is less than the address of the "old."

But in fact, there is no order in the placement of individual variables, and your address comparisons are meaningless.

In the traditional implementation, the memory for all local variables of the function is allocated immediately, with one “stack frame” at the beginning of the function. Inside this stack frame, the compiler at the compilation stage will develop some fixed map of the location of local variables. At the same time, it can (and will) locate local variables in this map in a completely arbitrary way, guided by optimization considerations of alignment, memory savings, etc. etc. Therefore, your order for declaring variables does not mean anything at all, and the difference in the addresses of the two "neighboring" variables can be of any kind, either by value or by sign.

Address sB - generally coincided with address B!

What is " sB address"? sB - link. The reference type is not object, conceptually does not occupy memory, and has no address. There is no "sB address" in C ++, and there can not be. After the declaration int &sB = B; the expression &sB will give exactly the address B , which you observe. It is not surprising that the address B coincides with the address B

How does (in fact) assign the address of the next variable relative to the previous one?

None specifically. As the compiler wants in this particular case - so be it. These decisions are made by the compiler based on logic, which is not visible at the language level.

Some order in the C ++ language exists (can exist) only between the elements of one array or the fields of one class.

Is it possible to print the stack in a more literate way than I did? (I want to see a stack in the form of three entries - type - value - address).

At the level of the language, of course not.

And then you can study the format of debugging information generated by your compiler, and those APIs that your implementation provides (if provides) to access this debugging information. There it will all be.

When I get even an exhaustive answer to a difficult question, I always wait at least 6 hours, all of a sudden someone will add something.
Some order in the placement of variables for a particular compiler can be found, but it is difficult to think up when it can be used in practical terms.

Answer 2 · 2017-02-17T18:07:41

We'll take a quick look at how the compiler has allocated variables. I will use the fact that the original post does not tell which compiler the author uses, therefore, as mathematicians say, without loss of generality, we can assume that the compiler is gcc.

Save the code from the original post to the prnlocal.cpp file, adding to it

 #include <iostream> using namespace std;

and having commented out

 /* system("PAUSE"); */

Many share the view that Intel's assembler code is easier to read than AT & T assembler, so we will generate the assembler code with the command

 g++ -masm=intel -S prnlocal.cpp

Get the file prnlocal.s in which we find the label _Z2f2v - so C ++ encodes the name of the function f2, which has a parameter (void) .

  .... _Z2f2v: .LFB971: .cfi_startproc ...... mov DWORD PTR [rbp-28], 5 ; вот это - наша инициализация B=5 lea rax, [rbp-28] mov QWORD PTR [rbp-40], rax ; вот это - инициализация pB = &B ....

We will understand what is written here, in more detail. In our architecture, the stack is implemented using two specialized registers: rbp and rsp. rbp points to the bottom of the stack, and rsp points to the top of the stack.

Our stack grows from large addresses to smaller ones, that is, under normal conditions, rbp > rsp .

The first interesting operator:

  mov DWORD PTR [rbp-28], 5

[rbp-28 ... rbp-25] 4 bytes to the addresses [rbp-28 ... rbp-25] 4 bytes. As a result, the number 5 will fall into the byte located at rbp-28 , and the remaining three bytes will be initialized with zeros.

Next team

  lea rax, [rbp-28]

will load the address [rbp-28] into the rax register.

And the third operator will load the value from the rax register into eight bytes [rbp-40 ... rbp-33].

  mov QWORD PTR [rbp-40], rax

Interestingly, our program does not use bytes at addresses [rbp-32 ... rbp-29]. This is the result of placing 64-bit variables at addresses that are multiples of eight - the so-called alignment of data to addresses that are multiples of 8.

As a result, we have such a stack

  +0 +1 +2 +3 +4 +5 +6 +7 rbp-40 44 F6 38 00 00 00 00 00 rbp-32 ZZ ZZ ZZ ZZ 05 00 00 00 rbp-24 rpb-16 rbp-08

Here at rbp-40 is the value of the variable pB , and at rbp-28 is the value of variable B

You can also find out what is located between rbp-24 and rbp from the assembler code, but this is a separate story.

How does the function stack actually work in C ++?

2 answers 2

More articles: