Hello!

There are pointers in C ++, this is the area of ​​memory that contains the address, which in turn contains the data. And I am interested in the following questions:

  1. When we allocate memory with the operator new, the compiler creates a certain variable and translates a pointer to it? It's not entirely clear how this looks in memory.

  2. If we have a structure (structure object), we create a pointer to an object of the type of this structure, how does all this look in memory? One pointer is created (what size will it have?) Or some tricky manipulations with pointers to pointers (the structure can have its own fields and methods and can be used on them, which is not visible to the programmer, something tricky is tested)?

    2 answers 2

    1) the compiler inserts a special code that calls the function of allocating memory ( malloc for example, although no one bothers to allocate a lot of memory at first, and then issue pieces). This function returns a pointer. Its value is assigned to a variable. If necessary, conducts initialization (that is, mov).

    An example (strongly synthetic! Memory is not released! Nothing is initialized! Just for example!)

     int f(int x) { int * c = new int[10]; return c[0]; } 

    we get this somewhere code (for gcc)

     f(int): sub rsp, 8 mov edi, 40 # это размер 10 * sizeof(int) call operator new[](unsigned long) # собственно выделение памяти mov eax, DWORD PTR [rax] add rsp, 8 ret 

    that is, the program itself allocates 40 bytes of memory. Yes, it is to itself. The compiler only inserts a call to special functions (which are sometimes called built-in or magick ).

    Of course, no one bothers the compiler to independently reserve memory in a binary, and instead of allocating, simply register a pointer to this part of memory. In my synthetic example, the compiler could, in principle, and 4 bytes to select.

    2) in the case of a structure, the compiler calculates how much space is needed for the structure. It is usually no less than the sizeof(имя_структуры) . This memory is allocated in one piece. All fields are assigned offsets.

     struct data { int a; int b; int * d; }; void f(int x) { data * c = new data; c->a = 1; c->b = 2; c->d = 0; } 

    get the code

     f(int): mov edi, 16 # это размер структуры push rax call operator new(unsigned long) # выделяем память mov DWORD PTR [rax], 1 # это поле a. оно находиться по нулевому адресу mov DWORD PTR [rax+4], 2 # это поле b mov QWORD PTR [rax+8], 0 # а указатель занимает 8 байт, поэтому размер структуры 16. pop rdx ret 

    Up to this point, nothing is different from the classic C But in С++ structures can have a constructor. In this case, the compiler will insert its call.

    In the event that a structure contains other structures within itself, a combination is created. And again an example:

     struct temp { int c; int d; }; struct data { int a; int b; temp f; }; void f(int x) { data * c = new data; c->a = 7; c->b = 2; c->fd =5; } 

    and his code

     f(int): push rax mov edi, 16 call operator new(unsigned long) mov DWORD PTR [rax], 7 mov DWORD PTR [rax+4], 2 mov DWORD PTR [rax+12], 5 pop rdx ret 

    I deliberately chose different values ​​to match. As you can see, the compiler simply inserted the structure and at the code level now formally there is only one "structure":

     struct fake { int a; int b; int c; int d; }; 

    There was only a question with functions which can have structures. Everything is simple here. Typically, the compiler adds an implicit parameter to such functions, in which it passes a pointer to the structure. There is no need to modify the structure itself - for them there is no polymorphism - in any case, the type of structure will be known and the correct call can be inserted.

    • Oh, thanks! Where did you read about it? You can give literature, very interesting) - Alerr
    • one
      I just watch a lot in assembly code. This is just a practice. In the books of this now do not like to write in detail (at least I have not seen such books). - KoVadim
    • I'll put in my five kopecks. In addition, there is the concept of alignment (memory alignment). It consists in the fact that the compiler will try to place the structure fields by offsets that are multiples of the system capacity. For example, in such a structure, struct {int a; // 4 bytes short b; // 2 bytes int c; // 4 bytes} fields a, b and c will be placed at offsets of 0, 4 and 8 bytes, respectively (and not 0, 4 and 6, as one might expect). Usually there is nothing wrong with this behavior, but it can spoil life very much with direct mapping of structures into a file. - fori1ton
    • the author of the question seems to understand pure s, therefore such details should be known to him (in pure C without this in any way). - KoVadim
    • @KoVadim In the books of this now do not like to write in detail Whether the thing in former times, when the compilers were bad, and knowledge of the assembler can significantly reduce the time to search for errors. What in the prefaces to books on assembly language without hesitation mentioned. And who would watch PMD now (post-mortem dump) ... - alexlz

    @Alerr , you are right, the pointer is the area of ​​memory in which the address is written. The size of this memory depends on the architecture of the computer and OS.

    If you move a little bit into specifics, then in C / C ++ Linux pointer size == sizeof (long). For 32-bit x86 it is 4 bytes, and for 64-bit x86 it is 8 bytes.

    Generally speaking, there are always some bits in this memory (maybe all zeros, and maybe all ones ...), so in fact the pointer always points to something, although the current value may be incorrect (not even belonging to the process address space) . Usually about such a pointer say that it is not initialized.

    The rest (in points 1, 2 you have already remarkably described @KoVadim ).