File analysis for functionality

Question

There is a start function

it is in basic mode:

In which the sub_410156 function is called (highlighted in yellow). In this function, we can see the following:

same function:

In line

cmp word ptr [esi], 'ZM'

(the third line from the beginning of the function ) compares the content at the esi address with 'MZ', that is, the PE file is analyzed (a comparison with 'PE' can be seen below).

Question: I can not understand where to get the file data?

My advice: do not use the graph mode in the main mode in IDA, switch to the normal mode using the space bar and in the settings make sure that it does not automatically switch to the graph mode.
Secondly, the data is passed to the function through parameters — in this case, either through the stack, or through the ecx register — see the calling function mov ecx, [ebp + ....] .
More precisely, it will be possible if you add a screenshot in disassembled form (and not in the form of a graph).
In graph mode, transitions are more clearly shown, so I use it.
I am not interested in the mechanism of passing parameters to a function, but the data source ... where do I get information about the file being analyzed.

Accepted Answer · 2017-06-06T17:52:45

First part:

  push ebp call $+5 loc_410006: pop ebp

This is the typical way to get the actual address where the program is loaded. The call to the next command is executed, and the address of this command is placed on the stack. With the help of the pop ebp command, this address is removed from the stack and put into the ebp register.

 mov ebx, ebp sub ebp, offset loc_410006

Here the program calculates the difference (delta) between the actual address where the program is located, and the address at which it was supposed to be located. The fact is that when loading an executable file into memory, the loader can locate the program at the base address not at 0x400000 , but at any other address (provided there is a relocation table, which in principle can be empty). In fact, after pop ebp in the pop ebp register it may not be 0x410006 , but for example 0xC50006 , the difference in this case will be equal to 0x840000 . Usually this is a round number (if you look at it in hexadecimal representation) with 3 zeros at the end, since the section start addresses must be multiples of 4096 ( 0x1000 ).

 loc_41000F: mov eax, 4096 ; 0x1000 add eax, 6 ; 0x1006 sub ebx, eax ; от фактического адреса метки loc_410006 вычитается число 0x1006, получается что-то вроде 0x40f000 + дельта mov [ebp+410961h], ebx ; По фактическому адресу какой-то глобальной переменной записывается значение полученного выше адреса ; Это скорее всего та же переменная, которая внутри функции у вас называется varX mov edx, offset aGetmodulehandl ; "GetModuleHandleA" add edx, ebp ; Здесь опять же вычисляется фактический адрес строки "GetModuleHandleA" mov ecx, [ebp+410C16h] ; В ecx кладется значение какой-то другой переменной (неинициализированной?) push ebp ; значение дельты сохраняется на стеке call sub_410156 pop ebp ; а потом восстанавливается cmp eax, -1 ; проверяется значение, возвращенное функцией jz short loc_41009C

As a result, before calling the function in the ecx and edx the value of some variable and the address of the string "GetModuleHandleA"

Such a position-independent code can be used to include it in another program without the need to add records to the relocation table.

Now about what is located at 0x40f000 . As shown by a small experiment, before the first section is the header of the executable file. Actually, here is the experimental code (compiled using flat assembler):

 format PE GUI 4.0 include 'win32ax.inc' .code start: xor eax, eax mov ax, [start - 0x1000] mov dword [buf], eax invoke MessageBox, HWND_DESKTOP, buf, "Title", MB_OK invoke ExitProcess,0 .data buf rb 4 .end start

The message box displays MZ, which means that there is the beginning of the file header. Where does this data come from - as I understand it, first the header is loaded by the loader into memory, then it is disassembled, and after it the sections of the file are loaded. After loading sections, the file header remains in memory.

Inside the sub_410156 function, sub_410156 is a check for the presence of MZ and PE signatures in the header (it’s not clear why, if the file was successfully launched), and it’s like checking the import from the KERNEL32.DLL library (at least in the test file by offset +128 from the beginning of the PE header there is just an array of Import Directory structures).

To analyze what the function checks in the header of the executable file, you can check the "Manual load" checkbox when loading the file to IDA PRO, answer yes to everything, then the file header is loaded, and, for example, the import section is in raw form, and overlay (file "tail" that is not included in any section), if there is one.

Answer 2 · 2017-06-06T08:16:51

I would venture to suggest that call $ + 5, pop ebp receives the address of the instruction from start in the ebp register, from which RVA is then subtracted. As a result, in ebp will be the loading address of the module. As it was true above, the graphs are evil. But the debugger is good, it allows you to find answers to such questions in a couple of minutes.

File analysis for functionality

2 answers 2

More articles: