After executing the mul ebx command, I get the product in edx:eax . I push edx this work onto the stack: push edx , push eax . After a while, it took to multiply the current edx:eax by the work in the stack. The question is how to do this? If you just write mul [esp] , then, as I understand it, the result will be wrong ...
2 answers
You want to multiply 2 numbers of 64 bits using 32 bit operations. Let's first present it in the decimal system and let 32 bits be one decimal place:
54 * 68 = 5 * 6 * 100 + 5 * 8 * 10 + 4 * 6 * 10 + 4 * 8
Multiplications by 10 and 100 perfectly show in which word you should have the resulting digits.
Perhaps we will not throw out the overflow, and write the multiplication to obtain a 128-bit result while preserving it in memory. If the older part is not needed for some reason, you can not use it or save it. Multiplication of the most senior parts can then not be done.
res0 DD 0 ; 0-31 биты результата res1 DD 0 ; 32-63 биты res2 DD 0 ; 64-95 биты res3 DD 0 ; 96-127 биты ... ;edx:eax - первый множитель X ;ebx:ecx - второй множитель Y push edx ; Оставим старшую часть первого множителя пока в покое push eax ; Младшую копируем на будущее mul ecx ; Младшая часть X * Младшую Y mov res0, eax mov res1, edx; Сохраняем pop eax ; Восстановили младшую X mul ebx ; Младшая X * Старшую Y add res1, eax adc res2, edx; В res2 старшая часть результата + флаг переноса при сложении в младшей pop eax ; Старшую часть X в EAX push eax ; И сохраняем опять на будущее mul ecx ; Старшая X * Младшую Y add res1, eax adc res2, edx; Вот тут опять может возникнуть перенос adc res3, 0 ; Который мы сохраняем в самую старшую часть результата pop eax ; Восстановили старшую X mul ebx ; Перемножаем старшие части add res2, eax adc res3, edx Something like this, I can't give a 100% guarantee, there is nothing to check
Taking into account comments on the issue, we do so.
Multiply the lower 32 bits of the first multiplier by the lower 32 bits of the second. Then multiply the low 32 bits of the first multiplier by the high 32 bits of the second. And the younger part of the result is added to the older half of the previous multiplication. Now symmetrically: the high 32 bits of the first to the low 32 bits of the second. Add the lower 32 bits of the result to the high 32 bits of the first multiplication.
It turns out 3 multiplications and 2 additions. In short, it looks like this:
ab cd ------- bd ad bc
mul- kormulwill not be able to multiply 64 bits by 64, because the result will be 128 bits of response. If you do not care about overflow, then you just need to multiply the lower 32 bits of one factor and the lower 32 bits of another. But the result will be wrong. As a result, only the lower 32 bits of the 128-bit result will be correct. - Zealint