Which implementation on vc ++ will work faster?

Question

There is an array of unsigned char B [2], it is necessary to translate these two bytes into an unsigned short type variable, which of the algorithms described below will cope with this faster!?

unsigned short A=B[0]+B[1]*255;
unsigned short A; CopyMemory (&A,B,2);

And the second additional question - if you need to convert 8 bytes to double type !? Which algorithm will win? AND THE MOST IMPORTANT THING!!! - NEVER - HEAR - NEVER !!! DO NOT REPEAT THIS EXPERIMENT AT HOME !!! :)))))))))))))))))))))))

Cool, but not quite clear, and there is no answer to both questions!
And then we interpret the value in 2 bytes as a number of type uint16_t.
The speed of both options will depend on the platform and implementation of the CopyMemory function.
On systems without hardware multiplication or with it, but the long version will be won by copy-paste.
In the first variant: get B [1], multiply by 255, get B [0], add, put in A. Few operations.
In copytime, it will eat up a cycle in some way if the compiler does not deploy it.
Plus copying ... If it is not an inline function, then it is still playing with the stack at the input and output.

Accepted Answer · 2011-05-30T23:34:21

 unsigned short A=B[0]+B[1]*255;

Essentially reduced to B [0] + B [1] << 8; Those. the main one is two operations: addition + shift. And with multiplication made a mistake - you need to multiply by 0x100, i.e. 256. If variable A is badly needed, add forwarding to memory. In fact, most likely, the variable A is optimized and the value will be taken from the register of the processor.

 unsigned short A; CopyMemory (&A,B,2);

Call f-tsii. Total - organization of the stack (push / pop, setting registers, passing parameters), the code of the function itself. If the built-in function is already better, but the speed will still be worse than addition + shift, since there is work with memory. Optimize fails.

 uint16_t A = *(uint16_t*)&B[0];

and clones. Only two shipments (memory -> register, register -> memory). At best, it is optimized and the value of A will then again be taken from the register. Those. in fact - one reading from memory. And no memory entries.

In general, in fact, it is necessary to take and watch assembly listing. Now all compilers are optimizing. And it is simply rough to convert unequivocally one instruction of a language into one or several instructions of a processor. A will be the most effective option for one of the criteria. There are actually two of them: speed and size. And for each processor, the optimization rules are different.

Regarding double:

 double A = *(double*)&B[0];

But with double I would be careful. The fact is that integers are stored as integers, bitwise. Each byte is consecutive. And a record

 unsigned short A=B[0]+B[1]*256; unsigned long C=B[0]+B[1]<<8+B[2]<<16+B[3]<<24;

works. And the internal representation of a double is much more complicated. Mantissa, exhibitor, signs ... Fu. br. And simply tearing out certain double digits is more difficult.

And if it's not difficult for you, explain the construction of this line: double A = (double ) & B [0];
double A - everything is clear here (double ) & B [0] - multiply to the left of the bracket - clearing is also clear, it's not clear just what (double *) is not entirely clear & B [0] - the address of the first element of the array
& B [0] is the address <s> of the zero </ s> of the first element of array B - i.e.
in fact, the pointer <p> (double *) & B [0] - we led the pointer to type double * <p> * (double *) & B [0] - and this we delocated the resulting pointer, i.e.
Received the value of the double type at address = cell address B [0]

Which implementation on vc ++ will work faster?

1 answer 1

More articles: