There is a dword
number - RGBA pixel. It is necessary to add it with another pixel (arithmetic with saturation saturation), and to do this through the xmm
registers, packing these dword'ы
in 128 bits
unsigned int A0 = 0xFF99AA00; unsigned int B0 = 0xFF80AA00; unsigned int C0 = 0xFF60AA00; unsigned int D0 = 0xFF70AA00; unsigned int A1 = 0xFF90AA00; unsigned int B1 = 0xFF80AA00; unsigned int C1 = 0xFFB0AA00; unsigned int D1 = 0xFFC0AA00; _asm { /*........*/ PADDUSB xmm1, xmm2 }
How to pack 4 + 4 dword
so that they fold like 16 + 16 byte
? In place of /*...*/
need the appropriate assembler instructions for SSE2. The sum of these byte
cannot be more than 255.
And another small question: how could one pack 16 chars, i.e. if for each color there would be a separate variable?