OSDN Git Service

x86 dsputil: provide SSE2/SSSE3 versions of bswap_buf
authorChristophe Gisquet <christophe.gisquet@gmail.com>
Thu, 19 Jan 2012 20:48:39 +0000 (21:48 +0100)
committerDiego Biurrun <diego@biurrun.de>
Mon, 30 Jan 2012 09:19:55 +0000 (10:19 +0100)
commit6b039003822a03add20c7ba91fc857dca52b0a03
tree66ed7686c3377bce8accbed2fbc471c9a5931dbb
parenta846202343af7c56bf444ec47d4bb26a5d2b83ce
x86 dsputil: provide SSE2/SSSE3 versions of bswap_buf

While pshufb allows emulating bswap on XMM registers for SSSE3, more
shuffling is needed for SSE2. Alignment is critical, so specific codepaths
are provided for this case.

For the huffyuv sequence "angels_480-huffyuvcompress.avi":
C (using bswap instruction): ~ 55k cycles
SSE2:                        ~ 40k cycles
SSSE3 using unaligned loads: ~ 35k cycles
SSSE3 using aligned loads:   ~ 30k cycles

Signed-off-by: Diego Biurrun <diego@biurrun.de>
libavcodec/x86/dsputil_mmx.c
libavcodec/x86/dsputil_yasm.asm