OSDN Git Service

x86: float dsp: unroll SSE versions
authorChristophe Gisquet <christophe.gisquet@gmail.com>
Fri, 14 Feb 2014 15:03:12 +0000 (15:03 +0000)
committerJanne Grunau <janne-libav@jannau.net>
Thu, 20 Feb 2014 13:18:05 +0000 (14:18 +0100)
commit996697e266c8adc0ad9b7fc7568406c7529c97cf
tree4b143794e0a28c92722d81a62b8c55b9cbd00cc1
parentef010f08ae53479c54e2f16be5a7e1a809a9e268
x86: float dsp: unroll SSE versions

vector_fmul and vector_fmac_scalar are guaranteed that they can process in
batch of 16 elements, but their SSE versions only does 8 at a time.

Therefore, unroll them a bit.
299 to 261c for 256 elements in vector_fmac_scalar on Arrandale/Win64.

Signed-off-by: Janne Grunau <janne-libav@jannau.net>
libavutil/x86/float_dsp.asm