OSDN Git Service

android-x86/external-bluetooth-sbc.git
11 years agosbc: Reduce for-loop induced indentation in sbc_unpack_frame
Johan Hedberg [Wed, 19 Oct 2011 08:09:13 +0000 (11:09 +0300)]
sbc: Reduce for-loop induced indentation in sbc_unpack_frame

11 years agosbc: overflow bugfix and audio decoding quality improvement
Siarhei Siamashka [Mon, 17 Oct 2011 01:24:38 +0000 (04:24 +0300)]
sbc: overflow bugfix and audio decoding quality improvement

The "(((audio_sample << 1) | 1) << frame->scale_factor[ch][sb])"
part of expression
    "frame->sb_sample[blk][ch][sb] =
        (((audio_sample << 1) | 1) << frame->scale_factor[ch][sb]) /
        levels[ch][sb] - (1 << frame->scale_factor[ch][sb])"
in "sbc_unpack_frame" function can sometimes overflow 32-bit signed int.
This problem can be reproduced by first using bitpool 128 and encoding
some random noise data, and then feeding it to sbc decoder. The obvious
thing to do would be to change "audio_sample" variable type to uint32_t.

However the problem is a little bit more complicated. According
to the section "12.6.2 Scale Factors" of A2DP spec:
    scalefactor[ch][sb] = pow(2.0, (scale_factor[ch][sb] + 1))

And according to "12.6.4 Reconstruction of the Subband Samples":
    sb_sample[blk][ch][sb] = scalefactor[ch][sb] *
        ((audio_sample[blk][ch][sb]*2.0+1.0) / levels[ch][sb]-1.0);

Hence the current code for calculating "sb_sample[blk][ch][sb]" is
not quite correct, because it loses one least significant bit of
sample data and passes twice smaller sample values to the synthesis
filter (the filter also deviates from the spec to compensate this).
This all has quite a noticeable impact on audio quality. Moreover,
it makes sense to keep a few extra bits of precision here in order
to minimize rounding errors. So the proposed patch introduces a new
SBCDEC_FIXED_EXTRA_BITS constant and uses uint64_t data type
for intermediate calculations in order to safeguard against
overflows. This patch intentionally addresses only the quality
issue, but performance can be also improved later (like replacing
division with multiplication by reciprocal).

Test for the difference of sbc encoding/decoding roundtrip vs.
the original audio file for joint stereo, bitpool 128, 8 subbands
and http://media.xiph.org/sintel/sintel-master-st.flac sample
demonstrates some quality improvement:

=== before ===
    --- comparing original / sbc_encoder.exe + sbcdec ---
    stddev:    4.64 PSNR: 82.97 bytes:170495708/170496000
=== after ===
    --- comparing original / sbc_encoder.exe + sbcdec ---
    stddev:    1.95 PSNR: 90.50 bytes:170495708/170496000

11 years agosbc: Use __asm__ keyword
Maarten Bosmans [Tue, 6 Sep 2011 07:40:44 +0000 (10:40 +0300)]
sbc: Use __asm__ keyword

There are two reasons for this change:

First: consistency. __asm__ was already used elsewhere in the files, so
using that throughout is cleaner.

Second: both asm and __asm__ are GCC-specific extensions, not defined in
the C standard. When compiling with --std=gnu99 both are recognized, but
when using --std=c99 only __asm__ is recognized to make it perfectly
clear that you're not using some standard C99 construct, but a
GCC-extension.

11 years agosbc: Fix empty parameter list in usage() declaration
Szymon Janc [Wed, 18 May 2011 06:42:47 +0000 (08:42 +0200)]
sbc: Fix empty parameter list in usage() declaration

11 years agosbc: Remove unused variable
Johan Hedberg [Sat, 14 May 2011 22:56:11 +0000 (01:56 +0300)]
sbc: Remove unused variable

11 years agosbc: better compatibility with ARM thumb/thumb2
Siarhei Siamashka [Mon, 28 Mar 2011 22:57:39 +0000 (01:57 +0300)]
sbc: better compatibility with ARM thumb/thumb2

ARM assembly optimizations fail to compile in thumb mode, but are fine
for thumb2. Update ifdefs in the code to make use of ARM assembly only
when it is safe and also make sure that no optimizations are missed
when compiling for thumb2.

The problem was reported by Paul Menzel:
https://tango.0pointer.de/pipermail/pulseaudio-discuss/2011-February/009022.html

11 years agosbc: detect when bitpool has changed
Luiz Augusto von Dentz [Wed, 22 Dec 2010 09:35:48 +0000 (11:35 +0200)]
sbc: detect when bitpool has changed

A2DP spec allow bitpool changes midstream which is why sbc configuration
has a range of values for bitpool that the encoder can use and decoder
must support.

Bitpool changes do not affect the state of encoder/decoder so they don't
need to be reinitialize when this happens, so the impact is fairly small,
what it does change is the frame length so encoders may change the
bitpool to use the link more efficiently.

11 years agosbc: Add iwmmxt optimization for sbc for pxa series cpu
Keith Mok [Thu, 18 Nov 2010 13:33:16 +0000 (21:33 +0800)]
sbc: Add iwmmxt optimization for sbc for pxa series cpu

Add iwmmxt optimization for sbc for pxa series cpu.

Benchmarked on ARM PXA platform:
===  Before (4 bands) ====
$ time  ./sbcenc_orig  -s 4     long.au  > /dev/null
real    0m 2.44s
user    0m 2.39s
sys     0m 0.05s
===  After (4 bands) ====
$ time  ./sbcenc  -s 4     long.au  > /dev/null
real    0m 1.59s
user    0m 1.49s
sys     0m 0.10s

===  Before (8 bands) ====
$ time  ./sbcenc_orig   -s 8     long.au  > /dev/null
real    0m 4.05s
user    0m 3.98s
sys     0m 0.07s
===  After (8 bands) ====
$ time  ./sbcenc  -s 8     long.au  > /dev/null
real    0m 1.48s
user    0m 1.41s
sys     0m 0.06s

===  Before (a2dp usage) ====
$ time  ./sbcenc_orig   -b53 -s8 -j    long.au  > /dev/null
real    0m 4.51s
user    0m 4.41s
sys     0m 0.10s
===  After (a2dp usage) ====
$ time  ./sbcenc   -b53 -s8 -j    long.au  > /dev/null
real    0m 2.05s
user    0m 1.99s
sys     0m 0.06s

11 years agosbc: added "cc" to the clobber list of mmx inline assembly
Siarhei Siamashka [Thu, 11 Nov 2010 09:29:42 +0000 (11:29 +0200)]
sbc: added "cc" to the clobber list of mmx inline assembly

In the case of scale factors calculation optimizations, the inline
assembly code has instructions which update flags register, but
"cc" was not mentioned in the clobber list. When optimizing code,
gcc theoretically is allowed to do a comparison before the inline
assembly block, and a conditional branch after it which would lead
to a problem if the flags register gets clobbered. While this is
apparently not happening in practice with the current versions of
gcc, the clobber list needs to be corrected.

Regarding the other inline assembly blocks. While most likely it
is actually unnecessary based on quick review, "cc" is also added
there to the clobber list because it should have no impact on
performance in practice. It's kind of cargo cult, but relieves
us from the need to track the potential updates of flags register
in all these places.

11 years agosbc: ARMv6 optimized version of analysis filter for SBC encoder
Siarhei Siamashka [Fri, 2 Jul 2010 12:25:42 +0000 (15:25 +0300)]
sbc: ARMv6 optimized version of analysis filter for SBC encoder

The optimized filter gets enabled when the code is compiled
with -mcpu=/-march options set to target the processors which
support ARMv6 instructions. This code is also disabled when
NEON is used (which is a lot better alternative). For additional
safety ARM EABI is required and thumb mode should not be used.

Benchmarks from ARM11:

== 8 subbands ==

$ time ./sbcenc -b53 -s8 -j test.au > /dev/null

real    0m 35.65s
user    0m 34.17s
sys     0m 1.28s

$ time ./sbcenc.armv6 -b53 -s8 -j test.au > /dev/null

real    0m 17.29s
user    0m 15.47s
sys     0m 0.67s

== 4 subbands ==

$ time ./sbcenc -b53 -s4 -j test.au > /dev/null

real    0m 25.28s
user    0m 23.76s
sys     0m 1.32s

$ time ./sbcenc.armv6 -b53 -s4 -j test.au > /dev/null

real    0m 18.64s
user    0m 15.78s
sys     0m 2.22s

11 years agosbc: faster 'sbc_calculate_bits' function
Siarhei Siamashka [Fri, 2 Jul 2010 12:25:41 +0000 (15:25 +0300)]
sbc: faster 'sbc_calculate_bits' function

By using SBC_ALWAYS_INLINE trick, the implementation of 'sbc_calculate_bits'
function is split into two branches, each having 'subband' variable value
known at compile time. It helps the compiler to generate more optimal code
by saving at least one extra register, and also provides more obvious
opportunities for loops unrolling.

Benchmarked on ARM Cortex-A8:

== Before: ==

$ time ./sbcenc -b53 -s8 -j test.au > /dev/null

real    0m3.989s
user    0m3.602s
sys     0m0.391s

samples  %        image name               symbol name
26057    32.6128  sbcenc                   sbc_pack_frame
20003    25.0357  sbcenc                   sbc_analyze_4b_8s_neon
14220    17.7977  sbcenc                   sbc_calculate_bits
8498     10.6361  no-vmlinux               /no-vmlinux
5300      6.6335  sbcenc                   sbc_calc_scalefactors_j_neon
3235      4.0489  sbcenc                   sbc_enc_process_input_8s_be_neon
2172      2.7185  sbcenc                   sbc_encode

== After: ==

$ time ./sbcenc -b53 -s8 -j test.au > /dev/null

real    0m3.652s
user    0m3.195s
sys     0m0.445s

samples  %        image name               symbol name
26207    36.0095  sbcenc                   sbc_pack_frame
19820    27.2335  sbcenc                   sbc_analyze_4b_8s_neon
8629     11.8566  no-vmlinux               /no-vmlinux
6988      9.6018  sbcenc                   sbc_calculate_bits
5094      6.9994  sbcenc                   sbc_calc_scalefactors_j_neon
3351      4.6044  sbcenc                   sbc_enc_process_input_8s_be_neon
2182      2.9982  sbcenc                   sbc_encode

11 years agosbc: slightly faster 'sbc_calc_scalefactors_neon'
Siarhei Siamashka [Fri, 2 Jul 2010 12:25:40 +0000 (15:25 +0300)]
sbc: slightly faster 'sbc_calc_scalefactors_neon'

Previous variant was basically derived from C and MMX implementations.
Now new variant makes use of 'vmax' instruction, which is available in
NEON and can do this job faster. The same method for calculating scale
factors is also used in 'sbc_calc_scalefactors_j_neon'.

Benchmarked without joint stereo on ARM Cortex-A8:

== Before: ==

$ time ./sbcenc -b53 -s8 test.au > /dev/null

real    0m3.851s
user    0m3.375s
sys     0m0.469s

samples  %        image name               symbol name
26260    34.2672  sbcenc                   sbc_pack_frame
20013    26.1154  sbcenc                   sbc_analyze_4b_8s_neon
13796    18.0027  sbcenc                   sbc_calculate_bits
8388     10.9457  no-vmlinux               /no-vmlinux
3229      4.2136  sbcenc                   sbc_enc_process_input_8s_be_neon
2408      3.1422  sbcenc                   sbc_calc_scalefactors_neon
2093      2.7312  sbcenc                   sbc_encode

== After: ==

$ time ./sbcenc -b53 -s8 test.au > /dev/null

real    0m3.796s
user    0m3.344s
sys     0m0.438s

samples  %        image name               symbol name
26582    34.8726  sbcenc                   sbc_pack_frame
20032    26.2797  sbcenc                   sbc_analyze_4b_8s_neon
13808    18.1146  sbcenc                   sbc_calculate_bits
8374     10.9858  no-vmlinux               /no-vmlinux
3187      4.1810  sbcenc                   sbc_enc_process_input_8s_be_neon
2027      2.6592  sbcenc                   sbc_encode
1766      2.3168  sbcenc                   sbc_calc_scalefactors_neon

11 years agosbc: ARM NEON optimizations for input permutation in SBC encoder
Siarhei Siamashka [Fri, 2 Jul 2010 12:25:39 +0000 (15:25 +0300)]
sbc: ARM NEON optimizations for input permutation in SBC encoder

Using SIMD optimizations for 'sbc_enc_process_input_*' functions provides
a modest, but consistent speedup in all SBC encoding cases.

Benchmarked on ARM Cortex-A8:

== Before: ==

$ time ./sbcenc -b53 -s8 -j test.au > /dev/null

real    0m4.389s
user    0m3.969s
sys     0m0.422s

samples  %        image name               symbol name
26234    29.9625  sbcenc                   sbc_pack_frame
20057    22.9076  sbcenc                   sbc_analyze_4b_8s_neon
14306    16.3393  sbcenc                   sbc_calculate_bits
9866     11.2682  sbcenc                   sbc_enc_process_input_8s_be
8506      9.7149  no-vmlinux               /no-vmlinux
5219      5.9608  sbcenc                   sbc_calc_scalefactors_j_neon
2280      2.6040  sbcenc                   sbc_encode
661       0.7549  libc-2.10.1.so           memcpy

== After: ==

$ time ./sbcenc -b53 -s8 -j test.au > /dev/null

real    0m3.989s
user    0m3.602s
sys     0m0.391s

samples  %        image name               symbol name
26057    32.6128  sbcenc                   sbc_pack_frame
20003    25.0357  sbcenc                   sbc_analyze_4b_8s_neon
14220    17.7977  sbcenc                   sbc_calculate_bits
8498     10.6361  no-vmlinux               /no-vmlinux
5300      6.6335  sbcenc                   sbc_calc_scalefactors_j_neon
3235      4.0489  sbcenc                   sbc_enc_process_input_8s_be_neon
2172      2.7185  sbcenc                   sbc_encode

11 years agosbc: ARM NEON optimized joint stereo processing in SBC encoder
Siarhei Siamashka [Fri, 2 Jul 2010 12:25:38 +0000 (15:25 +0300)]
sbc: ARM NEON optimized joint stereo processing in SBC encoder

Improves SBC encoding performance when joint stereo is used, which
is a typical A2DP configuration.

Benchmarked on ARM Cortex-A8:

== Before: ==

$ time ./sbcenc -b53 -s8 -j test.au > /dev/null

real    0m5.239s
user    0m4.805s
sys     0m0.430s

samples  %        image name               symbol name
26083    25.0856  sbcenc                   sbc_pack_frame
21548    20.7240  sbcenc                   sbc_calc_scalefactors_j
19910    19.1486  sbcenc                   sbc_analyze_4b_8s_neon
14377    13.8272  sbcenc                   sbc_calculate_bits
9990      9.6080  sbcenc                   sbc_enc_process_input_8s_be
8667      8.3356  no-vmlinux               /no-vmlinux
2263      2.1765  sbcenc                   sbc_encode
696       0.6694  libc-2.10.1.so           memcpy

== After: ==

$ time ./sbcenc -b53 -s8 -j test.au > /dev/null

real    0m4.389s
user    0m3.969s
sys     0m0.422s

samples  %        image name               symbol name
26234    29.9625  sbcenc                   sbc_pack_frame
20057    22.9076  sbcenc                   sbc_analyze_4b_8s_neon
14306    16.3393  sbcenc                   sbc_calculate_bits
9866     11.2682  sbcenc                   sbc_enc_process_input_8s_be
8506      9.7149  no-vmlinux               /no-vmlinux
5219      5.9608  sbcenc                   sbc_calc_scalefactors_j_neon
2280      2.6040  sbcenc                   sbc_encode
661       0.7549  libc-2.10.1.so           memcpy

11 years agosbc: Fix signedness of libsbc parameters
Johan Hedberg [Wed, 30 Jun 2010 08:55:11 +0000 (11:55 +0300)]
sbc: Fix signedness of libsbc parameters

The written parameter of sbc_encode can be negative so it should be
ssize_t instead of size_t.

11 years agosbc: ARM NEON optimization for scale factors calculation
Siarhei Siamashka [Tue, 29 Jun 2010 13:48:47 +0000 (16:48 +0300)]
sbc: ARM NEON optimization for scale factors calculation

Improves SBC encoding performance when joint stereo is not used.
Benchmarked on ARM Cortex-A8:

== Before: ==

$ time ./sbcenc -b53 -s8 test.au > /dev/null

real    0m4.756s
user    0m4.313s
sys     0m0.438s

samples  %        image name               symbol name
2569     27.6296  sbcenc                   sbc_pack_frame
1934     20.8002  sbcenc                   sbc_analyze_4b_8s_neon
1386     14.9064  sbcenc                   sbc_calculate_bits
1221     13.1319  sbcenc                   sbc_calc_scalefactors
996      10.7120  sbcenc                   sbc_enc_process_input_8s_be
878       9.4429  no-vmlinux               /no-vmlinux
204       2.1940  sbcenc                   sbc_encode
56        0.6023  libc-2.10.1.so           memcpy

== After: ==

$ time ./sbcenc -b53 -s8 test.au > /dev/null

real    0m4.220s
user    0m3.797s
sys     0m0.422s

samples  %        image name               symbol name
2563     31.3249  sbcenc                   sbc_pack_frame
1892     23.1239  sbcenc                   sbc_analyze_4b_8s_neon
1368     16.7196  sbcenc                   sbc_calculate_bits
961      11.7453  sbcenc                   sbc_enc_process_input_8s_be
836      10.2176  no-vmlinux               /no-vmlinux
262       3.2022  sbcenc                   sbc_calc_scalefactors_neon
199       2.4322  sbcenc                   sbc_encode
49        0.5989  libc-2.10.1.so           memcpy

11 years agosbc: MMX optimization for scale factors calculation
Siarhei Siamashka [Tue, 29 Jun 2010 13:48:46 +0000 (16:48 +0300)]
sbc: MMX optimization for scale factors calculation

Improves SBC encoding performance when joint stereo is not used.
Benchmarked on Pentium-M:

== Before: ==

$ time ./sbcenc -b53 -s8 test.au > /dev/null

real    0m1.439s
user    0m1.336s
sys     0m0.104s

samples  %        image name               symbol name
8642     33.7473  sbcenc                   sbc_pack_frame
5873     22.9342  sbcenc                   sbc_analyze_4b_8s_mmx
4435     17.3188  sbcenc                   sbc_calc_scalefactors
4285     16.7331  sbcenc                   sbc_calculate_bits
1942      7.5836  sbcenc                   sbc_enc_process_input_8s_be
322       1.2574  sbcenc                   sbc_encode

== After: ==

$ time ./sbcenc -b53 -s8 test.au > /dev/null

real    0m1.319s
user    0m1.220s
sys     0m0.084s

samples  %        image name               symbol name
8706     37.9959  sbcenc                   sbc_pack_frame
5740     25.0513  sbcenc                   sbc_analyze_4b_8s_mmx
4307     18.7972  sbcenc                   sbc_calculate_bits
1937      8.4537  sbcenc                   sbc_enc_process_input_8s_be
1801      7.8602  sbcenc                   sbc_calc_scalefactors_mmx
307       1.3399  sbcenc                   sbc_encode

11 years agosbc: new 'sbc_calc_scalefactors_j' function added to sbc primitives
Siarhei Siamashka [Tue, 29 Jun 2010 13:48:45 +0000 (16:48 +0300)]
sbc: new 'sbc_calc_scalefactors_j' function added to sbc primitives

The code for scale factors calculation with joint stereo support has
been moved to a separate function. It can get platform-specific
SIMD optimizations later for best possible performance.

But even this change in C code improves performance because of the
use of __builtin_clz() instead of loops similar to what was done
to sbc_calc_scalefactors earlier. Also technically it does loop
unrolling by processing two channels at once, which might be either
good or bad for performance (if the registers pressure is increased
and more data is spilled to memory). But the benchmark from 32-bit
x86 system (pentium-m) shows that it got clearly faster:

$ time ./sbcenc.old -b53 -s8 -j test.au > /dev/null

real    0m1.868s
user    0m1.808s
sys     0m0.048s

$ time ./sbcenc.new -b53 -s8 -j test.au > /dev/null

real    0m1.742s
user    0m1.668s
sys     0m0.064s

11 years agosbc: Fix redundant null check on calling free()
Gustavo F. Padovan [Sat, 5 Jun 2010 10:14:28 +0000 (07:14 -0300)]
sbc: Fix redundant null check on calling free()

Issues found by smatch static check: http://smatch.sourceforge.net/

11 years agosbc: Update Nokia copyrights
Johan Hedberg [Thu, 7 Jan 2010 09:02:51 +0000 (11:02 +0200)]
sbc: Update Nokia copyrights

11 years agosbc: Update copyright information
Marcel Holtmann [Sat, 2 Jan 2010 01:08:17 +0000 (17:08 -0800)]
sbc: Update copyright information

11 years agosbc: added saturated clipping of decoder output to 16-bit
Siarhei Siamashka [Fri, 17 Apr 2009 15:27:38 +0000 (18:27 +0300)]
sbc: added saturated clipping of decoder output to 16-bit

This prevents overflows and audible artefacts for the audio files which
originally had loudness maximized. Music from audio CD disks is an
example of such files, see http://en.wikipedia.org/wiki/Loudness_war

11 years agosbc: Do some coding style cleanups
Marcel Holtmann [Thu, 16 Apr 2009 23:55:42 +0000 (01:55 +0200)]
sbc: Do some coding style cleanups

11 years agosbc: fix up sbc.h prototypes to use const/size_t wherever applicable
Lennart Poettering [Mon, 23 Mar 2009 15:44:11 +0000 (16:44 +0100)]
sbc: fix up sbc.h prototypes to use const/size_t wherever applicable

11 years agosbc: Remove unused variable.
Luiz Augusto von Dentz [Wed, 1 Apr 2009 13:47:39 +0000 (10:47 -0300)]
sbc: Remove unused variable.

11 years agosbc: ensure 16-byte buffer position alignment for 4 subbands encoding
Siarhei Siamashka [Mon, 16 Mar 2009 00:27:26 +0000 (02:27 +0200)]
sbc: ensure 16-byte buffer position alignment for 4 subbands encoding

Buffer position in X array was not always 16-bytes aligned.
Strict 16-byte alignment is strictly required for powerpc altivec
simd optimizations because altivec does not have support for
unaligned vector loads at all.

11 years agosbc: Fix misuse of 'frame.joint' when estimating the frame length.
Luiz Augusto von Dentz [Fri, 20 Mar 2009 21:40:43 +0000 (18:40 -0300)]
sbc: Fix misuse of 'frame.joint' when estimating the frame length.

'frame.joint' is not the flag for joint stereo mode, it is a set of bits which
show for which subbands channels joining was actually used.

11 years agosbc: Fix a couple of other places that should use size_t and ssize_t
Johan Hedberg [Thu, 12 Mar 2009 19:33:14 +0000 (16:33 -0300)]
sbc: Fix a couple of other places that should use size_t and ssize_t

11 years agosbc: don't dereference sbc pointer if NULL
Marc-André Lureau [Tue, 17 Feb 2009 20:46:41 +0000 (22:46 +0200)]
sbc: don't dereference sbc pointer if NULL

11 years agosbc: provide implementation info as a readable string
Marc-André Lureau [Mon, 16 Feb 2009 13:59:51 +0000 (15:59 +0200)]
sbc: provide implementation info as a readable string

This is mainly useful for logging and debugging.

11 years agosbc: make check_mmx_support() a proper C function
Lennart Poettering [Mon, 2 Feb 2009 00:57:14 +0000 (01:57 +0100)]
sbc: make check_mmx_support() a proper C function

Signed-off-by: Lennart Poettering <lennart@poettering.net>
11 years agosbc: Fix SBC to compile cleanly with -Wsign-compare
Marcel Holtmann [Thu, 29 Jan 2009 23:02:58 +0000 (00:02 +0100)]
sbc: Fix SBC to compile cleanly with -Wsign-compare

11 years agosbc: Fix for SBC encoding with block sizes other than 16
Siarhei Siamashka [Thu, 29 Jan 2009 16:15:31 +0000 (18:15 +0200)]
sbc: Fix for SBC encoding with block sizes other than 16

Thanks to Christian Hoene for finding and reporting the
problem. This regression was intruduced in commit
19af3c49e61aa046375497108e05a3a0605da158

11 years agosbc: Add -Wno-sign-compare for the library and fix the other warnings
Marcel Holtmann [Thu, 29 Jan 2009 16:32:58 +0000 (17:32 +0100)]
sbc: Add -Wno-sign-compare for the library and fix the other warnings

11 years agosbc: SBC encoder scale factors calculation optimized with __builtin_clz
Siarhei Siamashka [Thu, 29 Jan 2009 00:17:36 +0000 (02:17 +0200)]
sbc: SBC encoder scale factors calculation optimized with __builtin_clz

Count leading zeros operation is often implemented using a special
instruction for it on various architectures (at least this is true
for ARM and x86). Using __builtin_clz gcc intrinsic allows to
eliminate innermost loop in scale factors calculation and improve
performance. Also scale factors calculation can be optimized even
more using SIMD instructions.

11 years agosbc: Performance optimizations for input data processing in SBC encoder
Siarhei Siamashka [Tue, 27 Jan 2009 16:57:35 +0000 (18:57 +0200)]
sbc: Performance optimizations for input data processing in SBC encoder

Channels deinterleaving, endian conversion and samples reordering
is done in one pass, avoiding the use of intermediate buffer. Also
this code is implemented as a new "performance primitive", which
allows further platform specific optimizations (ARMv6 and ARM NEON
should gain quite a lot from assembly optimizations here).

11 years agosbc: Use of -funroll-loops option to improve SBC encoder performance
Siarhei Siamashka [Wed, 21 Jan 2009 19:08:34 +0000 (21:08 +0200)]
sbc: Use of -funroll-loops option to improve SBC encoder performance

Added the use of -funroll-loops gcc option for SBC. Also in
order to gain better effect, 'sbc_pack_frame' function
body moved to an inline function, which gets instantiated
for 4 different subbands/channels combinations. So that
'frame_subbands' and 'frame_channels' arguments become compile
time constants and can be better optimized by the compiler.

11 years agosbc: Audio quality improvement for 16-bit fixed point SBC encoder
Siarhei Siamashka [Wed, 21 Jan 2009 22:12:40 +0000 (00:12 +0200)]
sbc: Audio quality improvement for 16-bit fixed point SBC encoder

Multiplying the first part of the analysis filter constant tables
by some coefficients and dividing the second part by the same
coefficients is a transformation which should produce the same
results if rounding errors are not taken into account. These
additional C0/C1/... coefficients can be varied in a certain
range (the requirement is that we still do not get overflows).
The 'magic' values for these coefficients are selected in such
a way that the rounding errors are minimized (rounding errors
are unavoidable when putting all the floating constants into
16-bit tables and losing some of the fractional part).

Also non-SIMD variant of the analysis filter is dropped because
keeping it would require applying a similar change to its tables,
which is a bit tricky and just increases maintenance overhead.

11 years agosbc: Fix sbcenc breakage when au file header size is larger than 24 bytes
Siarhei Siamashka [Sun, 18 Jan 2009 21:10:00 +0000 (23:10 +0200)]
sbc: Fix sbcenc breakage when au file header size is larger than 24 bytes

11 years agosbc: Performance optimizations for sbcenc utility
Siarhei Siamashka [Fri, 16 Jan 2009 15:23:54 +0000 (17:23 +0200)]
sbc: Performance optimizations for sbcenc utility

Read and write buffers sizes increased, memmove overhead eliminated.
Nonportable cast from 'unsigned char *' to 'struct au_header *' is
now also resolved as part of the changes.

11 years agosbc: Coding style fixes
Siarhei Siamashka [Sat, 17 Jan 2009 18:30:40 +0000 (20:30 +0200)]
sbc: Coding style fixes

11 years agosbc: Fix indentation to use only tabs
Johan Hedberg [Fri, 16 Jan 2009 18:29:43 +0000 (20:29 +0200)]
sbc: Fix indentation to use only tabs

11 years agosbc: MMX and ARM NEON optimized versions of analysis filter for SBC encoder
Siarhei Siamashka [Thu, 15 Jan 2009 18:25:49 +0000 (20:25 +0200)]
sbc: MMX and ARM NEON optimized versions of analysis filter for SBC encoder

11 years agosbc: SBC arrays and constant tables aligned at 16 byte boundary for SIMD
Siarhei Siamashka [Thu, 15 Jan 2009 17:45:36 +0000 (19:45 +0200)]
sbc: SBC arrays and constant tables aligned at 16 byte boundary for SIMD

Most SIMD instruction sets benefit from data being naturally aligned.
And even if it is not strictly required, performance is usually better
with the aligned data. ARM NEON and SSE2 have different instruction
variants for aligned/unaligned memory accesses.

11 years agosbc: SIMD-friendly variant of SBC encoder analysis filter
Siarhei Siamashka [Thu, 15 Jan 2009 17:11:23 +0000 (19:11 +0200)]
sbc: SIMD-friendly variant of SBC encoder analysis filter

Added SIMD-friendly C implementation of SBC analysis filter (the
structure of code had to be changed a bit and constants in the
tables reordered). This code can be used as a reference for
developing platform specific SIMD optimizations. These functions
are put into a new file 'sbc_primitives.c', which is going to
contain all the basic stuff for SBC codec.

11 years agosbc: Fix for big endian problems in SBC codec
Siarhei Siamashka [Wed, 7 Jan 2009 12:28:48 +0000 (14:28 +0200)]
sbc: Fix for big endian problems in SBC codec

11 years agosbc: Fixed correct handling of frame sizes in the encoder
Christian Hoene [Mon, 5 Jan 2009 12:26:08 +0000 (13:26 +0100)]
sbc: Fixed correct handling of frame sizes in the encoder

11 years agosbc: Use of constant shift in SBC quantization code to make it faster
Siarhei Siamashka [Sun, 4 Jan 2009 01:11:12 +0000 (03:11 +0200)]
sbc: Use of constant shift in SBC quantization code to make it faster

The result of 32x32->64 unsigned long multiplication is returned
in two registers (high and low 32-bit parts) for many 32-bit
architectures. For these architectures constant right shift by
32 bits is optimized out by the compiler to just taking the high
32-bit part. Also some data needed at the quantization stage is
precalculated beforehand to improve performance.

11 years agosbc: Update copyright information
Marcel Holtmann [Thu, 1 Jan 2009 18:33:20 +0000 (19:33 +0100)]
sbc: Update copyright information

11 years agosbc: Added possibility to analyze 4 blocks at once in SBC encoder
Siarhei Siamashka [Wed, 31 Dec 2008 07:14:25 +0000 (09:14 +0200)]
sbc: Added possibility to analyze 4 blocks at once in SBC encoder

This change is needed for SIMD optimizations which will follow
shortly. And even for non-SIMD capable platforms it still may
be useful to have possibility to merge several analyzing functions
together into one for better code scheduling or reusing loaded
constants. Also analysis filter functions are now called using
function pointers, which allows the default implementation to be
overrided at runtime (with high precision variant or MMX/SSE2/NEON
optimized code).

11 years agosbc: New SBC analysis filter function to replace current broken code
Siarhei Siamashka [Sun, 28 Dec 2008 01:22:59 +0000 (03:22 +0200)]
sbc: New SBC analysis filter function to replace current broken code

This code is heavily based on the patch submitted by Jaska Uimonen.
Additional changes include preserving extra bits in the output of
filter function for better precision, support for both 16-bit and
32-bit fixed point implementation. Sign of some table values was
changed in order to preserve a regular code structure and have
multiply-accumulate oparations only. No additional optimizations
were applied as this code is intended to be some kind of "reference"
implementation. Platform specific optimizations may require
different tricks and can be branched off from this implementation.
Some extra information about this code can be found in linux-bluetooth
mailing list archive for December 2008.

11 years agosbc: Fixed subbands selection for joint-stereo in SBC encoder
Siarhei Siamashka [Sat, 27 Dec 2008 17:36:14 +0000 (19:36 +0200)]
sbc: Fixed subbands selection for joint-stereo in SBC encoder

11 years agosbc: Add more options to control encoding methods
Marcel Holtmann [Tue, 23 Dec 2008 22:56:32 +0000 (23:56 +0100)]
sbc: Add more options to control encoding methods

11 years agosbc: Don't decode a frame if it is too small
Marcel Holtmann [Tue, 23 Dec 2008 22:41:38 +0000 (23:41 +0100)]
sbc: Don't decode a frame if it is too small

11 years agosbc: Remove unnecessary code and fix a coding style.
Luiz Augusto von Dentz [Thu, 18 Dec 2008 22:22:31 +0000 (19:22 -0300)]
sbc: Remove unnecessary code and fix a coding style.

11 years agosbc: Fix for overflow bug in SBC quantization code
Siarhei Siamashka [Wed, 17 Dec 2008 20:32:11 +0000 (22:32 +0200)]
sbc: Fix for overflow bug in SBC quantization code

The result of multiplication does not always fit into 32-bits. Using 64-bit
calculations helps to avoid overflows and sound quality problems in encoded
audio. Overflows are more likely to show up when using high values for
bitpool setting.

11 years agosbc: Bitstream writing optimization for SBC encoder
Siarhei Siamashka [Thu, 11 Dec 2008 19:21:28 +0000 (21:21 +0200)]
sbc: Bitstream writing optimization for SBC encoder

SBC encoder performance improvement up to 1.5x for ARM11
and almost twice faster for Intel Core2 in some cases.

11 years agosbc: Add more options to SBC encoder and decoder
Marcel Holtmann [Fri, 31 Oct 2008 23:14:46 +0000 (00:14 +0100)]
sbc: Add more options to SBC encoder and decoder

11 years agosbc: Fix SBC gain mismatch
Marcel Holtmann [Fri, 31 Oct 2008 22:55:13 +0000 (23:55 +0100)]
sbc: Fix SBC gain mismatch

11 years agosbc: Fix SBC decoding handling
Marcel Holtmann [Thu, 30 Oct 2008 19:01:06 +0000 (20:01 +0100)]
sbc: Fix SBC decoding handling

11 years agosbc: Let the decoder write Sun/NeXT audio S16_BE files
Marcel Holtmann [Sun, 26 Oct 2008 00:04:44 +0000 (02:04 +0200)]
sbc: Let the decoder write Sun/NeXT audio S16_BE files

11 years agosbc: Add bitpool option to encoder
Marcel Holtmann [Sat, 25 Oct 2008 23:32:52 +0000 (01:32 +0200)]
sbc: Add bitpool option to encoder

11 years agosbc: Fix missing encoding of last frame
Marcel Holtmann [Sat, 25 Oct 2008 22:26:20 +0000 (00:26 +0200)]
sbc: Fix missing encoding of last frame

11 years agosbc: Add low-complexity, subband codec support
Marcel Holtmann [Mon, 30 Jul 2012 02:34:44 +0000 (19:34 -0700)]
sbc: Add low-complexity, subband codec support

11 years agoInitial revision
Marcel Holtmann [Wed, 11 Jul 2012 12:51:00 +0000 (09:51 -0300)]
Initial revision