OSDN Git Service
Paul B Mahol [Sat, 28 Jan 2017 16:23:31 +0000 (17:23 +0100)]
avformat/sccdec: attempt to fix valgrind issue
Signed-off-by: Paul B Mahol <onemda@gmail.com>
Chris Moeller [Fri, 27 Jan 2017 21:20:31 +0000 (13:20 -0800)]
avformat: fix ID3v2 parser for v2.2 comment frames
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Aaron Colwell [Fri, 27 Jan 2017 17:33:29 +0000 (09:33 -0800)]
mov: Fix spherical metadata_source parsing
Signed-off-by: James Almer <jamrial@gmail.com>
Michael Niedermayer [Sat, 21 Jan 2017 22:01:50 +0000 (23:01 +0100)]
avfilter/vf_gblur: Increase supported pixel count from 31bit to 32bit in filter_postscale()
Fixes CID1396252
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Sasi Inguva [Thu, 26 Jan 2017 00:41:44 +0000 (16:41 -0800)]
ffmpeg.c: Add output file index and stream index to vstats file.
Signed-off-by: Sasi Inguva <isasi@google.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Sasi Inguva [Thu, 26 Jan 2017 19:26:46 +0000 (11:26 -0800)]
lavf/matroskaenc.c: Free dyn bufs in mkv_free. Fixes memory leaks when muxing fails.
Signed-off-by: Sasi Inguva <isasi@google.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Paul B Mahol [Wed, 25 Jan 2017 21:28:48 +0000 (22:28 +0100)]
fate: add SCC test
Signed-off-by: Paul B Mahol <onemda@gmail.com>
Paul B Mahol [Fri, 27 Jan 2017 12:37:00 +0000 (13:37 +0100)]
avfilter/avf_showspectrum: fix 2 possible crashes
Make sure no division by zero is done.
Make sure there are actually samples available.
Signed-off-by: Paul B Mahol <onemda@gmail.com>
Paul B Mahol [Fri, 27 Jan 2017 11:13:42 +0000 (12:13 +0100)]
doc/filters: mention recently added option
Signed-off-by: Paul B Mahol <onemda@gmail.com>
Carl Eugen Hoyos [Fri, 27 Jan 2017 07:31:07 +0000 (08:31 +0100)]
lavf/img2dec: Reduce the probe score for incomplete jpgs.
Ensures that probing doesn't finish prematurely for small files.
Michael Niedermayer [Thu, 26 Jan 2017 23:14:02 +0000 (00:14 +0100)]
avcodec/h264dec: Clear ref_count on slice header processing failure
Fixes using freed memory
Introduced in
744801989099df26e90b00062c645969c5347533
Fixes: 471/fuzz-1-ffmpeg_VIDEO_AV_CODEC_ID_H264_fuzzer
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
James Almer [Thu, 26 Jan 2017 22:28:09 +0000 (19:28 -0300)]
avformat/matroskadec: ProjectionPrivate is optional on Equirectangular projections
This reflects a recent change to the spec draft.
Signed-off-by: James Almer <jamrial@gmail.com>
Joel Cunningham [Mon, 9 Jan 2017 20:54:47 +0000 (14:54 -0600)]
tcp: set socket buffer sizes before listen/connect/accept
From
e24d95c0e06a878d401ee34fd6742fcaddeeb95f Mon Sep 17 00:00:00 2001
From: Joel Cunningham <joel.cunningham@me.com>
Date: Mon, 9 Jan 2017 13:37:51 -0600
Subject: [PATCH] tcp: set socket buffer sizes before listen/connect/accept
Attempting to set SO_RCVBUF and SO_SNDBUF on TCP sockets after connection
establishment is incorrect and some stacks ignore the set call on the socket at
this point. This has been observed on MacOS/iOS. Windows 7 has some peculiar
behavior where setting SO_RCVBUF after applies only if the buffer is increasing
from the default while decreases are ignored. This is possibly how the incorrect
usage has gone unnoticed
Unix Network Programming Vol. 1: The Sockets Networking API (3rd edition, seciton 7.5):
"When setting the size of the TCP socket receive buffer, the ordering of the
function calls is important. This is because of TCP's window scale option,
which is exchanged with the peer on SYN segments when the connection is
established. For a client, this means the SO_RCVBUF socket option must be
set before calling connect. For a server, this means the socket option must
be set for the listening socket before calling listen. Setting this option
for the connected socket will have no effect whatsoever on the possible window
scale option because accept does not return with the connected socket until
TCP's three-way handshake is complete. This is why the option must be set on
the listening socket. (The sizes of the socket buffers are always inherited from
the listening socket by the newly created connected socket)"
Signed-off-by: Joel Cunningham <joel.cunningham@me.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Paul B Mahol [Thu, 5 May 2016 10:15:39 +0000 (12:15 +0200)]
avfilter: add abitscope multimedia filter
Signed-off-by: Paul B Mahol <onemda@gmail.com>
Frank Liberato [Tue, 24 Jan 2017 18:58:17 +0000 (10:58 -0800)]
avformat/flacdec: Check avio_read result when reading flac block header.
Return AVERROR_INVALIDDATA if all four bytes aren't present.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Sasi Inguva [Tue, 24 Jan 2017 16:23:54 +0000 (08:23 -0800)]
ffmpeg_opt.c: Introduce a -vstats_version option and document the existing -vstats format.
Signed-off-by: Sasi Inguva <isasi@google.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Paul B Mahol [Wed, 25 Jan 2017 10:00:13 +0000 (11:00 +0100)]
avcodecc/ccaption_dec: remove extra word from long codec description
Signed-off-by: Paul B Mahol <onemda@gmail.com>
Paul B Mahol [Tue, 24 Jan 2017 15:34:29 +0000 (16:34 +0100)]
avformat: add Scenarist Closed Captions demuxer
Fixes #4767.
Signed-off-by: Paul B Mahol <onemda@gmail.com>
Paul B Mahol [Sat, 21 Jan 2017 11:29:44 +0000 (12:29 +0100)]
avformat: add Sample Dump eXchange demuxer
Signed-off-by: Paul B Mahol <onemda@gmail.com>
Carl Eugen Hoyos [Wed, 25 Jan 2017 10:49:04 +0000 (11:49 +0100)]
lavf/mov: Unscramble dref debug output.
compn [Wed, 25 Jan 2017 04:46:38 +0000 (23:46 -0500)]
isom: map xalg and avlg to h264, fixes ticket #6099
Michael Niedermayer [Tue, 24 Jan 2017 23:20:19 +0000 (00:20 +0100)]
avcodec/utils: correct align value for interplay
Fixes out of array access
Fixes: 452/fuzz-1-ffmpeg_VIDEO_AV_CODEC_ID_INTERPLAY_VIDEO_fuzzer
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Carl Eugen Hoyos [Tue, 24 Jan 2017 23:55:36 +0000 (00:55 +0100)]
Cosmetics: Reindent after last commit.
Carl Eugen Hoyos [Sat, 10 Dec 2016 15:43:00 +0000 (16:43 +0100)]
lavd/v4l2: Avoid setting frame_size to a negative value.
Carl Eugen Hoyos [Sat, 14 Jan 2017 18:17:09 +0000 (19:17 +0100)]
lavf/rtmpproto: Make bytes_read variables 64bit.
When bytes_read overflowed, last_bytes_read did not yet overflow
and no bytes-read report was created leading to a timeout.
Analyzed-by: Thomas Bernhard
Fixes ticket #5836.
Marton Balint [Mon, 26 Dec 2016 01:03:37 +0000 (02:03 +0100)]
avfilter/formats: do not allow unknown layouts in ff_parse_channel_layout if nret is not set
Current code returned the number of channels as channel layout in that case,
and if nret is not set then unknown layouts are typically not supported.
Also use the common parsing code. Use a temporary workaround to parse an
unknown channel layout such as '13c', after a 1 year grace period only '13C'
will work.
Signed-off-by: Marton Balint <cus@passwd.hu>
Marton Balint [Mon, 26 Dec 2016 00:19:34 +0000 (01:19 +0100)]
avutil/channel_layout: add av_get_extended_channel_layout
Return a channel layout and the number of channels based on the specified name.
This function is similar to av_get_channel_layout(), but can also parse unknown
channel layout specifications.
Unknown channel layout specifications are a decimal number and a capital 'C'
suffix, in order to not break compatibility with the lowercase 'c' suffix,
which is used for a guessed channel layout with the specified number of
channels.
Signed-off-by: Marton Balint <cus@passwd.hu>
Marton Balint [Mon, 26 Dec 2016 00:06:22 +0000 (01:06 +0100)]
avutil/channel_layout: fix remains of old syntax in docs and comments
Signed-off-by: Marton Balint <cus@passwd.hu>
Carl Eugen Hoyos [Mon, 23 Jan 2017 12:39:56 +0000 (13:39 +0100)]
lavc/svq3: Fail for media key encryption.
Tested-by: ami_stuff
Fixes a part of ticket #6094.
Michael Niedermayer [Tue, 24 Jan 2017 21:21:25 +0000 (22:21 +0100)]
avcodec/vp56: Check for the bitstream end, pass error codes on
Fixes timeout
Fixes: 446/fuzz-3-ffmpeg_VIDEO_AV_CODEC_ID_VP6_fuzzer
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Martin Storsjö [Thu, 5 Jan 2017 10:52:06 +0000 (12:52 +0200)]
aarch64: Add NEON optimizations for 10 and 12 bit vp9 loop filter
This work is sponsored by, and copyright, Google.
This is similar to the arm version, but due to the larger registers
on aarch64, we can do 8 pixels at a time for all filter sizes.
Examples of runtimes vs the 32 bit version, on a Cortex A53:
ARM AArch64
vp9_loop_filter_h_4_8_10bpp_neon: 213.2 172.6
vp9_loop_filter_h_8_8_10bpp_neon: 281.2 244.2
vp9_loop_filter_h_16_8_10bpp_neon: 657.0 444.5
vp9_loop_filter_h_16_16_10bpp_neon: 1280.4 877.7
vp9_loop_filter_mix2_h_44_16_10bpp_neon: 397.7 358.0
vp9_loop_filter_mix2_h_48_16_10bpp_neon: 465.7 429.0
vp9_loop_filter_mix2_h_84_16_10bpp_neon: 465.7 428.0
vp9_loop_filter_mix2_h_88_16_10bpp_neon: 533.7 499.0
vp9_loop_filter_mix2_v_44_16_10bpp_neon: 271.5 244.0
vp9_loop_filter_mix2_v_48_16_10bpp_neon: 330.0 305.0
vp9_loop_filter_mix2_v_84_16_10bpp_neon: 329.0 306.0
vp9_loop_filter_mix2_v_88_16_10bpp_neon: 386.0 365.0
vp9_loop_filter_v_4_8_10bpp_neon: 150.0 115.2
vp9_loop_filter_v_8_8_10bpp_neon: 209.0 175.5
vp9_loop_filter_v_16_8_10bpp_neon: 492.7 345.2
vp9_loop_filter_v_16_16_10bpp_neon: 951.0 682.7
This is significantly faster than the ARM version in almost
all cases except for the mix2 functions.
Based on START_TIMER/STOP_TIMER wrapping around a few individual
functions, the speedup vs C code is around 2-3x.
Signed-off-by: Martin Storsjö <martin@martin.st>
Martin Storsjö [Tue, 3 Jan 2017 12:35:54 +0000 (14:35 +0200)]
aarch64: Add NEON optimizations for 10 and 12 bit vp9 itxfm
This work is sponsored by, and copyright, Google.
Compared to the arm version, on aarch64 we can keep the full 8x8
transform in registers, and for 16x16 and 32x32, we can process
it in slices of 4 pixels instead of 2.
Examples of runtimes vs the 32 bit version, on a Cortex A53:
ARM AArch64
vp9_inv_adst_adst_4x4_sub4_add_10_neon: 111.0 109.7
vp9_inv_adst_adst_8x8_sub8_add_10_neon: 914.0 733.5
vp9_inv_adst_adst_16x16_sub16_add_10_neon: 5184.0 3745.7
vp9_inv_dct_dct_4x4_sub1_add_10_neon: 65.0 65.7
vp9_inv_dct_dct_4x4_sub4_add_10_neon: 100.0 96.7
vp9_inv_dct_dct_8x8_sub1_add_10_neon: 111.0 119.7
vp9_inv_dct_dct_8x8_sub8_add_10_neon: 618.0 494.7
vp9_inv_dct_dct_16x16_sub1_add_10_neon: 295.1 284.6
vp9_inv_dct_dct_16x16_sub2_add_10_neon: 2303.2 1883.9
vp9_inv_dct_dct_16x16_sub8_add_10_neon: 2984.8 2189.3
vp9_inv_dct_dct_16x16_sub16_add_10_neon: 3890.0 2799.4
vp9_inv_dct_dct_32x32_sub1_add_10_neon: 1044.4 1012.7
vp9_inv_dct_dct_32x32_sub2_add_10_neon: 13333.7 9695.1
vp9_inv_dct_dct_32x32_sub16_add_10_neon: 18531.3 12459.8
vp9_inv_dct_dct_32x32_sub32_add_10_neon: 24470.7 16160.2
vp9_inv_wht_wht_4x4_sub4_add_10_neon: 83.0 79.7
The larger transforms are significantly faster than the corresponding
ARM versions.
The speedup vs C code is smaller than in 32 bit mode, probably
because the 64 bit intermediates in the C code can be expressed
more efficiently in aarch64.
Signed-off-by: Martin Storsjö <martin@martin.st>
Martin Storsjö [Wed, 14 Dec 2016 21:48:35 +0000 (23:48 +0200)]
aarch64: Add NEON optimizations for 10 and 12 bit vp9 MC
This work is sponsored by, and copyright, Google.
This has mostly got the same differences to the 8 bit version as
in the arm version. For the horizontal filters, we do 16 pixels
in parallel as well. For the 8 pixel wide vertical filters, we can
accumulate 4 rows before storing, just as in the 8 bit version.
Examples of runtimes vs the 32 bit version, on a Cortex A53:
ARM AArch64
vp9_avg4_10bpp_neon: 35.7 30.7
vp9_avg8_10bpp_neon: 93.5 84.7
vp9_avg16_10bpp_neon: 324.4 296.6
vp9_avg32_10bpp_neon: 1236.5 1148.2
vp9_avg64_10bpp_neon: 4639.6 4571.1
vp9_avg_8tap_smooth_4h_10bpp_neon: 130.0 128.0
vp9_avg_8tap_smooth_4hv_10bpp_neon: 440.0 440.5
vp9_avg_8tap_smooth_4v_10bpp_neon: 114.0 105.5
vp9_avg_8tap_smooth_8h_10bpp_neon: 327.0 314.0
vp9_avg_8tap_smooth_8hv_10bpp_neon: 918.7 865.4
vp9_avg_8tap_smooth_8v_10bpp_neon: 330.0 300.2
vp9_avg_8tap_smooth_16h_10bpp_neon: 1187.5 1155.5
vp9_avg_8tap_smooth_16hv_10bpp_neon: 2663.1 2591.0
vp9_avg_8tap_smooth_16v_10bpp_neon: 1107.4 1078.3
vp9_avg_8tap_smooth_64h_10bpp_neon: 17754.6 17454.7
vp9_avg_8tap_smooth_64hv_10bpp_neon: 33285.2 33001.5
vp9_avg_8tap_smooth_64v_10bpp_neon: 16066.9 16048.6
vp9_put4_10bpp_neon: 25.5 21.7
vp9_put8_10bpp_neon: 56.0 52.0
vp9_put16_10bpp_neon/armv8: 183.0 163.1
vp9_put32_10bpp_neon/armv8: 678.6 563.1
vp9_put64_10bpp_neon/armv8: 2679.9 2195.8
vp9_put_8tap_smooth_4h_10bpp_neon: 120.0 118.0
vp9_put_8tap_smooth_4hv_10bpp_neon: 435.2 435.0
vp9_put_8tap_smooth_4v_10bpp_neon: 107.0 98.2
vp9_put_8tap_smooth_8h_10bpp_neon: 303.0 290.0
vp9_put_8tap_smooth_8hv_10bpp_neon: 893.7 828.7
vp9_put_8tap_smooth_8v_10bpp_neon: 305.5 263.5
vp9_put_8tap_smooth_16h_10bpp_neon: 1089.1 1059.2
vp9_put_8tap_smooth_16hv_10bpp_neon: 2578.8 2452.4
vp9_put_8tap_smooth_16v_10bpp_neon: 1009.5 933.5
vp9_put_8tap_smooth_64h_10bpp_neon: 16223.4 15918.6
vp9_put_8tap_smooth_64hv_10bpp_neon: 32153.0 31016.2
vp9_put_8tap_smooth_64v_10bpp_neon: 14516.5 13748.1
These are generally about as fast as the corresponding ARM
routines on the same CPU (at least on the A53), in most cases
marginally faster.
The speedup vs C code is around 4-9x.
Signed-off-by: Martin Storsjö <martin@martin.st>
Martin Storsjö [Wed, 14 Dec 2016 21:38:02 +0000 (23:38 +0200)]
aarch64: vp9dsp: Restructure the bpp checks
This work is sponsored by, and copyright, Google.
This is more in line with how it will be extended for more bitdepths.
Signed-off-by: Martin Storsjö <martin@martin.st>
Martin Storsjö [Thu, 5 Jan 2017 10:51:08 +0000 (12:51 +0200)]
arm: Add NEON optimizations for 10 and 12 bit vp9 loop filter
This work is sponsored by, and copyright, Google.
This is pretty much similar to the 8 bpp version, but in some senses
simpler. All input pixels are 16 bits, and all intermediates also fit
in 16 bits, so there's no lengthening/narrowing in the filter at all.
For the full 16 pixel wide filter, we can only process 4 pixels at a time
(using an implementation very much similar to the one for 8 bpp),
but we can do 8 pixels at a time for the 4 and 8 pixel wide filters with
a different implementation of the core filter.
Examples of relative speedup compared to the C version, from checkasm:
Cortex A7 A8 A9 A53
vp9_loop_filter_h_4_8_10bpp_neon: 1.83 2.16 1.40 2.09
vp9_loop_filter_h_8_8_10bpp_neon: 1.39 1.67 1.24 1.70
vp9_loop_filter_h_16_8_10bpp_neon: 1.56 1.47 1.10 1.81
vp9_loop_filter_h_16_16_10bpp_neon: 1.94 1.69 1.33 2.24
vp9_loop_filter_mix2_h_44_16_10bpp_neon: 2.01 2.27 1.67 2.39
vp9_loop_filter_mix2_h_48_16_10bpp_neon: 1.84 2.06 1.45 2.19
vp9_loop_filter_mix2_h_84_16_10bpp_neon: 1.89 2.20 1.47 2.29
vp9_loop_filter_mix2_h_88_16_10bpp_neon: 1.69 2.12 1.47 2.08
vp9_loop_filter_mix2_v_44_16_10bpp_neon: 3.16 3.98 2.50 4.05
vp9_loop_filter_mix2_v_48_16_10bpp_neon: 2.84 3.64 2.25 3.77
vp9_loop_filter_mix2_v_84_16_10bpp_neon: 2.65 3.45 2.16 3.54
vp9_loop_filter_mix2_v_88_16_10bpp_neon: 2.55 3.30 2.16 3.55
vp9_loop_filter_v_4_8_10bpp_neon: 2.85 3.97 2.24 3.68
vp9_loop_filter_v_8_8_10bpp_neon: 2.27 3.19 1.96 3.08
vp9_loop_filter_v_16_8_10bpp_neon: 3.42 2.74 2.26 4.40
vp9_loop_filter_v_16_16_10bpp_neon: 2.86 2.44 1.93 3.88
The speedup vs C code measured in checkasm is around 1.1-4x.
These numbers are quite inconclusive though, since the checkasm test
runs multiple filterings on top of each other, so later rounds might
end up with different codepaths (different decisions on which filter
to apply, based on input pixel differences).
Based on START_TIMER/STOP_TIMER wrapping around a few individual
functions, the speedup vs C code is around 2-4x.
Signed-off-by: Martin Storsjö <martin@martin.st>
Martin Storsjö [Sat, 17 Dec 2016 22:22:15 +0000 (00:22 +0200)]
arm: Add NEON optimizations for 10 and 12 bit vp9 itxfm
This work is sponsored by, and copyright, Google.
This is structured similarly to the 8 bit version. In the 8 bit
version, the coefficients are 16 bits, and intermediates are 32 bits.
Here, the coefficients are 32 bit. For the 4x4 transforms for 10 bit
content, the intermediates also fit in 32 bits, but for all other
transforms (4x4 for 12 bit content, and 8x8 and larger for both 10
and 12 bit) the intermediates are 64 bit.
For the existing 8 bit case, the 8x8 transform fit all coefficients in
registers; for 10/12 bit, when the coefficients are 32 bit, the 8x8
transform also has to be done in slices of 4 pixels (just as 16x16 and
32x32 for 8 bit).
The slice width also shrinks from 4 elements to 2 elements in parallel
for the 16x16 and 32x32 cases.
The 16 bit coefficients from idct_coeffs and similar tables also need
to be lenghtened to 32 bit in order to be used in multiplication with
vectors with 32 bit elements. This leads to the fixed coefficient
vectors needing more space, leading to more cases where they have to
be reloaded within the transform (in iadst16).
This technically would need testing in checkasm for subpartitions
in increments of 2, but that slows down normal checkasm runs
excessively.
Examples of relative speedup compared to the C version, from checkasm:
Cortex A7 A8 A9 A53
vp9_inv_adst_adst_4x4_sub4_add_10_neon: 4.83 11.36 5.22 6.77
vp9_inv_adst_adst_8x8_sub8_add_10_neon: 4.12 7.60 4.06 4.84
vp9_inv_adst_adst_16x16_sub16_add_10_neon: 3.93 8.16 4.52 5.35
vp9_inv_dct_dct_4x4_sub1_add_10_neon: 1.36 2.57 1.41 1.61
vp9_inv_dct_dct_4x4_sub4_add_10_neon: 4.24 8.66 5.06 5.81
vp9_inv_dct_dct_8x8_sub1_add_10_neon: 2.63 4.18 1.68 2.87
vp9_inv_dct_dct_8x8_sub4_add_10_neon: 4.52 9.47 4.24 5.39
vp9_inv_dct_dct_8x8_sub8_add_10_neon: 3.45 7.34 3.45 4.30
vp9_inv_dct_dct_16x16_sub1_add_10_neon: 3.56 6.21 2.47 4.32
vp9_inv_dct_dct_16x16_sub2_add_10_neon: 5.68 12.73 5.28 7.07
vp9_inv_dct_dct_16x16_sub8_add_10_neon: 4.42 9.28 4.24 5.45
vp9_inv_dct_dct_16x16_sub16_add_10_neon: 3.41 7.29 3.35 4.19
vp9_inv_dct_dct_32x32_sub1_add_10_neon: 4.52 8.35 3.83 6.40
vp9_inv_dct_dct_32x32_sub2_add_10_neon: 5.86 13.19 6.14 7.04
vp9_inv_dct_dct_32x32_sub16_add_10_neon: 4.29 8.11 4.59 5.06
vp9_inv_dct_dct_32x32_sub32_add_10_neon: 3.31 5.70 3.56 3.84
vp9_inv_wht_wht_4x4_sub4_add_10_neon: 1.89 2.80 1.82 1.97
The speedup compared to the C functions is around 1.3 to 7x for the
full transforms, even higher for the smaller subpartitions.
Signed-off-by: Martin Storsjö <martin@martin.st>
Martin Storsjö [Thu, 8 Dec 2016 21:35:31 +0000 (23:35 +0200)]
arm: Add NEON optimizations for 10 and 12 bit vp9 MC
This work is sponsored by, and copyright, Google.
The plain pixel put/copy functions are used from the 8 bit version,
for the double size (e.g. put16 uses ff_vp9_copy32_neon), and a new
copy128 is added.
Compared with the 8 bit version, the filters can no longer use the
trick to accumulate in 16 bit with only saturation at the end, but now
the accumulators need to be 32 bit. This avoids the need to keep track
of which filter index is the largest though, reducing the size of the
executable code for these filters.
For the horizontal filters, we only do 4 or 8 pixels wide in parallel
(while doing two rows at a time), since we don't have enough register
space to filter 16 pixels wide.
For the vertical filters, we still do 4 and 8 pixels in parallel just
as in the 8 bit case, but we need to store the output after every 2
rows instead of after every 4 rows.
Examples of relative speedup compared to the C version, from checkasm:
Cortex A7 A8 A9 A53
vp9_avg4_10bpp_neon: 2.25 2.44 3.05 2.16
vp9_avg8_10bpp_neon: 3.66 8.48 3.86 3.50
vp9_avg16_10bpp_neon: 3.39 8.26 3.37 2.72
vp9_avg32_10bpp_neon: 4.03 10.20 4.07 3.42
vp9_avg64_10bpp_neon: 4.15 10.01 4.13 3.70
vp9_avg_8tap_smooth_4h_10bpp_neon: 3.38 6.22 3.41 4.75
vp9_avg_8tap_smooth_4hv_10bpp_neon: 3.89 6.39 4.30 5.32
vp9_avg_8tap_smooth_4v_10bpp_neon: 5.32 9.73 6.34 7.31
vp9_avg_8tap_smooth_8h_10bpp_neon: 4.45 9.40 4.68 6.87
vp9_avg_8tap_smooth_8hv_10bpp_neon: 4.64 8.91 5.44 6.47
vp9_avg_8tap_smooth_8v_10bpp_neon: 6.44 13.42 8.68 8.79
vp9_avg_8tap_smooth_64h_10bpp_neon: 4.66 9.02 4.84 7.71
vp9_avg_8tap_smooth_64hv_10bpp_neon: 4.61 9.14 4.92 7.10
vp9_avg_8tap_smooth_64v_10bpp_neon: 6.90 14.13 9.57 10.41
vp9_put4_10bpp_neon: 1.33 1.46 2.09 1.33
vp9_put8_10bpp_neon: 1.57 3.42 1.83 1.84
vp9_put16_10bpp_neon: 1.55 4.78 2.17 1.89
vp9_put32_10bpp_neon: 2.06 5.35 2.14 2.30
vp9_put64_10bpp_neon: 3.00 2.41 1.95 1.66
vp9_put_8tap_smooth_4h_10bpp_neon: 3.19 5.81 3.31 4.63
vp9_put_8tap_smooth_4hv_10bpp_neon: 3.86 6.22 4.32 5.21
vp9_put_8tap_smooth_4v_10bpp_neon: 5.40 9.77 6.08 7.21
vp9_put_8tap_smooth_8h_10bpp_neon: 4.22 8.41 4.46 6.63
vp9_put_8tap_smooth_8hv_10bpp_neon: 4.56 8.51 5.39 6.25
vp9_put_8tap_smooth_8v_10bpp_neon: 6.60 12.43 8.17 8.89
vp9_put_8tap_smooth_64h_10bpp_neon: 4.41 8.59 4.54 7.49
vp9_put_8tap_smooth_64hv_10bpp_neon: 4.43 8.58 5.34 6.63
vp9_put_8tap_smooth_64v_10bpp_neon: 7.26 13.92 9.27 10.92
For the larger 8tap filters, the speedup vs C code is around 4-14x.
Signed-off-by: Martin Storsjö <martin@martin.st>
Martin Storsjö [Thu, 8 Dec 2016 21:23:44 +0000 (23:23 +0200)]
arm: vp9dsp: Restructure the bpp checks
This work is sponsored by, and copyright, Google.
This is more in line with how it will be extended for more bitdepths.
Signed-off-by: Martin Storsjö <martin@martin.st>
Clément Bœsch [Tue, 24 Jan 2017 18:35:33 +0000 (19:35 +0100)]
Merge commit '
fd5e6a095f69495c558069315d6b36ea410c31fa'
* commit '
fd5e6a095f69495c558069315d6b36ea410c31fa':
x86util: Extend SPLATW for avx2
This commit is a noop, see
1ace9573dce509e2b25165199c3b658667860ecf
(only libavutil/x86/x86util.asm chunk).
Merged-by: Clément Bœsch <u@pkh.me>
Clément Bœsch [Tue, 24 Jan 2017 18:32:12 +0000 (19:32 +0100)]
Merge commit '
37961044c6'
* commit '
37961044c6':
checkasm: arm: Ignore changes to bits 0-4 and 7 of FPSCR
cheackasm/arm: remove NEON instructions from checkasm_checked_call_vfp
checkasm: arm: Don't start new const blocks for each string
This merge is a noop: the changes were included in
9f1c81e5ec.
Merged-by: Clément Bœsch <u@pkh.me>
Clément Bœsch [Tue, 24 Jan 2017 18:29:35 +0000 (19:29 +0100)]
Merge commit '
5ece6911010b3464d2fdacfa8031c15b5bd83418'
* commit '
5ece6911010b3464d2fdacfa8031c15b5bd83418':
apichanges: Fill in missing hashes and dates
This commit is a noop as we need to fill with our own hashes.
Merged-by: Clément Bœsch <u@pkh.me>
Clément Bœsch [Tue, 24 Jan 2017 18:26:51 +0000 (19:26 +0100)]
Merge commit '
facdfe40805559963b5875931af9406ed5ddcd5c'
* commit '
facdfe40805559963b5875931af9406ed5ddcd5c':
swscale: Add proper ff_ prefix to init functions
This commit is a noop, see
e8c37160640952ab036e643156add9638c062536
I'm keeping our ff_sws_ vs ff_ since we use ff_sws_ in other places in
swscale.
Merged-by: Clément Bœsch <u@pkh.me>
Clément Bœsch [Tue, 24 Jan 2017 18:23:48 +0000 (19:23 +0100)]
Merge commit '
c0fd2fb27bebd1d5ab028e6df6bca9119d269122'
* commit '
c0fd2fb27bebd1d5ab028e6df6bca9119d269122':
swscale: Rename sws_context_class to ff_sws_context_class
This commit is a noop, see
8bfbc8c5e504ef3ae914499646d450987b419385
Merged-by: Clément Bœsch <u@pkh.me>
Clément Bœsch [Tue, 24 Jan 2017 18:17:38 +0000 (19:17 +0100)]
Merge commit '
71a0472114574993df7035f4de9aa007e03817b8'
* commit '
71a0472114574993df7035f4de9aa007e03817b8':
checkasm: arm: report the first clobbered register in checkasm_checked_call
Also includes
446353ea18,
59aeed93e4, and
37961044c6 to avoid breaking
too much stuff.
Merged-by: Clément Bœsch <u@pkh.me>
Michael Niedermayer [Tue, 24 Jan 2017 15:13:05 +0000 (16:13 +0100)]
avcodec/mjpegdec: Check remaining bitstream in ljpeg_decode_yuv_scan()
Fixes timeout
Fixes: 445/fuzz-3-ffmpeg_VIDEO_AV_CODEC_ID_MJPEG_fuzzer
Fixes: 456/fuzz-2-ffmpeg_VIDEO_AV_CODEC_ID_JPEGLS_fuzzer
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Clément Bœsch [Tue, 24 Jan 2017 15:34:00 +0000 (16:34 +0100)]
Merge commit '
a8fce24b9c5a87187f5bd864b18f5b3e575f8c3d'
* commit '
a8fce24b9c5a87187f5bd864b18f5b3e575f8c3d':
avconv_dxva2: support HEVC Main10 decoding
This commit is a noop, see
1ec14612a5bd3506ffd755f8d42db776bfa40ae8
Merged-by: Clément Bœsch <u@pkh.me>
Clément Bœsch [Tue, 24 Jan 2017 15:31:10 +0000 (16:31 +0100)]
Merge commit '
33f6690eb4e21acc4b581688eecfc4cc5ea9515e'
* commit '
33f6690eb4e21acc4b581688eecfc4cc5ea9515e':
hevc: offer DXVA2 for 10bit 420
This commit is a noop, see
ccb94789e2968329947f1c2e00d019f387f9c409
Merged-by: Clément Bœsch <u@pkh.me>
Clément Bœsch [Tue, 17 Jan 2017 14:26:38 +0000 (15:26 +0100)]
Merge commit '
38efff92f1ef81f3de20ff0460ec7b70c253d714'
* commit '
38efff92f1ef81f3de20ff0460ec7b70c253d714':
FATE: add a test for H.264 with two fields per packet
h264: fix decoding multiple fields per packet with slice threads
This merge includes two commits because the FATE test was useful in
order to make proper testing.
The merge gets rid of the now unused:
- SLICE_SINGLETHREAD and SLICE_SKIPED macros
- max_contexts
- "again" label in decode_nal_units()
This commit also includes the fix from
d3e4d406b.
Thanks to wm4 and Michael Niedermayer for their testing.
Merged-by: Clément Bœsch <u@pkh.me>
Merged-by: Matthieu Bouron <matthieu.bouron@gmail.com>
Steven Liu [Tue, 24 Jan 2017 14:25:29 +0000 (22:25 +0800)]
avformat/hlsenc: improve to write m3u8 head block
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
Michael Niedermayer [Mon, 23 Jan 2017 21:33:27 +0000 (22:33 +0100)]
avcodec/h264dec: Fix regression with "make fate-h264-attachment-631 THREADS=8"
This treats the case of no slices like no frames which it basically is.
The field is added to the context as other nal related fields are also there
and passing the has_slices field per *arguments is ugly and not consistent
Found-by: ubitux
Approved-by: ubitux
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Paul B Mahol [Sat, 14 Jan 2017 18:04:54 +0000 (19:04 +0100)]
avfilter: add EIA-608 line extractor
Signed-off-by: Dave Rice <dave@dericed.com>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
Steven Liu [Tue, 24 Jan 2017 04:31:36 +0000 (12:31 +0800)]
avformat/flvenc: refine the flvenc shift_data code
refine the flvenc shift_data move data option
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
Steven Liu [Tue, 24 Jan 2017 04:29:01 +0000 (12:29 +0800)]
avformat/hlsenc: refine the code readable for time unit
Reviewed-by: Bodecs Bela <bodecsb@vivanet.hu>
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
Felipe Astroza [Mon, 23 Jan 2017 17:55:31 +0000 (14:55 -0300)]
libavformat/tee: tee was passing a wrong option name for fifo's format_options
If fifo is enabled on tee muxer, ffmpeg exits because of an unknown option passed to fifo muxer.
Option name "format_options" was replaced by "format_opts" on tee muxer.
Signed-off-by: Felipe Astroza <felipe@astroza.cl>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Pavel Koshevoy [Sun, 22 Jan 2017 23:20:05 +0000 (16:20 -0700)]
avcodec/cuvid: fail early if GPU can't handle video resolution
CUVID on GeForce GT 730 and GeForce GTX 1060 does not report any error when
decoding 8K h264 packets. However, it does return an error during
cuvidCreateDecoder call if the indicated video resolution is not
supported.
Given that stream resolution is typically known as a result of probing
it is better to use this information during avcodec_open2 call to fail
immediately, rather than proceeding to decode and never receiving any
frames from the decoder nor receiving any indication of decode failure.
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
wm4 [Mon, 16 Jan 2017 15:43:13 +0000 (16:43 +0100)]
hwcontext_cuda: implement frames_get_constraints
Copied and modified from hwcontext_qsv.c.
Rodger Combs [Sat, 21 Jan 2017 02:15:03 +0000 (20:15 -0600)]
lavf/segment: fix crash when failing to open segment list
This happens because segment_end() returns an error, so seg_write_packet
never proceeds to segment_start(), and seg->avf->pb is never re-set,
so we crash with a null pb when av_write_trailer flushes the packet
queue.
This doesn't seem to be clearly recoverable, so I'm just failing more
gracefully.
Repro:
ffmpeg -i input.ts -f segment -c copy -segment_list /noaxx.m3u8 test-%05d.ts
(assuming you don't have write access to /)
Michael Niedermayer [Mon, 23 Jan 2017 00:25:27 +0000 (01:25 +0100)]
avcodec/pngdec: Fix off by 1 size in decode_zbuf()
Fixes out of array access
Fixes: 444/fuzz-2-ffmpeg_VIDEO_AV_CODEC_ID_PNG_fuzzer
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Michael Niedermayer [Sun, 22 Jan 2017 20:43:06 +0000 (21:43 +0100)]
avcodec/error_resilience: update indention after last commit
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Michael Niedermayer [Sun, 22 Jan 2017 20:14:05 +0000 (21:14 +0100)]
avcodec/error_resilience: Optimize motion recovery code by using blcok lists
This makes the code 7 times faster with the testcase from libfuzzer
and should reduce the amount of timeouts we hit in automated fuzzing.
(for example 438/fuzz-2-ffmpeg_VIDEO_AV_CODEC_ID_RV40_fuzzer)
The code is also faster with more realistic input though the difference
is small here as that is far from the worst cases the fuzzers pick out
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Marton Balint [Sun, 15 Jan 2017 16:43:15 +0000 (17:43 +0100)]
ffplay: fix indentation after last commit
Signed-off-by: Marton Balint <cus@passwd.hu>
Marton Balint [Sun, 27 Jul 2014 20:11:38 +0000 (22:11 +0200)]
ffplay: do not preallocate video texture
Since the uploads happen in the main display function, it does not matter much.
Signed-off-by: Marton Balint <cus@passwd.hu>
Paul B Mahol [Fri, 20 Jan 2017 16:01:31 +0000 (17:01 +0100)]
avformat: add MIDI Sample Dump Standard demuxer
Signed-off-by: Paul B Mahol <onemda@gmail.com>
Jonathan Campbell [Sat, 3 Sep 2016 10:34:01 +0000 (03:34 -0700)]
avcodec/ac3dec: add consistent noise generation option.
use av_lfg_init_from_data() to seed AC-3 dithering from the AC-3 frame
data to make it consistent given the same AC-3 frame, if option is set.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Jonathan Campbell [Sat, 3 Sep 2016 10:29:29 +0000 (03:29 -0700)]
libavutil: add av_lfg_init_from_data() function
seeds an AVLFG from binary data.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Michael Niedermayer [Sat, 21 Jan 2017 22:44:51 +0000 (23:44 +0100)]
avfilter/af_hdcd: Fix leak of memory allocated by ff_make_format_list()
Fixes CID1396265
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Mark Thompson [Sat, 21 Jan 2017 23:02:21 +0000 (23:02 +0000)]
vaapi_mpeg4: Restore changes overwritten by merge
From
2aa8e33d7d86fbc4a4060c363a5733067c160654.
Michael Niedermayer [Sat, 21 Jan 2017 22:07:02 +0000 (23:07 +0100)]
avfilter/avf_showspectrum: Fix memleak of text allocated by av_asprintf()
Fixes CID1396261
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Michael Niedermayer [Sat, 21 Jan 2017 21:09:03 +0000 (22:09 +0100)]
avfilter/vf_palettegen: Fix leak and simplify code
Fixes CID1270818
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Paul B Mahol [Thu, 19 Jan 2017 22:14:27 +0000 (23:14 +0100)]
avcodec/fraps: add support for PAL8
Signed-off-by: Paul B Mahol <onemda@gmail.com>
Michael Niedermayer [Thu, 22 Dec 2016 14:29:08 +0000 (15:29 +0100)]
avcodec: Add FF_CODEC_CAP_SKIP_FRAME_FILL_PARAM to most h263 based codecs
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Michael Niedermayer [Sat, 21 Jan 2017 00:35:52 +0000 (01:35 +0100)]
avfilter/avfiltergraph: Add assert to write down in machine readable form what is assumed about sample rates in swap_samplerates_on_filter()
Fixes CID1397292
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Matthieu Bouron [Fri, 20 Jan 2017 16:29:09 +0000 (17:29 +0100)]
lavc/h264dec: re-indent after previous commit
Matthieu Bouron [Fri, 20 Jan 2017 16:24:52 +0000 (17:24 +0100)]
lavc/h264dec: make sure a slice is decoded before finishing setup
Fixes regression in fate-h264-attachment-631 with THREADS=8 introduced
by
bdbbb8f11edbf10add874508c5125c174d8939be.
Paul B Mahol [Thu, 19 Jan 2017 19:43:02 +0000 (20:43 +0100)]
avformat/wavdec: enable seeking with XMA2
Signed-off-by: Paul B Mahol <onemda@gmail.com>
Paul B Mahol [Thu, 19 Jan 2017 19:42:14 +0000 (20:42 +0100)]
avcodec/wmaprodec: add xma_flush for seeking in XMA2
Signed-off-by: Paul B Mahol <onemda@gmail.com>
Paul B Mahol [Thu, 19 Jan 2017 19:43:40 +0000 (20:43 +0100)]
avcodec: add XMA2 parser
Signed-off-by: Paul B Mahol <onemda@gmail.com>
Paul B Mahol [Fri, 20 Jan 2017 12:47:44 +0000 (13:47 +0100)]
avcodec/wmaprodec: unbreak XMA mono decoding
Signed-off-by: Paul B Mahol <onemda@gmail.com>
bnnm [Thu, 19 Jan 2017 19:53:12 +0000 (20:53 +0100)]
avcodec/atrac3: allow 6 channels (non-joint stereo)
Raises max channels to 6 (for non joint-stereo only),
there is no difference decoding 1 or N discrete channels.
Fixes trac issue #5840
Signed-off-by: bnnm <bananaman255@gmail.com>
Daniil Cherednik [Wed, 18 Jan 2017 14:26:27 +0000 (17:26 +0300)]
dcaenc: Use Huffman codes for Bit Allocation Index
Reviewed-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Timo Rothenpieler [Wed, 18 Jan 2017 22:22:28 +0000 (23:22 +0100)]
avcodec/nvenc: add logging for more error cases
Timo Rothenpieler [Wed, 18 Jan 2017 22:01:28 +0000 (23:01 +0100)]
avcodec/nvenc: make gpu indices independend of supported capabilities
Steven Liu [Fri, 20 Jan 2017 04:12:02 +0000 (12:12 +0800)]
avformat/hlsenc: fix too many open files bug
When use http method to delete the old segments,
there is only io_open, hove not io_close yet,
this patch is used to fix it
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
Paul B Mahol [Mon, 16 Jan 2017 12:07:45 +0000 (13:07 +0100)]
avcodec/exr: export writer info into frame metadata
Signed-off-by: Paul B Mahol <onemda@gmail.com>
Paul B Mahol [Mon, 16 Jan 2017 11:36:11 +0000 (12:36 +0100)]
avcodec/exr: make it aware of 2 additional compressions
Signed-off-by: Paul B Mahol <onemda@gmail.com>
Aleksandr Slobodeniuk [Wed, 18 Jan 2017 10:11:48 +0000 (13:11 +0300)]
avcodec/avcodec: fix lil typo in comment
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Michael Niedermayer [Thu, 19 Jan 2017 18:15:42 +0000 (19:15 +0100)]
avcodec/speedhq: Fix warning about "initialization from incompatible pointer type"
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Paul B Mahol [Thu, 19 Jan 2017 14:42:47 +0000 (15:42 +0100)]
avcodec/wmaprodec: check number of channels for XMA streams
Signed-off-by: Paul B Mahol <onemda@gmail.com>
Paul B Mahol [Thu, 19 Jan 2017 12:32:21 +0000 (13:32 +0100)]
avcodec/pixlet: use av_clip_uintp2_c explicitly
Found-by: Clément Bœsch <u@pkh.me>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
Paul B Mahol [Thu, 19 Jan 2017 12:19:10 +0000 (13:19 +0100)]
avcodec/pixlet: use av_clip_uintp2()
Found-by: Clément Bœsch <u@pkh.me>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
Paul B Mahol [Thu, 19 Jan 2017 11:49:41 +0000 (12:49 +0100)]
avcodec/pixlet: clip chroma before shifting
Fixes artifacts.
Signed-off-by: Paul B Mahol <onemda@gmail.com>
Paul B Mahol [Thu, 19 Jan 2017 11:29:41 +0000 (12:29 +0100)]
avcodec/wmapro: redone stream selection for XMA1/2
Signed-off-by: Paul B Mahol <onemda@gmail.com>
Clément Bœsch [Wed, 18 Jan 2017 17:13:02 +0000 (18:13 +0100)]
lavc/h264: simplify find_unused_picture()
Piotr Bandurski [Wed, 18 Jan 2017 13:49:19 +0000 (14:49 +0100)]
avformat/caf: add 'aacl' codec tag
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Tobias Rapp [Wed, 18 Jan 2017 09:27:01 +0000 (10:27 +0100)]
ffmpeg: pass output stream duration as a hint to the muxer
Signed-off-by: Tobias Rapp <t.rapp@noa-archive.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Paul B Mahol [Tue, 17 Jan 2017 14:54:57 +0000 (15:54 +0100)]
avcodec/wmaprodec: >2 channel support for XMA
Signed-off-by: Paul B Mahol <onemda@gmail.com>
Steven Liu [Wed, 18 Jan 2017 23:09:22 +0000 (07:09 +0800)]
avfilter:vf_drawtext: add new line space size set parameter
add line_spacing parameter to set the space between two lines
Based on an idea by: Leandro Santiago <leandrosansilva@gmail.com>
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
Steven Liu [Wed, 18 Jan 2017 23:06:50 +0000 (07:06 +0800)]
avformat/hlsenc: fix bug of hlsenc http delete old segments
when push hls to http server, the old segemnts can not delete by hls formats.
so add the http option into hls_delete_old_segments
Reported-by: Yin Jiaoyuan <yinjiaoyuan@163.com>
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
Clément Bœsch [Tue, 17 Jan 2017 09:50:01 +0000 (10:50 +0100)]
lavc/h264dec: remove flush goto in decode callback
Steven Liu [Wed, 18 Jan 2017 15:18:41 +0000 (23:18 +0800)]
avformat/hlsenc: remove debug message used error level log
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>