doc/ffmpeg-3.0.2/filter_design.txt

   1 Filter design
   2 =============
   3
   4 This document explains guidelines that should be observed (or ignored with
   5 good reason) when writing filters for libavfilter.
   6
   7 In this document, the word “frame” indicates either a video frame or a group
   8 of audio samples, as stored in an AVFilterBuffer structure.
   9
  10
  11 Format negotiation
  12 ==================
  13
  14   The query_formats method should set, for each input and each output links,
  15   the list of supported formats.
  16
  17   For video links, that means pixel format. For audio links, that means
  18   channel layout, sample format (the sample packing is implied by the sample
  19   format) and sample rate.
  20
  21   The lists are not just lists, they are references to shared objects. When
  22   the negotiation mechanism computes the intersection of the formats
  23   supported at each end of a link, all references to both lists are replaced
  24   with a reference to the intersection. And when a single format is
  25   eventually chosen for a link amongst the remaining list, again, all
  26   references to the list are updated.
  27
  28   That means that if a filter requires that its input and output have the
  29   same format amongst a supported list, all it has to do is use a reference
  30   to the same list of formats.
  31
  32   query_formats can leave some formats unset and return AVERROR(EAGAIN) to
  33   cause the negotiation mechanism to try again later. That can be used by
  34   filters with complex requirements to use the format negotiated on one link
  35   to set the formats supported on another.
  36
  37
  38 Buffer references ownership and permissions
  39 ===========================================
  40
  41   Principle
  42   ---------
  43
  44     Audio and video data are voluminous; the buffer and buffer reference
  45     mechanism is intended to avoid, as much as possible, expensive copies of
  46     that data while still allowing the filters to produce correct results.
  47
  48     The data is stored in buffers represented by AVFilterBuffer structures.
  49     They must not be accessed directly, but through references stored in
  50     AVFilterBufferRef structures. Several references can point to the
  51     same buffer; the buffer is automatically deallocated once all
  52     corresponding references have been destroyed.
  53
  54     The characteristics of the data (resolution, sample rate, etc.) are
  55     stored in the reference; different references for the same buffer can
  56     show different characteristics. In particular, a video reference can
  57     point to only a part of a video buffer.
  58
  59     A reference is usually obtained as input to the start_frame or
  60     filter_frame method or requested using the ff_get_video_buffer or
  61     ff_get_audio_buffer functions. A new reference on an existing buffer can
  62     be created with the avfilter_ref_buffer. A reference is destroyed using
  63     the avfilter_unref_bufferp function.
  64
  65   Reference ownership
  66   -------------------
  67
  68     At any time, a reference “belongs” to a particular piece of code,
  69     usually a filter. With a few caveats that will be explained below, only
  70     that piece of code is allowed to access it. It is also responsible for
  71     destroying it, although this is sometimes done automatically (see the
  72     section on link reference fields).
  73
  74     Here are the (fairly obvious) rules for reference ownership:
  75
  76     * A reference received by the filter_frame method (or its start_frame
  77       deprecated version) belongs to the corresponding filter.
  78
  79       Special exception: for video references: the reference may be used
  80       internally for automatic copying and must not be destroyed before
  81       end_frame; it can be given away to ff_start_frame.
  82
  83     * A reference passed to ff_filter_frame (or the deprecated
  84       ff_start_frame) is given away and must no longer be used.
  85
  86     * A reference created with avfilter_ref_buffer belongs to the code that
  87       created it.
  88
  89     * A reference obtained with ff_get_video_buffer or ff_get_audio_buffer
  90       belongs to the code that requested it.
  91
  92     * A reference given as return value by the get_video_buffer or
  93       get_audio_buffer method is given away and must no longer be used.
  94
  95   Link reference fields
  96   ---------------------
  97
  98     The AVFilterLink structure has a few AVFilterBufferRef fields. The
  99     cur_buf and out_buf were used with the deprecated
 100     start_frame/draw_slice/end_frame API and should no longer be used.
 101     src_buf and partial_buf are used by libavfilter internally
 102     and must not be accessed by filters.
 103
 104   Reference permissions
 105   ---------------------
 106
 107     The AVFilterBufferRef structure has a perms field that describes what
 108     the code that owns the reference is allowed to do to the buffer data.
 109     Different references for the same buffer can have different permissions.
 110
 111     For video filters that implement the deprecated
 112     start_frame/draw_slice/end_frame API, the permissions only apply to the
 113     parts of the buffer that have already been covered by the draw_slice
 114     method.
 115
 116     The value is a binary OR of the following constants:
 117
 118     * AV_PERM_READ: the owner can read the buffer data; this is essentially
 119       always true and is there for self-documentation.
 120
 121     * AV_PERM_WRITE: the owner can modify the buffer data.
 122
 123     * AV_PERM_PRESERVE: the owner can rely on the fact that the buffer data
 124       will not be modified by previous filters.
 125
 126     * AV_PERM_REUSE: the owner can output the buffer several times, without
 127       modifying the data in between.
 128
 129     * AV_PERM_REUSE2: the owner can output the buffer several times and
 130       modify the data in between (useless without the WRITE permissions).
 131
 132     * AV_PERM_ALIGN: the owner can access the data using fast operations
 133       that require data alignment.
 134
 135     The READ, WRITE and PRESERVE permissions are about sharing the same
 136     buffer between several filters to avoid expensive copies without them
 137     doing conflicting changes on the data.
 138
 139     The REUSE and REUSE2 permissions are about special memory for direct
 140     rendering. For example a buffer directly allocated in video memory must
 141     not modified once it is displayed on screen, or it will cause tearing;
 142     it will therefore not have the REUSE2 permission.
 143
 144     The ALIGN permission is about extracting part of the buffer, for
 145     copy-less padding or cropping for example.
 146
 147
 148     References received on input pads are guaranteed to have all the
 149     permissions stated in the min_perms field and none of the permissions
 150     stated in the rej_perms.
 151
 152     References obtained by ff_get_video_buffer and ff_get_audio_buffer are
 153     guaranteed to have at least all the permissions requested as argument.
 154
 155     References created by avfilter_ref_buffer have the same permissions as
 156     the original reference minus the ones explicitly masked; the mask is
 157     usually ~0 to keep the same permissions.
 158
 159     Filters should remove permissions on reference they give to output
 160     whenever necessary. It can be automatically done by setting the
 161     rej_perms field on the output pad.
 162
 163     Here are a few guidelines corresponding to common situations:
 164
 165     * Filters that modify and forward their frame (like drawtext) need the
 166       WRITE permission.
 167
 168     * Filters that read their input to produce a new frame on output (like
 169       scale) need the READ permission on input and must request a buffer
 170       with the WRITE permission.
 171
 172     * Filters that intend to keep a reference after the filtering process
 173       is finished (after filter_frame returns) must have the PRESERVE
 174       permission on it and remove the WRITE permission if they create a new
 175       reference to give it away.
 176
 177     * Filters that intend to modify a reference they have kept after the end
 178       of the filtering process need the REUSE2 permission and must remove
 179       the PRESERVE permission if they create a new reference to give it
 180       away.
 181
 182
 183 Frame scheduling
 184 ================
 185
 186   The purpose of these rules is to ensure that frames flow in the filter
 187   graph without getting stuck and accumulating somewhere.
 188
 189   Simple filters that output one frame for each input frame should not have
 190   to worry about it.
 191
 192   filter_frame
 193   ------------
 194
 195     This method is called when a frame is pushed to the filter's input. It
 196     can be called at any time except in a reentrant way.
 197
 198     If the input frame is enough to produce output, then the filter should
 199     push the output frames on the output link immediately.
 200
 201     As an exception to the previous rule, if the input frame is enough to
 202     produce several output frames, then the filter needs output only at
 203     least one per link. The additional frames can be left buffered in the
 204     filter; these buffered frames must be flushed immediately if a new input
 205     produces new output.
 206
 207     (Example: frame rate-doubling filter: filter_frame must (1) flush the
 208     second copy of the previous frame, if it is still there, (2) push the
 209     first copy of the incoming frame, (3) keep the second copy for later.)
 210
 211     If the input frame is not enough to produce output, the filter must not
 212     call request_frame to get more. It must just process the frame or queue
 213     it. The task of requesting more frames is left to the filter's
 214     request_frame method or the application.
 215
 216     If a filter has several inputs, the filter must be ready for frames
 217     arriving randomly on any input. Therefore, any filter with several inputs
 218     will most likely require some kind of queuing mechanism. It is perfectly
 219     acceptable to have a limited queue and to drop frames when the inputs
 220     are too unbalanced.
 221
 222   request_frame
 223   -------------
 224
 225     This method is called when a frame is wanted on an output.
 226
 227     For an input, it should directly call filter_frame on the corresponding
 228     output.
 229
 230     For a filter, if there are queued frames already ready, one of these
 231     frames should be pushed. If not, the filter should request a frame on
 232     one of its inputs, repeatedly until at least one frame has been pushed.
 233
 234     Return values:
 235     if request_frame could produce a frame, or at least make progress
 236     towards producing a frame, it should return 0;
 237     if it could not for temporary reasons, it should return AVERROR(EAGAIN);
 238     if it could not because there are no more frames, it should return
 239     AVERROR_EOF.
 240
 241     The typical implementation of request_frame for a filter with several
 242     inputs will look like that:
 243
 244         if (frames_queued) {
 245             push_one_frame();
 246             return 0;
 247         }
 248         input = input_where_a_frame_is_most_needed();
 249         ret = ff_request_frame(input);
 250         if (ret == AVERROR_EOF) {
 251             process_eof_on_input();
 252         } else if (ret < 0) {
 253             return ret;
 254         }
 255         return 0;
 256
 257     Note that, except for filters that can have queued frames, request_frame
 258     does not push frames: it requests them to its input, and as a reaction,
 259     the filter_frame method possibly will be called and do the work.
 260
 261 Legacy API
 262 ==========
 263
 264   Until libavfilter 3.23, the filter_frame method was split:
 265
 266   - for video filters, it was made of start_frame, draw_slice (that could be
 267     called several times on distinct parts of the frame) and end_frame;
 268
 269   - for audio filters, it was called filter_samples.