FFmpeg multithreading methods
==============================================

FFmpeg provides two methods for multithreading codecs, controlled by
AVCodecContext thread_type:

Slice threading decodes multiple parts of a frame at the same time, using
execute().

Frame threading decodes more than one frame at once by adding more decoder
delay. Given X threads, it will queue the first X submitted frames, then
return the last Xth frame; meanwhile, it decodes the upcoming frames on
separate threads.

Restrictions on clients
==============================================

Slice threading -
* If the client uses draw_horiz_band, it must handle it being called from
  separate threads.

Frame threading -
* Restrictions with slice threading also apply.
* get_buffer and release_buffer will be called by separate threads, but are
  protected by a mutex, so do not need to be reentrant.
* There is one frame of delay added for every thread. Use of reordered_opaque
  will help with A/V sync problems, and clients should not assume that no frame
  being returned after all frames have been submitted means there are no frames
  left.

Restrictions on codecs
==============================================

Slice threading -
None.

Frame threading -
* Relying on previous contents of buffers no longer works. This includes using
  reget_buffer() and not copying skipped MBs. Buffers will have age set to
  INT_MAX, so this won't be a problem for most cases.
* Accepting randomly truncated packets (CODEC_FLAG_TRUNCATED) no longer works.
* Some codecs (such as ffv1) can't be multithreaded.
* If the codec uses draw_edges, it must be called before
  ff_report_frame_progress() is called on any row.

Porting codecs to frame threading
==============================================
1. Fix the above restrictions.

2. Find all the context variables that are needed by the next frame, and make
sure they aren't changed after the actual decoding process starts. Code that
does this can either be moved up, put under if (!USE_FRAME_THREADING()) and
later copied into update_context(), or changed to work on a copy of the
variables it changes.

3. If the codec allocates writable tables in its init(), add an init_copy()
which re-allocates them. If it uses inter-frame compression, add an
update_context() which copies everything necessary for the next frame and does
whatever operations would otherwise be done at the end of the last frame
decoding.

Add CODEC_CAP_FRAME_THREADS to the capabilities - there won't be any speed gain
but it should work.

4. After decoding some part of a frame, call ff_report_frame_progress(). Units
don't matter - MB rows work for most codecs, but pixel rows may be better if it
uses a deblocking filter. Codecs using MpegEncContext should make sure they call
ff_draw_horiz_slice() correctly.

Before accessing a reference frame, call ff_await_frame_progress().

5. Call ff_report_frame_setup_done() as soon as possible. This will start the
next thread.
