[parsec-users] Is x264 really pipeline parallelism?
nchen.dev at mac.com
nchen.dev at mac.com
Tue Aug 2 17:44:43 EDT 2011
According to the PARSEC technical report <http://parsec.cs.princeton.edu/doc/parsec-report.pdf>: dedup, ferret and x264 are parallelized using pipeline parallelism. With dedup and ferret it was pretty obvious what the pipelines were and what the stages/filters were. However, it wasn't so with x264.
>From the technical report:
"The parallel algorithm of x264 uses the pipeline model with one stage per input
video frame. This results in a virtual pipeline with as many stages as there are
input frames. x264 processes a number of pipeline stages equal to the number of
encoder threads in parallel, resulting in a sliding window which moves from the
beginning of the pipeline to its end."
When looking at the code, it is not obvious that the frames actually form stages of the pipeline. My understanding of a pipeline is that it is a structure of linear series interconnected stages/filters. Each stage/filter performs some operation on its input token and puts an output token for the next stage. For instance, in ferret the stages are:
Input -> Segmentation -> Extraction -> Indexing -> Ranking -> Output
Instead in x264, what is going on is that a thread is spawn for each frame. Each frame is determined to be an I, B or P frame. This thread will then perform some operations (residual calculation, quantization, etc) on the macroblocks in the frame and finally encode it. Because frames and macroblocks have inter and intra dependencies on one another, these threads are not embarrassingly parallel. From this description, it seems that the PThreads version of x264 is more like task parallelism.
I don't see how stages == frame. A stage/filter is a node that performs some operation. It cannot be an object like a frame. If you wanted to parallelize this with pipeline parallelism, the stages/filters would be something like:
Input -> Predict -> Residual (macroblocks) -> Quantization (macroblocks) -> Encode -> Output
and the tokens would be frames coming in. However, because frames might have interdependencies with other frames, it is not easy to express this as a simple pipeline.
Am I missing something? Perhaps someone could elaborate on why x264 was identified as pipeline parallelism.
This post has some relation to "x264 Threading Model" <https://lists.cs.princeton.edu/pipermail/parsec-users/2011-April/001089.html>
More information about the parsec-users