High Efficiency Video Coding (HEVC), also known as H.265 and MPEG-H part2 is the state of art video encoder standard developed by the Joint Collaborative Team on Video Coding (JCT-VC). It achieves up to 50% better data compression at the same level of video quality than its predecessor (H.264-AVC). One of the main challenges of the HEVC coding is its overall performance as the improved quality and bit rate of the standard coding comes with significant performance overhead. This motivates significant research to reduce the overall complexity. In this paper, we explore the effect of applying down sampling and up sampling of coded video on the performance of the HEVC. The down sampling is applied to all video frames before the encoder process begins and the up sampling is applied to all the video frames after the decoding process is completed. We use an average filter in down sampling, and machine learning based network (SRCNN) for the up sampling. In contrast to other methods, the down sampling and up sampling are applied on frame level outside the encoding/decoding processes and not in block level. Our experiments show the down sampling and up sampling can improve the HEVC encoding/decoding performance by up to 50% in some sequences with limited impact on the output encoded bit rate and decoded video quality. The performance comparison is done using different quantization values.
Recent development in hardware and software allowed a new generation of video quality. However, the development in networking and digital communication is lagging behind. This prompted the establishment of the Joint Collaborative Team on Video Coding (JCT-VC), with an objective to develop a new high-performance video coding standard. A primary reason for developing the HEVC was to enable efficient processing and transmission for HD videos that normally contain large smooth areas; therefore, the HEVC utilizes larger encoding blocks than the previous standard to enable more effective encoding, while smaller blocks are still exploited to encode fast/complex areas of video more efficiently. Hence, the implementation of the encoder investigates all the possible block sizes. This and many added features on the new standard have led to significant increase in the complexity of the encoding process. Furthermore, there is not an automated process to decide on when large blocks or small blocks should be exploited. To overcome this problem, this research proposes a set of optimization tools to reduce the encoding complexity while maintaining the same quality and compression rate. The method automates this process through a set of hierarchical steps yet using the standard refined coding tools.
KEYWORDS: Video, Video coding, Cameras, Computer programming, Standards development, Video compression, Optical engineering, Motion estimation, 3D acquisition, 3D displays
The H.264/multiview video coding (MVC) standard has been developed to enable efficient coding for three-dimensional and multiple viewpoint video sequences. The inter-view statistical dependencies are utilized and an inter-view prediction is employed to provide more efficient coding; however, this increases the overall encoding complexity. Motion homogeneity is exploited here to selectively enable inter-view prediction, and to reduce complexity in the motion estimation (ME) and the mode selection processes. This has been accomplished by defining situations that relate macro-blocks’ motion characteristics to the mode selection and the inter-view prediction processes. When comparing the proposed algorithm to the H.264/MVC reference software and other recent work, the experimental results demonstrate a significant reduction in ME time while maintaining similar rate-distortion performance.
The H.264 video coding standard achieves high performance compression and image quality at the expense of increased encoding complexity. Consequently, several fast mode decision and motion estimation techniques have been developed to reduce the computational cost. These approaches successfully reduce the computational time by reducing the image quality and/or increasing the bitrate. In this paper we propose a novel fast mode decision and motion estimation technique. The algorithm utilizes preprocessing frequency domain motion estimation in order to accurately predict the best mode and the search range. Experimental results show that the proposed algorithm significantly reduces the motion estimation time by up to 97%, while maintaining similar rate distortion performance when compared to the Joint Model software.
The objective of scalable video coding is to enable the generation of a unique bitstream that can adapt to various bitrates,
transmission channels and display capabilities. The scalability is categorised in terms of temporal, spatial, and
quality. Effective Rate Control (RC) has important ramifications for coding efficiency, and also channel bandwidth and
buffer constraints in real-time communication.
The main target of RC is to reduce the disparity between the actual and target bit-rates. In order to meet the target bitrate,
a predicted Mean of Absolute Difference (MAD) between frames is used in a rate-quantisation model to obtain the
Quantisation Parameter (QP) for encoding the current frame.
The encoding process exploits the interdependencies between video frames; therefore the MAD does not change abruptly
unless the scene changes significantly. After the scene change, the MAD will maintain a stable slow increase or
decrease. Based on this observation, we developed a simplified RC algorithm. The scheme is divided in two steps;
firstly, we predict scene changes, secondly, in order to suppress the visual quality, we limit the change in QP value
between two frames to an adaptive range. This limits the need to use the rate-quantisation model to those situations
where the scene changes significantly.
To assess the proposed algorithm, comprehensive experiments were conducted. The experimental results show that the
proposed algorithm significantly reduces encoding time whilst maintaining similar rate distortion performance,
compared to both the H.264/SVC reference software and recently reported work.
Multiview Video Coding (MVC) is an extension to the H.264/MPEG-4 AVC video compression standard developed
with joint efforts by MPEG/VCEG to enable efficient encoding of sequences captured simultaneously from multiple
cameras using a single video stream. Therefore the design is aimed at exploiting inter-view dependencies in addition to
reducing temporal redundancies. However, this further increases the overall encoding complexity
In this paper, the high correlation between a macroblock and its enclosed partitions is utilised to estimate motion
homogeneity, and based on the result inter-view prediction is selectively enabled or disabled. Moreover, if the MVC is
divided into three layers in terms of motion prediction; the first being the full and sub-pixel motion search, the second
being the mode selection process and the third being repetition of the first and second for inter-view prediction, the
proposed algorithm significantly reduces the complexity in the three layers.
To assess the proposed algorithm, a comprehensive set of experiments were conducted. The results show that the
proposed algorithm significantly reduces the motion estimation time whilst maintaining similar Rate Distortion
performance, when compared to both the H.264/MVC reference software and recently reported work.
The objective of scalable video coding is to enable the generation of a unique bitstream that can adapt to various bitrates,
transmission channels and display capabilities. The scalability is categorised in terms of temporal, spatial, and
quality. To improve encoding efficiency, the SVC scheme incorporates inter-layer prediction mechanisms which
increases complexity of overall encoding.
In this paper several conditional probabilities are established relating motion estimation characteristics and the mode
distribution at different layers of the H.264/SVC. An evaluation of these probabilities is used to structure a low-complexity
prediction algorithm for Group of Pictures (GOP) in H.264/SVC, reducing computational complexity whilst
maintaining similar performance. When compared to the JSVM software, this algorithm achieves a significant reduction
of encoding time, with a negligible average PSNR loss and bit-rate increase in temporal, spatial and SNR scalability.
Experiments are conducted to provide a comparison between our method and a recently developed fast mode selection
algorithm. These demonstrate our method achieves appreciable time savings for scalable spatial and scalable quality
video coding, while maintaining similar PSNR and bit rate.
The H.264 video coding standard achieves high performance compression and image quality at the expense of increased
encoding complexity, due to the very refined Motion Estimation (ME) and mode decision processes. This paper focuses
on decreasing the complexity of the mode selection process by effectively applying a novel fast mode decision
algorithm.
Firstly the phase correlation is analysed between a macroblock and its prediction obtained from the previously encoded
adjacent block. Relationships are established between the correlation value and object size and also best fit motion
vector. From this a novel fast mode decision and motion estimation technique has been developed utilising preprocessing
frequency domain ME in order to accurately predict the best mode and the search range. We measure the
correlation between a macroblock and the corresponding prediction. Based on the result we select the best mode, or
limit the mode selection process to a subset of modes. Moreover the correlation result is also used to select an
appropriate search range for the ME stage.
Experimental results show that the proposed algorithm significantly reduces the motion estimation time whilst
maintaining similar Rate Distortion performance, when compared to both the H.264/AVC Joint Model (JM) reference
software and recently reported work.
We propose a fast subpixel motion estimation algorithm for the H.264 advanced video coding (AVC) standard. The algorithm utilizes the correlation of the spatial interpolation effect on the full-pixel motion estimation best matches between different block sizes in order to reduce the computational cost of the overall motion-estimation process. Experimental results show that the proposed algorithm significantly reduces the CPU cycles in the various motion estimation schemes by up to 16% with similar rate-distortion performance when weighed up against the H.264/AVC standard.
In this paper, we propose an algorithm for selective application of sub-pixel Motion Estimation and
Hadamard transform in the H.264/AVC video coding standard. The algorithm exploits the spatial interpolation effect of
the reference slices on the best matches of different block sizes in order to increase the computational efficiency of the
overall motion estimation process. Experimental results show that the proposed algorithm significantly reduces the CPU
cycles in the Fast-Full-Search Motion Estimation Scheme by up to 8.2% with similar RD performance, as compared to the
H.264/AVC standard.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.