One of the most frequently used coding mode in H.264 is skip mode. In the conventional approach, after the best RD
mode has been computed and the resultant predicted error coefficients block is all quantized to zero, it is switched to
skip mode. This is a waste of computational resources because skip mode doesn't require forward transform and
quantization. In this paper, skip mode condition is checked for the macroblock prior to multi-block motion estimation.
Motion estimation will not be performed if the condition is satisfied which will drastically reduce the computations. The
condition considers zero-block property after 4x4 block transform/quantisation and caters for noise inherent in natural
video images. In addition, color components are also taken into consideration for skip mode decision. The experimental
results show that the approach can improve encoder speed greatly with negligible bit rate increase or PSNR degradation.
KEYWORDS: Motion estimation, Computer programming, Video, Video coding, Quantization, Diamond, Standards development, Video processing, Distortion, Digital signal processing
It is well known that motion estimation is the most computationally intensive processing unit of the H.264 video
encoder. Various fast motion estimation algorithms have been proposed to reduce its complexity. Generally, these
approaches achieve speedup by reducing the number of candidate search points within the search window. In this paper,
we propose a new method, which uses the Sum-of-Absolute-Differences mapping (SAD map) to dynamically cache the
SAD values and then reuse them for different block sizes. Experimental results on standard video sequences verified
that the proposed method is capable of increasing the encoder speed by up to 15% without any loss in PSNR value or
increase in bit rate. Due to its generic nature, this method can be applied in any fast motion estimation methods although
it is especially effective in the full search motion estimation method.
As one important component in H.264 video encoder, interpolation of half and quarter pixels is also computational
intensive. Compared to integer pixel motion estimation, "finer" interpolation provides better block match. However, this
good motion compensation performance is obtained at the expense of increased complexity. Based on our previous
work, this paper presents an improved fast and adaptive interpolation method that further reduces the complexity of
video encoding process. Experimental results on typical video sequences demonstrate that the proposed method is able
to increase encoder speed ranging from 10% to 22% compared with our previous work without any PSNR loss or bit
rate increase.
We propose a method of providing error resilient H.264 video over 802.11 wireless channels by using a feedback mechanism which does not incur an additional delay typically found in ARQ-type feedback. Our system uses the TCP/IP and UDP/IP protocols, located between the medium access control (MAC) layer of 802.11, and the H.264 video application layer. The UDP protocol is used to transfer time sensitive video data without delay; however, packet losses introduce excessive artifacts which propagate to subsequent frames. Error resilience is achieved by a feedback mechanism-the decoder conveys the packet-loss information as small TCP packets to the video source as negative acknowledgements. By using multiple reference frames, slice-based coding and timely intra-refresh, the encoder makes use of this feedback information to perform subsequent temporal prediction without propagating the error to future frames. We take static measurements of the actual channel and use the packet loss and delay patterns to test our algorithms. Simulations show an improvement of 0.5~5 dB in PSNR over plain UDP-based video transmission. Our method improves the overall quality of service of interactive video transmission over wireless LAN; it can be used as a
model for future media-aware wireless network protocol designs.
The EU FP6 WCAM (Wireless Cameras and Audio-Visual Seamless Networking) project aims to study, develop and validate a wireless, seamless and secured end-to-end networked audio-visual system for video surveillance and multimedia distribution applications. This paper describes the video transmission aspects of the project, with contributions in the areas of H.264 video delivery over wireless LANs.
The planned demonstrations under WCAM include transmission of H.264 coded material over 802.11b/g networks with TCP/IP and UDP/IP being employed as the transport and network layers over unicast and multicast links. UDP based unicast and multicast transmissions pose the problem of packet erasures while TCP based transmission is associated with long delays and the need for a large jitter buffer. This paper presents measurement data that have been collected at the trial site along with analysis of the data, including characterisation of the channel conditions as well as recommendations on the optimal operating parameters for each of the above transmission scenarios (e.g. jitter buffer sizes, packet error rates, etc.). Recommendations for error resilient coding algorithms and packetisation strategies are made in order to moderate the effect of the observed packet erasures on the quality of the transmitted video. Advanced error concealment methods for masking the effects of packet erasures at the receiver/decoder are also described.
This paper addresses two issues related to motion estimation using the block matching algorithms (BMA): (1) determining the reliability of the motion vectors of each block, and (2) imposing smoothness constraint to the motion vector field. We introduce a new robust reliability measure to represent the confidence level of the motion vector from the cost function distribution and propose a novel algorithm that incorporates smoothness constraint into the motion vector field evaluation by implementing a priority queue structure based on the reliability measure. In this framework, a smooth motion vector field is evaluated in a single pass without going through iterations typical of many existing optical flow estimation algorithms. Hence it is fast and can easily be incorporated into real-time applications for video compression as well as image segmentation.
KEYWORDS: Motion estimation, Motion models, Computer simulations, Video coding, Computer programming, Video compression, Performance modeling, Distortion, Digital signal processing, Roads
This paper proposes a low-complexity sub-pixel refinement to motion estimation based on full-search block matching algorithm (BMA) at integer-pixel accuracy. This algorithm eliminates the need to produce interpolated reference frames, which is may be too memory- and processor- intensive, for some real-time mobile applications. The algorithm assumes the BMA is done at pixel resolution and the (sum-of-absolute-differences) SADs of the candidate motion vector and its neighbouring vectors are available for each block. The proposed method than models the SAD distribution around the candidate motion vector and its neighbouring points. Actual minimum point at sub-pixel resolution is then computed according to the model used. 3 variations of the parabolic model are considered and simulations using the H.263 standard encoder on several test sequences reveal an improvement of 1.0 dB over integer-accuracy motion estimation. Albeit its simplicity, some test cases come close to the results obtained by actual interpolated reference frames.
This paper proposes a novel low-complexity region-based video-coding algorithm that automatically identifies moving foreground objects, compresses them with higher quality than the background and efficiently encodes the video in an H.263+ compliant bitstream. Global motion estimation is first performed using the MSE algorithm. The original sequence is then segmented into foreground and background regions by using global and local motion information predicted from the previous frame. This enables the separation of moving objects with respect to a static background, even in the presence of camera motion. A modified TMN8 rate control algorithm is proposed to assign more bits to the foreground region, and the segmented video is then encoded into an H.263+ compliant bitstream. As block-matching motion estimation is used to obtain the local motion field and foreground/background identification is also block-based, the proposed algorithm has lower complexity than previously proposed pixel-based algorithms. Hence it is can be easily implemented in software or ASIC-based real-time applications. It is also particularly useful for mobile applications where bandwidth is highly constrained and low power requirements restrict processing complexity.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.