(PDF file)
(PDF/ Slides)
PoLin Lai
polinlai AT usc DOT edu
 
PoLin Lai
polinlai AT usc DOT edu
 
Research - Multiview Video Coding (MVC)
( Papers/Patents/Standard Contributions are listed under Publications )
   
Advisor: Professor Antonio Ortega
Collaboration with THOMSON Corporate Research

Multiview video systems utilize multiple cameras to simultaneously capture scenes from different viewpoints. They provide digital video data that can be used in new generation of video services such as free-view point TV (FTV) and 3DTV. Multiview video coding (MVC) has recently became an active research area for efficient storage and transmission of multiview video data. Our work focus on developing efficient coding tools to improve coding efficiency as well as to address complexity issues. By understanding characteristics of multiview video data and then exploiting these properties, we have achieved encouraging results which we summarize as the following two categories: (1) Adaptive reference filtering, focus mismatch related works, and (2) Other MVC contributions, earlier works.
 
Adaptive reference filtering (ARF), focus mismatch related works

As compared to conventional monoscopic video, frames from different views are prone to suffer from mismatches other than simple displacement, due to heterogeneous camera settings and/or shooting positions. In particular, we consider focus mismatch which results in blurriness/sharpness discrepancy among different views. We analyze focus mismatch in terms of camera parameters and scene depths/disparity, and propose coding algorithms to identify localized focus mismatches within a frame to adaptively create filtered references which provide higher coding efficiency (Adaptive reference filtering, ARF): For cross-view prediction such as in MVC, since focus mismatch is depth-dependent, we exploited disparity vectors as estimation of depths to partition into different depth regions for filter design [VCIP 07, CSVT 07].  For temporal prediction in monoscopic video where we do not have disparity, we actually estimate simple block-wise mismatch kernels and then design filters for blocks with similar estimated kernels [ICASSP 07]. For video sequences exhibiting strong focus mismatches, up to 0.8~1dB gain as compared to H.264/AVC is achieved for both cases. Furthermore, by combining ARF and illumination compensation techniques, we proposed new coding scheme which can jointly treat focus and illumination mismatches, achieving up to 1.3dB gain over H.264/AVC [CSVT 07].
 
Inspired by the results of focus mismatch analysis which demonstrates view(camera) dependency and depth(disparity) dependency, we study the correlation of ARF filters across time considering depth-composition of the scene, and also evaluate rate-distortion (RD) performance of ARF across views. The effort leads to techniques which can reduce ARF complexity by more than 65% with only 0.05dB degradation [VCIP 08]. To extend ARF into bi-predictive coding (B frame), in which references from two different directions may contain different types of focus mismatch as compared to the frame to be encoded, we investigate the interaction between filter estimation and the process of bi-predictive search. We propose ARF method for B frame, which is capable of compensating mismatches from both references and incorporate well with conventional bi-predictive search schemes [ICIP 08].
 
Our latest work involves further analysis of the focus mismatch to get better quantitative understanding of how the mismatch kernel (i.e. the filter needs) changes under different camera settings and scene depths. We are in preparation for a journal paper submission with the new results.
 
Other MVC contributions (Earlier works)

By exploiting the similarity among motion fields in different views, and the similarity among disparity fields at different timestamps, we propose efficient search schemes for MVC [VCIP 06]. The across-time similarity among disparity fields also leads to our creation of "disparity skip mode", which reuses disparity information for encoding a block to reduce the bits to be transmitted. Up to 0.15dB gain for generic MVC structure with motion/disparity compensations is reported [MPEG m12969, Jan. 2006].
 
In MVC, while the utilization of disparity compensation provides higher coding efficiency, it also introduces "view dependency for decoding process": To decode certain requested view(s), frames from other view(s) also have to be decoded. We study the rate-distortion (RD) performance of MVC by explicitly considering view dependency in partial decoding scenarios (i.e. not all the views are of interest to the user, such as in view-switching applications). We propose a new RD metric for application-specific multiview codec which helps to determine proper MVC prediction structures [MPEG m13318, Apr.2006].
 
 
 
 
 
The University of Southern California does not screen or control the content on this website and thus does not guarantee the accuracy, integrity, or quality of such content. All content on this website is provided by and is the sole responsibility of the person from which such content originated, and such content does not necessarily reflect the opinions of the University administration or the Board of Trustees