----------------------

Research on Multimedia Coding and Communication

(1) Network-Adaptive Video Streaming

(2) Distributed Video Coding (Joint Work with Ngai-Man Cheung)

(3) Other Research Projects

(3) USC Course Projects

----------------------

Network-adaptive Video Streaming

Multiple Description Layered Coding (MDLC)

Layered coding (LC) and multiple description coding (MDC) have been proposed as two different kinds of "quality adaptation"
schemes for video delivery over the current Internet or wireless networks. To combine the advantages of LC and MDC, we present a
new approach -- Multiple Description Layered Coding (MDLC), to provide reliable video communication over a wider range of network
scenarios and application requirements. MDLC improves LC in that it introduces redundancy in each layer so that the chance of
receiving at least one description of base layer is greatly enhanced. Though LC and MDC are each good in limited cases (e.g.,
long end-to-end delay for LC vs. short delay for MDC), the proposed MDLC system addresses those intermediate cases as well.
Same as a LC system with retransmission, the MDLC system can have a feedback channel to indicate which descriptions have been
correctly received. Thus a low redundancy MDLC system can be implemented with our proposed runtime packet scheduling system
based on the feedback information.

Rate-Distortion Optimized Video Streaming for Video Codecs with Multiple Decoding Paths

Previous work on rate-distortion (R-D) optimized video streaming (particularly packet scheduling) is mainly focused on encoding techniques, such as layered coding, which generate only one set of dependent packets. However, such a single encoded stream cannot easily provide the optimal operational R-D points over variable channel environments (e.g., at different channel bandwidths and packet loss rates). On the other hand, it is sometimes impractical for a live encoder to adapt to the varying channel conditions on the fly by completely switching between different modes of operation in a very short time. Thus, in our research, we extend the streaming framework to include an important case, namely, that of a codec that produces multiple redundant versions of the media data to enhance the adaptation flexibility. For example, we proposed a new coding technique, Multiple Description Layered Coding, by combining the advantages of layered coding and multiple description coding. To address this kind of codecs with redundancy between data units, we introduced a new source model, Directed Acyclic HyperGraph (DAHG), to represent the data dependencies and correlation between different video data units, from which the expected end-to-end distortion can be accurately estimated. With the DAHG model, we applied Lagrangian optimization to adjust the system's real-time redundancy to match the channel behavior, thus to achieve the best expected performance. The end-to-end performance achieved by optimal scheduling on a redundant source codec shows a consistent improvement over the traditional source codecs that generate only one decoding path. Both source redundancy and transport redundancy were carefully studied for both error robustness and bandwidth adaptation.

BSD Transformation for Format-Agnostic Bit-Stream Adaptation Using the MEPG-21 DIA 

(Joint work with Debargha Mukherjee and Sam Liu at HP Labs)

Part 7 of MPEG-21, entitled Digital Item Adaptation (DIA), deals with descriptions and protocols enabling content adaptation for different networks and terminals. Scalable representations enable efficient and secure content adaptation along the delivery path without re-encoding video contents. Mukherjee et al at HP Labs have proposed Structured Scalable Metaformats (SSM) for scalable bit stream adaptation, that allows downscaling of content by simply deleting segments and minor bit stream editing. It contains a decision-making module cascaded with a bit-stream adaptation module that models the adaptation process as an XML transformation operating on a high-level syntax description of the bit stream. The previous generic Bit-stream Syntax Description (gBSD) transformation architecture is based on the DOM API, which is memory consuming in that the full input bit stream and gBSD have to be loaded in the memory for processing. In a streaming environment where the full bit stream and gBSD are not available or in a constrained environment where a large enough high-speed cache is not available, this architecture will not work. Instead of passing and storing the gBSD XML as an in-memory DOM tree, we proposed a new architecture based on SAX events. The event handler receives SAX event notifications as the gBSD is parsed, and thus allows generating the output bit stream and adapted gBSD on the fly, by successfully processing the input SAX events with the BSDTrI stylesheet.

Go to top index

----------------------

Distributed Video Coding

Wyner-Ziv Scalable Video Coding

We proposed a practical video coding framework based on distributed source coding principles, with the goal to achieve efficient and low-complexity scalable coding. Starting from a standard predictive coder as base layer (such as MPEG-4 baseline video coder in our implementation), the proposed Wyner-Ziv scalable (WZS) coder can achieve higher coding efficiency, by selectively exploiting the high quality reconstruction of the previous frame in the enhancement layer coding of the current frame. This creates a multi-layer Wyner-Ziv prediction "link", connecting the same bit-plane level between successive frames, thus providing improved temporal prediction as compared to MPEG-4 FGS, while keeping complexity reasonable at the encoder. Since the temporal correlation varies in time and space, a block-based adaptive mode selection algorithm is designed for each bit-plane, so that it is possible to switch between different coding modes. Experimental results show improvements in coding efficiency of 3 -4.5 dB over MPEG-4 FGS for video sequences with high temporal correlation. 

Video Compression with Flexible Playback Order Based on Distributed Source Coding

Some emerging applications may require flexible playback features for time-based media, such as video, that cannot be directly supported by current compression standards, because for these decoding of frames can only be done in a predetermined order. An example would be a video application where both backward and forward frame-by-frame playback are to be supported. A standard codec could support this by decoding complete GOPs in the desired order, and then playing back one frame at a time. Thus, potentially significant added delay and memory are needed to support backward playback, which can be lowered if small GOP sizes are chosen, at the cost of reduced coding efficiency. In this work we address flexible playback by showing that it becomes feasible when a particular data unit (e.g., a video frame) can be decoded using information from either one of a number of other data units (e.g., in the video case the next frame or the previous frame). Note that this is different from structures such as bi-directionally predicted frames, which require both predictor frames to be available at the decoder. We cast this problem as one of source coding with uncertainty about decoder side-information and propose a solution based on distributed source coding. In addition, we propose macroblock-based mode switching algorithms in the context of distributed video coding to improve coding efficiency. Our results show that, using forward/backward playback as an example, our proposed solution can achieve good coding efficiency without incurring additional delay and memory overhead. Other example applications where flexible playback may be desirable include switching between different views in multiview video coding, and accessing individual spectral bands in hyperspectral imagery. 

Go to top index

----------------------

Other Research Projects

Multiview Video Coding 

(joint work with George Chen, Kim Ng, and Clifford Stein at La Jolla Lab, STMicroelectronics)

- We implemented a multi-view video coder using an MPEG-4 temporal scalability based approach. The main reference stream is coded as the base layer, and the frames from secondary views are coded as the temporal-scalable enhancement layers to exploit the correlation with the main stream. The output file is written in a RIFF data structure that contains all the information necessary for the decoder to decode a multi-view video stream including the general configuration, camera settings and compressed video streams.

- We proposed spatio-temporal graph-segmentation encoding for multiple video streams

3-D Audio Rendering 

(joint work with Sam Dicker and Jean-Marc Jot at Creative Technology Ltd.)

Arrays of loudspeakers are utilized to simulate the playback of audio in 3 dimensions. However many systems only utilize two channels and provide only two loudspeaker signals. We proposed an encoding system to implement "Dolby Stereo" matrix encoding, by encoding multiple sound channels into two channels that utilizes all-pass filters to cause frequency dependent phase shifts between the sound channels. 

Go to top index

----------------------

USC Course Projects

Multimedia Compression

Projects including entropy coder design such as Huffman, arithmetic, QM, LZW and run-length coding, and lossy compression techniques such as Fractal, VQ, and Wavelet (EZW, LZC), JPEG, MPEG-1/2/4, H.26x

 

Operating Systems

Projects based on Nachos, which is an operating system simulator developed at the University of California Berkeley, including multiprogramming, multithreading, virtual memory and hierarchical file system. We also study advanced operating systems such as Inter-process Communication (IPC), Naming and Security, File System study (SUN NFS, Andrew, Locus, Athena, USC computing system), and Kernels (V kernel, SPIN, X kernel, Mach, Unix/Linux kernel).

Computer Communications

Projects including landmark routing protocol implementation using socket programming for IPC (Inter-Process Communication), and dynamic source routing (DSR) in ad hoc wireless networks

 

Go to top index

 
----------------------

The University of Southern California does not screen or control the content on this website and thus does not guarantee the accuracy, integrity, or quality of such content. All content on this website is provided by and is the sole responsibility of the person from which such content originated, and such content does not necessarily reflect the opinions of the University administration or the Board of Trustees