Table Of ContentCover
Page a
Design of Digital Video Coding Systems
Signal Processing and Communications
Series Editor
K.J.Ray Liu
University of Maryland
College Park, Maryland
Editorial Board
Maurice G.Ballanger, Conservatoire National
des Arts et Métiers (CNAM), Paris
Ezio Biglieri, Politecnico di Torino, Italy
Sadaoki Furui, Tokyo Institute of Technology
YihFang Huang, University of Notre Dame
Nikhil Jayant, Georgia Tech University
Aggelos K.Katsaggelos, Northwestern University
Mos Kaveh, University of Minnesota
P.K.Raja Rajasekaran, Texas Instruments
John Aasted Sorenson, IT University of Copenhagen
1. Digital Signal Processing for Multimedia Systems, edited by Keshab K.Parhi and Takao Nishitani
2. Multimedia Systems, Standards, and Networks, edited by Atul Puri and Tsuhan Chen
3. Embedded Multiprocessors: Scheduling and Synchronization, Sundararajan Sriram and Shuvra S.Bhattacharyya
4. Signal Processing for Intelligent Sensor Systems, David C.Swanson
5. Compressed Video over Networks, edited by MingTing Sun and Amy R.Reibman
6. Modulated Coding for Intersymbol Interference Channels, XiangGen Xia
7. Digital Speech Processing, Synthesis, and Recognition: Second Edition, Revised and Expanded, Sadaoki Furui
8. Modern Digital Halftoning, Daniel L.Lau and Gonzalo R.Arce
9. Blind Equalization and Identification, Zhi Ding and Ye (Geoffrey) Li
10. Video Coding for Wireless Communication Systems, King N.Ngan, Chi W.Yap, and Keng T.Tan
11. Adaptive Digital Filters: Second Edition, Revised and Expanded, Maurice G.Bellanger
12. Design of Digital Video Coding Systems, Jie Chen, UtVa Koc, and K.J.Ray Liu
Additional Volumes in Preparation
Pattern Recognition and Image Preprocessing: Second Edition, Revised and Expanded, SingTze Bow
Programmable Digital Signal Processors: Architecture, Programming, and Applications, edited by Yu Hen Hu
Signal Processing for Magnetic Resonance Imaging and Spectroscopy, edited by Hong Yan
Page i
Design of Digital Video Coding Systems
A Complete Compressed Domain Approach
Jie Chen
Flarion Technologies
Bedminster, New Jersey
UtVa Koc
Lucent Technologies
Murray Hill, New Jersey
K.J.Ray Liu
University of Maryland
College Park, Maryland
MARCEL DEKKER, INC. NEW YORK • BASEL
Page ii
This edition published in the Taylor & Francis eLibrary, 2005.
To purchase your own copy of this or any of Taylor & Francis or Routledge’s collection of thousands of eBooks please go to www.eBookstore.tandf.co.uk.
ISBN 0203904184 Master ebook ISBN
ISBN (OEB Format)
ISBN: 0824706560 (Print Edition)
Headquarters
Marcel Dekker, Inc.
270 Madison Avenue, New York, NY 10016
tel: 212–696–9000; fax: 212–685–4540
Eastern Hemisphere Distribution
Marcel Dekker AG
Hutgasse 4, Postfach 812, CH4001 Basel, Switzerland
tel: 41–61–261–8482; fax: 41–61–261–8896
World Wide Web
http://www.dekker.com
The publisher offers discounts on this book when ordered in bulk quantities. For more information, write to Special Sales/Professional Marketing at the headquarters
address above.
Copyright © 2002 by Marcel Dekker, Inc.
All Rights Reserved.
Neither this book nor any part may be reproduced or transmitted in any form or by any
means, electronic or mechanical, including photocopying, microfilming, and recording, or
by any information storage and retrieval system, without permission in writing from the
publisher.
Page iii
The LORD has blessed me with a wonderful family to whom
this book is dedicated.
To my parents, my dear wife Allison, and our lovely
daughter Grace.
Jie Chen
To the memory of my dear father, Kok long Ip,
and to my dear mother Chong I Mui, my beloved wife
WenLing, and our lovely children Irene and Jeffrey.
UtVa Koc
To Lynne, Jeffry, Joanne,
and our doggie Reo
K.J.Ray Liu
Page iv
This page intentionally left blank.
Page v
Series Introduction
Over the past 50 years, digital signal processing has evolved as a major engineering discipline. The fields of signal processing have grown from the origin of fast Fourier
transform and digital filter design to statistical spectral analysis and array processing, and image, audio, and multimedia processing, and shaped developments in high
performance VLSI signal processor design. Indeed, there are few fields that enjoy so many applications—signal processing is everywhere in our lives.
When one uses a cellular phone, the voice is compressed, coded, and modulated using signal processing techniques. As a cruise missile winds along hillsides
searching for the target, the signal processor is busy processing the images taken along the way. When we are watching a movie in HDTV, millions of audio and video
data are being sent to our homes and received with unbelievable fidelity. When scientists compare DNA samples, fast pattern recognition techniques are being used.
On and on, one can see the impact of signal processing in almost every engineering and scientific discipline.
Because of the immense importance of signal processing and the fastgrowing demands of business and industry, this series on signal processing serves to report up
todate developments and advances in the field. The topics of interest include but are not limited to the following:
● Signal theory and analysis
● Statistical signal processing
● Speech and audio processing
● Image and video processing
● Multimedia signal processing and technology
● Signal processing for communications
● Signal processing architectures and VLSI design
Page vi
I hope this series will provide the interested audience with highquality, stateoftheart signal processing literature through research monographs, edited books, and
rigorously written textbooks by experts in their fields.
K.J.Ray Liu
Page vii
Preface
The hybrid DCT motioncompensated approach for video coding has been the core of almost all recent multimedia standards such as MPEG1, MPEG2, H.261,
H.263, and even MPEG4. Therefore, an efficient high performance, costeffective design of a digital video encoder and decoder relies on a good design of the hybrid
DCT motioncompensated Codec.
The concept of a hybrid DCT motioncompensated Codec comes mainly from two parts. One is to employ the discrete cosine transform (DCT), similar to the
famous still image standard JPEG, as a means to remove spatial redundancy within an image frame through transform coding. The other is to perform motion estimation
and compensation to remove temporal redundancy among image frames through some kind of prediction. Naturally, such a concept leads to an encoder architecture
such that the temporal redundancy is first removed by taking the difference from the current image frame and the prediction of the current frame from motion prediction
and compensation of the previous frame. Then the difference is further processed by DCT to remove the spatial redundancy. Such architecture, commonly used
nowadays, has a performancecritical feedback loop consisting of a DCT, quantization unit and dequantization unit, an Inverse DCT and a spatial domain motion
estimation/compensation unit. Note that both DCT and motion estimation/compensation consume most of the computational resource of a digital video encoder. Such a
heavily loaded feedback loop not only increases the overall complexity of the encoder but also limits the throughput, becoming the bottleneck for designing a realtime
highperformance, costeffective digital video system.
Is there a better way to design the video encoder? This is the question we have been trying to answer. In this monograph, we present an encoder structure that, by
combining transform coding, motion estimation and compensation completely in the DCT domain, can reduce the complexity inside the loop significantly. The question
is: can we perform motion estimation and compensation in the DCT domain efficiently, i.e. with lower overall complexity and higher data through rate? We have
developed a motion estimation scheme completely on the DCT domain. At first look, it may seem that such a scheme, because of the need of other transforms of
similar family, may require higher computational complexity from an algorithmic point of view. Nevertheless, we can show that with an efficient design of a sig
Page viii
nal processing architecture, those transforms can be generated altogether naturally with almost no or little hardware penalty compared to the basic hardware cost of
DCT. In fact, through the generation of those transforms, the operations of motion estimation have been inherently performed. As such, both the DCT and motion
estimation are combined into a single, unified component. Therefore, to answer the question of finding a better way for designing a digital video encoder, the solution
comes not only from the domain of algorithms, but also from the interactions with our understanding of architecture/hardware issues.
In fact, given today’s optical technology, the repeated computation of those required transforms can be easily handled by an optical engine with almost no loss of
time. Therefore, the proposed complete transform domain approach can gain incredible advantages over conventional electronic designs in areas such as broadband
fiber optical multimedia communications where speed is of the essence. If the optical engine can be costeffective, then the proposed approach can even be employed
to deliver lowcost, realtime personal video encoders everywhere.
This book contains part of the research we have been conducting in search of a better design of digital video encoders. The scope of the entire view, as it relates to
the interactions and evolution of algorithms and architectures, cannot be easily presented and understood through various technical publications of limited scope given
the constraint of page limitation. Thus this book is devoted to readers who are interested in designing a new class of highperformance, lowpower digital video
encoder. This is just the starting point of the journey as readers may find that there are many possibilities and unanswered questions. We hope this book can serve as a
seed planted in readers’ mind to germinate into an idea: perhaps there is a better way to the design and implementation of digital video encoders.
In order to prepare readers with different backgrounds to understand the materials, there are four parts in this book. Part I covers fundamental material on the
background and standards of digital video. In Part II, the algorithmic aspects are considered, followed by the discussion of design and implementation in Part III.
Finally, in Part IV an application to the SONET optical transcoder is presented.
Part I contains Chapters 1, 2, and 3. We devote Chapter 2 to the basics of the motioncompensated DCT video coding approach (MCDCT). Various MCDCT
based video coding standards such as H.261, H.263, MPEG1, and MPEG2 are presented in Chapter 3. After introduction of the commonly used MCDCT
approach in Chapter 2, the disadvantages of the conventional blockbased motion estimation and compensation video coder structure used in all the coding standards
are also pointed out. To overcome those disadvantages, the idea of fully DCTbased coder design is presented.
Part II is from Chapter 4 to Chapter 7. To be able to realize transform domain