Table Of ContentINTRODUCTION TO
DIGITAL AUDIO CODING
AND STANDARDS
THE KLUWER INTERNATIONAL SERIES
IN ENGINEERING AND COMPUTER SCIENCE
INTRODUCTION TO
DIGITAL AUDIO CODING
AND STANDARDS
Marina Bosi
Stanford University
Richard E. Goldberg
The Brattle Group
SPRINGER SCIENCE+BUSINESS MEDIA, LLC
Library of Congress Cataloging-in-Publication Data
Bosi, Marina
Introduction to Digital Audio Coding and Standards/by Marina Bosi, Richard E. Goldberg,
p. cm. -(The Kluwer international series in engineering and computer science; SECS 721
Includes bibliographical references and index.
ISBN 978-1-4613-5022-4 ISBN 978-1-4615-0327-9 (eBook)
DOI 10.1007/978-1-4615-0327-9
1. Sound—Recording and reproducing—Digital techniques—Standards. 2. Data
compression (Computer science) I. Goldberg, Richard E. II. Title. III. Series.
TK7881.4 .B59 2002
621.38932—<Jc21
2002040686
Copyright © 2003 by Springer Science+Business Media New York
Originally published by Kluwer Academic Publishers in 2003
Softcover reprint of the hardcover 1st edition 2003
All rights reserved. No part of this work may be reproduced, stored in a
retrieval system, or transmitted in any form or by any means, electronic,
mechanical, photocopying, microfilming, recording, or otherwise, without the
written permission from the Publisher, with the exception of any material
supplied specifically for the purpose of being entered and executed on a
computer system, for exclusive use by the purchaser of the work.
Printed on acid-free paper.
Contents
FOREWORD ................................................................................................ xiii
PREFACE .................................................................................................... xvii
PART I: AUDIO CODING METHODS
Chapter I. INTRODUCTION
1. Representation of Audio Signals ................................................ 3
2. Whatis a Digital Audio Coder? ................................................. 4
3. Audio Coding Goals ................................................................... 5
4. The Simplest Coder - PCM ....................................................... 7
5. The Compact Disk ..................................................................... 8
6. Potential Coding Errors .............................................................. 9
7. A More Complex Coder. .......................................................... 10
8. References ................................................................................ 12
9. Exercises .................................................................................. 12
Chapter 2. QUANTIZATION
I. Introduction .............................................................................. 13
2. Binary Numbers ....................................................................... 14
3. Quantization ............................................................................. 20
4. Quantization Errors .................................................................. 34
5. Entropy Coding ........................................................................ 38
6. Summary .................................................................................. 43
7. References ................................................................................ 44
8. Exercises .................................................................................. 44
Chapter 3. REPRESENTATION OF AUDIO SIGNALS
1. Introduction .............................................................................. 47
2. Notation .................................................................................... 48
3. Dirac Delta ............................................................................... 49
. 4. The Fourier Transform ............................................................. 51
5. Summary Properties of Audio Signals ..................................... 53
6. The Fourier Series .................................................................... 59
7. The Sampling Theorem ............................................................ 61
8. Prediction ................................................................................. 63
9. Summary .................................................................................. 68
10. Appendix - Exact Reconstruction of a Band-Limited,
Periodic Signal from Samples within One Period .................... 68
11. References ................................................................................ 69
12. Exercises .................................................................................. 70
Chapter 4. TIME TO FREQUENCY MAPPING PART I: THE PQMF
1. Introduction .............................................................................. 75
2. The Z Transform ...................................................................... 77
3. Two-Channel Perfect Reconstruction Filter Banks .................. 84
4. The Pseudo-QMF Filter Bank, PQMF ..................................... 90
5. Summary .................................................................................. 99
6. References .............................................................................. 100
7. Exercises ................................................................................ 101
Chapter 5. TIME TO FREQUENCY MAPPING PART II: THE MDCT
1. Introduction ............................................................................ 103
2. The Discrete Fourier Transform ............................................. 104
3. The Overlap-and-Add Technique .......................................... 113
4. The Modified Discrete Cosine Transform, MDCT ................ 124
5. Summary ................................................................................ 143
6. References .............................................................................. 144
7. Exercises ................................................................................ 146
Chapter 6. INTRODUCTION TO PSYCHOACOUSTICS
1. Introduction ............................................................................ 149
2. Sound Pressure Levels ........................................................... 150
3. Loudness ................................................................................ 150
4. Hearing Range ........................................................................ 151
5. Hearing Threshold .................................................................. 153
6. The Masking Phenomenon ..................................................... 156
7. Measuring Masking Curves ................................................... 160
8. Critical Bandwidths ................................................................ 164
9. How Hearing Works .............................................................. 168
vi
10. Summary ................................................................................ 174
11. References .............................................................................. 175
12. Exercises ................................................................................ 177
Chapter 7. PSYCHOACOUSTIC MODELS FOR AUDIO CODING
1. Introduction ............................................................................ 179
2. Excitation Patterns and Masking Models ............................... 180
3. The Bark Scale ....................................................................... 182
4. Models for the Spreading of Masking .................................... 183
5. Masking Curves ..................................................................... 190
6. "Addition" of Masking ........................................................... 192
7. Modeling the Effects of Non-Simultaneous (Temporal)
Masking .................................................................................. 195
8. Perceptual Entropy ................................................................. 196
9. Masked Thresholds and Allocation of the Bit Pool ............... 197
10. Summary ................................................................................ 198
11. References .............................................................................. 198
12. Exercises ................................................................................ 200
Chapter 8. BIT ALLOCATION STRATEGIES
1. Introduction ............................................................................ 201
2. Coding Data Rates ................................................................. 202
3. A Simple Allocation of the Bit Pool ...................................... 204
4. Optimal Bit Allocation ........................................................... 205
5. Time-Domain Distortion ........................................................ 214
6. Optimal Bit Allocation and Perceptual Models ..................... 216
7. Summary ................................................................................ 218
8. References .............................................................................. 219
9. Exercises ................................................................................ 219
Chapter 9. BUILDING A PERCEPTUAL AUDIO CODER
1. Introduction ............................................................................ 221
2. Overview of the Coder Building Blocks ................................ 221
3. Computing Masking Curves .................................................. 223
4. Bitstream Format. ................................................................... 230
5. Business Models and Coding Secrets ..................................... 233
6. References .............................................................................. 235
7. Exercises ................................................................................ 235
Chapter 10. QUALITY MEASUREMENT OF PERCEPTUAL AUDIO CODECS
1. Introduction ............................................................................ 237
2. Audio Quality ......................................................................... 239
3. Systems with Small Impairments ........................................... 240
4. Objective Perceptual Measurements of Audio Quality .......... 251
vii
5. What Are We Listening For? ................................................. 255
6. Summary ................................................................................ 257
7. References .............................................................................. 257
8. Exercises ................................................................................ 261
PART II: AUDIO CODING STANDARDS
Chapter 11. MPEG-l AUDIO
1. Introduction ............................................................................ 265
2. Brief History of MPEG Standards ......................................... 266
3. MPEG-l Audio ...................................................................... 268
4. Time to Frequency Mapping .................................................. 273
5. MPEG Audio Psychoacoustic Models ................. :. ................ 278
6. MPEG-l Audio Syntax .......................................................... 296
7. Stereo Coding ......................................................................... 307
8. Summary ................................................................................ 310
9. References .............................................................................. 310
Chapter 12. MPEG-2 AUDIO
1. Introduction ............................................................................ 315
2. MPEG-2 LSF, "MPEG-2.5" and MP3 ................................... 315
3. Introduction to Multichannel Audio ....................................... 318
4. MPEG-2 Multichannel BC. .................................................... 321
5. Summary ................................................................................ 330
6. References .............................................................................. 330
Chapter 13. MPEG-2 AAC
1. Introduction ............................................................................ 333
2. Overview ................................................................................ 333
3. Gain Control ........................................................................... 338
4. Filter Bank .............................................................................. 340
5. Prediction ............................................................................... 343
6. Quantization and Coding ........................................................ 346
7. Noiseless Coding .................................................................... 350
8. Bitstream Multiplexing .......................................................... 353
9. Temporal Noise Shaping ........................................................ 355
10. Joint Stereo Coding ................................................................ 358
11. Test Results ............................................................................ 359
12. Decoder Complexity Evaluation ............................................ 363
13. Summary ................................................................................ 367
14. References .............................................................................. 367
viii
Chapter 14. DOLBY AC-3
1. Introduction ............................................................................ 371
2. Main Features ......................................................................... 372
3. Overview of the Encoding process ........................................ 374
4. Filter Bank .............................................................................. 377
5. Spectral Envelope Coding ...................................................... 382
6. Multichannel Coding .............................................................. 385
7. Bit Allocation ......................................................................... 390
8. Quantization ........................................................................... 394
9. Bitstream Syntax .................................................................... 395
10. Performance ........................................................................... 396
1 1. Summary ................................................................................ 397
12. References .............................................................................. 398
Chapter 15. MPEG-4 AUDIO
1. Introduction ............................................................................ 401
2. MPEG-4: What is it? .............................................................. 402
3. MPEG-4 Audio Goals and Functionalities ............................. 405
4. MPEG-4 Audio Tools and Profiles ........................................ 408
5. MPEG-1 and 2 Versus MPEG-4 Audio ................................. 422
6. The Performance of the MPEG-4 Audio Coding Tools ......... 424
7. Intellectual Property and MPEG-4 ......................................... 425
8. Summary ................................................................................ 426
9. References .............................................................................. 426
INDEX ....................................................................................................... 431
ix
About the Authors
Marina Bosi is a Consulting Professor at Stanford University's Computer
Center for Research in Music and Acoustics (CCRMA) and Chief
Technology Officer of MPEG LA ®, a firm specializing in the licensing of
multimedia technology. Past president of the Audio Engineering Society,
Dr. Bosi is the author of numerous articles and the holder of several patents
in audio technology. Dr. Bosi has been involved in the development of
MPEG, Dolby, and DTS audio coders.
Richard E. Goldberg is a Partner at The Brattle Group, a management
consulting firm specializing in economics and finance issues. Dr.
Goldberg's practice focuses on business valuation and risk management. Dr.
Goldberg has a Ph.D. in Physics from Stanford University and an A.B. in
Astrophysics from Princeton University. Audio coding technology and
related business applications have long been areas of interest for him.