Design of MPEG-4 AAC Encoder Authors: Chi-Min Liu, Wen-Chieh Lee, Chung-Han Yang, Kang- Yan Peng, Ting Chiou, Tzu-Wen Chang, Yu-Hua Hsiao, Hen-Wen Hue and Chu-Ting Chien Outline Introduction Psychoacoustic Model M/S Coding Window Switch Temporal Noise Shaping Experiments & Demonstration Conclusion Introduction– NCTU-AAC Encoder Audio in W-Switch Psychoacoustic Model Filterbank g n i k TNS c Bit Reservoir a P m a M/S e r t S - t i B Bit Allocation Quantization VLC Introduction– NCTU-AAC Encoder Audio in W-Switch Psychoacoustic Model Filterbank g n i k TNS c Bit Reservoir a P m a M/S e r t S - t i B Bit Allocation Quantization VLC 1. Introduction– NCTU-AAC Encoder Audio in W-Switch Psychoacoustic Model Filterbank g n i k TNS c Bit Reservoir a P m a M/S e r t S - t i B Bit Allocation Quantization VLC 1. Introduction Modules Psychoacoustic Model M/S Coding Window Switch Temporal Noise Shaping Objective Theoretical Frameworks Quality Complexity 2. Psychoacoustic Model Approach MDCT-based instead of FFT-based. New Masking Models Detection of tonal attack band. Detection of tone-rich signal. 2. Psychoacoustic Model (c.1) MDCT and FFT Similar spectrum. MDCT spectrum is chaotic due to the aliasing. MDCT leads to the consistent spectrum for analysis and encoding process. 2. Psychoacoustic Model (c.2) DCT Spectrum Q-Bands instead of Lines or P-Bands Tone/Noise information based on Band Flatness instead of Frame Predictivity GM N1 1 1 N1 flatness b , GM x , AM x N b b i b i AM N b i0 i0 For tone-rich signal in band, flatness approximates to 0 b For noise-rich signal in band, flatness approximates to 1 b 2. Psychoacoustic Model-- Adaptive TMN and NMT offset Utilization Human Perception Insensitivity in high frequency The masking effect in high frequency is higher than the lower one Offset 4 3.5 3 2.5 2 Offset 1.5 1 0.5 0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49
Description: