Master’s Thesis Audio- and Visual Cues in Mixing and Mastering Lukas Knoebl Graz, April 18, 2016 Institute of Electronic Music and Acoustics Graz University of Music and Performing Arts Graz University of Technology Advisor: Msc. Ph.D. Georgios Marentakis Assessor: O.Univ.-Prof. Mag.art. DI Dr.techn. Robert H¨oldrich Abstract With the emergence of digital technology into the audio market, computer-based music systems have become an integral part for the creation of professional audio productions. Most software in that area is still operated with mouse and keyboard and often involves complex graphical user interfaces. The goal of this master thesis is to investigate, whether and to what extent the visualisation of audio-effect param- eters can influence the mixing decisions of audio engineers. For this purpose, test subjects performed two common mixing and mastering tasks by using just the aural modality, just the visual modality or both modalities at the same time. The first experiment was designed to investigate whether visual cues affected test subjects when matching the loudness of various test signals. Similarly, test subjects were asked to match the spectral characteristics of two test signals by using an equal- izer in a second experiment. Though software such as sequencers and audio effect plug-ins have become indispensable tools for most audio engineers, the influences of audio- and visual cues in mixing and mastering have hardly been researched sci- entifically. This thesis should give some first indications about the interaction and implications of aural and visual modalities in conjunction with common tasks in mixing and mastering. Zusammenfassung Seit dem Einzug der Digitaltechnik in den Audiobereich werden computerbasier- te Musiksysteme fu¨r viele Arbeitsschritte einer professionellen Tontr¨agerproduktion herangezogen. Die Bedienung solcher Software erfolgt meist traditionell mit Maus und Tastatur und erfordert daher die Bereitstellung von oftmals komplexen gra- fischen Benutzeroberfl¨achen. Das Ziel dieser Masterarbeit ist, zu untersuchen, ob und in welchem Ausmaß die Visualisierung von Audioeffekt-Parametern die Ent- scheidungen von Mixing- oder Mastering-Ingenieuren beeinflusst. Zu diesem Zwecke wurden H¨orversuche entworfen, in denen die Testpersonen typische Mixing- und Mastering Aufgaben unter unterschiedlichen Testbedingungen durchgefu¨hrt haben. Im ersten Versuch sollten jeweils zwei Testsignale unter rein auditiven, rein visuellen sowie audio-visuellen Versuchsbedingungen in ihrer Lautheit angepasst werden. In einem zweiten Experiment sollte die Kernfrage der Arbeit auch in Bezug auf die Angleichung der spektralen Charakteristik zweier Signale untersucht werden. Trotz der breiten Anwendung computerbasierter Systeme in der Musikproduktion wurde der Einfluss der damit einhergehenden Visualisierungen auf die kreativen Entschei- dungen Tonschaffender bisher kaum untersucht. Diese Masterarbeit soll erste Auf- schlu¨sse u¨ber das Zusammenspiel von Sehen und H¨oren in Verbindung mit typischen Mixing- und Mastering Aufgaben liefern. Acknowledgements I would like to thank my supervisors O.Univ.-Prof. Mag.art. DI Dr.techn. Robert Ho¨ldrich and Msc. Ph.D. Georgios Marentakis for their persistent support and many insightful discussions. Thanks to DI Ph.D. Matthias Frank for a very helpful intro- duction to Pure Data. My audio-engineering colleagues at the Graz University of Technology, who have truly made my university years a friendly and vibrant experience. Thanks for all the fun we have had in the last years. Special thanks to those who willingly shared their precious time and participated in my experiments, this work would never have been possible without their efforts. Lastly, Ithankmyfriendsandlovedones, whohavesupportedmethroughtheentire process and inspired my passion for music. My family - Mum, Dad and Lydia - for their love, patience and for providing me with a peaceful environment to live, grow up and learn. Thank-you for inciting me to strive towards my goals and for ongoing support on the new roads I am taking. Contents 1 Introduction 1 2 Audio Production 3 2.1 The Evolution Of Mixing . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.2 The Mixing Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.3 The Mastering Process . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.4 The Digital Audio Workstation . . . . . . . . . . . . . . . . . . . . . 12 2.4.1 Audio Plug-ins . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3 Interface Designs For Audio Processing 16 3.1 Interface Conventions in Digital Audio Workstations . . . . . . . . . 18 3.2 Influence Of Graphical User Interface Design . . . . . . . . . . . . . . 19 4 Cross Modal Interaction 22 4.1 Cross-Modality: Vision and Sound . . . . . . . . . . . . . . . . . . . 22 4.1.1 Spatial Domain Effects . . . . . . . . . . . . . . . . . . . . . . 23 4.1.2 Time Domain Effects . . . . . . . . . . . . . . . . . . . . . . . 24 4.1.3 Other Audio-Visual Effects . . . . . . . . . . . . . . . . . . . . 26 4.1.4 Cross-Modal Attention . . . . . . . . . . . . . . . . . . . . . . 27 4.1.5 Mixing with Eyes Closed . . . . . . . . . . . . . . . . . . . . . 29 5 Existing Studies 30 5.1 Summary of Literature Review . . . . . . . . . . . . . . . . . . . . . 33 i 6 Experiment 1: Loudness Matching 35 6.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 6.1.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 6.1.2 Stimuli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 6.2 Participants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 6.2.1 Procedure for Condition 1 - Auditory . . . . . . . . . . . . . . 40 6.2.2 Procedure for Condition 2 - Visual . . . . . . . . . . . . . . . 41 6.2.3 Procedure for Condition 3 - Audio-Visual . . . . . . . . . . . . 42 6.2.4 Summary: Loudness Matching Experimental Design . . . . . . 43 6.3 Results: Loudness Matching . . . . . . . . . . . . . . . . . . . . . . . 44 6.3.1 Hypotheses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 6.3.2 Data Preparation . . . . . . . . . . . . . . . . . . . . . . . . . 45 6.3.3 Evaluation of Unbiased Data . . . . . . . . . . . . . . . . . . . 46 6.3.4 Evaluation of Biased Data . . . . . . . . . . . . . . . . . . . . 48 6.3.5 A Characteristic Measure for Visual Influence . . . . . . . . . 54 6.3.6 Response time . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 6.3.7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 7 Experiment 2: Spectral Matching 61 7.1 Equalizers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 7.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 7.2.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 7.2.2 Stimuli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 7.2.3 Participants . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 7.2.4 Procedure for Condition 1 - Auditory . . . . . . . . . . . . . . 70 7.2.5 Procedure for Condition 2 - Visual . . . . . . . . . . . . . . . 71 7.2.6 Procedure for Condition 3 - Audio-Visual . . . . . . . . . . . . 72 ii 7.2.7 Summary: Spectral Matching Experimental Design . . . . . . 74 7.3 Results: Spectral Matching . . . . . . . . . . . . . . . . . . . . . . . . 75 7.3.1 Hypotheses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 7.3.2 Data Preparation . . . . . . . . . . . . . . . . . . . . . . . . . 75 7.3.3 Evaluation of Unbiased Data - Frequency Error . . . . . . . . 76 7.3.4 Evaluation of Unbiased Data - Gain Error . . . . . . . . . . . 79 7.3.5 Evaluation of Biased Data . . . . . . . . . . . . . . . . . . . . 80 7.3.6 A Characteristic Measure for the Degree of Visual Influence . 84 7.3.7 Evaluation of the Mean Cumulative Error . . . . . . . . . . . 86 7.3.8 Response Time . . . . . . . . . . . . . . . . . . . . . . . . . . 91 7.3.9 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 8 Conclusion 96 Appendices 109 iii List of Figures 2.1 Common production chain of recorded music. Reprinted from [42] . . 7 2.2 Common production chain of sequenced music. Reprinted from [42] . 7 2.3 The iterative process of mixing. Reprinted from [42] . . . . . . . . . . 9 2.4 The iterative coarse-to-fine approach while mixing. Reprinted from [42] 10 2.5 Norman’s seven-staged model of interaction . Reprinted from [73] . . 11 2.6 Major steps in Compact Disc production [46] . . . . . . . . . . . . . . 11 2.7 Graphical user interface of Wave’s CLA-2A compressor plug-in . . . . 14 6.1 Loudness matching experiment setup . . . . . . . . . . . . . . . . . . 36 6.2 Routing and general setup in Experiment 1 . . . . . . . . . . . . . . . 38 6.3 Test signals, waveforms . . . . . . . . . . . . . . . . . . . . . . . . . . 39 6.4 Schematic illustration of the experimental design for the unimodal aural condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 6.5 Graphical user interface of the PPMultor Plus plug-in by zplane in EBU-R128 mode with short time metering . . . . . . . . . . . . . . . 41 6.6 Schematic illustration of the experimental design for the unimodal visual condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 6.7 Schematic illustration of the experimental design for the bi-modal audiovisual condition. . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 6.8 Evaluation of responses collected under unbiased conditions. Error- bars represent the standard error of them mean. . . . . . . . . . . . . 47 6.9 Mean gain settings related to the degree of (visual) offset for different stimuli pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 6.10 Characteristic measure for the degree of visual influence depending on test signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 iv 6.11 Response times per trial for the individual conditions of the experiment 57 7.1 The frequency response of a parametric equalizer boosting 15 dB at 1 kHz with Q-Settings of (0.25/0.50/0.75/1.0). Reprinted from [1] . . 62 7.2 Measured frequency response of Fabfilter’s Pro Q2 . . . . . . . . . . . 63 7.3 Spectral Matching experimental setup . . . . . . . . . . . . . . . . . . 64 7.4 General experimental setup for Experiment 2 . . . . . . . . . . . . . 68 7.5 Spectrum of spectral matching test signals . . . . . . . . . . . . . . . 69 7.6 Schematic illustration of the experimental design for the unimodal aural condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 7.7 Schematic illustration of the experimental design for the unimodal visual condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 7.8 Example of displayed EQ-Plug-ins on the screen . . . . . . . . . . . . 72 7.9 Schematic illustration of the experimental design for the multimodal audio-visual condition . . . . . . . . . . . . . . . . . . . . . . . . . . 73 7.10 Mean Absolute Frequency Error for unbiased conditions . . . . . . . . 77 7.11 Mean gain error (a) and mean absolute gain error (b) for unbiased conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 7.12 Mean frequency error as a function of visual bias . . . . . . . . . . . 82 7.13 Mean absolute gain error as a function of visual bias . . . . . . . . . 84 7.14 Characteristic measure for the degree of visual influence as a function of test signal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 7.15 Derivation of Cumulative Error with corresponding 95th percentiles . 87 7.16 Mean Cumulative Error . . . . . . . . . . . . . . . . . . . . . . . . . 88 7.17 Mean Maximum Cumulative Error . . . . . . . . . . . . . . . . . . . 90 7.18 Response times per trial for the individual conditions of the experiment 91 1 Mean Frequency Error for unbiased conditions . . . . . . . . . . . . . 110 v List of Tables 6.1 Illustration of key parameters for the three conditions in Experiment 1 44 6.2 Two main data groups, consisting of 9 and 21 subgroups ´a 12 re- sponses respectively . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 6.3 Results of the 2-way repeated measures ANOVA . . . . . . . . . . . . 48 6.4 Pink-Noise vs. Pink-Noise multiple comparisons . . . . . . . . . . . . 51 6.5 Pop A vs. Pop B multiple comparisons . . . . . . . . . . . . . . . . . 53 6.6 Pop A vs. Classic loudness matching: Multiple comparisons revealed statistically significantly different gain settings between offsets of left and right columns. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 7.1 Frequency-regions for center frequencies . . . . . . . . . . . . . . . . . 66 7.2 Illustration of key parameters for the three conditions in Experiment 1 74 7.3 Two main data groups . . . . . . . . . . . . . . . . . . . . . . . . . . 76 7.4 Unbiased data, Results of the 3-way repeated measures ANOVA . . . 78 7.5 Biased Data, Results of the 3-way repeated measures ANOVA . . . . 81 vi
Description: