MODELS AND THEORIES OF SPEECH PRODUCTION EDITED BY : Adamantios Gafos and Pascal van Lieshout PUBLISHED IN : Frontiers in Psychology and Frontiers in Communication Frontiers eBook Copyright Statement About Frontiers The copyright in the text of individual articles in this eBook is the Frontiers is more than just an open-access publisher of scholarly articles: it is a property of their respective authors pioneering approach to the world of academia, radically improving the way scholarly or their respective institutions or funders. The copyright in graphics research is managed. The grand vision of Frontiers is a world where all people have and images within each article may an equal opportunity to seek, share and generate knowledge. Frontiers provides be subject to copyright of other parties. In both cases this is subject immediate and permanent online open access to all its publications, but this alone to a license granted to Frontiers. is not enough to realize our grand goals. The compilation of articles constituting this eBook is the property of Frontiers. Frontiers Journal Series Each article within this eBook, and the eBook itself, are published under The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, the most recent version of the online journals, promising a paradigm shift from the current review, selection and Creative Commons CC-BY licence. The version current at the date of dissemination processes in academic publishing. All Frontiers journals are driven publication of this eBook is by researchers for researchers; therefore, they constitute a service to the scholarly CC-BY 4.0. If the CC-BY licence is updated, the licence granted by community. At the same time, the Frontiers Journal Series operates on a revolutionary Frontiers is automatically updated to invention, the tiered publishing system, initially addressing specific communities of the new version. scholars, and gradually climbing up to broader public understanding, thus serving When exercising any right under the CC-BY licence, Frontiers must be the interests of the lay society, too. attributed as the original publisher of the article or eBook, as applicable. Dedication to Quality Authors have the responsibility of ensuring that any graphics or other Each Frontiers article is a landmark of the highest quality, thanks to genuinely materials which are the property of collaborative interactions between authors and review editors, who include some others may be included in the CC-BY licence, but this should be of the world’s best academicians. Research must be certified by peers before entering checked before relying on the a stream of knowledge that may eventually reach the public - and shape society; CC-BY licence to reproduce those materials. Any copyright notices therefore, Frontiers only applies the most rigorous and unbiased reviews. relating to those materials must be Frontiers revolutionizes research publishing by freely delivering the most outstanding complied with. research, evaluated with no bias from both the academic and social point of view. Copyright and source acknowledgement notices may not By applying the most advanced information technologies, Frontiers is catapulting be removed and must be displayed scholarly publishing into a new generation. in any copy, derivative work or partial copy which includes the elements in question. What are Frontiers Research Topics? All copyright, and all rights therein, are protected by national and Frontiers Research Topics are very popular trademarks of the Frontiers Journals international copyright laws. The above represents a summary only. Series: they are collections of at least ten articles, all centered on a particular subject. For further information please read With their unique mix of varied contributions from Original Research to Review Frontiers’ Conditions for Website Use and Copyright Statement, and Articles, Frontiers Research Topics unify the most influential researchers, the latest the applicable CC-BY licence. key findings and historical advances in a hot research area! Find out more on how ISSN 1664-8714 to host your own Frontiers Research Topic or contribute to one as an author by ISBN 978-2-88963-928-1 DOI 10.3389/978-2-88963-928-1 contacting the Frontiers Editorial Office: [email protected] Frontiers in Psychology 1 August 2020 | Models and Theories of Speech Production MODELS AND THEORIES OF SPEECH PRODUCTION Topic Editors: Adamantios Gafos, University of Potsdam, Germany Pascal van Lieshout, University of Toronto, Canada Citation: Gafos, A., van Lieshout, P., eds. (2020). Models and Theories of Speech Production. Lausanne: Frontiers Media SA. doi: 10.3389/978-2-88963-928-1 Frontiers in Psychology 2 August 2020 | Models and Theories of Speech Production Table of Contents 05 Editorial: Models and Theories of Speech Production Adamantios Gafos and Pascal van Lieshout 09 Emergence of an Action Repository as Part of a Biologically Inspired Model of Speech Processing: The Role of Somatosensory Information in Learning Phonetic-Phonological Sound Features Bernd J. Kröger, Tanya Bafna and Mengxue Cao 26 Variability and Central Tendencies in Speech Production D. H. Whalen and Wei-Rong Chen 35 Modeling Dimensions of Prosodic Prominence Simon Roessig and Doris Mücke 54 The Emergence of Discrete Perceptual-Motor Units in a Production Model That Assumes Holistic Phonological Representations Maya Davis and Melissa A. Redford 73 Motoric Mechanisms for the Emergence of Non-local Phonological Patterns Sam Tilsen 98 Bridging Dynamical Systems and Optimal Trajectory Approaches to Speech Motor Control With Dynamic Movement Primitives Benjamin Parrell and Adam C. Lammert 115 Modeling Sensory Preference in Speech Motor Planning: A Bayesian Modeling Framework Jean-François Patri, Julien Diard and Pascal Perrier 129 The Morphogenesis of Speech Gestures: From Local Computations to Global Patterns Khalil Iskarous 140 Economy of Effort or Maximum Rate of Information? Exploring Basic Principles of Articulatory Dynamics Yi Xu and Santitham Prom-on 162 Native Language Influence on Brass Instrument Performance: An Application of Generalized Additive Mixed Models (GAMMs) to Midsagittal Ultrasound Images of the Tongue Matthias Heyne, Donald Derrick and Jalal Al-Tamimi 188 Noggin Nodding: Head Movement Correlates With Increased Effort in Accelerating Speech Production Tasks Mark Tiede, Christine Mooshammer and Louis Goldstein 205 Spatially Conditioned Speech Timing: Evidence and Implications Jason A. Shaw and Wei-rong Chen 222 The Role of Temporal Modulation in Sensorimotor Interaction Louis Goldstein 234 Spoken Language Development and the Challenge of Skill Integration Aude Noiray, Anisia Popescu, Helene Killmer, Elina Rubertus, Stella Krüger and Lisa Hintermeier Frontiers in Psychology 3 August 2020 | Models and Theories of Speech Production 251 A Simple 3-Parameter Model for Examining Adaptation in Speech and Voice Production Elaine Kearney, Alfonso Nieto-Castañón, Hasini R. Weerathunge, Riccardo Falsini, Ayoub Daliri, Defne Abur, Kirrie J. Ballard, Soo-Eun Chang, Sara-Ching Chao, Elizabeth S. Heller Murray, Terri L. Scott and Frank H. Guenther 267 Timing Evidence for Symbolic Phonological Representations and Phonology-Extrinsic Timing in Speech Production Alice Turk and Stefanie Shattuck-Hufnagel 287 Speech Sound Disorders in Children: An Articulatory Phonology Perspective Aravind Kumar Namasivayam, Deirdre Coleman, Aisling O’Dwyer and Pascal van Lieshout Frontiers in Psychology 4 August 2020 | Models and Theories of Speech Production EDITORIAL published:19June2020 doi:10.3389/fpsyg.2020.01238 Editorial: Models and Theories of Speech Production AdamantiosGafos1* andPascalvanLieshout2* 1DepartmentofLinguisticsandExcellenceAreaofCognitiveSciences,UniversityofPotsdam,Potsdam,Germany, 2DepartmentofSpeech-LanguagePathology,OralDynamicsLaboratory,UniversityofToronto,Toronto,ON,Canada Keywords:speechproduction,motorcontrol,dynamicalmodels,phonology,speechdisorders,timing EditorialontheResearchTopic ModelsandTheoriesofSpeechProduction Spokenlanguageisconveyedviawell-coordinatedspeechmovements,whichactascoherentunits of control referred to as gestures. These gestures and their underlying movements show several distinctivepropertiesintermsoflawfulrelationsamongtheparametersofduration,relativetiming, range of motion, target accuracy, and speed. However, currently, no existing theory successfully accountsforallpropertiesofthesemovements.Eventhoughmodelsinspeechmotorcontrolin thelast40yearshaveconsistentlytakeninspirationfromgeneralmovementscience,someofthe comparisonsremainill-informed.Forexample,ourpresentknowledgeonwhetherwidelyknown principlesthatapplytolimbmovements(e.g.,thespeed-accuracytradeoffknownasFitts’law)also holdtrueforspeechmovementsisstillverylimited.Anunderstandingoftheprinciplesthatapply tospeechmovementsiskeytodefiningthesomewhatelusiveconceptofspeechmotorskillandto assessingandinterpretingdifferentlevelsofthatskillinpopulationswithandwithoutdiagnosed speechdisorders.Thelatterissuetapsintofundamentaldebatesaboutwhetherspeechpathology assessmentparadigmsneedtoberestrictedtocontrolregimesthatarespecifictothoseunderlying Editedandreviewedby: typicalspeechproductions.Resolutionofsuchdebatescruciallyreliesonourunderstandingofthe NielsO.Schiller, natureofspeechprocessesandtheunderlyingcontrolunits. LeidenUniversity,Netherlands Unlikemovementsinlocomotionoroculomotorfunction,speechmovementswhencombined *Correspondence: into gestures are not mere physical instantiations of organs moving in space and time but, also, AdamantiosGafos have intrinsic symbolic function. Language-particular systems, or phonological grammars, are [email protected] involvedinthepatterningofthesegestures.Grammarconstraintsregulatethepermissiblesymbolic PascalvanLieshout combinationsasevidencedviaelicitingjudgmentsonwhetheranygivensequenceiswell-formedin [email protected] anyparticularlanguage(thesamesequencecanbeacceptableinone,butnottheotherlanguage).In whatwaystheseconstraintsshapespeechgesturesandhowthesefitwithexistinggeneralprinciples Specialtysection: ofmotorcontrolis,also,notclearlyunderstood. Thisarticlewassubmittedto LanguageSciences, Furthermore,speechgesturesarepartsofwordsandthusonewindowintounderstandingthe asectionofthejournal natureofthespeechproduction1systemistoobservespeechmovementsaspartsofwordsorlarger FrontiersinPsychology chunks of speech such as phrases or sentences. The intention to produce a lexical item involves Received:14April2020 activatingsequencesofgesturesthatarepartofthelexicalitem.Theregulationintimeoftheunits Accepted:12May2020 insuchsequencesraisesmajorquestionsforspeechmotorcontroltheories(butalsofortheories Published:19June2020 1Oneofourreviewersnotesthatinthefieldofpsycholinguisticsthetermspeechproductionisusedmorebroadly(thanin Citation: theuseofthetermimpliedbythecontributionstothisResearchTopic)and,pointsouttheneed,aptlystated,“tobridge GafosAandvanLieshoutP(2020) thegapbetweenpsycholinguisticallyinformedphoneticsandphoneticallyinformedpsycholinguistics.”Wefullyconcurand Editorial:ModelsandTheoriesof lookforwardtofutureresearcheffortsandperhapsResearchTopicsdevotedtosuchbridging.Forarecentspecialissue SpeechProduction. onpsycholinguisticapproachestospeechproduction,seeMeyeretal.(2019)andforamorefocusedreviewoftheissues Front.Psychol.11:1238. pertinentto“phoneticencoding”(aterminpsycholinguisticsroughlyequivalenttoouruseofthetermspeechproductionin doi:10.3389/fpsyg.2020.01238 thepresentResearchTopic)seeLaganaro(2019). FrontiersinPsychology|www.frontiersin.org 15 June2020|Volume11|Article1238 GafosandvanLieshout Editorial:ModelsandTheoriesofSpeechProduction ofcognitionandsequentialactioningeneral).Majorchallenges componentofabstract,symbolicphonologicalrepresentations are met in the inter-dependence among different time is kept apart from the way(s) in which these representations scales related to gestural planning, movement execution areimplementedinquantitativetermswhichincludesurface andcoordinationwithinandacrossdomainsofindividuallexical durationspecificationsandattendanttimingmechanismsfor items. How these different time scales interact and how their achievingthese. interaction affects the observed movement properties is for the ShawandCheninvestigatedtowhatdegreetimingbetween mostpartstillunknown. gestures is stable across variations in the spatial positions of In this special issue, we present a variety of theoretical and individualarticulators,aspredictedinArticulatoryPhonology. empiricalcontributionswhichexplorethenatureofthedynamics Using Electromagnetic Articulography with a group of ofspeechmotorcontrol.Forpracticalpurposes,weseparatethese MandarinspeakersproducingCVmonosyllables,theyfound contributionsintwomajorthemes: acorrelationbetweentheinitialpositionofthetonguegesture for the vowel and C-V timing. In contrast to the original 1) Modelsandtheoriesofspeechproduction. hypothesis,thisindicatesthatinter-gesturaltimingissensitive 2) Applications. tothepositionofthearticulators,suggestingacriticalrolefor Following is a short description of each paper as listed under somatosensoryfeedback. thesethemes. Roessig and Mücke study tonal and kinematic profiles of different degrees of prominence (unaccented, broad, 1) Modelsandtheoriesofspeechproduction narrow and contrastive focus) from 27 speakers of German. The speech signal is simultaneously expressed in two Parameters in both the tonal and kinematic dimensions are information-encoding systems: articulation and acoustics. shown to vary systematically across degrees of prominence. Goldstein’s contribution addresses the relation between A dynamical approach is put forward in modeling these representationsinthesetwoparallelmanifestationsofspeech findings.Thisapproachembracesthemultidimensionalityof while focusing not on static properties but on patterns of prosodywhileatthesametimeshowinghowbothdiscreteand change over time (temporal co-modulation) in these two continuousmodificationsinfocusmarkingcanbeexpressed channels.Todoso,Goldsteinquantifiestherelationbetween withinoneformallanguage.Themodelcapturesqualitatively rates of change in the parallel acoustic and articulatory the observed patterns in the data by tuning of an abstract representations of the same utterance, produced by various control variable which shapes the attractor landscape over speakers, based on x-ray microbeam data. Analysis of this the parameter space of kinematic and tonal dimensions relation indicates that the two representations are correlated consideredinthiswork. via a pulse-like modulation structure, with local correlations Iskarous provides a computational approach to explain beingstrongerthanglobalones.Thismodulationseemslinked the nature of spatiotemporal particulation of the vocal tract, tothefundamentalunitofthesyllable. as evidenced in the production of speech gestures. Based It is widely assumed that acoustic parameters for vowels on a set of reaction-diffusion equations with simultaneous are normally distributed, but it is rarely demonstrated that TuringandHopfpatternsthecriticalcharacteristicsofspeech this might be the case. Whalen and Chen quantified the gesturesrelatedtovocaltractconstrictionscanbereplicatedin distributionsofF1andF2valuesof/i/and/o/intheEnglish supportofthenotionthatmotorprocessescanbeseenasthe words“heed,”“geek,”“ode”/“owed,”and“dote”producedbya emergence of low degree of freedom descriptions from high singlespeakeronthreedifferentdays.Analysisbasedonahigh degreeoffreedomsystems. numberofrepetitionsofthesevowelsindifferentconsonantal Patri et al. address individual differences in responses to contexts indicates that distributions are generally normal, auditoryorsomatosensoryperturbationinspeechproduction. whichinturnsuggestsconsistentvowel-specifictargetsacross Two accounts are entertained. The first reduces individual different contextual environments. The results add weight differencestodifferencesinacuityofthesensoryspecifications to the widely-held assumption that speech targets follow a while the second leaves sensory specifications intact and, normaldistributionandtheauthorsdiscusstheimplications instead, modulates the sensitivity of match between motor fortheoriesofspeechtargets. commandsandtheirauditoryconsequences.Whilesimulation TurkandShattuck-Hufnageladdressthenatureoftiming results show that both accounts lead to similar results, it is inspeech,withspecialattentiongiventomovementendpoints, arguedthatmaintainingintactsensoryspecificationsismore which as they argue relate to the goals of these movements. flexible, enabling a more encompassing approach to speech Theargumentispresentedthatthesepointsrequirededicated variability where cognitive, attentional and other factors can control regimes. Evidence for this argument is derived modulateresponsestoperturbations. from work in both speech and non-speech motor control. Oneofthefoundationalideasofphonologyandphonetics It is also argued that in contrast to the Articulatory is that produced and perceived utterances are decomposed Phonology/TaskDynamicsview,wheregesturaldurationsare into sequences of discrete units. However, evidence from determined by an intrinsic dynamics, duration must be an development indicates that in child speech utterances are independently controlled variable in speech. A phonology- holistic rather than segmented. The contribution by Davis extrinsiccomponentisthusproposedtobenecessaryandacall and Redford offers a theoretical demonstration along with ismadefordevelopingandtestingmodelsofspeechwherea attendantmodelingthatthepositedunitscanemergefroma FrontiersinPsychology|www.frontiersin.org 26 June2020|Volume11|Article1238 GafosandvanLieshout Editorial:ModelsandTheoriesofSpeechProduction stageofspeechwherewordsorphrasesstartoffastime-aligned Parrell and Lammert develop a synthesis of the dynamic motoric and perceptual trajectories. As words are added and movement primitives model of motor control (Schaal et al., repeatedlyrehearsedbythelearner,motorictrajectoriesbegin 2007; Ijspeert et al., 2013) with the task dynamics model todeveloprecurrentarticulatoryconfigurationswhich,when of speech production (Saltzman and Munhall, 1989). A key coupledwiththeircorrespondingperceptualrepresentations, element in achieving this synthesis is the incorporation of give rise to perceptual-motor units claimed to characterize a learnable forcing term into the task dynamics’ point- maturespeechproduction. attractorsystem.Thepresenceofsuchatunabletermendows In their contribution, Kearney et al. present a task dynamics with flexibility in movement trajectories. The simplified version of the DIVA model, focusing on three proposed synthesis also establishes a link to optimization fitting parameters related to auditory feedback control, approaches to motor control where the forcing term can be somatosensory feedback control, and feedforward control. seen to minimize a cost function over the timespan of the The model is tested through computer simulations that movementunderconsideration(e.g.,minimizingtotalenergy identify optimal model fits to six existing sensorimotor expendedduringareachingmovement).Thedynamicsofthe adaptation datasets, showing excellent fits to real data across proposed synthesis model are explicitly described and their differenttypesofperturbationsandexperimentalparadigms. effects are demonstrated in the form of proof of concept An active area in phonological theory is the investigation simulationsshowingtheconsequencesofperturbationsonjaw of long-distance assimilation where features of a phoneme movementtrajectories. assimilatetofeaturesofanothernon-adjacentphoneme.Tilsen 2) Applications seeks to identify mechanisms for the emergence of such Noiray et al. present a study in which they examined non-local assimilations in speech planning and production whether phonemic awareness correlates with coarticulation models. Two mechanisms are proposed. The first is one degree, commonly used as a metric for estimating the size where a gesture is either anticipatorily selected in an earlier of children’s production units. A speech production task epoch or is not suppressed (after being selected) so that its was designed to test for developmental differences in intra- influence extends to later epochs. The second is one where syllabic coarticulation degree in 41 German children from gestures which may be active in one epoch of a planning- 4 to 7 years of age, using ultrasound imaging. The results level dynamics, even though not selected during execution, suggestthattheprocessofdevelopingspokenlanguagefluency maystillinfluenceproduction ina differentepoch.Evidence involvesdynamicalinteractionsbetweencognitiveandspeech forthesemechanismsisfoundinbothspeechandnon-speech motordomains. movementpreparationparadigms.Theexistenceofthesetwo Tiede et al. describe a study in which they tracked mechanisms is argued to account for the major dichotomy movements of the head and speech articulators during betweenassimilationphenomenathathavebeendescribedas an alternating word pair production task driven by an involving the extension of an assimilating property vs. those accelerating rate metronome. The results show that as thatcannotbesodescribed. production effort increased, so did speaker head nodding, Xu and Prom-on contrast two principles assumed to and that nodding increased abruptly following errors. The underlie the dynamics of movement control: economy of strongest entrainment between head and articulators was effort and maximum rate of information. They present observedatthefastestrateundercodaalternationconditions. data from speakers of American English on repetitive Namasivayam et al. present an Articulatory Phonology syllable sequences who were asked to imitate recordings of approach for understanding the nature of Speech Sound the same sequences that had been artificially accelerated Disorders (SSDs) in children, aiming to reconcile the and to produce meaningful sentences containing the same traditional phonetic-phonology dichotomy with the concept syllables at normal and fast speaking rates. The results of interconnectedness between these levels. They present show that the characteristics of the formant trajectories evidence supporting the notion of articulatory gestures at they analyzed fit best the notion of the maximum rate of the level of speech production and how this is reflected in informationprinciple. control processes in the brain. They add an overview of Kröger et al.’s contribution offers a demonstration that a how an articulatory “gesture”-based approach can account learning model based on self-organizing maps can serve as for articulatory behaviors in typical and disordered speech bridge between models of the mental lexicon and models production, concluding that the Articulatory Phonology of sensorimotor control and that such a model can learn approach offers a productive strategy for further research in (from semantic, auditory and somatosensory information) thisarea. representationalunitsakintophonetic-phonologicalfeatures. Heyne et al. address the relation between speech At a broad level, few efforts have been made to bridge and another oral motor skill, trombone playing. Using theory and modeling of the lexicon and motor control. ultrasound, they recorded midsagittal tongue shapes from The proposed model aims at addressing that gap and New Zealand English and Tongan-speaking trombone makes predictions about the specificity and rate of growth players. Tongue shapes from the two language groups were of such representational features under different training estimated via fits with generalized additive mixed models, conditions (auditory only vs. auditory and somatosensory while these speakers/players produced vowels (in their trainingmodes). native languages) and sustained notes at different pitches FrontiersinPsychology|www.frontiersin.org 37 June2020|Volume11|Article1238 GafosandvanLieshout Editorial:ModelsandTheoriesofSpeechProduction and intensities. The results indicate that, while airflow models with other aspects of cognition; and finally, to the production and requisite acoustics largely constrain vocal potential of theoretical models in informing applications of tract configuration during trombone playing, evidence for a speechproductionindisorderedspeechandmotorskillsinother secondaryinfluencefromspeechmotorconfigurationscanbe oralactivitiessuchasplayingmusicalinstruments. discernedinthatthetwogroupstendedtousedifferenttongue AUTHOR CONTRIBUTIONS configurations resembling distinct vocalic monopthongs in theirrespectivelanguages. Allauthorslistedhavemadeequalcontributionstotheworkand The papers assembled for this Research Topic attest to the approveditforpublication. advantages of combining theoretical and empirical approaches to the study of speech production. They also attest to the ACKNOWLEDGMENTS value of formal modeling in addressing long-standing issues in speechdevelopmentandtherelationshipbetweenmotorcontrol AG’sworkhasbeensupportedbytheEuropeanResearchCouncil andphonologicalpatterns;totheimportanceofsomatosensory (AdG249440)andtheDeutscheForschungsgemeinschaft(DFG, and auditory feedback in planning and monitoring speech German Research Foundation) - Project ID 317633480 - SFB productionandtheimportanceofintegratingspeechproduction 1287,ProjectC04. REFERENCES Schaal,S.,Mohajerian,P.,Ijspeert,A.J.,Cisek,P.,Drew,T.,andKalaska,J.F. (2007).Dynamicssystemsvs.Optimalcontrolaunifyingview.InProgressin Ijspeert, A. J., Nakanishi, J., Hoffmann, H., Pastor, P., and Schaal, S. (2013). BrainResearch165,425–45.doi:10.1016/S0079-6123(06)65027-9 Dynamical movement primitives: learning attractor models for motor behaviors.NeuralComputation,25,328–73.doi:10.1162/NECO_a_00393 ConflictofInterest:Theauthorsdeclarethattheresearchwasconductedinthe Laganaro, M. (2019). Phonetic encoding in utterance production: a review of absenceofanycommercialorfinancialrelationshipsthatcouldbeconstruedasa open issues from 1989 to 2018. Language Cognit. Neurosci. 34, 1193–1201. potentialconflictofinterest. doi:10.1080/23273798.2019.1599128 Meyer, A. S., Ardi, R., and Laurel, B. (2019). Thirty years of speaking: an Copyright©2020GafosandvanLieshout.Thisisanopen-accessarticledistributed introductiontotheSpecialIssue.LanguageCognit.Neurosci.34,1073–1084. underthetermsoftheCreativeCommonsAttributionLicense(CCBY).Theuse, doi:10.1080/23273798.2019.1652763 distribution or reproduction in other forums is permitted, provided the original Saltzman, E. L., and Munhall, K. G. (1989). A dynamical approach to author(s)andthecopyrightowner(s)arecreditedandthattheoriginalpublication gestural patterning in speech production. Ecological Psychology, 1, 333–82. in this journal is cited, in accordance with accepted academic practice. No use, doi:10.1207/s15326969eco0104_2 distributionorreproductionispermittedwhichdoesnotcomplywiththeseterms. FrontiersinPsychology|www.frontiersin.org 48 June2020|Volume11|Article1238 fpsyg-10-01462 July9,2019 Time:17:38 #1 ORIGINALRESEARCH published:10July2019 doi:10.3389/fpsyg.2019.01462 Emergence of an Action Repository as Part of a Biologically Inspired Model of Speech Processing: The Role of Somatosensory Information in Learning Phonetic-Phonological Sound Features BerndJ.Kröger1*,TanyaBafna2andMengxueCao3 1NeurophoneticsGroup,DepartmentofPhoniatrics,Pedaudiology,andCommunicationDisorders,MedicalSchool,RWTH AachenUniversity,Aachen,Germany,2MedicalSchool,RWTHAachenUniversity,Aachen,Germany,3SchoolofChinese LanguageandLiterature,BeijingNormalUniversity,Beijing,China A comprehensive model of speech processing and speech learning has been established. The model comprises a mental lexicon, an action repository and an articulatory-acoustic module for executing motor plans and generating auditory and somatosensory feedback information (Kröger and Cao, 2015). In this study a “model Editedby: language” based on three auditory and motor realizations of 70 monosyllabic words AdamantiosGafos, has been trained in order to simulate early phases of speech acquisition (babbling UniversitätPotsdam,Germany and imitation). We were able to show that (i) the emergence of phonetic-phonological Reviewedby: JoanaCholin, features results from an increasing degree of ordering of syllable representations within BielefeldUniversity,Germany the action repository and that (ii) this ordering or arrangement of syllables is mainly JasonW.Bohland, BostonUniversity,UnitedStates shapedbyauditoryinformation.Somatosensoryinformationhelpstoincreasethespeed *Correspondence: oflearning.Especiallyconsonantalfeatureslikeplaceofarticulationarelearnedearlierif BerndJ.Kröger auditoryinformationisaccompaniedbysomatosensoryinformation.Itcanbeconcluded [email protected]; that somatosensory information as it is generated already during the babbling and the [email protected] imitation phase of speech acquisition is very helpful especially for learning features Specialtysection: like place of articulation. After learning is completed acoustic information together with Thisarticlewassubmittedto LanguageSciences, semanticinformationissufficientfordeterminingthephonetic-phonologicalinformation asectionofthejournal from the speech signal. Moreover it is possible to learn phonetic-phonological features FrontiersinPsychology like place of articulation from auditory and semantic information only but not as fast as Received:09January2019 whensomatosensoryinformationisalsoavailableduringtheearlystagesoflearning. Accepted:07June2019 Published:10July2019 Keywords: neural model simulation, speech production and acquisition, speech perception, neural self- Citation: organization,connectionismandneuralnets KrögerBJ,BafnaTandCaoM (2019)EmergenceofanAction RepositoryasPartofaBiologically INTRODUCTION InspiredModelofSpeechProcessing: TheRoleofSomatosensory Speakingstartswithamessagewhichthespeakerwantstocommunicate,followedbyanactivation InformationinLearning ofconcepts.Thisprocessiscalledinitiation.Subsequentlyconceptsactivatewordswhichmaybe Phonetic-PhonologicalSound Features.Front.Psychol.10:1462. inflected and ordered within a sentence with respect to their grammatical and functional role. doi:10.3389/fpsyg.2019.01462 Thisprocessiscalledformulationandstartswiththeactivationoflemmasinthementallexicon FrontiersinPsychology|www.frontiersin.org 19 July2019|Volume10|Article1462