Statistical ‘Techniques for Talker Identification My B,D. BUCKEA, K CNANADTSTEAN, M. ¥. MATHEWS, MISH 8. PRUZANDRY. P. 4. TUKEY, &. W. WACHTER aad J. To WARNER (Gtancaigtreirad Movember 6 50) his per proien ae anerview of work on state! formulas ‘ra aman aorta th the prablana wwe fping persue on he Inns af spretrat energy Tenesontatcns af asowstica eitsree®, The intetiotion ase tase soypvieal and the oper fecuace othe Getistea!teckniynen mul lone tak fe eon devnloped im Ee ontect af analyzing ter senchin boiiee oj date. The grobiews and procedures t9 be ecwcerd sactote: (3) dy erulemacton wad Fore eatutice; 01) eont and preticalertria for elossifeasion aed ite Criminativns und 3) abvatagie fv tonne dentipecton of tlbath fn relaty ay Urge papules ‘Mens of ms enn perhaps ser! the axpsiene of identifying a ent fo the texphowe (rva a rn iurly ahrt tteranee meh as the word "ital." Tie mths iene Cha even stork ullerancesconrain eu cient tnforation for dent Seatn, ne iho ine ad intent Ing problem to inquire sehr wld, wjetion setae and coe sie methods eat, be develeoed for (ler rrvgitign, Toe aulbate OF et lent rm papers i he I gh Sas have ynorted experiments SSE (Soulate wats natie ake rerogh acre Using a raciety of ap Drowees to iferenk gap ul ee prt, dase enpesineniee base fot mith stokingly Gril snows 9 peo farsave) earoae Dee pation Previous studies way he easitnd fata seo romps aosocding to bother she proldeeswcrvered wa tarifeutin (i ie spealiee Whe be Clans a te ionsfndion {assim an arkcown tteranee {one sean in piven group of speakers. While to scien of the Figs aia involved 84 voices oe Kos? the Wind? ad uaaet evans 14 non am, svetear mmcenaenL soLagG, ATT (108 voinn waa muse sueesstal,welfering an everage af 9B porent comeat. or femtone. Four studies of Cie sevrcl apse used qui dl Teroo! buses for slonitecliny mex 0-3U vuices) popes ‘uring ftoo: sycele] aalyses of Enard gover oe whale word wo smuapurntons of posdegion eatszee Wheres al the alu studies regained te salem U9 slvr ptrectbee text chrce olhom hase fshiovod drom 90 ta IW oecnen: some: nse Livy wilh ms sm rersines on wha the spenkor wis provided that ¢eaffeleut quality bf epeech from cach talker i wr! te, Again he procedaver employed, tv hone Tas iy tues havo rfenen widely: apoetral analysee © tae speveh? of! recs uly? wee narvena essoen ere points in varios Srequraey bunds" wer wel wth hier dilanen2eeogntion schon, "These rele with smal popubione sgn that the apeceh sigcal coving a enh infrazaion about te taller ak ca be cate ised frown among Dov a by a vasizy of prosaure, and Cat we ‘sev Tenn Fron thet ence relsivw ner uf wars WR of Fepmecalig ‘he aigna) and teaeking a deeiaon, The wily auc of Uieee studios Unt won x hunceed ar mote voles? requized only thi eae sunkovsn hs asianed tn one af to eeape {genuste oF i:poeor]; Het Jouve boc no bul of dutifieaton i large sopalions, ‘The omer dtgeilus the evelu ‘on of wrk adnesed tn both th eral end sige epulstion sonication peoblenas hy @ group of peu, ineheye the Drteent aullirg, over the Tash few yrane. Aside Sem che pe fusbors, ule why is sail in eiterent Eoeta of thie worl fre: Mra MT, Rsk, Mis, T. P. Hughes, T, L. DeGasine, B.S Pinkham ae MB, Wie “The work co be dorlné bere esulvel empvieliy end exper taentaliy inthe ont of analyz ta hodies of daca. With no gen- fre theory being available on nih digi a Donen Sor tees Sdenigeation, thie wor ved hewyy san anaes of data nt ly te guncant ens id trhnicc of poreible reletsane is £10 9 ae the personmance uf aay el-vme ‘Tea Ue prnyaatineite i af oe served propertions of err rerglin i Usp eg Taine alee Utiined ge the roocherone rather fhan amy goseal Theneical epic tasliy properties,‘ al wale urea i hse pena to be practical nod produstive, nad mest of che exeuss‘u ier ad azcthose ate icy abvione—eapciniy aster ho foot ‘The presetation of tie data un yeis wad deekion processes may be sere in four parte: 6) The data-—the twa bodes of cata stihes Fi Le deserted, the basic digit) onmet ef an aeoustca! utlrnace ill ‘raLaun oxsnisosnee 9 be ceecrba and deploy, se Festuces ofthe data wil Ie dienes. (2) Dato condensation’ several primitive procedures wll anacoued fr deriving anna limensional representa ‘Hoos sera the original iste (0! efi of « space and metrics, the umnnsa, and the vesious ewdidates for wssgning 1 bo, may be ‘epreseuted a the space ofthe surimury eat, se several meee may a spevifed for measnsing “he stance between the unkown an the cavdidates for sdeotieatin, 128) Ciaesifonton amen ond aratniee for identigention—c, prneeducee Zor uaigping ex unkoowa ins reltivelystnet popilaie contending epeakers a well ms statstie fat stelogirs far elatin i sltively ange papalations of speakers “The evo bots of eat involv in ne sly bath ea with repented snteranee of single words, The fe elo dita (el. 8. Prazayaky® for see desertion} is fen on cakes eh of whom yielded several epctitione of ten wore commonly need i telephone conversions ‘The sctual weteranoes wen exeorie om sesteneas in sich the won wore erbwcded aod te tele, fz fe, red the ne ees Por “noe ialker-med combica:ios shore ware eoven reliolions with om Stow rine 1698 utterances were aval sea of 700 = 10 > mt) ‘Phe soeand soy of dts, Which was clleted subsequent co pramis- sng seu obtene withthe fist «2 of dete, deale wih s populesion fo 172 speakers cach af whom repeated rar af dive digit name (one, Ihe, thoes, foe atl we) five tines. The wonts Were uttered in ola tion rather than being embricied in sealenes, ‘The send body of Tia involved many more spenses fewer wants ad emer replications ‘lative bo Ue fst act of data ‘Whereas te fir oct uf recurs: was mae under eavefoly con tvnieeeerditions fsee Re, 5, the sal set wee wade in an 1aat- orl eet in a bey eazeouse,AIUhogh a h-quality meeophene sensed, it-vaa housed in tlophone Tet, ll w sho (but vie {Ae} sistas fom th igs Avteracieenaipmene controled «doles jn Ge Bool which eved the talkore 280 hich digit name te sy and ‘hon Tw oth ens, all tcerasees by a given taller were oer in nthe present report, the dieplavs and examples are drawn from anayaee of both bes of i'n wl he pwoseataln wl suite hel fd fora beucen the two sets of eas 149 ra amy svSIME eMe-MECAL SOLE, APA IL ‘he most for of Uta ie jut the ui rewrings. However fo rapases of anal, th aio trooednas were fe nko an anslos Flier im si fae Alter outputs wene aunpled wt fo, request ine rals of time (Kmilinecon intervale in the fet boly of dats and Grniicwnnt ingervle fa the soon set}. In the fst aeh of si, the flute inom 17 fonteneyehaanele covering a range of 100 t9 7000 Hz Ivers fluined; the at 16 shunnols wore ayproxitacely equally spared long s Korg eonle vrs 200 ta 4000 He, while the Mth eovered ie Tame $000 to 7000 Hx Tbe etenad se of dts she oul pate soa 2 Trrenoney charels shanoing the range 20 to 20H Te weve seesined de upper etd lower etzed ieee of rac af the 20 Ale=a a ‘Rows ov the abvciea af Hig. 8 Enel wucio weraboe ‘apa: thie ‘Psided a certain sumser (17 we Uw Tet set of data snd 20 in the econ) of svete sane series ae utp, with each serge representing the energy in spore Tequenay bund ns it varies sevues tine. Tox Gather the sores represent che srl -nvespetram ofthe utterance ‘Thm the hasie digital fra of Ue dal for an lteranee onsite of routs of spelen] eergna ees’ overding to Fequeney aude 30 Gach of @ sequence 0” Hine ierle, (sop Prova & Matewss for fs ceserpion w! energy Srequeney-tine quvnlieaion-) Tale Ti an Geample of data ates onmn the seeard et of data, Ore van aban pintnsic.zepresen:ions of sak ¢santris, The cee wyesontetion i fhe soar spre nage, whieh © undortanctely vot san for eas read by commuters Figure 1 eaowe a eontour plot of lox energy a2 3 Finetion of time an feauance; i wae ebnel wea compos prin foom a date mali. Atthong® dexived in ¢ stcughtforacd wry ‘rom fomputer-eale ala, tie pl eonveys some of the visu) sepecle tf the sound spcotrogea. ‘Some ramen an vortsin sepets of the data are in onfer= (The teal volume of dts ir large.) ‘The basie disitel representation of ‘Tanta I-Deta Maumute von an Cirrmnanon a Bok of 8 8 8 1432 rue atu, aysrmae TeomuEML soussAt, APM, 1 ‘he dasa foram atzerane i intraethiy high in dicensirulity for per forming ettiaten) wnalyen. [For tho fret st of al the metriers we 17 x 60 ‘appro, of #-diensions, per suteranc; forthe second they sere 20%. 273 ype), oF S8N0-densional, per wlteraree!) fii) The geceral level ef the energies way shaft from l{erance to idlerance of even tae vem sperker du to araiuetul rensons. (Lowe reve may very for example eau 6” verging preximity Co he miexo- phone tn) These xno atc owe origin foe tae daca an i spece ication is arbitrary in that wa is Inlet aot 1 does nob Alpers on the neta? comnmeneenaet of the wierance; this implies Tack of alignment af the data for diTernt wlerences of a given word coven by the snae speaker. ‘The conjunction nf the aloe four inase conveys ceri implies tows forthe wren analyses, Fat, its essential, even for explora tory investigations, to pa allelion Lo practicality ard effsioney in coupiter procosares Seeand, iti eras! to find effete lower rmentional represrtations nf hs daca using methods of striata that wl be of gneral tility ols foe HPerext persera end for df= ferent verde. Finely, adjnerk mart be provide for azeesetaal = feets, cioh as energy-lovel variation and arbitrariness of the tine srgin, Sues adjustments may be nevonpiche ther hy seating tee Aint poe auchyae ox by wong aati! procednree wie make prerisione for tke areessl ele. Thus, “oF example, exergyovel Yaciation can be handles! cicher br moralivng [ evexpim wo Uhe> Ice sum is unity for ech nttrsuen oe hy wig clasiestion pro- ferns aie allow for lovl shangrs amoryet te repligsted ata finees of specker let. H. Gass & MB. Wile). Biilaly, the wbitrarines of tise crigin many be handled either hy peeling the utternteee by rere eerioa, suchas The one vant hy Paasneky swith the fet body of dat, or by usin orgin-nvarient time inform Gon in ister anslyses ‘the high dimcasionlity of the basie quantivalive rerérentstion af fom uttrnieo (vie, she atric of spentesl energies) is not only femapatationally unienble ae comely Aiiew'k Sot alo perhaps lurneressary. One woule expe thnt the high physica! en stetisticed ‘enelaions atseng the energies shout ‘mply edhindaney. The limited Dues of ropleations aveilsle wats sorcerer, pes 9 mater eal eoaatssnt on aeahle dimensionality. Por ell shee reasons, eamne ‘oom onasimextiox 493 smarizaton i neorsssty, The en foe simauary statis wre gion nd the eonsequenees important, Varlone ashen far eodesing the fafurmetion in tens of mavagesbly:nwimension statistics were studied ‘Table TE shows a lst of sor af he pes of inforinttionsurumerias that are invessgaied For laelante, Sumaciting vie £9 tin -usrziz meme comsidor Me energies martaliged) salapsel anv the hme sule clone ‘itl wny Znquency breaiziowa. Sinfsny, ep ieney tana means thatthe ensgice normalize act none overall tho time interval, us elisansting the information nowt sime-rsiion of the speeuna Looking a fenpueney soe implice the conscenstion of the meng” otis in euch of the frequency chenaels, With ras uf hee ways of lacking ee the data, sere alterate tthos ere feet tated Tor surariing the information, Hop fastance in sb the Fine margin both the energies themselves es wel as characterization 19 thei ltelition nore tre in terme of ett lon-oedes wont fnvean, sane devidion, ee sere vestigated, The disttstion of towne within « srequeney slic, however, was tye riter byte Aeviniens o? ts to tele fhe Line value whieh eve the energy Aumatbution int chice eq parts) fry che rusia! tine mia ee by the inserter distance. "These two timeslependene chavaete'2h- Tor mre detals eva cerning he rediction wud lye of fs! met of ea) ‘One the inportunt ssnmaries fro the stacpoint of perfor dae i iontSeaton prover, sta be the frequency magi normale ence ‘Hos bos Touinnensionalzepeeentation with fhe drs body of dita amd 2W-digtisions reprceentation in the Enns aor origin svaremt. (Sor Heke, of a Tents) T—Beaoeanuannioss oy Dara toy FRmgueyey’sticze 1) Intetostinees to) REGENCY aH ere cml mate dive ro emer (0 ng a nenqevey {S) Mealy oped erate nee M34 zum erax avereae weontou, orm, APRIL FL tccond ect To illustrate how this summary represenation ray Tl Figa. 24 and b cuh show the murmalised encegae in the freuuenry margin for ell the usranecs of 4 wor ly apeetc talker. Figure 2a i for one epear and Fig b ie for urotber. Qualitative sd qcentine ‘ive dferenees between chet »poikns are evdent es siewec: nest the lative eabesvencse a2 the dient wlterwnen within a spaaker, ‘Rach sezeme for suming Jo Ins data Ted wo sct of !aput statistics whove values foreach utlranre yrld « veolne eovrspending or that utterance, ‘Tho sua.teleinvorsed designating exrntn of the lteerances from ene spexker ax wnnawen End trecting the ramsining litterances as the reference et of Known uttaranors ta be used for ppurpows of statisteal estimation, oof the Features of tee rere popalation Tus, as shown in Table TT, vorespnting tn the wh reference ticrance {ie he alr zon) ofa spore wont bythe th alls, fe woul have e pedimeniannl ne velar opt statistic, Wah ym Y= Westone ytd E12, shore dh th element of the veotor 2 the value of the jh input Satstie Zor che uth utteraros of the th tallee. ‘There are eallere in tl wd kava utters irom tho‘ch taker. Then lore wtter:mecs Of the ss talker may then be wed, = shown in "Table I, to obtain Ure ‘dimensional centnid, ¥; , veel the pp covariance matrix, 8; , For he eh tlk, ‘Correponding to on unknown utterenes (2, tho tellue is unknown end i 0 be identified), whieh a known only to be aa utterance of ome fone of the tallers i eal one ould silly havea p-imenvional foneesntation, shown in Table LI ws Bim teva Ako <loyn in Table HL are the averll centred 2” snd vo matices Band W, B isa measure ofthe disperaon of sho speaker centres in pspnse and fe eallad the beeweulalvers cuvarizace mutex. W i Fooled naaute of deperion ofthe rescue kaown uleroness anetnl the talker eentroide ned i Ted sho within takers eoveviance matrix. fa mectie or ditaase matnire sere delined in the ppaimensional ne of th inpst sates, thon one coll ealelate the distance of the unknown, vg, Z', cium cieh of the oetoida, wz. P's, of te 1455 Big Seceuens manic wy yu recy (au dt. V5 nm saa weausne Perea FOURKAL, AME diferent takers ead thon wae theo divtanoot to aetgn the unknown to ore ofthe tlle, ‘AL tie measures of squaeed dlstasee wal in aus work were yoilve emieetirive quidentin buone, 4 eee whoo typieal mercber sony be Slaebroialy define! am shown im ita (0 of wale TV. This elo Ines nob oxy che Lani wineighted Rueliewn squsers distance (GH — 1) ane the weighted Bulidean eqnaged distance, whieh makes Sllowaaces for unequal vorianes af the deren varitle, bat seo Froamures of equared eshwien whieh allow for eortelntions amang the Cacinbles, Figie & écaling wilh tle esee of wo variables ekows an. Sppropsice ananner of ensuring equarod stance when the cozela- tion is postive Aecocding Wy such an elipties) miearure of squared itunoe, points Ike dy und Py whieh He the ste llipee re eon dered fo be (he eatse cimanon wway From the ertor C of che ellipa, ners pt ike ay dere as Which He on the decent lips fhunbored 1,2 and # are eusiderel n beat increasing dsteneee sway from C The way Uo aelest thie coiee Zarmally in he definition of quarod distanoe ie 20 use fur BU the saverse of un estimate of the fevasience six of The vases. "Table IV ceo shows thecesprealeations of cbe tebe Mf chet Tend to chreeequisl Wistenee nssanares Th, Dy ancl Dy shown, eagweively, ss onuacons (1), (2) <nd (8 “The eboiee of M tht less to 7: wes each taker’ iin. mene ince matsp viemsaring 2 dae The xnkewn ce thet user's “Tasun THNonuniox an Famnvanss roa TReexancr ems © * Sie. — nw. —B; heat w ai, where w= Some