ebook img

BSTJ 60: 7. September 1981: Digital Signal Processor: Architecture and Performance. (Boddie, J.R.; Daryanani, G.T.; Eldumiati, I.I.; Gadenz, R.N.; Thompson, J.S.; Walters, S.M.) PDF

6.3 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview BSTJ 60: 7. September 1981: Digital Signal Processor: Architecture and Performance. (Boddie, J.R.; Daryanani, G.T.; Eldumiati, I.I.; Gadenz, R.N.; Thompson, J.S.; Walters, S.M.)

pr 8 nen ag Digita! Signal Processor: Architecture and Performance By J. R. BODDIE, G. T. DARVANAN,, I. ELDUMIAT, RN, GADENZ, J.'S. THOMPSON, ang §. M. WALTERS (atanuecriotreoeved ly 14, 1981) This paper describes the sp, a recently developed integrated cineait implementing a programmable digital signal processor. The single-chip devie fs fabricated in depletion load SoS and is pack ‘agen 40-pin Div. 1 has the speod, provision, and flexibility for a arial of telecommmanication applications. The processor can decode an instruction, fetch data, perform a 16-by 20-bit maltipleation and 1 full a ie product accumulation in une machine cycle of 800 ns “This permits the realization of signal processing functions of such ‘applications as dual tone muliifrequensy receivers or low speed data modem with a single device. The arithmetic precision ofthe proces: Sor és also suficient for many voice signal applications, 1 INTRODUCTION Digital signal procesing hos beewme: more snd more important in ‘clocommunicetiona. As new produce aud services aze offered, che menu of rauite signel procescing continues oinereae. In addition, Signals are becoming digital, especially in applications where the ‘superior stability and accuracy of these signals i either necessary ot tore nutmetive, Digial sigaal processing is also prompted by the Inmduction of digital swivching ofces and digital transmission sy tens Ite made powible by the eontinaous, rapid growrh af che sileon Est and vist aapailties, The lar have made ie inexpensive vo bull complex processort—ao inexpensive tt itis eor-effercive even 0 use ‘4/0 conversion and distal signal processing in some analog systems ‘We indeed visualize the extnsion ofthe digital network all he Way to ‘the subscriber phone In thie paper, we describe « single-chip, digital signal processor zecenily developed for Bell System use, The device, known as Digicl ‘Signal Processor (at) ie 8 general-purpose building block which ean he programmed to perform a variety of gia signal processing func- ‘ona, Examples of these ave fering, equalivation, modulation, tone etection, speech coding, and Past Fourier Transform. ‘The DsP is fabriated in depletion-toad siaus und prekaged in e40-pin ni. Jt is ‘customized {o perform specific signal processing hunetions by means of fv on-chip read ony memory (RoW) eoncaining the prograca and fixed fina. ‘The device alo contains a random accoss mumory (wast for writing and reading varie data, a Copcrol Unit an Arithmetic Unit (at), an Address Arithieti: Unit (au, and appropriate Input/Out- pt (1/0 ezeutry. The Dee funedons in stand-alone manner, requir Ing only an excemal &-MEtz resonator or elock, or it may be directly interfaced with ocher processors wo achieve a greater degree of signal processing capability. "The Dep programmability makes the device useful for a vasiety of telecommunication applications, and results in a shorter and less fxpentve system development eyelo, Key elements in digtl signal Dprovasing are adequete numerical precision and high-computation Fates The na fers both. Tes 6- by 20-bit multiplier and 40-bic adder, ‘ating at 1.25 million operons per second, are unparalleled in oer st processors "The general nee archievlure iv described in Soetion T. Section LI centers on the psb programing and faclades a brief duscription of the instruction set Am example ofa simple program i also given 10 iuscrate the style of the input lacguaye. In Section TV, the naw 1/0 interface is described. Finally, an overview ofthe DSP performance in ‘spleal fering applications wiven in Section V. 1. ARCHITECTURE “Thisyection presents a description ofthe ns architecture. Aeshown in Fig 1, the principal fears areas follows: (0) adword by 16-4il Ros for instructions and fixed dats; i) a 128-word by Wb wast for variable dete; (ad) an anu hich genernisc addresses for the ROM and RAS met ves and performs post olieation arithmetic on these adie fie) on aU which aocepte a Llbit nnd a 20-bie operand to form a Debit predict, accamulaces the product wich a 40-bit accumulator, fd roundsthe acrumalator to a2) bil word (oth overflow protection} for storage or output; {e} an 1/0 setion which aevally receives and trans echer de 256 law or liner Poo signal samples: and 11480 THE RELL SYSTEM TECHNICAL JOURNAL, SEPTEMBER 1901 (oi) Control Unit for instruction decoding and overall system coordination, ‘The nse is alo abo to accoss a 1u24-word by 16-bit extornal now, wich no reduction in processing speed. Thi feature i expecially convenient Grng program development and testing. It is also ueeful for small, ‘oluise applioetiona in which the expanre of progeanmning the on-chip ox isnot justifiable ‘The analysis of many digital signal processing algorthme reveals (hat they basically perform mukipiations and edditions. Therefore, he au Wes designed Lo implement these operations ofiiently In is simples: form, be expression evalusted by the av is where, {isthe 16-5 colicin i rier, nce bit data word in regina. Again, the word lengths for x and y mere determined by examining che ‘requirements of « variety of telecommuniation applications. A good ‘compromise was exablicbed beowaen the hardware required ca imple ‘ment «given precision and the nee for geneval pat, lik the Os £0 ‘over mast applications. "The Sshie prouc, ps amie with che 26 least rignificant bite of ‘the contents, of tho A-bit accumlatr. 4, and the real ia wrgten ima, Whom the wos ie niet ube te AT (og. to ie remory}, the eanuenis of & are truncal or round, Buelow cor. ected [if neceaary) then stored in he 20 Bi AC cup raion, W. ‘The contents of w ean then he transfered othr ptf eh MAP oi the 2c ath Tis, on eed ae da fa anther arithmetic "The at x pipelined in Siow sages) ue ermine a de pace i tein of Une ye aan i he rer of 12 ‘Thus, while this transfer is in progress for ary one expression the asdition of the product in» to the eancents of fr the next expression {salen being performed, ad the formation of the product. of the following expression is taking place, Tie pipelined aucture keeps all, parts ofthe aU busy ala times end allows the processor to maintain, ‘igh throughpa, The full capability of the Av ie described the sore general aif} + sted oat 1452 THE BELL SYSTEM TECHNICAL JOUNAL, SEPTEMBER 1901 whe 4 = (6bit,coeticiant arhah in vend ino rogister x from the 28 rmost-sgifiant ica ofthe 20-bi data bus, This event is tuormally fiche from wom, nt could also be fetched From. dane o the ino ler y= Birbit dala word which i ferched from amor the input baler, wel i rend into regicer ¥ from the data bus. The sine nf ran alba be ote fn the data bus, 1 ~ concencs of the 40-bit aceumalavor 4, The four ext bis we provided for overflow proveetion, tw = Founded or mancated Uicbit au output word which is toned in rogiaer whe content of W eve li writen i Une atm bust storage in nan or for output through the output bude, fam he uae instead of in gnother enthmerie operation ‘The lease significant hit of w corresponds to bit 14 of «. hia sslveton is roneintent wich the anmumption that y and w are Imegera and chat 2 i@ uaually vesvicted by —2 = x = 2 However, other choices are possible by shifting « belure reading = linear or nonbinear function of cher yoru, auch a che actual value, che absolute value, or the gn Eunetion (sgaum} of one ft these variable, fr ~ avithmecie fimesion ofa fe. sling ofa by 2 ar 8 or alogical fonction of a and pip axe a he Whit proscar insimuctions are stored in the Rew. When coofcienes are fited, they will abu nesile in Row. Data for the flgonthm, shther i comes Fruss the input oF ix generated by the flporin: sim be sone nthe 20-btswide nan. In some applications frau ive tiles as resulted in aco eanelern ee coetcients are ‘aril al ar stored ithe 18 mos-signiican: bits of Rast beatin. ‘Address for mestory references are generated in the Sav, Pour reno aldresses,requived to access che instruction, the coefficient, fn Use data tooth read snd rie), are multiplexed ont the hie fddrees bot in eich procesor syle, ae the vorrei formalin asmultplesed onto the 20-bic dat bus, The pram it ROMs acess by the address stored in ragisier Pe, the pengram enter Fixed cuelicents in ROH ean bo be ade hy Pe Allerasively, eet cients can also be addressed by the wailary riser RX, which ran Doin ta either naw of Rav. Data, which ie roa From RAK, em Be Bedrecsed by BY or hy un asiiany myiacer RY, wie ncn ea be ‘written wy Raa by using addernaes in RO ue ROA. The primary ie of the mninry ropes HS aid A isco allow manipulation of tet foray resulis in 8 separate section of data memory: ‘The aay ul prides a selection of poasible inerements for pest _uificign ofthese nd dresses. Under the direction ofa given ruc tion, the contests of the adress tegiaters are applied co the addrose tous and them incremented the Sav adder before being restored co the ras nsnly for subsequent ure, The program and cuetficiunts tam Ie siuctured in ost so that the cantente of Fc nood enly be incremented by +1. The contents of other address rogiwtes can be inoromenced by the amount 0, +1, or ~L, ur by the contuns of the Ke i reviaers 4 ax, an saciid hy the instruction, The proeraan -clurn resi [Pk] chown ip Bg. 2 i used to provido a single level Gulvowtge eapobility. The 1e regter ie C-bit loop counter used to prosidelosping within an algorichm All theepistars encioned above fan be scr eo arbitrary vals. This can be done wnoonditionlly abject to «particular cendcon being ter. “Instructions fom ROM are Talchesl ino the iasiruetion register 1k tnd. aubeequently deended in the Conerol Unie. In some ausiiary instnvulions, wig 4 regnter et Trom Row or 9 regiacer Toad from sa 116-bir numer follows the insiucion; tisargarnent gest egiater 2s or ¥9, eepootively, ‘The dovoded signals are ranafereed from che Gontrol Unit tothe a, hw sate, a ta the wavous register, te needed, cArithmerie contr information tlie relaaively variant svithin an algorithm (eg, the cype of rounding apithmetic use ia ‘rowing sala Tem 4 00 oF the Builvin selo factor used in suse rruliplie eperstins) ia stored in tho aUt ropster. The 10¢ roxister toros 9 aiilar type of information for the 1/0 (eB the 1/0 re, oF (he sixe and format ofthe input and output data words) ‘The din inleraction hantoen the OsP and the outside world is carved ne though the structure Input and oulpul ae proces, he) the buffer registers uur and over, zeapectvely. The 1/0 inert arvoncuolatan a vetial teanafer of fb op 2irbit words tinder the contra of either the G8b of « variety of exter deine (eg, codecs, mieroprocessons. Difleen! erates and fiat axe available tothe programmer to facltate thi btertacing. Additional tals wl be yivem Seton TV. "The setting (under program contol} of register sve allows the: user to rurpand che DSP operation unl a coalition specified Uy one of she Welds of this register ic met. Thia can be used to aynchronize che processor program with the dace sample rte. The availabe rondtions fre input buffer fal, output buffer expry, or the etatun of ne of che {Sov Galsied lial ips et aval C1, A congo input, BSE ean be ‘Soul ta Inch itesvatly fe vatigos oF 00 andl Ck Sls’ setting tf vegiater vr allows the user to output dizetly one oF tio logical gals (and a1} andr w epnchration pulse (3). "The serial 70 sna ite control requice onocher ten pins; hey ate 1454 1H BELL SYSTEM TECHINIGAL JOURNAL, REPTEMAEA 19m exer iv Section TW. Sixteen of the 40 pins (nse0-08015) are Aedieated Uo tie exlernal data bus, which i ured ve access external Roa in place ofthe internal, mak-programmable Kos. The oemwinig gl pine ze atl flow til ovo for the +9 V power unply{voe) ant ground (VS) (a throo forthe erytal connection (xs ehe clock input (C1, and th cock oly (econ TES) one for evseling the nap to a stating point DOES; and {is} toe far external memory control GH and EXE), ‘Phe exloran nox is aceeased by serting XN low; XE combined with ‘enous allows the generation of signala needed to latch the adéross ‘coming out of the naP dhrough the bus pie, lacch the data flched ‘fom the wilersl nov, and enahle these daca onto tho OBE pins I, PROGRAMMING “The ne has two types of insructone noel aud ansiliary. The ‘ooraealinsbetetins control processor compulaiuns it Ue ann exal= tae the general exppeasion giren in the Inst section. The three AU ‘peraions of product formation, acurulsion, wad transfer tothe ac Stl register w if required are fully completed in une cyl of the aessurs The apatationa ars performed im parallel each ane coves sponding to ce partial waliation ata diflerene expression. For a normal insrvelin, » lypcal eymbolic astemblar input line conic of opt aw expressions indicating 1) the souree and doscnation of the diva vo be irinnlered out a "he au, with the destination adress register increment, (Go the contra of the At output rgb ernent Ai) Une funetiow co he performed by the accumulnor, al tiv) Uke prac o he formed by the multiplier, including a speci ‘ation of the operand address rginens und inerenunt. ‘When tue program constants are ss fr pmadurt operands they may be indicated direetl in the expression rather chan indirectly through an addres register "Ac the machine lee, Une Whit instruction haa feds that vontrol hove meationed functions, including ths iforoition needa to zeal the eveficent and data required in alter av opal, and to Sorte the result fa previous av operation, Conelants to be loaded [no the x roster are also 16 bie wie and restored in ea following {hw voresponding instruction, "Aunilaryinsinctine areal control oncomputatonal aspects of the ner, such as witazation of addrss reisors aa minal Nilo of eereain processor functions, ‘They can ao perils ae ‘ulitimal set of computational operations for the AC, such an com pressed/lineay conversions or large shits of the scoumulatov contents, thigh du not mains he govern argument est wean in tormad computations The aevembleriapul for thene inmeuetin i ions empl fermmae the special fanetians thar they speci, 36 robe amen, en she tegicer at nscruction of the example below ‘Al the machine eve 6-bic auibiary instruction isalvayefllowed bya IS bit argument which is interpreted either a¢ mu exten uf Te inszuction itself or as data aseotinted with the iatrwotion. Boch poral and ausliary instructions bave commen fields that allow ‘ting of previnu reales oy fetching of inlermation quired for lacor ‘Many (atunes ofthe psp are iuscaned in simple example of foartb-order recursive fiver shown schematically in Rig, 2 wn in The armemblerinput code belo. The filter has aula input frum ve ey buffer, ovo feanultiply second-order medles, anda Hear antpat to the ougput buffer, ‘The programy begins at Tine T with « seriea of tia instructions for iii the nar. The Firat raven insta tions aro unenndional register se operatinna, The eonetanes ic and UG, to bu written inte the eorresponuing registers, reflect the desived lions for 170 onl at speation Te newt egies ad frost to 1,1, and —8, respectively. Reqisers RY and x are ato 0, the adres ofthe Gest watt location. The constants, be writen into the respective resister, rfloct che desired condition fo suspend: ing the Ds operation Assembler inpul code for fourtiorder recursive Alter “456 THE BELL SYSTEM TECHNICAL JOURNAL, SEPTEMBER 1981 oop: p= mile uty; uk wea B= itl 1: ubut=w Bea iroaees 12, Deanery us po alate ti Ts trbeti “ poallretti 1h trleti po alla. 0 Ba Oa3try ati 1 Da oa ayy rr Bo a2 nyt 1s: 2: wen po alttry tk 2 Px ai0tw "The instruction inline is the ist instruction inthe main operating loop of the program. (This loop processes each sample through the filter) Ite funetion is to suspend the procercor unl the selecred external synchronizing vent seeuts, This is the method wed in ths ‘example for aynehroniation with an extemal sample rate clock. Lines 9 und 10 are auiliary instructions which perfarm the law to linear conversion. This conversion ix done an data mhich was fetched from, sce. The aocamulation, tranefer tw, and write to oUt in lines 4 1, an 11 efor to the operations chat store beman at the end ofthe logy. The practice of meshing the tail of lie mp with ite head is ‘essential for writing low overhoad code for this pipelined machine ‘The nase memory organization for this progesm i shown below: Location Variable ° “2 1 au 2 me 5 ma where the Ze are the sate variables shun in Fig. 2 ‘The values ip Feinlers Lund are used to motif the adresses in registers Ke and ip ao thar these variable loaciona maybe referenced. The i resister "ese hee aires afer thy are ated forte hs time in the Loop with no additional overhead. The fltersoeTcients (812. BLL + a2, ‘hate ated inline withthe code ‘The instruction chat sets the Pe for che coml-u-loup branch i ine 19 instead of at the acual end of the long, This is because of the ARCHITECTURE AND PERFORMANCE 1487 pipelnod architecture, When the machine is executing the branch Instruction at Tine 19, ti already deeading the instruction at line 20 dand fetching the instruction at line, Therefore, che nex instruction torbe etched wall boa ine 8. In this example, there re only to DAP eyelet af overhoud in the Joop fee secting of S76 alee. The total loop has 14 eyes and out tosoamodate a sare vate of up vo M9 KF. 1. INPUT/OUTPUT INTERFACE "The nae architechure is designed to frelitae ayacem incerfave with ‘minimum aumber sf external componeny if any. The 1/0 transfer bf information is perforived serials. The Der 1/0 structure provides evial-to-paralel conversion of inp datn, and parallel to-sriel eon ‘Version of output data. Input and eulps operations axe carried out in Independent sections, thus, permiting them to be asynchronous sith Tespoct to each athe, ax ell ns with respect tothe program exceution, ‘Several signals control the 1/0 transmissions Five pep pins are dedicated (o dhe input serial transfor and its control and five pine ate dedinted tothe output. The beginning of Serial transfer i ndicated by a synchronization signal precent athe Tey (input ayichronization) pin for sn input, at the a3y (output syachronicaion) pin fot an ont. Tape data bits ar recived a pin brand advanced ino che TBR register of che ns by the clock wigs] pretent at pin 1k finput clock), utqus data bits ar available ae pin Do nd cre shifted ont ofthe OuUY register of he pxP by the clock gnel precent al pin ock foutpatelack}, The two enable lines™H tnot ‘lear to reed) and CTS (nol clear o send} can be used 0 activate the Input and eutput seein, respectively. high level on one ofthese pins auses che ne to be ietve on all che pins associved with that, particular section, and tristan ihe corresponding off-chip divers ‘This allows eoverel bets to he switched on and off e single external ‘Hoa. The flags 19 Gt fer ll) and ou outgat ber erty) Indicate the sats of te rspectiva batfers, Tete lags can be used to Der and ie peripheral, They cin also be internally tested by corti "The us 1/0 anit is progracnmable via the 10-bit 1c register. This register configures the input and output setion# of the ns to cher fenernte or nvcepl the elock and synchronication signals. If «section tft ne generaces chess gna, ies to hein the ACTIVE mod, Uther isin the PASSIVE mud, The toe sao contol the lngch Ut the pera data tvanefer tobe ether 8,16, or 20 bite In edition, th toc controle the /0 elotk rus for active mode. The rate can be either “is or of the ust inpul clack, Finely, for both input and 4456 THE BELL SYSTEM TECHNICAL JOURNAL, SEPTEMBER 1981

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.