ebook img

Methods and apparatus for compiling instructions for a data processor PDF

85 Pages·2013·3.9 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Methods and apparatus for compiling instructions for a data processor

US008166450B2 (12) United States Patent (10) Patent No.: US 8,166,450 B2 Fuhler et a1. (45) Date of Patent: Apr. 24, 2012 (54) METHODS AND APPARATUS FOR 4,827,427 A 5/1989 Hyduke _ COMPILING INSTRUCTIONS FORA DATA 2 1g; [lime-tell a1 a a SuIIll e . PROCESSOR 5,088,034 A 2/1992 Ihara et al. 5,105,353 A 4/1992 Charles et al. (75) Inventors: Richard A. Fuhler, Santa Cruz, CA 5,247,668 A 9/1993 Smith et al, (US); Thomas J. Pennello, Santa Cruz, 5,249,295 A 9/1993 Briggs et a1. CA (Us); Michael Lee Jalkut’ Santa 5,274,818 A 12/1993 Vasilevsky et a1. Cruz, CA (US); Peter Warnes, London (Continued) (GB) FOREIGN PATENT DOCUMENTS (73) Assignee: (Sgrslppsys, Inc., Mountain View, CA W0 W0 03/091914 A1 11/2003 ( * ) Notice: Subject to any disclaimer, the term of this OTHER PUBLICATIONS patent is eXIended 01‘ adjusted under 35 G. J. Chaitin, “Register Allocation $ Spilling Via Graph Coloring”, U.S.C. 154(b) by 989 days. 1982 ACM 0-89791-074-5/82/006/0098.* (21) Appl. No.: 11/906,519 (Continued) (22) Filed: Oct. 1, 2007 _ _ Primary Examiner * Philip Wang (65) Prior Publication Data (74) Attorney, Agent, or Firm * FenWick & West LLP US 2008/0320246 A1 Dec. 25, 2008 (57) ABSTRACT Related US. Application Data (62) Division of application NO 10/330,632, ?led on Dec‘ Methods and apparatus optimized for compiling instructions 26 2002 HOW Pat' No‘ 7 278 137' in a data processor are disclosed. In one aspect, a method of _ _ _ _ address calculation is disclosed, comprising operating a com (60) PIOVlSlOnal 81313110201011 NO- 60/343,730, ?led 011 Dee- piler to generate at least one instruction; canonicalizing the 26, 2001- address calculation in a plurality of different approaches: in one exemplary embodiment, the ?rst approach comprises (51) Int- Cl- canonicalizing the “regular” 32-bit instruction addressing G06F 9/4 4 (2006-01) modes, and the second for the “compressed” 16-bit instruc G06F 9/4 5 (2006-01) tion addressing modes. In another aspect, a plurality of func U-S. Cl. ....................... .. {ions (up to and including all available functions) are Called (58) Field of Classi?cation Search ................ .. 717/101, indirectly to alloW addresses to be placed in a constant pool. 717/ 140, 146 Improved methods for instruction selection, register alloca See application ?le for complete search history. tion and spilling, and instruction compression are provided. An improved SoC integrated circuit device having an opti (56) References Cited mized 32-bit/16-bit processor core implementing at least one of the foregoing improvements is also disclosed. U.S. PATENT DOCUMENTS 4,571,678 A 2/1986 Chaitin 4,763,242 A 8/1988 Lee et a1. 20 Claims, 8 Drawing Sheets @ ATFEMPT TO COLOR GR W/N REGIST REASSIGN REGISTERS REMAI GPRs 1 mm COPIES BEIWEEN BANKS "CLEAN uP" SPILL; US 8,166,450 B2 D>D>D>D>D>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Page 2 U.S. PATENT DOCUMENTS 2002/0116603 A1 8/2002 Kissell 2003/0140337 A1 7/2003 Aubury 5,287,510 2/1994 Hall et al. 2003/0225998 A1 12/2003 Khan et al. 5,293,631 3/1994 Rau et al. 5,339,424 8/1994 Fushimi OTHER PUBLICATIONS 5,428,793 6/1995 Odnert et al. 5,448,746 9/1995 Eickemeyer et al. Intel Pentium Processor Family Developer’s Manualivol. 3: Archi 5,450,585 9/1995 Johnson tecture and Programing Manual (1995) (cover pages, pp. 3-15 to 5,481,723 1/1996 Harris et al. 5,488,710 1/1996 Sato et al. 3-18, pp. 25-189 to 25-192). 5,546,552 8/1996 Coon et al. Article entitled “The JCN Embedded Processor” by AT&T Labora 5,555,417 9/1996 Odnert et al. tories Cambridge (modi?ed Jun. 6, 2002, © 2001) (3 pages) (http:// 5,586,020 12/1996 Isozaki www.uk.research.att.com/jen/). 5,636,352 6/1997 Bealkowski et al. The History of Language Technology in IBM by EB. Allen, IBM J. 5,638,525 6/1997 Hammond et al. Res. Develop, vol. 25, No. 5, Sep. 1981, pp. 535-548. 5,659,754 8/1997 Grove et al. ................ .. 717/158 5,740,461 4/1998 Jaggar Heads and Tails: A Variable-Length Instruction Format Supporting 5,752,035 5/1998 Trimberger ................. .. 717/153 Parallel Fetch and Decode by Heidi Pan and Krste Asanovic, Nov. 5,790,854 8/1998 Spielman et al. 16-17, 2001, 8 pages. 5,790,877 8/1998 Nishiyama et al. SilAria Puts the Squeeze on Code With Tuned Instruction Set by 5,794,010 8/1998 Worrell et al. Chris Edwards, EE Times, 2 pages, Oct. 17, 2001, http://www.eetuk. 5,809,272 9/1998 Thusoo et al. com/tech/news/OEG20011017S0018. 5,809,306 9/1998 Suzuki et al. “An Introduction to Thumb”, Version 2.0, Mar. 1995. 5,819,058 10/1998 Miller et al. The Evolution of RISC Technology at IBM by John Cocke and V. 5,828,884 10/1998 Lee et al. Markstein, IBM J. Res. Develop. vol. 34, No. 1, Jan. 1990, pp. 4-11. 5,828,886 10/1998 Hayashi ...................... .. 717/159 Improving Register Allocation for Subscripted Variables by David 5,845,127 12/1998 Isozaki 5,850,551 12/1998 Takayama et al. Callahan, Steve Carr and Ken Kennedy, copyright 2003, 2 pages. 5,864,704 1/1999 Battle et al. “TriCore 1.3, Architecture Overview Handbook”, V1.3.3, May 2002. 5,867,681 2/1999 Worrell et al. “Embedded Control Problems, Thumb, and the ARM7TDMI”; 5,870,576 2/1999 Faraboschi et al. Segas, et al.; 1995; IEEE. 5,878,267 3/1999 Hampapuram et al. Xtensa: A Con?gurable and Extensible Processor; Gonzalez; 2000; 5,881,260 3/1999 Raje et al. IEEE. 5,884,057 3/1999 Blomgren et al. An Overview of the PL. 8 Compiler by M. Auslander and M. Hopkins, 5,896,519 4/1999 Worrell IBM T.J. Watson Research Center, 1982, pp. 22-31. 5,905,893 5/1999 Worrell IEEE Std. 1076.3-1997 IEEE StandardVHDL Synthesis Packagesi 5,946,491 8/1999 Aizikowitz et al. Description, Abstract only, copyright 2004, 2 pages. 5,946,492 8/1999 Bates Global Optimization by Suppression of Partial Redundancies by E. 5,948,100 9/1999 Hsu et al. 5,961,632 10/1999 Shiell et al. Morel and C. Renvoise, CII Honeywell Bull, Communications of the 6,090,156 7/2000 MacLeod ACM, Feb. 1979, vol. 22, No. 2, pp. 96-103. 6,158,046 12/2000 Yoshida et al. The 801 Minicomputer by George Radin, IBM J. Res. Develop. vol. 6,209,079 3/2001 Otani et al. 44 No. 1/2 Jan/Mar. 2000, pp. 37-46. 6,260,189 7/2001 Batten et al. The GNU 64-bit PL8 Compiler: Toward an Open Standard Environ 6,275,830 8/2001 Muthukkaruppan et al. ment for Firmware Development by W. Gellerich, et al., IBM J. Res. 6,282,633 8/2001 Killian et al. & Dev. vol. 48 No. 3/4 May/Jul. 2004, pp. 543-556. 6,292,940 9/2001 Sato Motorola Product Brief MCF5202 Embedded Microprocessor, 1996 6,305,013 10/2001 Miyamoto Motorola, Inc., 6 pages. 6,308,323 10/2001 Douniwa PICO: Automatically Designing Custom Computers by Vinod 6,334,212 12/2001 Nakajima Kathail, et al., Hewlett-Packard Laboratories, Sep. 2002, 17 pages, 6,367,071 4/2002 Cao et al. http://www.hpl.hp.com/research/papers/2002/pico.pdf. 6,408,382 6/2002 Pechanek et al. High Performance, Variable-Length Instruction Encodings by Heidi 6,412,066 6/2002 Worrell et al. Pan, submitted to the Dept. of Electrical Engineering and Computer 6,425,070 7/2002 Zou et al. Science, Mass. Instit. of Technology, May 2002, 53 pages. 6,463,520 10/2002 Otani et al. Drinic, et al, “Code Optimization for Code Compression”, IEEE, pp. 6,463,581 10/2002 Bacon et al. 315-324, 2003. 6,477,683 11/2002 Killian et al. Ozturk et al, “Access Pattern-Based Code Compression for Memory 6,477,697 11/2002 Killian et al. Constrained Systems”, ACM Transactions on Design Automation of 6,564,314 5/2003 May et al. Electronic Systems, vol. 13, No. 4, Article 62; pp. 60:1-60:30, Sep. 6,678,818 1/2004 Co?er et al. 2008. 6,694,511 2/2004 Yokote Ozturk, et al, “Access Pattern-Based Code Compression for 6,701,515 3/2004 Wilson et al. Memory-Constrained Embedded Systems”, IEEE, Proceedings of 6,732,238 5/2004 Evans et al. the Design, Automation and Test in Europe Conference and Exhibi 6,760,888 7/2004 Killian et al. 6,763,327 7/2004 Songer et al. tion (DATE’05), pp. 1-6, 2005. Ros, et al, “A Post-Compilation Register Reassignment Technique 6,826,681 11/2004 Kissell et al. ............... .. 712/228 for Improving Hamming Distance Code Compression”, CASES’05, 6,862,563 3/2005 Hakewill et al. 6,889,313 5/2005 Co?er et al. ACM, pp. 97-104, Sep. 24-27, 2005. 7,103,881 9/2006 Stone Avaz Product Information Sheets on Mozart2 Fully Synthesizable 7,120,903 10/2006 Toi et al. DSP Core, 2 pages, [Online] [Retrieved on Jun. 30, 2003] Retrieved 7,140,008 11/2006 Chilimbi et al. from the Internet<URL :http :/ /w ww. avaz .com/p roducts/ core s/ dspcore/mozaIt2/>. 7,203,935 4/2007 Chakradhar et al. 7,278,137 10/2007 Fuhler et al. Chaitin, G., “Register Allocation and Spilling Via Graph Coloring,” 7,558,462 7/2009 Kurita Best of PLI 1979-1999, ACM SIGPLAN, 1982, pp. 66-74. 2001/0013093 8/2001 Banno et al. ................ .. 712/210 Collin, M. et al., “Low Power Instruction Fetch Using Pro?led Vari 2002/0004897 1/2002 Kao et al. able Length Instructions,” Dept. of Microelectronics and Information 2002/0073299 6/2002 Pechanek et al. Technology, 2003 Proceedings of the IEEE International Systems 2002/0073407 A1 6/2002 Takayama et al. on-Chip Conference, Sep. 17-20, 2003, 17 pages. US 8,166,450 B2 Page 3 Intel 80386 Programmer’s Reference ManualiSection 17.1, “17.1 Richardson, N. et al., “Chapter 16iThe iCoreTM 520MHz Operand-SiZe and Address-Size Attributes,” [Online] [Retrieved on SynthesiZable CPU Code,” Closing the Gap Between ASIC & Cus Jun. 30, 2003] Retrieved from the Internet<URL:http://Webster.cs. tom, 2004, pp. 361-381, Part 3. ucr.edu>. Slechta, B. et al., “Dynamic Optimization of Micro-Operations,” Krishna, R. et al. “Ef?cient Software Decoder Design,” Advanced Computer Architecture Laboratory, IEEE Computer Society Techni Center for Reliable and High-Performance Computing, Dept. of cal, 2001, 10 pages. Electrical and Computer Engineering, Univ. of Illinois at Urbana Liao, S. et al., “Storage Assignment to Decrease Code Size,” ACM Champaign, The Ninth International Symposium on High Perfor Transactions on Programming Languages and Systems, May 1996, mance Computer Architecture (HPCA’03), Feb. 8-12, 2003, 12 pp. 235-253, vol. 18, No. 3. pages. Omondi, A.R. “The Microarchitecture of Pipelined and Superscalar Computers” 1999, pp. 98-99, Kluwer Academic Publishers Boston. * cited by examiner US. Patent Apr. 24, 2012 Sheet 1 of8 US 8,166,450 B2 100 CANONICALIZE / ADDRESS ~ 104 CALCULATION IN PLURALITY OF WAYS 106 l / OPTIMIZE? APPLY OPHMIZATION DETERMINE WHICH FORM ~1Q8 GENERATES FEWEST INSTR. l SELECT BEST ~ 110 FORM (10) FIG. 1 US. Patent Apr. 24, 2012 Sheet 2 of8 US 8,166,450 B2 D_T.SrEA_ R E_SEcSUN v AN8MGDC U O"SR NDR S WUEEI MU__LBO||U . FSIU TUA_ES Du_S_ F_ _FA0 FIG. 1a US. Patent Apr. 24, 2012 Sheet 3 0f 8 US 8,166,450 B2 PROVIDE ZOO PLURALITY OF ~202 )/ CALLABLE FUNCTIONS I DETERMINE NUMBER OF ~ 204 CALLS FOR 2 1 FUNCTION(S) I DETERMINE RELATIONSHIP ~ 206 OF NUMBER TO CRITERIA CRITERIA USE MET? DIRECT CALL USE INDIRECT CALL F ADDRESS TO ~212 CONSTANT POOL t— FIG. 2 US. Patent Apr. 24, 2012 Sheet 4 0f 8 US 8,166,450 B2 READ THE ~ INSTRUCTIONS 302 I TRANSLATE EACH INSTRUCTION “ 304 MAKE SURE THE TRANSLATION 18 ~30‘? COMPLETE AND VALID I CHOSE THE SMALLER ~ 308 INSTRUCTION VERSION I RETURN THE BEST VERSION OF THE INSTRUCTIONS FIG. 3 US. Patent Apr. 24, 2012 Sheet 5 of8 US 8,166,450 B2 C START D l EXAMINE CODE; COUNT ~402 LOADS TO 400 GPRS / I EVAL'EJATE ~ 403 l 55%“ L CTED ~ CONSTANTS IN 404 POOL l N FIG. 4 US. Patent Apr. 24, 2012 Sheet 6 0f8 US 8,166,450 B2 IDENHFY "CMP " FOLLOWED ~ 502 BYBRANCH 500 504 ,/ MATCHING PAIR‘? EVALUATE BRANCH ~ 506 LABEL J, IF LABEL EFLIITMSI,N ATE ~ 508 ll ll l REWRITE BRANCH TO ~ 510 REMOVE DELAY ~ 512 SLOT i FIG. 5 US. Patent Apr. 24, 2012 Sheet 7 0f 8 US 8,166,450 B2 ATTEMPT TO COLOR GRAPH ~ 602 W/ N REGISTERS SPILLING REQ’D? REASSIGN REGISTERS ~ 606 TO REMAINING GPRs I INSERT COPIES N 608 BETWEEN BANKS I "CLEAN UP" SPILLS FIG. 6

Description:
Eickemeyer et al. Johnson. Harris et al DSP Core, 2 pages, [Online] [Retrieved on Jun. 30, 2003] able Length Instructions,” Dept. of Microelectronics and Information Computer Architecture Laboratory, IEEE Computer Society Techni .. time in Which the ?rst nested conditional construct Was initi.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.