Lecture Notes in Computer Science 4415 CommencedPublicationin1973 FoundingandFormerSeriesEditors: GerhardGoos,JurisHartmanis,andJanvanLeeuwen EditorialBoard DavidHutchison LancasterUniversity,UK TakeoKanade CarnegieMellonUniversity,Pittsburgh,PA,USA JosefKittler UniversityofSurrey,Guildford,UK JonM.Kleinberg CornellUniversity,Ithaca,NY,USA FriedemannMattern ETHZurich,Switzerland JohnC.Mitchell StanfordUniversity,CA,USA MoniNaor WeizmannInstituteofScience,Rehovot,Israel OscarNierstrasz UniversityofBern,Switzerland C.PanduRangan IndianInstituteofTechnology,Madras,India BernhardSteffen UniversityofDortmund,Germany MadhuSudan MassachusettsInstituteofTechnology,MA,USA DemetriTerzopoulos UniversityofCalifornia,LosAngeles,CA,USA DougTygar UniversityofCalifornia,Berkeley,CA,USA MosheY.Vardi RiceUniversity,Houston,TX,USA GerhardWeikum Max-PlanckInstituteofComputerScience,Saarbruecken,Germany Paul Lukowicz Lothar Thiele Gerhard Tröster (Eds.) Architecture of Computing Systems - ARCS 2007 20th International Conference Zurich, Switzerland, March 12-15, 2007 Proceedings 1 3 VolumeEditors PaulLukowicz UniversityofPassau IT-Center/InternationalHouse Innstraße43,94032Passau,Germany E-mail:[email protected] LotharThiele SwissFederalInstituteofTechnologyZurich ComputerEngineeringandNetworksLaboratory Gloriastrasse35,8092Zurich,Switzerland E-mail:[email protected] GerhardTröster SwissFederalInstituteofTechnologyZurich ElectronicsLaboratory Gloriastrasse35,8092Zürich.Switzerland E-mail:[email protected] LibraryofCongressControlNumber:2007922097 CRSubjectClassification(1998):C.2,C.5.3,D.4,D.2.11,H.3.5,H.4,H.5.2 LNCSSublibrary:SL1–TheoreticalComputerScienceandGeneralIssues ISSN 0302-9743 ISBN-10 3-540-71267-4SpringerBerlinHeidelbergNewYork ISBN-13 978-3-540-71267-1SpringerBerlinHeidelbergNewYork Thisworkissubjecttocopyright.Allrightsarereserved,whetherthewholeorpartofthematerialis concerned,specificallytherightsoftranslation,reprinting,re-useofillustrations,recitation,broadcasting, reproductiononmicrofilmsorinanyotherway,andstorageindatabanks.Duplicationofthispublication orpartsthereofispermittedonlyundertheprovisionsoftheGermanCopyrightLawofSeptember9,1965, initscurrentversion,andpermissionforusemustalwaysbeobtainedfromSpringer.Violationsareliable toprosecutionundertheGermanCopyrightLaw. SpringerisapartofSpringerScience+BusinessMedia springer.com ©Springer-VerlagBerlinHeidelberg2007 PrintedinGermany Typesetting:Camera-readybyauthor,dataconversionbyScientificPublishingServices,Chennai,India Printedonacid-freepaper SPIN:12031580 06/3142 543210 Preface The ARCS series of conferences has over 30 years of tradition reporting high- quality results in computer architecture and operating systems research. While the conference is proud of its long tradition, it is also proud to represent a dy- namic, evolving community that closely follows new researchtrends and topics. Thus, over the last few years,ARCS has evolved towards a strong focus on sys- tem aspects of pervasive computing and self-organization techniques (organic and autonomic computing). At the same time it has expanded from its roots as a German Informatics Society (GI/ITG) conference to an international event. This is reflected by the composition of the TPC which included over 30 renown scientist from 10 different countries. The conference attracted 83 submission from 16 countries across 4 continents. Of those, 20 have been accepted, which amounts to an acceptance rate below 25%. The 20th ARCS eventwas a special anniversaryconference.It is only fitting that it was held at a special place: the ETH Zurich. It combines one of the leading information technology schools in Europe with a beautiful location. I would like to express my gratitude to all those who made this year’s con- ference possible. This includes the General Chairs Lothar Thiele and Gerhard Tro¨ster from ETH, the Tutorials and Workshops Chair Marco Platzner from the University of Paderborn, the members of the “Fachausschus ARCS” of the GI/ITG(theSteeringCommittee),themembersoftheTechnicalProgramCom- mittee, the Reviewers, and most of all to all the authors that submitted their work to ARCS 2007. I would also like to thank IFIP, ITG/Electrosuisse, VDE and the ARTIST2 Project for their support of the conference. January 2007 Paul Lukowicz Organization Organizing Committee Conference Chairs : Lothar Thiele (ETH Zurich, Switzerland) Gerhard Tro¨ster (ETH Zurich, Switzerland ) ProgramChair: Paul Lukowicz (University of Passau,Germany) Workshops and Tutorials: MarcoPlatzner(UniversityofPaderborn,Germany) Program Committee Nader Bagherzadeh, University of California, Irvine, USA Michael Beigl, University of Braunschweig,Germany Michael Berger, Siemens AG, Munich, Germany Don Chiarulli, University of Pittsburgh, USA Giovanni Demicheli, EPFL Lausanne, Switzerland Koen De Bosschere, Ghent University, Belgium Alois Ferscha, University of Linz, Austria Mike Hazaas, Lancaster University, UK Ernst Heinz, UMIT Hall i. Tirol, Austria Paolo Ienne, EPFL Lausanne, Switzerland Wolfgang Karl, University of Karlsruhe, Germany Spyros Lalis, University of Thessaly, Greece Koen Langendoen, Delft University of Technology, The Netherlands Tom Martin, Virginia Tech, USA Hermann de Meer, University of Passau, Germany Erik Maehle, University of Luebeck, Germany Peter Marwedel, University of Dortmund, Germany Christian Mller-Schloer, University of Hanover, Germany Stephane Vialle, Supelec, France Joe Paradiso,MIT Media Lab, USA Daniel Roggen, ETH Zurich, Switzerland PascalSainrat, Universit´e Paul Sabatier, Toulouse, France Heiko Schuldt, University of Basel, Switzerland Hartmut Schmeck, University of Karlsruhe,Germany Karsten Schwan, Georgia Tech, Atlanta, USA Bernhard Sick, University of Passau,Germany Juergen Teich, University of Erlangen, Germany Pedro Trancoso,University of Cyprus, Cyprus Theo Ungerer, University of Augsburg, Germany Stamatis Vassiliadis, Delft University of Technology, The Netherlands Lucian Vintan, Lucian Blaga University of Sibiu, Romania Klaus Waldschmidt, University of Frankfurt, Germany VIII Organization Additional Reviewers Henoc Agbota Holger Harms Hooman Parizi Mohammed Al-Loulah Sabine Hauert Tom Parker Muneeb Ali Wim Heirman Neal Patwari Ioannis Avramopoulos Jo¨rg Henkel Andy Pimentel Gonzalo Bailador Michael Hinchey Thilo Pionteck David Bannach Alexander Hofmann Laura Pozzi Juergen Becker Ulrich Hofmann Robert Pyka Andrey Belenky Amir Kamalizad Markus Ramsauer Mladen Berekovic Dimitrios Katsaros Thomas Schwarzfischer Uwe Brinkschulte Bernd Klauer Andr Seznec Rainer Buchty Manfred Kunde Enrique Soriano Georg Carle Kai Kunze Ioannis Sourdis Supriyo Chatterjea Christoph Langguth Michael Springmann Marcelo Cintra Marc Langheinrich Mathias Sta¨ger Philippe Clauss Baochun Li Yannis Stamatiou Joshua Edmison Lei Liu Kyriakos Stavrou Werner Erhard Paul Lokuciejewski Walter Stiehl Philippe Faes Clemens Lombriser Mototaka Suzuki Diego Federici Thanasis Loukopoulos Joseph Sventek Dietmar Fey Jonas Maebe Jie Tao Mamoun Filali Amine Rene Mayrhofer Karl-Heinz Temme Stefan Fischer Lotfi Mhamdi Sascha Uhrig Pierfrancesco Foglia Jo¨rg Mische Miljan Vuletic Thomas Fuhrmann Florian Moesch Jamie Ward Martin Gaedke Thorsten Mo¨ller Ralph Welge Marco Goenne Katell Morin-Allory Lars Wolf Werner Grass Sanaz Mostaghim Bernd Wolfinger Jan Haase Leyla Nazhandali Markus Wulff Erik Hagersten Afshin Niktash Olivier Zendra Jo¨rg Ha¨hner Pasquale Pagano Peter Zipf Gertjan Halkes Thomas Papakostas Table of Contents ARCS 2007 A Reconfigurable Processor for Forward Error Correction ............. 1 Afshin Niktash, Hooman T. Parizi, and Nader Bagherzadeh FPGA-Accelerated Deletion-Tolerant Coding for Reliable Distributed Storage ......................................................... 14 Peter Sobe and Volker Hampel LIRAC: Using Live Range Information to Optimize Memory Access .... 28 Peng Li, Dongsheng Wang, Haixia Wang, Meijuan Lu, and Weimin Zheng Optimized Register Renaming Scheme for Stack-Based x86 Operations...................................................... 43 Xuehai Qian, He Huang, Zhenzhong Duan, Junchao Zhang, Nan Yuan, Yongbin Zhou, Hao Zhang, Huimin Cui, and Dongrui Fan A Customized Cross-Bar for Data-Shuffling in Domain-Specific SIMD Processors ...................................................... 57 Praveen Raghavan, Satyakiran Munaga, Estela Rey Ramos, Andy Lambrechts, Murali Jayapala, Francky Catthoor, and Diederik Verkest Customized Placement for High Performance Embedded Processor Caches ......................................................... 69 Subramanian Ramaswamy and Sudhakar Yalamanchili A Multiprocessor Cache for Massively ParallelSoC Architectures ...... 83 J¨org-Christian Niemann, Christian Liß, Mario Porrmann, and Ulrich Ru¨ckert Improving Resource Discovery in the Arigatoni Overlay Network ....... 98 Rapha¨el Chand, Luigi Liquori, and Michel Cosnard An Effective Multi-hop Broadcastin Vehicular Ad-Hoc Network ....... 112 Tae-Hwan Kim, Won-Kee Hong, and Hie-Cheol Kim Functional Knowledge Exchange Within an Intelligent Distributed System............................................... 126 Oliver Buchtala and Bernhard Sick Architecture for Collaborative Business Items........................ 142 Till Riedel, Christian Decker, Phillip Scholl, Albert Krohn, and Michael Beigl X Table of Contents Autonomic Management Architecture for Flexible Grid Services Deployment Based on Policies ..................................... 157 Edgar Magan˜a, Laurent Lefevre, and Joan Serrat Variations and Evaluations of an Adaptive Accrual Failure Detector to Enable Self-healing Properties in Distributed Systems ................ 171 Benjamin Satzger, Andreas Pietzowski, Wolfgang Trumler, and Theo Ungerer Self-organizing Software Components in Distributed Systems .......... 185 Ichiro Satoh Toward Self-adaptive Embedded Systems: Multi-objective Hardware Evolution ....................................................... 199 Paul Kaufmann and Marco Platzner Measurement and Control of Self-organised Behaviour in Robot Swarms................................................... 209 Moez Mnif, Urban Richter, Ju¨rgen Branke, Hartmut Schmeck, and Christian Mu¨ller-Schloer Autonomous Learning of Load and Traffic Patterns to Improve Cluster Utilization ...................................................... 224 Andrew Sohn, Hukeun Kwak, and Kyusik Chung Parametric Architecture for Function Calculation Improvement ........ 240 Mar´ıa Teresa Signes Pont, Juan Manuel Garc´ıa Chamizo, Higinio Mora Mora, and Gregorio de Miguel Casado Design Space Exploration of Media Processors: A Generic VLIW Architecture and a ParameterizedScheduler......................... 254 Guillermo Paya´-Vaya´, Javier Mart´ın-Langerwerf, Piriya Taptimthong, and Peter Pirsch Modeling of Interconnection Networks in Massively Parallel Processor Architectures.................................................... 268 Alexey Kupriyanov, Frank Hannig, Dmitrij Kissler, Ju¨rgen Teich, Julien Lallet, Olivier Sentieys, and S´ebastien Pillement Invited Talk: Expanding Software Product Families: From Integration to Composition .................................................. 283 Jan Bosch Author Index.................................................. 297 A Reconfigurable Processor for Forward Error Correction Afshin Niktash, Hooman T. Parizi, and Nader Bagherzadeh 536 Engineering Tower, Henry Samueli School of Engineering, University of California, Irvine, CA 92697-2625, USA {aniktash,hparizi,nader}@ece.uci.edu Abstract. In this paper, we introduced a reconfigurable processor optimized for implementation of Forward Error Correction (FEC) algorithms and provided the implementation results of the Viterbi and Turbo decoding algorithms. In this architecture, an array of processing elements is employed to perform the required operations in parallel. Each processing element encapsulates multiple functional units which are highly optimized for FEC algorithms. A data buffer coupled with high bandwidth interconnection network facilitates pumping the data to the array and collecting the results. A processing element controller orchestrates the operation and the data movement. Different FEC algorithms like Viterbi, Turbo, Reed-Solomon and LDPC are widely used in digital communication and could be implemented on this architecture. Unlike traditional approach to programmable FEC architectures, this architecture is instruction-level programmable which results the ultimate flexibility and programmability. Keywords: Reconfigurable Processor, Processing Element, Forward Error Correction, Viterbi, Turbo. 1 Introduction Reconfigurable architectures customize the same piece of silicon for multiple applications. While general purpose processors could not meet the processing requirements of many new applications, traditional custom ASIC dominates the design space. In wireless communication, a DSP processor is usually responsible for low data rate signal processing and is coupled with customized silicon to perform the medium and high data rate processing. The main drawback of a custom design is its long and costly design cycle which requires high initial investment and results in long time-to-market. Furthermore, lack of flexibility and programmability of traditional solutions causes frequent design changes and tape-outs for emerging and developing standards. Reconfigurable architectures on the other hand are very flexibly and programmable and could significantly shorten the design cycle of new products while even extending the life cycle of existing products. Tracking new standards is simplified to software upgrades which could be performed on-the-fly. One of the challenging applications of a reconfigurable architecture is channel coding. Almost any digital communication system benefits from at least one form of P. Lukowicz, L. Thiele, and G. Tröster (Eds.): ARCS 2007, LNCS 4415, pp. 1 – 13, 2007. © Springer-Verlag Berlin Heidelberg 2007 2 A. Niktash, H.T. Parizi, and N. Bagherzadeh error correction coding [1]. There are four main algorithms widely used in wireless and wired communications: Viterbi, Turbo, Reed-Solomon and LDPC. However, multiple variations of each of these algorithms are employed in standards. More specifically, every standard uses a different configuration of an algorithm which makes that unique to that standard. For example, , the Turbo code used in W-CDMA standard has a different polynomial, block size, rate and termination scheme from that used in WiMAX. Viterbi coding employed in W-LAN, W-CDMA and WiMAX are not the same. This translates to having a plurality of coding accelerators for different coding algorithms and configurations which is very common in industry. In conventional approach, a separate coprocessor is employed for every FEC algorithm. Nevertheless, even one coprocessor is not programmable enough to cover all existing configurations of an FEC algorithm for multiple standards. In this paper, we introduce RECFEC, a REConfigurable processor optimized for Forward Error Correction algorithms. RECFEC combines the programmability of a DSP processor with performance of a dedicated hardware and is architected to enable effective software implementation of FEC algorithms. The organization of this paper is as follows. Section 2 reviews the related works. Section 3 describes the RECFEC architecture and programming model. Section 4 presents two examples of algorithm implementation, Viterbi and Turbo coding and Section 5 concludes the paper. 2 Related Works There are considerable research efforts to develop prototypes of reconfigurable architectures for channel coding. In this section, we present the features of those architectures. A reconfigurable signal processor is introduced in [2] using an FPGA based reconfigurable processing board to implement a programmable Turbo decoder. Viturbo[3] is among the first contributions trying to integrate Viterbi and Turbo decoders into a single architecture. Viturbo is a runtime reconfigurable architecture designed and implemented on an FPGA. The architecture can be reconfigured to decode a range of convolutionally coded data and can also be reconfigured to decode Turbo coded data. SOVA is the algorithm implemented for Turbo decoding. The target application is W-LAN, 3GPP and GSM. A dual mode Viterbi/Turbo decoder is introduced in [4]. The component decoder in this architecture has two modes and some of the modules are shared. In Viterbi mode, and the Log Likelihood Ratio (LLR) processors are turned off. Input symbols are sent from Branch Metrics Unit (BMU) processor to Add Compare Select Unit (ACSU) processor and decoded bits are sent out after tracing back. When in Turbo mode, the decoder works as a Maximum A posteriori Probability (MAP) decoder and only the Trace Back Unit (TBU) is turned off. A Turbo decoder on a dynamically reconfigurable architecture is introduced in [5]. The decoder is optimized for FPGA. The key power-saving technique in this design is the use of decoder run-time dynamic reconfiguration in response to variations in the channel conditions. If less favorable channel conditions are detected, a more powerful, less power-efficient decoder is swapped into the FPGA hardware to maintain a fixed bit error rate.
Description: