Free/Open Source Software Development Stefan Koch IDEA GROUP PUBLISHING TLFeBOOK Free/Open Source Software Development Stefan Koch Vienna University of Economics and Business Administration, Austria IDEA GROUP PUBLISHING Hershey • London • Melbourne • Singapore TLFeBOOK Acquisitions Editor: Mehdi Khosrow-Pour Senior Managing Editor: Jan Travers Managing Editor: Amanda Appicello Development Editor: Michele Rossi Copy Editor: Jane Conley Typesetter: Sara Reed Cover Design: Lisa Tosheff Printed at: Yurchak Printing Inc. Published in the United States of America by Idea Group Publishing (an imprint of Idea Group Inc.) 701 E. Chocolate Avenue, Suite 200 Hershey PA 17033 Tel: 717-533-8845 Fax: 717-533-8661 E-mail: [email protected] Web site: http://www.idea-group.com and in the United Kingdom by Idea Group Publishing (an imprint of Idea Group Inc.) 3 Henrietta Street Covent Garden London WC2E 8LU Tel: 44 20 7240 0856 Fax: 44 20 7379 3313 Web site: http://www.eurospan.co.uk Copyright © 2005 by Idea Group Inc. All rights reserved. No part of this book may be reproduced in any form or by any means, electronic or mechanical, including photocopy- ing, without written permission from the publisher. Library of Congress Cataloging-in-Publication Data Free/open source software development / Stefan Koch, Editor. p. cm. ISBN 1-59140-369-3 -- ISBN 1-59140-370-7 (pbk.) -- ISBN 1-59140-371-5 (ebook) 1. Computer software--Development. 2. Open source software. I. Koch, Stefan. QA76.76.S46F74 2004 005.1--dc22 2004003748 British Cataloguing in Publication Data A Cataloguing in Publication record for this book is available from the British Library. All work contributed to this book is new, previously-unpublished material. The views expressed in this book are those of the authors, but not necessarily of the publisher. TLFeBOOK Free/Open Source Software Development Table of Contents Preface .............................................................................................................vii Section I: F/OSS Development - “Intensive Analysis” Chapter I Do Not Check in on Red: Control Meets Anarchy in Two Open Source Projects ..............................................................................................................1 Jesper Holck, Copenhagen Business School, Denmark Niels Jørgensen, Roskilde University, Denmark Chapter II Analyzing the Anatomy of GNU/Linux Distributions: Methodology and Case Studies (Red Hat and Debian) .............................................................27 Jesús M. González-Barahona, Universidad Rey Juan Carlos, Spain Gregorio Robles, Universidad Rey Juan Carlos, Spain Miguel Ortuño-Pérez, Universidad Rey Juan Carlos, Spain Luis Rodero-Merino, Universidad Rey Juan Carlos, Spain José Centeno-González, Universidad Rey Juan Carlos, Spain Vicente Matellán-Olivera, Universidad Rey Juan Carlos, Spain Eva Castro-Barbero, Universidad Rey Juan Carlos, Spain Pedro de-las-Heras-Quirós, Universidad Rey Juan Carlos, Spain TLFeBOOK Chapter III The Co-Evolution of Systems and Communities in Free and Open Source Software Development ......................................................................59 Yuwen Ye, University of Colorado at Boulder, USA and SRA Key Technology Lab, Japan Kumiyo Nakakoji, University of Tokyo, Japan Yasuhiro Yamamoto, University of Tokyo, Japan Kouichi Kishida, SRA Key Technology Lab, Japan Section II: F/OSS Development and Software Engineering Practices - “Extensive Analysis” Chapter IV The Role of Modularity in Free/Open Source Software Development .....84 Alessandro Narduzzo, Universitá di Bologna, Italy Alessandro Rossi, Universitá di Trento, Italy Chapter V A Quantitative Study of the Adoption of Design Patterns by Open Source Software Developers ........................................................................103 Michael Hahsler, Vienna University of Economics and Business Administration, Austria Section III: F/OSS Projects as Social Constructs Chapter VI Coordination and Social Structures in an Open Source Project: VideoLAN .....................................................................................................125 Thomas Basset, Centre de Sociologie des Organisations, France and Ecole Normale Superieure de Chachan, France Chapter VII Free Software Development: Cooperation and Conflict in a Virtual Organizational Culture ...............................................................................152 Margaret S. Elliott, University of California, Irvine, USA Walt Scacchi, University of California, Irvine, USA TLFeBOOK Section IV: Simulating F/OSS Development - “Dynamic Swarms” Chapter VIII Dynamical Simulation Models of the Open Source Development Process ...........................................................................................................174 I.P. Antoniades, Aristotle University of Thessaloniki, Greece I. Samoladas, Aristotle University of Thessaloniki, Greece I. Stamelos, Aristotle University of Thessaloniki, Greece L. Angelis, Aristotle University of Thessaloniki, Greece G.L. Bleris, Aristotle University of Thessaloniki, Greece Chapter IX Modeling the Free/Open Source Software Community: A Quantitative Investigation .................................................................................................203 Gregory Madey, University of Notre Dame, USA Vincent Freeh, North Carolina University, USA Renee Tynan, University of Notre Dame, USA Section V: F/OSS Development Interacting with Commercial and Public Organizations Chapter X Benefits and Pitfalls of Open Source in Commercial Contexts ................222 Jiayin Hang, Siemens Business Services GmbH & Co. OHG, Germany Heidi Hohensohn, Siemens Business Services GmbH & Co. OHG, Germany Klaus Mayr, IFS IT GmbH, Germany Thomas Wieland, University of Applied Sciences Coburg, Germany Chapter XI Experiences Enhancing Open Source Security in the POSSE Project ...........................................................................................................242 Jonathan M. Smith, University of Pennsylvania, USA Michael B. Greenwald, University of Pennsylvania, USA Sotiris Ioannidis, University of Pennsylvania, USA Angelos D. Keromytis, Columbia University, USA Ben Laurie, AL Digital, Ltd., USA Douglas Maughan, Defense Advanced Research Projects Agency, USA Dale Rahn, University of Pennsylvania, USA Jason Wright, University of Pennsylvania, USA TLFeBOOK Section VI: Implications of the F/OSS Development Model - “The Broad Picture” Chapter XII The Impact of Open Source Development on the Social Construction of Intellectual Property ................................................................................259 Bernd Carsten Stahl, De Montfort University, UK Chapter XIII The Social Production of Ethics in Debian and Free Software Communities: Anthropological Lessons for Vocational Ethics ................273 E. Gabriella Coleman, University of Chicago, USA Benjamin Hill, Debian Project, USA About the Editor ..........................................................................................296 About the Authors ........................................................................................297 Index ..............................................................................................................306 TLFeBOOK vii Preface In the last few years, free and open source software has gathered increasing interest, both from the business and the academic worlds. As some projects in dif- ferent application domains, like Linux together with the suite of GNU utilities, GNOME, KDE, Apache, sendmail, bind, and several programming languages, have achieved huge success in their respective markets, they have demonstrated that this new development paradigm can produce output of considerable quality. This has led to massive business interest, has given rise to new corporations like RedHat or VA Software (formerly having achieved a record-breaking IPO 1999 with a 700 percent gain on its first day of trading under the name of VA Linux), and has spurred organizations both small and large (like IBM, Sun Microsystems or Netscape) to invest in one way or the other into this new field. Academic interest in this new form of collaborative software development has also grown, arising from very different backgrounds including software engineering, sociology, management, and psychology, and has gained increasing prominence, as can be deduced from the number of international journals like Management Science, Information Systems Journal, Electronic Markets or Research Policy, and confer- ences like ICSE dedicating special issues, workshops, and tracks to this new field of research. As diverse as the background of researchers are their approaches and the issues tackled. The current research that can be attributed to this field ranges from quantitative analysis of source code or other artifacts of the software development to uncover programming practices and the efficiency of this development model, to sociological field work soliciting information in interviews about the ways in which coordination and communication in these virtual teams are accomplished. This book will try to give an overview of current research. It aims to be an up-to-date inventory of research approaches and outlooks. As yet, an edited volume of academic papers dealing with free and open source software development has not been available, and it is hoped that this book will provide a first step towards attributing this line of research the prominence and credibility it so richly deserves, given the high-quality output produced, as can now be witnessed by any interested reader. A note is in order about the title of this book. In a first version, the proposed title was Open Source Software Development, the announcement of which led to an e-mail response from Richard Stallman, founder of the GNU project, in which he TLFeBOOK viii argued that by using this title all relevant and important work of the Free Software community would be subsumed by the Open Source movement (which of course was never intended), and its very existence denied. Following his reasoning, the title was changed, not going further into ideological differences in the outlook of both communities, but explicitly acknowledging the inspiring work done by both. An additional term often used in this context, especially in Europe, libre software was not included to maintain readability and because it constitutes more of an artificial term with which no larger group of developers identifies. ORGANIZATION OF THE BOOK The organization of this book is intended to reflect the very different research approaches taken in the field of free and open source software development. There- fore, the chapters have been grouped into no less than six parts, each one dealing with a slightly different focus or outlook. With this, the book provides an overview of this very active field of research, and an interested reader or researcher might be able to identify the approach or focus he or she would like to take in future reading or work. Although there are many different possibilities for classifying the chapters presented here, and even more for all the research currently done, the organization presented here tries to constitute a starting point for discussing and developing a coherent framework for the diverse activities in the field of free and open source software development. Section I: F/OSS Development - “Intensive Analysis” Section I contains three chapters that provide what seems to constitute the heart and most important first step of current research. They all deal with a small number of projects and detail several facets of these. Therefore, the term intensive analysis is used here, to be distinguished from the extensive analyses performed in the chapters forming Section II. The first chapter, “Do Not Check in on Red: Control Meets Anarchy in Two Open Source Projects” by Jesper Holck and Niels Jørgensen, is a prototypical example in which the authors describe the most important elements of the software development process in the Mozilla and FreeBSD projects. For each of these elements, the struggle for optimal balance between control—supposedly necessary for producing high-quality software—and anarchy—supposedly neces- sary for attracting and keeping voluntary developers—is discussed. The authors give a superb picture of the free and open source software development process, documenting and analyzing similarities between free and open source projects and commercial software development. The second chapter, “Analyzing the Anatomy of GNU/Linux Distributions: Methodology and Case Studies (Red Hat and Debian)” by Jesús M. González-Bara- hona, Gregorio Robles, and colleagues, presents a quantitative and longitudinal study of one of the most important and visible aspects of free and open source software: GNU/Linux distributions. Using a total of nine different versions of the Red Hat TLFeBOOK ix and Debian distributions, the authors base their analysis on the source code itself, and also include a detailed description of the methodology applied. This approach allows the study of the evolution of parameters like total distribution size, size and version of different packages, or usage of different programming languages, thus offering insights into the free and open source projects providing the foundation for these distributions, as well as the process of producing a distribution, which poses an enormous integration task and therefore a large project in itself. Chapter III is titled “The Co-Evolution of Systems and Communities in Free and Open Source Software Development” and is written by Yunwen Ye, Kumiyo Nakakoji, Yasuhiro Yamamoto, and Kouichi Kishida. In this chapter, the authors see beyond the software product and document its relationship to the development community, analyzing the co-evolution that results. Using four projects in which the authors are involved as case studies, they find that projects co-evolve differently depending on the goal of the system and the structure of the community. This leads to a proposed classification scheme for free and open source projects and practical implications of recognizing the co-evolution and the type of project. Section II: F/OSS Development and Software Engineering Practices - “Extensive Analysis” The two chapters forming Section II have both their focus on the role of a single concept from “traditional” software engineering literature in free and open source software development and their analysis of several projects in common, therefore providing an example of extensive research. The first chapter in this part is by Alessandro Narduzzo and Alessandro Rossi and is titled “The Role of Modularity in Free/Open Source Software Development,” which very adequately describes its theme. The authors discuss especially the development of different free and open Unix-systems using the theory of modularity applied to both the software architecture and the organization of the projects. Chapter V, “A Quantitative Study of the Adoption of Design Patterns by Open Source Software Developers” by Michael Hahsler, describes a large-scale quantita- tive analysis of the usage of design patterns. The analysis is based on data extracted from the version-control system of the SourceForge hosting site and encompasses almost 1,000 projects. Results indicate that design patterns, as proposed by software engineering literature, are indeed used as a practice to improve communication, as larger projects tend towards increased adoption rates and more productive com- munity members. In addition to these results, this chapter serves as an excellent example of the “testbed” function that free and open source software projects might provide for software engineering researchers because their publicly available data allows for large-scale studies. Section III: F/OSS Projects as Social Constructs Section III of this book, “F/OSS Projects as Social Constructs,” contains two chapters that take a distinctly sociological position and foremost view free and open TLFeBOOK