Table Of ContentS
o
computer science/software engineering f
t
w
a
r
e
E
Software Engineering for Internet Applications Eve Andersson n
g
i
Philip Greenspun n
e
e
Andrew Grumet r Software Engineering
i
n
g
After completing this self-contained course on server-based Internet applications software, students who f
o
start with only the knowledge of how to write and debug a computer program will have learned how to build r for Internet Applications
I
n
Web-based applications on the scale of Amazon.com. Unlike the desktop applications that most students
t
e
have already learned to build, server-based applications have multiple simultaneous users. This fact, coupled r
n
with the unreliability of networks, gives rise to the problems of concurrency and transactions, which students e
t
learn to manage by using the relational database system. A
p
p
After working their way to the end of the book, students will have the skills to take vague and ambitious li
c
specifications and turn them into a system design that can be built and launched in a few months. They a
t
i
will be able to test prototypes with end-users and refine the application design. They will understand how o
n
to meet the challenge of extreme business requirements with automatic code generation and the use of open- s
source toolkits where appropriate. Students will understand HTTP, HTML, SQL, mobile browsers, VoiceXML,
data modeling, page flow and interaction design, server-side scripting, and usability analysis.
The book, which originated as the text for an MIT course, is suitable for classroom use and will be a useful
reference for software professionals developing multi-user Internet applications. It will also help managers A
n
d
evaluate such commercial softwareas Microsoft Sharepoint of Microsoft Content Management Server. e
r
s
s
o
Eve Anderssonis Senior Vice President and Chair of the Bachelor of Science in Computer Science at Neumont n
,
University, Salt Lake City. Philip Greenspun,asoftware developer, author, teacher, pilot, and photographer, G
r
e
originated the SoftwareEngineering for Internet Applications course at MIT.He is the author of Philip and e
n
Alex’sGuide to Web Publishing.Andrew Grumet received his Ph.D. in Electrical Engineering and Computer s
p
u
Science from MIT and builds Web applications as an independent software developer. n
,
a
n
“Filled with practical advice for elegant and effective Web sites.” d
G
—Edward Tufte,author of The Visual Display of Quantitative Information
r
u
m
e
The MIT Press t
0-262-51191-6
Massachusetts Institute of Technology
Eve Andersson, Philip Greenspun, and Andrew Grumet
Cambridge, Massachusetts 02142
http://mitpress.mit.edu
Cover design by Erin Hasley
Software Engineering for Internet
Applications
Eve Andersson, Philip Greenspun, and Andrew Grumet
Software Engineering for Internet
Applications
The MIT Press
Cambridge, Massachusetts
London,England
62006MassachusettsInstituteofTechnology
All rights reserved. No part of this book may be reproduced in any form by any elec-
tronicormechanicalmeans(includingphotocopying,recording,orinformationstorage
andretrieval)withoutpermissioninwritingfromthepublisher.
MIT Press books may be purchased at special quantity discounts for business or sales
promotionaluse.Forinformation,pleaseemailspecial_sales@mitpress.mit.eduorwrite
to Special Sales Department, The MIT Press, 55 Hayward Street, Cambridge, MA
02142.
ThisbookwassetinTimesNewRomanon3B2byAscoTypesetters,HongKong,and
printedandboundintheUnitedStatesofAmerica.
LibraryofCongressCataloging-in-PublicationData
Andersson,EveAstrid.
SoftwareengineeringforInternetapplications/EveAndersson,PhilipGreenspun,and
AndrewGrumet.
p. cm.
Includesbibliographicalreferencesandindex.
ISBN0-262-51191-6(pbk.:alk.paper)
1.Internetprogramming. 2.Applicationsoftware. 3.Softwareengineering. I.
Greenspun,Philip. II.Grumet,Andrew. III.Title.
QA76.625.A55 2006 005.2076—dc22 2005049144
10 9 8 7 6 5 4 3 2 1
Contents
Preface vii
Acknowledgments ix
1 Introduction 1
2 Basics 9
3 Planning 47
4 Software Structure 63
5 User Registration andManagement 75
6 Content Management 97
7 Software Modularity 141
8 Discussion 161
9 Adding Mobile Users toYour Community 183
10 Voice (VoiceXML) 199
11 Scaling Gracefully 213
12 Search 241
13 Planning Redux 261
14 DistributedComputing with HTTP, XML, SOAP, and WSDL 269
15 Metadata(and Automatic Code Generation) 281
vi Contents
16 User ActivityAnalysis 303
17 Writeup 313
Reference Chapters
A HTML 329
B EngagementManagementby CesarBrea 351
C Grading Standards 359
Glossary 363
To the Instructor 375
Sample Contract (between StudentTeamand Client) 391
About the Authors 393
Index 395
Preface
This is the textbook for the MIT course ‘‘Software Engineering for Internet
Applications.’’ The course is intended for juniors and seniors in computer
science. We assume that they know how to write a computer program and
debug it. We do not assume knowledge of any particular programming lan-
guages, standards, or protocols. The most concise statement of the course
goal is that ‘‘The student finishes knowing how to build amazon.com by him
or herself.’’
Other people who might find this book usefulincludethefollowing:
m
professionalsoftwaredevelopersbuilding online communitiesorothermulti-
user Internetapplications
m
managers who are evaluating packaged software aimed at supporting online
communities—various chapters contain criteria for judging the features of
products such as Microsoft Sharepoint or Microsoft Content Management
Server
m
universitystudentsandfacultylookingtoaddsomestructuretoa‘‘capstone’’
project at the end of a computersciencedegree
If you’re confused by the ‘‘student knows how to build amazon.com’’ state-
ment, we can break it down in terms of principles and skills. The fundamental
di¤erence between server-based Internet applications and the desktop appli-
cations that students have already learned to build is that server-based appli-
cations have multiple simultaneous users. Coupled with the unreliability of
networks, this gives rise to the problems of concurrency and transactions.
Stateless communications protocols such as HTTP mean that the student must
learn how to build a stateful user experience on top of stateless protocols. For
persistence between clicks and management of concurrency and transactions,
viii Preface
the student needs to learn how to use the relational database management sys-
tem.Finally,thoughthisgoesbeyondthesimplestand-aloneamazon.com-style
service, students ought to learn about object-oriented distributed computing
where each object isa Web service.
In addition to learning these principles, we’d like the student to learn some
skills. This is a laboratory course, and we want students who graduate to be
competent software engineers. We’d like our students to be able to take vague
and ambitious specifications and turn them into a system design that can be
built and launched within a few months, with the features most important to
users and easiest to develop built first and the di‰cult bells and whistles de-
ferred to a second version. We’d like our students to know how to test proto-
types with end-users and refine their application design once or twice within
even a three-month project. When business requirements are extreme, for
example, ‘‘build me amazon.com by yourself in three months,’’ we want our
studentstounderstandhowtocopewiththechallengeviaautomaticcodegen-
eration and use of open-source toolkits where appropriate.
We can recast the ‘‘student knows how to build amazon.com’’ statement in
termsoftechnologiesused.Bythetimesomeonehasfinishedreadinganddoing
the exercises in this book, he or she will understand HTTP, HTML, SQL, mo-
bilebrowsersontelephones,VoiceXML,datamodeling,pageflowandinterac-
tion design, server-side scripting,and usability analysis.
Eve Andersson, Philip Greenspun, Andrew Grumet
Cambridge, Massachusetts
December 2005
Acknowledgments
The book is an outgrowth of six semesters of teaching experience at MIT and
other universities. So our first thanks must go to our students, who taught us
whatworkedandwhatdidn’twork.ItisaprivilegetoteachatMIT,andevery
instructor should have the opportunity once in a lifetime.
We did not teach alone. Hal Abelson and the late Michael Dertouzos were
our partners on the lecture podium. Hal was Mr. Pedagogy and also pushed
the distributed computing ideas to the fore. Michael gave us an early push
into voice applications. Lydia Sandon was our first teaching assistant. Ben
Adida was our teaching assistant at MIT in the fall of 2003 when this book
took its final pre-print shakedown cruise.
In semesters where we did not have a full-time teaching assistant, the stu-
dents’ most valuable partners were their industry mentors, most of whom were
MIT alumni volunteering their time: David Abercrombie, Tracy Adams, Ben
Adida, Mike Bonnet, Christian Brechbuhler, James Buszard-Welcher, Bryan
Che, Bruce Keilin, Chris McEniry, Henry Minsky, Neil Mayle, Dan Parker,
Richard Perng, Lydia Sandon, Mike Shurpik, Steve Strassman, Jessica Wong,
and certainly a few more whose names have slipped fromour memory.
We’ve gotten valuable feedback from instructors at other universities using
these materials, notably Aurelius Prochazka at Caltech and Oscar Bonilla at
Universidad Galileo.
Description:A self-contained course on server-based Internet applications software that grew out of an MIT course.