XML IN A NUTSHELL Other XML resources from O’Reilly Related titles .NET and XML RELAX NG Content Syndication with XForms Essentials RSS XML Hacks Java and XML XML Pocket Reference Learning XML XML Publishing with Learning XSLT AxKit Office 2003 XML XML Schema Practical RDF XML Books Resource xml.oreilly.com is a complete catalog of O’Reilly’s books Center onXMLandrelatedtechnologies,includingsamplechap- ters and code examples. XML.comhelpsyoudiscoverXMLandlearnhowthisIn- ternet technology can solve real-world problems in information management and electronic commerce. Conferences O’Reillybringsdiverseinnovatorstogethertonurturethe ideasthatsparkrevolutionaryindustries.Wespecializein documentingthelatesttoolsandsystems,translatingthe innovator’s knowledge into useful skills for those in the trenches. Visit conferences.oreilly.com for our upcoming events. Safari Bookshelf (safari.oreilly.com) is the premier online reference library for programmers and IT professionals. Conduct searches across more than 1,000 books. Sub- scriberscanzeroinonanswerstotime-criticalquestions in a matter of seconds. Read the books on your Book- shelf from cover to cover or simply flip to the page you need. Try it today with a free trial. XML IN A NUTSHELL Third Edition Elliotte Rusty Harold and W. Scott Means Beijing • Cambridge • Farnham • Köln • Paris • Sebastopol • Taipei • Tokyo XML in a Nutshell, Third Edition by Elliotte Rusty Harold and W. Scott Means Copyright © 2004, 2002, 2001 O’Reilly Media, Inc. All rights reserved. Printed in the United States of America. PublishedbyO’ReillyMedia,Inc.,1005GravensteinHighwayNorth,Sebastopol,CA95472. O’Reillybooksmaybepurchasedforeducational,business,orsalespromotionaluse.Online editions are also available for most titles (safari.oreilly.com). For more information, contact our corporate/institutional sales department: (800) 998-9938 [email protected]. Editor: Simon St.Laurent Production Editor: Marlowe Shaeffer Cover Designer: Ellie Volckhausen Interior Designer: Melanie Wang Printing History: January 2001: First Edition. June 2002: Second Edition. September 2004: Third Edition. Nutshell Handbook, the Nutshell Handbook logo, and the O’Reilly logo are registered trademarksofO’ReillyMedia,Inc.TheInaNutshellseriesdesignations,XMLinaNutshell, the image of a peafowl, and related trade dress are trademarks of O’Reilly Media, Inc. Manyofthedesignationsusedbymanufacturersandsellerstodistinguishtheirproductsare claimed as trademarks. Where those designations appear in this book, and O’Reilly Media, Inc. was aware of a trademark claim, the designations have been printed in caps or initial caps. While every precaution has been taken in the preparation of this book, the publisher and authors assume no responsibility for errors or omissions, or for damages resulting from the use of the information contained herein. This book uses RepKover™, a durable and flexible lay-flat binding. ISBN-10: 0-596-00764-7 ISBN-13: 978-0-596-00764-5 [M] [8/07] Chapter1 Table of Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii PartI. XML Concepts 1. Introducing XML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 The Benefits of XML 3 What XML Is Not 5 Portable Data 6 How XML Works 7 The Evolution of XML 8 2. XML Fundamentals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 XML Documents and XML Files 12 Elements, Tags, and Character Data 13 Attributes 16 XML Names 18 References 20 CDATA Sections 21 Comments 22 Processing Instructions 22 The XML Declaration 24 Checking Documents for Well-Formedness 26 v This is the Title of the Book, eMatter Edition Copyright © 2007 O’Reilly & Associates, Inc. All rights reserved. 3. Document Type Definitions (DTDs). . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 Validation 29 Element Declarations 37 Attribute Declarations 42 General Entity Declarations 49 External Parsed General Entities 50 External Unparsed Entities and Notations 52 Parameter Entities 53 Conditional Inclusion 56 Two DTD Examples 56 Locating Standard DTDs 59 4. Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 The Need for Namespaces 60 Namespace Syntax 63 How Parsers Handle Namespaces 68 Namespaces and DTDs 69 5. Internationalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 Character-Set Metadata 71 The Encoding Declaration 72 Text Declarations 73 XML-Defined Character Sets 74 Unicode 74 ISO Character Sets 77 Platform-Dependent Character Sets 78 Converting Between Character Sets 79 The Default Character Set for XML Documents 80 Character References 81 xml:lang 84 PartII. Narrative-Like Documents 6. XML as a Document Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 SGML’s Legacy 89 Narrative Document Structures 90 TEI 92 DocBook 95 OpenOffice 98 WordprocessingML 101 vi | Table of Contents This is the Title of the Book, eMatter Edition Copyright © 2007 O’Reilly & Associates, Inc. All rights reserved. Document Permanence 105 Transformation and Presentation 107 7. XML on the Web . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 XHTML 111 Direct Display of XML in Browsers 117 Authoring Compound Documents with Modular XHTML 122 Prospects for Improved Web Search Methods 139 8. XSL Transformations (XSLT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 An Example Input Document 144 xsl:stylesheet and xsl:transform 145 Stylesheet Processors 147 Templates and Template Rules 148 Calculating the Value of an Element with xsl:value-of 150 Applying Templates with xsl:apply-templates 151 The Built-in Template Rules 154 Modes 157 Attribute Value Templates 159 XSLT and Namespaces 159 Other XSLT Elements 161 9. XPath . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 The Tree Structure of an XML Document 162 Location Paths 165 Compound Location Paths 170 Predicates 171 Unabbreviated Location Paths 172 General XPath Expressions 175 XPath Functions 178 10. XLinks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 Simple Links 183 Link Behavior 185 Link Semantics 187 Extended Links 188 Linkbases 195 DTDs for XLinks 196 Base URIs 197 Table of Contents | vii This is the Title of the Book, eMatter Edition Copyright © 2007 O’Reilly & Associates, Inc. All rights reserved. 11. XPointers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 XPointers on URLs 200 XPointers in Links 202 Shorthand Pointers 203 Child Sequences 204 Namespaces 205 Points 205 Ranges 207 12. XInclude . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212 The include Element 212 Including Text Files 214 Content Negotiation 215 Fallbacks 216 XPointers 217 13. Cascading Style Sheets (CSS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 The Levels of CSS 221 CSS Syntax 221 Associating Stylesheets with XML Documents 223 Selectors 225 The Display Property 228 Pixels, Points, Picas, and Other Units of Length 229 Font Properties 230 Text Properties 231 Colors 233 14. XSL Formatting Objects (XSL-FO) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234 XSL Formatting Objects 235 The Structure of an XSL-FO Document 237 Laying Out the Master Pages 238 XSL-FO Properties 244 Choosing Between CSS and XSL-FO 249 15. Resource Directory Description Language (RDDL) . . . . . . . . . . . . . . 251 What’s at the End of a Namespace URL? 251 RDDL Syntax 252 Natures 256 Purposes 257 viii | Table of Contents This is the Title of the Book, eMatter Edition Copyright © 2007 O’Reilly & Associates, Inc. All rights reserved.