XML Topic Maps: Creating and Using Topic Maps for the Web By Jack Editor Park,, Sam Technical Editor Hunting, Publisher: Addison Wesley Pub Date: July 16, 2002 ISBN: 0-201-74960-2 Table of Pages: 640 Contents The explosive growth of the World Wide Web is fueling the need for a new generation of technologies for managing information flow, data, and knowledge. This developer's overview and how-to book provides a complete introduction and application guide to the world of topic maps, a powerful new means of navigating the World Wide Web's vast sea of information. With contributed chapters written by today's leading topic map experts, XML Topic Maps is designed to be a "living document" for managing information across the Web's interconnected resources. The book begins with a broad introduction and a tutorial on topic maps and XTM technology. The focus then shifts to strategies for creating and deploying the technology. Throughout, the latest theoretical perspectives are offered, alongside discussions of the challenges developers will face as the Web continues to Y evolve. Looking forward, the book's concluding chapters provide a road map to the future of topic map technology and the Semantic WeLb in general. F Specific subjects explored in detail include: M • Topic mapping and the XTM specification A • Using XML Topic Maps to build knowledge repositories • Knowledge RepreseEntation, ontological engineering, and topic maps • Transforming an XTM document into a Web page T • Creating enterprise Web sites with topic maps and XSLT • Open source topic map software • XTM, RDF, and topic maps • Semantic networks and knowledge organization • Using topic maps in education • Topic maps, pedagogy, and future perspectives Featuring the latest perspectives from today's leading topic map experts, XML Topic Maps provides the tools, techniques, and resources necessary to plot the changing course of information management across the World Wide Web. Team-Fly® Table of Content Table of Content................................................................................................................................i Copyright...........................................................................................................................................v Foreword..........................................................................................................................................vi Preface............................................................................................................................................vii Acknowledgments...........................................................................................................................ix Contributors......................................................................................................................................x Chapter 1. Let There Be Light........................................................................................................1 Opening Salvo..............................................................................................................................1 Resources.....................................................................................................................................6 What's in Here?4.........................................................................................................................8 Chapter 2. Introduction to the Topic Maps Paradigm...............................................................13 Managing Complex Knowledge Networks.............................................................................13 Primary Constructs....................................................................................................................14 The Big Picture: Merging Information and Knowledge........................................................16 Design Principles for XTM........................................................................................................17 From ISO/IEC 13250 to XTM...................................................................................................19 Summary.....................................................................................................................................23 Acknowledgments.....................................................................................................................23 Chapter 3. A Perspective on the Quest for Global Knowledge Interchange........................24 Information Is Interesting Stuff.................................................................................................25 Information and Structure Are Inseparable............................................................................26 Formal Languages Are Easier to Compute Than Natural Languages..............................26 Generic Markup Makes Natural Languages More Formal..................................................27 A Brief History of the Topic Maps Paradigm.........................................................................29 Data and Metadata: The Resource-Centric View.................................................................31 Subjects and Data: The Subject-Centric View......................................................................32 Understanding Sophisticated Markup Vocabularies............................................................34 The Topic Maps Attitude...........................................................................................................36 Summary.....................................................................................................................................38 Chapter 4. The Rise and Rise of Topic Maps...........................................................................39 Milestones in Standards and Specifications..........................................................................40 Milestones in Software..............................................................................................................49 The Future of Topic Maps........................................................................................................49 Chapter 5. Topic Maps from Representation to Identity Conversation, Names, and Published Subject Indicators........................................................................................................51 What Is the Conversation About?...........................................................................................51 So What about Published Subject Indicators?......................................................................56 Back to the Conversation Subject...........................................................................................58 Chapter 6. How to Start Topic Mapping Right Away with the XTM Specification................61 XTM Topic Mapping..................................................................................................................61 Why Topic Maps?......................................................................................................................61 Appetizer.....................................................................................................................................63 Main Course...............................................................................................................................67 Dessert........................................................................................................................................71 Brandy, Cigars...........................................................................................................................74 Summary.....................................................................................................................................76 Acknowledgments.....................................................................................................................76 Resources...................................................................................................................................77 Chapter 7. Knowledge Representation, Ontological Engineering, and Topic Maps...........79 Knowledge as Interpretation....................................................................................................79 Data, Knowledge, and Information.........................................................................................79 Knowledge Issues: Acquisition, Representation, and Manipulation..................................81 ii The Roots of Ontological Engineering: Knowledge Technologies.....................................83 New Knowledge Technology Branches: Toward Ontological Engineering......................89 Ontological Engineering...........................................................................................................91 Ontologies and Topic Maps.....................................................................................................95 Summary...................................................................................................................................101 Acknowledgments...................................................................................................................102 References...............................................................................................................................102 Selected Information and Research Sites...........................................................................115 Chapter 8. Topic Maps in the Life Sciences............................................................................117 A Literature Review.................................................................................................................117 The Need for Classification....................................................................................................117 The Five Kingdoms.................................................................................................................119 Kingdom Animalia....................................................................................................................120 Creating Topic Maps for a Web Site[7]..................................................................................122 Summary...................................................................................................................................132 Resources for More Information on the Life Sciences.......................................................133 Chapter 9. Creating and Maintaining Enterprise Web Sites with Topic Maps and XSLT.134 The XTM Framework for the Web.........................................................................................135 XTM as Source Code for Web Sites.....................................................................................137 HTML Visualization of Topic Map Constructs.....................................................................139 Topics........................................................................................................................................140 XSLT Layers.............................................................................................................................146 The XSLT Layout Layer..........................................................................................................147 The XSLT Back-End and Presentation Layers...................................................................151 Summary...................................................................................................................................158 Acknowledgments...................................................................................................................159 References...............................................................................................................................159 Chapter 10. Open Source Topic Map Software......................................................................161 About Open Source Software................................................................................................161 Four Projects............................................................................................................................162 SemanText...............................................................................................................................165 XTM Programming with TM4J...............................................................................................171 Nexist Topic Map Testbed.....................................................................................................199 GooseWorks Toolkit................................................................................................................214 Chapter 11. Topic Map Visualization........................................................................................219 Requirements for Topic Map Visualization..........................................................................219 Visualization Techniques........................................................................................................221 Summary...................................................................................................................................232 References...............................................................................................................................233 Chapter 12. Topic Maps and RDF............................................................................................234 A Sample Application: The Family Tree...............................................................................234 RDF and Topic Maps..............................................................................................................235 Modeling RDF Using Topic Map Syntax..............................................................................244 Summary...................................................................................................................................269 References...............................................................................................................................269 Chapter 13. Topic Maps and Semantic Networks..................................................................271 Semantic Networks: The Basics...........................................................................................271 Comparing Topic Maps, RDF, and Semantic Networks....................................................273 Building Semantic Networks from Topic Maps...................................................................273 Harvesting the Knowledge Identified in Markup.................................................................293 Identifying and Interpreting the Knowledge Found within Documents............................293 Summary...................................................................................................................................294 References...............................................................................................................................294 Chapter 14. Topic Map Fundamentals for Knowledge Representation..............................296 iii A Simple KR Example.............................................................................................................296 A Quick Review of Concepts for Topic Maps and KR.......................................................298 Topic Map Templates.............................................................................................................298 Class Hierarchies.....................................................................................................................300 Association Properties............................................................................................................302 Inference Rules........................................................................................................................303 Consistency Constraints.........................................................................................................310 Summary...................................................................................................................................315 References...............................................................................................................................315 Chapter 15. Topic Maps in Knowledge Organization[1]..........................................................317 Suggestions for Reading This Chapter................................................................................317 What Is KO?[17].........................................................................................................................323 KO as a Use Case for TMs....................................................................................................349 Illustrative Examples...............................................................................................................359 A Look into the Future: Toward Innovative TM-Based Information Services.................368 Summary...................................................................................................................................371 Acknowledgments...................................................................................................................372 Selected Abbreviations...........................................................................................................372 References...............................................................................................................................375 Chapter 16. Prediction: A Profound Paradigm Shift...............................................................394 Language..................................................................................................................................394 Transmitting the Word.............................................................................................................395 Lightness of Being...................................................................................................................396 A Brief History of Knowledge Representation and Education..........................................400 The Ephemeral Nature of Many New Ideas........................................................................402 What the Research Suggests about Knowledge Representation and Learning...........403 A Paradigm Shift: Patterning Speech to Patterning Thought...........................................410 Summary...................................................................................................................................411 Acknowledgments...................................................................................................................412 References...............................................................................................................................412 Chapter 17. Topic Maps, the Semantic Web, and Education[1]............................................419 What Is the Semantic Web?..................................................................................................419 How Can Topic Maps Play an Important Role in the Semantic Web?............................422 What's Next?............................................................................................................................422 Closing Salvo...........................................................................................................................436 References...............................................................................................................................436 Glossary........................................................................................................................................438 Appendix A. Tomatoes Topic Map............................................................................................449 Appendix B. Topic Map for Chapter 9......................................................................................452 Appendix C. XSLT Style Sheet for Chapter 9.........................................................................465 Appendix D. Genealogical Topic Map......................................................................................471 iv Copyright Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and Addison-Wesley was aware of a trademark claim, the designations have been printed with initial capital letters or in all capitals. The authors and publisher have taken care in the preparation of this book, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information or programs contained herein. W3C specifications and code copyright © 2003 World Wide Web Consortium (Massachusetts Institute of Technology, Institut National de Recherche en Informatique et en Automatique, Keio University). All Rights Reserved. http://www.w3.org/Consortium/Legal/. The publisher offers discounts on this book when ordered in quantity for bulk purchases and special sales. For more information, please contact: U.S. Corporate and Government Sales (800) 382-3419 [email protected] For sales outside of the U.S., please contact: International Sales (317) 581-3793 [email protected] Visit Addison-Wesley on the Web: www.awprofessional.com Library of Congress Cataloging-in-Publication Data Park, Jack. XML topic maps : creating and using topic maps for the Web / Jack Park and Sam Hunting. p. cm. Includes bibliographical references and index. ISBN 0-201-74960-2 (paperback) 1. XML (Document markup language) 2. Metadata. I. Hunting, Sam. II. Title. v QA76.76.H94 P376 2002 005.7'2—dc21 2002003679 Copyright © 2003 Pearson Education, Inc. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form, or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior consent of the publisher. Printed in the United States of America. Published simultaneously in Canada. For information on obtaining permission for use of material from this work, please submit a written request to: Pearson Education, Inc. Rights and Contracts Department 75 Arlington Street, Suite 300 Boston, MA 02116 Fax: (617) 848-7047 Text printed on recycled paper 1 2 3 4 5 6 7 8 9 10—CRS—0605040302 First printing, July 2002 Foreword In 1962 I wrote a paper, "Augmenting Human Intellect: A Conceptual Framework," in which I laid out my vision for how humanity can tackle its most complex, urgent problems. I proposed a framework driven by a simple premise: As problems get harder, we need to get collectively smarter. As I considered ways to increase our collective intellectual capabilities, I thought about language and the symbols that humans use to create conceptual models of the world. Our most basic conceptual structures have been evolving for thousands of years. Alphabets evolved from pictographs, followed by white space and punctuation. The introduction of the printing press led to conceptual structures such as paragraphs, page numbers, footnotes, concordance indices, and tables of contents. I realized that computers offered radical new ways of portraying and manipulating conceptual structures, and that further evolving these symbols and techniques could greatly augment our capabilities. Although one idea proposed in that paper—hypertext—has became pervasive today in simple form, I have been waiting for 40 years for the active exploration of concept mapping. As a result, I am vi delighted to see the work being done with Topic Maps, and I wholeheartedly support this book, which was edited by my friend and colleague Jack Park. In order to achieve the full potential of Topic Maps, we need tools to integrate these conceptual maps with our vast repositories of documents and recorded dialog, as well as tools for manipulating and viewing these structures in different ways. I hope that this book is a first step in that direction, and that you, the reader, will help make these possibilities reality. —Douglas C. Engelbart Preface In a former life, I built microprocessor-based data acquisition systems, originally for locating and monitoring wind and solar energy systems. I suppose it is fair to say that I have long been involved in roaming solution space. Along the way, farmers, on whose land the energy systems were often situated, discovered that my monitoring tools helped them form better predictions of fruit frost, irrigation needs, and pesticide needs. My program, which ran on an Apple II computer that had telephone access to the distributed monitoring stations, printed out large piles of data. Epiphany happened on the day that a manager of one of those monitoring systems came to me and asked, "What else is this data good for?" That was the day I entered the field of artificial intelligence, looking for ways to organize all that data and mine it for new knowledge. A recent discussion on National Public Radio focused on the nature and future of literature. Listening to that conversation while navigating the perils of Palo Alto traffic, I heard two comments that I shall paraphrase, with emphasis placed according to my own whims, as follows: In the past, we turned to the great works of literature to ponder what is life. Today, we turn to the great works of science to ponder the same issues. In some sense, the message I pulled out of that is that we (the really big we) tend to appeal to science and technology to find comfort and solutions to our daily needs. In that same sense, I found justification for this book and the vision I had when the book was conceived. Make no mistake here— I already had plenty of justification for the vision and the book. As is often pontificated by many, we are engulfed in a kind of information overload that threatens to choke off our ability to solve major problems that face all of humanity. No, the vision is not an expression of doom and gloom. Rather, it is an expression of my own deep and optimistic belief that it is through education, through an enriched human intellect, that solutions will be found, or at least, the solution space will become a more productive environment in which to operate. The vision expressed here is well grounded in the need to organize and mine data, all part of the solution space. While walking along a corridor at an XML conference in San Jose early in the year 2000, I noticed a sign that said "Topic Maps," with an arrow pointing to the right. I proceeded immediately to execute a personal "column right" command, entered a room, and met Steve Newcomb. The rest all makes sense. While in Paris later that year, I saw the need to take the XTM technology to the public. This book was then conceived at XML 2000 in Paris, and several authors signed on immediately. This book came with a larger vision than simply taking XTM to the public. I saw topic maps as an important tool in solution space. The vision included much more; topic maps are just one of many tools in that space. I wanted to start a book series, one that is thematically associated with my view of solution space. vii This book is the first in that series, flying under the moniker Open Knowledge Systems. By using the word open, I am saying that the series is about making the tools and information required to operate in solution space completely open and available to all who would participate. Open implies that each book in the series intends to include an Open Source Software project, one that enables all readers to immediately "play in the sandbox" and, hopefully, go beyond by extending the software and contributing that new experience to solution space. Each contribution to the Open Knowledge Systems series is intended to be a living document, meaning that each work will be available at http://www.nexist.org.[1] The entire contents of this Web site will be browsable and supported with an online forum so that topics discussed in the books can be further discussed online. [1] As this book is going into print, the Web site is going online. This book is about topic maps, particularly topic maps implemented in the XTM Version 1.0 specification format, as conceived by the XTM Authoring Group, which was started by an experienced group of individuals along with the vision and guidance of Steve Newcomb and Michel Biezunski, both contributing authors for this book. As with many new technologies, the XTM specification is, in most regards, not yet complete. In fact, a standard like XTM can never be complete simply because such standards must coevolve with the environment in which they are applied. In the same vein, a book such as this cannot be a coherent work simply because much of what is evolving now is subject to differing opinions, views, and so forth. There are a few assumptions made by all of the authors who contributed to this book. Mostly, the assumptions presume some minimal familiarity with Extensible Markup Language (XML), Extensible Style Language (XSL and XSLT), and Resource Description Framework (RDF). Please keep in mind that the book presents many Web site references. Web sites occasionally disappear. While the links presented were tested during the writing phase and again during final manuscript editing, do not be surprised if some of them fail to remain in service. Since this book will remain a living document on the Web, we hope to keep all links up-to-date on the book's Web site. Because of my view that solution space itself is coevolving along with the participants in that space, I have adopted an editorial management style that I suspect should be explained. My style is based on the understanding that I am combining contributions from many different individuals, each with a potentially different worldview and each with a different writing style. The content focus of this book is, of course, on topic maps, but I believe that it is not necessary to force a coherent worldview on the different authors—it is my hope that readers and, indeed, solution space will profit by way of exposure to differing views and opinions. There will, by the very nature of this policy, be controversy. Indeed, we are exploring the vast universe of discourse on the topic of knowledge, and there exists plenty of controversy just in that sandbox alone. There is also the possibility of overlap. Some chapters are likely to offer the same or similar (or even differing) points of view on the same point. Case in point: knowledge representation. This book has several chapters on that topic: one on ontological engineering, one on knowledge representation, and one on knowledge organization. Two chapters talk in some detail about semantic networks, and other chapters discuss how people learn. It's awfully easy to see just how these can overlap, and they do. My management style has been that which falls out of research in chaos theory: use the least amount of central management, and let the authors sort it out for themselves. History will tell us whether this approach works. viii Acknowledgments Producing this book turned out to be much harder than I expected. It's true, I was warned in advance that I was biting off more than I could chew, but such warnings never stopped me in the past. Let me tell you what was hard about the project. It wasn't what people warned—that coordinating the efforts of many authors would be difficult. I chose some of the best authors in the world, and nobody let me down. I strongly believe that the results prove that. The difficulty was this: coordinating the manuscript with the rapidly changing technological landscape was a killer. Readers may also think I experienced difficulty in coordinating the various writing styles of a diverse authoring community. Actually, that was not a difficulty at all. I simply decided up front that the nature of this book would be "style permissive," and the result is a book with chapters of varying length and content. I decided very early that this book was not intended to be a "cookbook" for building topic maps. I believed that, given the rapidity with which the nature of topic maps technology might evolve, a "cookbook" approach would be premature. This manuscript was first proposed in Paris during one of the earliest XTM Authoring Group meetings. Fat chance I had there to anticipate just how much our thinking would evolve over time. The manuscript was well developed by the time the first working version of XTM was made public. That's when the technological landscape started to evidence the massive convulsions of a magnitude-8 earthquake. Nevertheless, my team of coauthors persisted, and Sam Hunting jumped in recently and contributed an additional chapter (Chapter 4), which provides a bridge between the latest activities in the XTM community and the presentations of the other chapters. Sam and I gratefully acknowledge the assistance of Steve Newcomb and Michel Biezunski in developing the glossary. I gratefully acknowledge Sam's "hero's effort" in helping me to bring this book to completion. Working with Chrysta Meadowbrooke at Stillwater Publishing Services to massage this manuscript into shape was an enormous pleasure. I thank Kathy Glidden of Stratford Publishing Services for keeping this project on track. Now, let me tell you what was, at once, easy and fun about this project. VerticalNet funded all of my early work on the XTM project, with the full and enthusiastic support of Hugo Daley and Adam Cheyer. I am very grateful for that support. The production of this book was made possible by the incredible enthusiasm and efforts of each of the coauthors who submitted a chapter for me to include and by the assistance of Mary T. O'Brien and Alicia Carey, both at Addison-Wesley. Mary O'Brien agreed with me that this book should be a "living document" with a Web presence and the ability to be kept up-to-date. Perhaps, for me, the most profound influence on this project came from the two individuals who started topic maps in the first place, Steve Newcomb and Michel Biezunski. Along the way, by personal contact and by way of e-mail lists involved with topic maps, several other individuals have, in many ways, also contributed to this work. I am sure I will miss some names, but those who are pounding their way to visibility include Glen and Helen Haydon, Douglas Engelbart, Mary Keeler, Murray Altheim, Simon Buckingham Shum, Bernard Vatant, Mary Keeler, John Sowa, Robert Barta, Scott Tsao, Ann Wrightson, Steve Heckler, Sunthar Visuvalingam, Steve Pepper, Jeff Conklin, Kathleen Fisher, Alex Shapiro, Eugene Kim, Eric Armstrong, Rod Welch, and Peter Jones. I am also pleased to acknowledge and thank the reviewers of this manuscript who made many valuable comments and suggestions. This book would not exist without the enthusiastic support of my wife, Helen, and the support of our children, John and Nefer, who also teamed up to contribute a chapter to the manuscript. ix —Jack Park Brownsville, California March 2002 Contributors Kal Ahmed Founder, Techquila Kal Ahmed is a consultant specializing in XML document and knowledge management solutions. He has long experience with XML and SGML document management systems and more recently has worked extensively with Topic Maps, both as a founding member of TopicMaps.org and as a contributor to the XTM 1.0 specification. Kal is the lead developer of the open source topic map toolkit TM4J and hosts other topic map and meta data processing tools on his site, http://www.techquila.com. Michel Biezunski Consultant, Coolheads Consulting Michel Biezunski is an editor of the ISO/IEC Topic Maps standard. He holds a Ph.D. in the history of physics. He has been at the origin of the topic maps paradigm, together with Steven Newcomb, and is still actively involved in the design of its Reference Model. He is helping corporations and government agencies to implement topic map applications. Kathleen M. Fisher Professor of Biology and Director, Center for Research in Mathematics and Science Education, San Diego State University Dr. Kathleen Fisher has worked in biology education research and development for 30 years. Her recent book with coauthors J. Wandersee and D. Moody, Mapping Biology Knowledge, is now available in paperback from Kluwer. She developed the SemNet learning and knowledge construction tool with the SemNet Research Group. The Semantica software series knowledge transfer tools, successors to the SemNet software, are now being produced and marketed by Semantic Research, Inc., 1055 Shafter Street, San Diego, CA. The Semantica 2.1 authoring tool and the Semantica 3.0 Reader will be released in summer 2002. Eric Freese Senior Consultant, Chair, TopicMaps.org Eric Freese has 15 years of experience in the area of information, document, and knowledge management. His experience includes research, analysis, specification, design, development, testing, implementation, integration, and management of database systems and computer technologies in business, education, and government environments. Eric is also the chief architect and developer of x