INFORMATION- DRIVEN BUSINESS How to Manage Data and Information for Maximum Advantage ROBERT HILLARD John Wiley & Sons, Inc. Copyright © 2010 by Robert Hillard. All rights reserved. Published by John Wiley & Sons, Inc., Hoboken, New Jersey. Published simultaneously in Canada. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8600, or on the Web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www. wiley.com/go/permissions. Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages. For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002. Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books. For more information about Wiley products, visit our Web site at www.wiley.com. Library of Congress Cataloging-in-Publication Data Hillard, Robert, 1968– Information-driven business : how to manage data and information for maximum advantage / Robert Hillard. p. cm. Includes bibliographical references and index. ISBN 978-0-470-62577-4 (cloth); ISBN 978-0-470-64943-5 (ebk); ISBN 978-0-470-64945-9 (ebk); ISBN 978-0-470-64946-6 (ebk) 1. Technological innovations—Management. 2. Information technology— Management. 3. Management information systems. 4. Industrial management. I. Title. HD45.H45 2010 658.4′038–dc22 2010007798 Printed in the United States of America. 10 9 8 7 6 5 4 3 2 1 Contents Preface xiii Acknowledgments xv Chapter 1: Understanding the Information Economy 1 Did the Internet Create the Information Economy? 2 Origins of Electronic Data Storage 2 Stocks and Flows 3 Business Data 4 Changing Business Models 5 Information Sharing versus Infrastructure Sharing 6 Governing the New Business 7 Success in the Information Economy 8 Notes 9 Chapter 2: The Language of Information 10 Structured Query Language 13 Statistics 14 XQuery Language 15 Spreadsheets 15 Documents and Web Pages 16 Knowledge, Communications, and Information Theory 17 Notes 18 Chapter 3: Information Governance 19 Information Currency 19 Economic Value of Data 21 Goals of Information Governance 23 Organizational Models 24 Ownership of Information 26 Strategic Value Models 27 Repackaging of Information 30 Life Cycle 31 Notes 32 vii viii Contents Chapter 4: Describing Structured Data 33 Networks and Graphs 33 Brief Introduction to Graphs 35 Relational Modeling 37 Relational Concepts 38 Cardinality and Entity-Relationship Diagrams 39 Normalization 40 Impact of Time and Date on Relational Models 49 Applying Graph Theory to Data Models 51 Directed Graphs 52 Normalized Models 53 Note 54 Chapter 5: Small Worlds Business Measure of Data 55 Small Worlds 55 Measuring the Problem and Solution 56 Abstracting Information as a Graph 57 Metrics 58 Interpreting the Results 60 Navigating the Information Graph 61 Information Relationships Quickly Get Complex 62 Using the Technique 64 Note 65 Chapter 6: Measuring the Quantity of Information 66 Definition of Information 66 Thermal Entropy 67 Information Entropy 68 Entropy versus Storage 70 Enterprise Information Entropy 73 Decision Entropy 76 Conclusion and Application 78 Notes 78 Chapter 7: Describing the Enterprise 79 Size of the Undertaking 79 Enterprise Data Models Are All or Nothing 80 The Data Model as a Panacea 81 Metadata 82 The Metadata Solution 83 Contents ix Master Data versus Metadata 84 The Metadata Model 85 XML Taxonomies 87 Metadata Standards 87 Collaborative Metadata 88 Metadata Technology 90 Data Quality Metadata 91 History 91 Executive Buy-in 92 Notes 93 Chapter 8: A Model for Computing Based on Information Search 94 Function-Centric Applications 95 An Information-Centric Business 96 Enterprise Search 97 Security 98 Metadata Search Repository 98 Building the Extracts 100 The Result 100 Note 102 Chapter 9: Complexity, Chaos, and System Dynamics 103 Early Information Management 103 Simple Spreadsheets 104 Complexity 105 Chaos Theory 105 Why Information Is Complex 106 Extending a Prototype 110 System Dynamics 112 Data as an Algorithm 116 Virtual Models and Integration 118 Chaos or Complexity 119 Notes 120 Chapter 10: Comparing Data Warehouse Architectures 121 Data Warehousing 121 Contrasting the Inmon and Kimball Approaches 122 Quantity Implications 123 Usability Implications 125 Historical Data 132 x Contents Summary 133 Notes 134 Chapter 11: Layered View of Information 135 Information Layers 136 Are They Real? 137 Turning the Layers into an Architecture 141 The User Interface 143 Selling the Architecture 144 Chapter 12: Master Data Management 146 Publish and Subscribe 146 About Time 148 Granularity, Terminology, and Hierarchies 148 Rule 1: Consistent Terminology 149 Rule 2: Everyone Owns the Hierarchies 150 Rule 3: Consistent Granularity 150 Reconciling Inconsistencies 151 Slowly Changing Dimensions 151 Customer Data Integration 153 Extending the Metadata Model 153 Technology 155 Chapter 13: Information and Data Quality 156 Spreadsheets 156 Referencing 157 Fit for Purpose 158 Measuring Structured Data Quality 160 A Scorecard 164 Metadata Quality 164 Extended Metadata Model 165 Notes 166 Chapter 14: Security 167 Cryptography 167 Public Key Cryptography 169 Applying PKI 170 Predicting the Unpredictable 172 Protecting an Individual’s Right to Privacy 172 Securing the Content versus Securing the Reference 175 Contents xi Chapter 15: Opening Up to the Crowd 176 A Taxonomy for the Future 177 Populating the Stakeholder Attributes 179 Reducing E-mail Traffic within Projects 179 Managing Customer E-mail 180 General E-mail 180 Preparing for the Unknown 181 Third-Party Data Charters 182 Information Is Dynamic 183 Power of the Crowd Can Improve Your Data Quality 183 Note 184 Chapter 16: Building Incremental Knowledge 185 Bayesian Probabilities 187 Information from Processes 188 The MIT Beer Game 192 Hypothesis Testing and Confidence Levels 193 Business Activity Monitoring 195 Note 196 Chapter 17: Enterprise Information Architecture 197 Web Site Information Architecture 198 Extending the Information Architecture 198 Business Context 199 Users 199 Content 200 Top-Down/Bottom-Up 200 Presentation Format 201 Project Resourcing 201 Information to Support Decision Making 203 Notes 204 Looking to the Future 205 About the Author 209 Index 211 Preface T his book is aimed at anyone who is in any way responsible for information. Executives, managers, and technical staff all need to understand how to manage this most valuable resource. I wrote this book based on the observation that the concept of information overload is permeating every business that I deal with. At the same time, the global economy is moving from products to services that are described almost entirely electronically. Even those businesses that are traditionally associated with making things are less concerned with the management of the manufacturing process (which is largely outsourced) than they are with the management of their intellectual prop- erty. Increasingly, information doesn’ t provide a window on the business. It is the business. It ’ s a simple equation. Intellectual property is tied up in the data on computers. If it is the subject of focused management, then greater value is extracted from that data. If the intellectual property is a significant proportion of the value of the busi- ness, then such a focused effort will have a dramatic effect on the value of the business as a whole. Such an effort will also make the organization much more enjoyable to work in with less time lost searching for information that should be readily available and less time sifting through irrelevant data that should never have hit the e - mail inbox. A s business has become more complex, techniques are appearing almost every day that seek to simplify the task of managing a large, multifaceted organization. Their quest is similar to a physicist looking for the single unifying equation that will define the universe. Any approach that recommends focusing on one part of the business must use a limited set of measures that aggregate complex data from across the enterprise. In providing a simple answer, detail and differentiation must be lost. A simple set of metrics by itself is no longer enough to sum up the millions or billions of moving parts that define the enterprise. Perhaps, then, it is time to gain a better understanding of the role of information in business. W hile large quantities of information have been with us for as long as humans have gathered in groups, it has taken on a whole new dynamic form. The quantity of data has grown dramatically since the cost of computer storage dropped as it did at the end of the twentieth century. The growth has taken business management by surprise and the techniques that we use have not been able to keep up. W ith little differentiation in the bricks- a nd- m ortar assets, business needs to enhance its service and differentiate using the informational resources at its dis- posal. The winners tailor their product to the needs of their markets. Successful leaders have a deep insight into the running of their business. Such an insight can come only from accurate information. xiii xiv Preface I n almost every organization, one or more executives have been assigned accountability for information governance, quality, or records. Similarly, technolo- gists are being asked to make sense of the mountains of data that exist in databases, file systems, and other repositories. This is a book about becoming an information - centric business and achieving significant benefits as a result. O ver many years, I have had the opportunity to work with hundreds of organiza- tions in the private and government sectors. The issues that they face handling business information have a common theme of complexity. Questions that should be simple to answer take too long, reconciliations that should be exact aren’ t , privacy that should be perfect isn ’ t, and security that should be tight is porous. Treating information as something that needs to be managed in its own right allows a profession of information managers to develop a common approach to information management. Without common techniques, many organizations have been ad hoc in their approach. The most successful, though, have borrowed approaches from other disciplines and been part of the evolution of a form of pro- fessional consensus. For that reason, I have been pleased over a number of years to be part of the leadership of the MIKE2.0 initiative. MIKE2.0 (Method for the Implementation of a Knowledge Enterprise) is an open collaboration of information management professionals from a variety of organizations seeking to develop a common approach. The content is entirely free under the Creative Commons licensing model. MIKE2.0 can be found at www.openmethodology.org. I have applied the techniques in this book in some of the world’ s largest com- panies and government departments. They have also been effectively adopted in midsized and even small businesses. As a field grows in sophistication, so the knowledge needed by practitioners also increases. This book provides sufficient detail to allow anyone who deals with information to identify the right approach to apply without trying to be a step- b y- s tep guide. Armed with the knowledge within these pages, the reader can then adopt comprehensive methodologies like MIKE2.0 to develop detailed project plans or establish programs of work. Each chapter introduces a concept and in many cases provides both strategic and tactical advice. The strategic advice will help shape the future enterprise. The tactical advice will help solve immediate challenges. The reader should be left with the overwhelming message that information management is not the responsibility of the information technology department, nor is it able to be governed by any one line of business. Information is an asset with a very real economic value. It is the responsibility of everyone who in any way creates, handles, stores, or exploits this asset to ensure that they achieve the greatest possible value for the enterprise as a whole. T his is not the final book that will be written on this subject. The discipline will continue to develop as we all find better and more effective ways to run organiza- tions to better create, handle, and exploit information. There is no single answer to the question on how you should manage your information resources, so apart from the MIKE2.0 site, I also encourage readers of this book to check in at www. infodrivenbusiness.com where additional references and comments will be posted.
Description: