image image Text copyright © 2017 by Garson O’Toole All rights reserved. No part of this work may be reproduced, or stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without express written permission of the publisher. Published by Little A, New York www.apub.com Amazon, the Amazon logo, and Little A are trademarks of Amazon.com, Inc., or its affiliates. ISBN-13: 9781503933415 (hardcover) ISBN-10: 1503933415 (hardcover) ISBN-13: 9781503933408 (paperback) ISBN-10: 1503933407 (paperback) Cover design by Rex Bonomelli First edition In memory of my brother, Stephen CONTENTS START READING INTRODUCTION THE DETECTIVE, THE DATABASE, AND THE MECHANISMS OF ERROR I GROUP ERROR SYNTHESIS VENTRILOQUY PROVERBIAL WISDOM SYNTHESIS VENTRILOQUY PROVERBIAL WISDOM II READER ERROR TEXTUAL PROXIMITY REAL-WORLD PROXIMITY SIMILAR NAMES TEXTUAL PROXIMITY REAL-WORLD PROXIMITY SIMILAR NAMES III AUTHOR ERROR CONCOCTIONS * HISTORICAL FICTION CONCOCTIONS HISTORICAL FICTION IV FINDERS KEEPERS CAPTURE * HOST CAPTURE HOST INDEX (BY NAME) INDEX (BY QUOTATION) ABOUT THE AUTHOR image INTRODUCTION THE DETECTIVE, THE DATABASE, AND THE MECHANISMS OF ERROR How did I begin investigating the dubious origins of familiar quotations? I will tell you: In the 1990s I developed an enthusiasm for electronic books. I felt they had enormous potential to advance access to the world’s knowledge cheaply and efficiently. Massive digital libraries, every book in the public domain, shared worldwide at low cost on the Internet—the thought excited in me a desire to learn more about the potential applications of such technology. Meanwhile, a pioneer in electronic publishing named Brad Templeton had assembled a groundbreaking CD-ROM, Hugo and Nebula Anthology 1993, a digital collection of five novels and numerous short stories, at the time the first e-book of contemporary writing. The works were all nominees for the Hugo and Nebula Awards, the top literary prizes for science fiction. The CD-ROM cost about as much as a hardcover book these days, $29.95. The forward-thinking fans of science fiction were ideal customers for this innovative collection, and as one of them, I couldn’t resist. I purchased the CD-ROM as the harbinger of the future. But Templeton’s project was many years too early and, sadly, never caught on. In the early 2000s I read and commented on articles at a website devoted to e-reading called TeleRead. At the prompting of the founder, David Rothman, I began to contribute articles. Rothman envisioned a “well-stocked national digital library.” He had been propounding the idea for over a decade in op-ed pieces published in periodicals such as Computerworld, and I was glad to find someone who shared the same viewpoint about the magnificent potential of electronic reading. It would soon turn out that a central part of this idea was already being pursued by a former graduate student of Stanford University. Larry Page, the cocreator of Google, shared this dream of constructing a searchable digital library, one that might contain all the books of the world. In time he found himself in a position to do something about it. Google engineers developed efficient machines using multiple cameras and sensors to create scans of page after page of one volume after another, and in 2002 the machines were put to work. Beginning at his undergraduate alma mater at the University of Michigan, Page would pursue the stacks of the major research libraries in the United States and the United Kingdom. The Google Books database now contains more than thirty million searchable books. The fantasy became reality very quickly. Despite complex questions of copyright that threatened, for a time, to shut the database down, the futurist library had been built. I wondered even then how one could prove to others the power and value of such an expansive library. The whole of linguistic history at one’s fingertips: What is one to do? Searching for words, phrases, and statements would reveal a trove of connections and citations. But what could one learn? To test the utility of the search procedure, I decided to explore the history of a hex that only sounds like a blessing: May you live in interesting times. The saying fit the project. Robert F. Kennedy had once labeled the statement a “Chinese curse” when he employed it during a commencement speech given at the University of Capetown, South Africa, in 1966. Indeed, others have called it “ancient.” (More recently, Hillary Clinton included the saying in her memoir Living History.) When I sought in 2007 to answer the question of where the saying had come from, I discovered that many people had already explored its history. Wikipedia volunteers had created an entry on the topic with an initial citation in 1950, and this entry functioned as my benchmark. If I could determine that the statement had appeared before 1950, then I would write an article for TeleRead illustrating the effectiveness and potency of performing research with Google Books. It only took a few strokes of the keyboard for me to immediately locate within the Google Books database a 1930 citation for a short story in a magazine called Astounding Science Fiction. Had it really taken only my single search to best Wikipedia? No. This was my rocky introduction to the complexity and difficulty of searching within large textual databases like Google Books. While cross-checking the date of the issue in question, I determined that the story I had found, “U-Turn” by Duncan H. Munro, had in fact been published in 1950. I was befuddled. Why was Google Books supplying incorrect dates? The tight
Description: