植 物 分 类 与 资 源 学 报 ,33 ( ): 摇 2011 1 39~45 Plant Diversity and Resources : 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 DOI 10.3724/SP.J.1143.2011.10239 Botanical Literature Goes Global The Biodiversity : Heritage Library * Judith A.WARNEMENT BotanyLibraries HarvardUniversityHerbaria ( , ,22 DivinityAve.,Cambridge,MA,USA) Abstract : Scholars in the natural sciences rely on historic literature more than any other branch of science. Yet much of this material has limited global distribution and much of it is available in only a few select libraries. This wealth ofknowledgeisavailableonlytothosefewwhocangaindirectaccesstosignificantlibrarycollections,asitu鄄 ation that is considered one of the chief impediments to the efficiency of research in the field. Community support andnewtechnologiesledtotheformationoftheBiodiversityHeritageLibrary. TheBHLisaninternationalcollabora鄄 tion of naturalhistorylibrariesworkingtogethertomakebiodiversityliteratureavailableforusebythewidestpossible audience through open access and sustainable management. Key words : Biodiversity Heritage Library; Encyclopedia of Life; Botanical libraries; Digital libraries; Taxonomic intelligence; Taxonomic literature CLC number Document Code Article ID : G25摇 摇 摇 摇 摇 : A摇 摇 摇 摇 摇 摇 :2095-0845(2011)01-039-07 Background Missouri Botanical Garden agreeing to support the The biodiversity community is at the forefront of development of the technical infrastructure. By Feb鄄 developing international standards and applying new ruary of 2007 a formal organizational meeting was technologies to merge and expand historic datasets hosted by Harvard忆s Museum of Comparative Zoolo鄄 with current research, as exemplified by the Biodi鄄 gy, where the governance structure, operational versity Heritage Library (BHL) program. The idea plans, and working committees for the BHL were for this project started in March, 2005, as scien鄄 formed. Charter members included natural history tists, informatics experts, and librarians convened at museum libraries (American Museum of Natural a session entitled “Libraries and Laboratories冶 hos鄄 History; Field Museum; Natural History Museum, ted by the Natural History Museum in London to London; and Smithsonian Institution), botanical share ideas, goals, and concerns. One outcome was garden libraries (Missouri Botanical Garden; New a shared vision to build an integrated digital biodi鄄 York Botanical Garden; and Royal Botanic Gardens, versity library modeled after Botanicus, Missouri Bo鄄 Kew), as well as academic and research libraries tanical Garden忆s digital library. A follow鄄up organi鄄 (Harvard University忆s Botany Libraries; Ernst Mayr zational meeting was hosted by the Smithsonian Insti鄄 Library of the Museum of Comparative Zoology; and tution Library (SIL) in June of2006, where librari鄄 Marine Biological Laboratory/Woods Hole Oceano鄄 ans from major natural history, botanical garden, graphic Institution Library). Libraries representing and research institutions in the United States and the Academy of Natural Sciences (Philadelphia) Great Britain were invited to participate in a consor鄄 and California Academy of Sciences joined in 2008. tium that would build a global digital collection. All The Biodiversity Heritage Library portal (www. of the participants agreed to move forward, with the biodiversitylibrary. org; see Fig. 1) was officially * Authorforcorrespondence;E鄄mail:[email protected];phone(617) 495-2366;Fax (617) 495-8654 Receiveddate: 2010-12-13, Accepteddate: 2010-12-30 植 物 分 类 与 资 源 学 报 第 卷 摇40摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 33 launched in May,2007,with Botanicus as the under鄄 The Biodiversity Heritage Library is not a legal pinning of a rapidly expanding biodiversity library. entity, but a federation of libraries bound by memo鄄 randa of understanding. Members have signed agree鄄 ments and the library directors represent their re鄄 spective institutions on an Institutional Council. An elected Executive Committee conducts routine busi鄄 ness, and works closely with the three salaried posi鄄 tions that include an executive director, a technical director, and a collections coordinator. There are weekly teleconferences scheduled by the Executive Committee, monthly calls scheduled with the Institu鄄 tional Council, and there is one face鄄to鄄face meeting held each year. These two groups oversee policy and funding decisions, while the details are managed by Fig.1摇 TheBiodiveristyHeritageLibraryportal a variety of broadly representative committees. The Materials and methods scanning staff members teleconference weekly by phone and have been instrumental in developing the Partners in the Biodiversity Heritage Library tools that manage bidding, workflow, and quality (BHL) program are working together to digitize the control protocols. A collections committee monitors published literature of biodiversity held in their re鄄 the overall cohesiveness of BHL content, refines in鄄 spective collections, and to make that literature a鄄 gest criteria, and reviews all collections鄄related is鄄 vailable for open access as a part of a global “biodi鄄 sues. An active technical committee designs all as鄄 versity commons冶. The BHL program is the scan鄄 pects of the BHL global infrastructure, explores and ning and digitization component of the Encyclopedia engages in partnerships that will advance the project忆 of Life (EOL) (www. eol. org), Harvard Universi鄄 s mission. The project is supported by the Encyclo鄄 ty忆s Professor Edward O. Wilson忆s vision to create a pedia of Life budget with grants from the John D. web page for every species of the earth忆s biota. An鄄 and Catherine T. MacArthur Foundation and the Al鄄 other key collaborator is the Internet Archive (IA) fred P. Sloan Foundation, funds from EOL忆s five (www.archive.org), which is dedicated to “univer鄄 anchor institutions, the partner institutions, and oth鄄 sal access to human knowledge冶 and provides most er grants. BHL partners with low cost mass scanning, archival The BHL consortium is working with the global storage of files, image processing, and technology taxonomic community, rights holders and other inter鄄 development. Scanning facilities have been opened ested parties to ensure that this biodiversity heritage in London, Boston, New York, and Washington, is available to all and contributes to the International D. C. to assist with the BHL and other scanning Convention on Biological Diversity (CBD) and the projects. The IA also allows the BHL to “ingest冶 Global Biodiversity Information Facility (GBIF). other natural history content contributed by non鄄BHL “Taxonomic intelligence冶 is the inclusion of taxo鄄 partners like the California Digital Library, the Uni鄄 nomic practices, skills and knowledge within infor鄄 versity of Illinois at Urbana鄄Champaign, the Univer鄄 matics services to manage information about organ鄄 sity of Toronto, and the Boston Library Consortium. isms. Dubbed the Universal Biological Indexer and The partnership enriches the BHL collection and le鄄 Organizer, or uBio, BHL is using a sophisticated al鄄 verages limited scanning dollars. gorithm to locate likely name strings in OCR text, 期 1 摇 摇 摇 摇 Judith A.WARNEMENT: Botanical Literature Goes Global: The Biodiversity Heritage Library摇 摇 摇 摇 摇 摇 4摇1 has “discovered冶 10.7 million name strings in and yet essential to taxonomists忆 research. NameBank (Fig.2) (www.ubio.org/index.php? pa鄄 The technical team developed tools to coordi鄄 gename=namebank),and serves as a name thesaurus. nate scanning efforts and avoid duplication. A mer鄄 The link between the name service and the BHL col鄄 ged database of members忆 serials holdings was crea鄄 lection creates a powerful new tool for scholars. ted as a “bid list冶 so that each library can indicate the titles it intends to scan. If problems are discov鄄 ered, such as missing volumes, or pages, or illus鄄 trations, then a call goes out to other libraries to scan those volumes. Monographs are selected by each library along subject areas, and the BHL col鄄 lection is checked prior to scanning to avoid duplica鄄 tion. All items are barcoded and shipping manifests are created using a tool called WonderFetch (biodi鄄 versitylibrary.blogspot.com/2008/06/wonderfetchtm鄄 ia鄄metaxml鄄fields.htm). The partner libraries can populate fields with data that would not normally be populated as part of the standard IA process, and Fig.2摇 NameBankportal then store those values alongside each scanned item in the IA repository. The impetus for implementing Systematists and taxonomists need access to the WonderFetch was not just to automate the inclusion historic literature to support current research. The of essential data elements like the volume and issue cited half鄄life of publications in taxonomy and the information for serials, but to also capture due dili鄄 “decay rate冶 are longer than in any other scientific gence, rights, and licensing information related to discipline so the “current冶 biodiversity literature each item. Partner libraries underwrite all of the spans more than250 years. At the outset of the pro鄄 costs associated with identifying, processing, and ject, the BHL needed to calculate the scope of the shipping materials, and BHL grants support the costs biodiversity domain. OCLC (www.oclc. org/us/en/ associated with scanning and digital processing. default.htm), the major international library utility, Results and Discussion supported the BHL by merging all of the partners忆 catalog records into a database so OCLC忆s collection The BHL portal currently offers more than 44 analysis tool could be used to profile the overall col鄄 000 titles represented by nearly 86 000 volumes de鄄 lection. The outcome showed that biodiversity litera鄄 livering more than32 million pages of content. Users ture is represented by 1. 3 million catalog records. can search by simply browsing by author, title, or More than 800000 records describe monographs and subject, or can use the novel language, year of pub鄄 40 000 records describe journal titles, with 12 500 lication, and source map options (Fig.3). More re鄄 records representing current titles. About forty per鄄 fined searches can be achieved by using the search cent of the material was published prior to 1923, box that allows the user to search for a specific au鄄 generally placing it the public domain. Sixty鄄three thor, title, subject, or species names. It is the de鄄 percent of the records were for works in English, livery of search results that is unique to BHL. Spe鄄 and German was the second most frequent language cies names results are delivered as a bibliography (9%). The BHL忆s scanning efforts have focused on that cites the source title, author, date, and pages, the pre鄄1923 content that is not readily available, and includes a link to the NameBank record. The 植 物 分 类 与 资 源 学 报 第 卷 摇42摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 33 pages of any volume selected are automatically download the bibliographic record, selected pages, scanned by the uBio search feature for species images, or the entire volume. The menu also fea鄄 names. The results appear in the “Names on this tures PDF or OCR download options, and links to page冶 box on the lower left鄄hand corner of the views via other portals. In order to download select鄄 screen (Fig. 4). Links to EOL species pages are ed pages, users supply an email address, and a cita鄄 highlighted, and clicking on any of the discovered tion for the request, and then select up to one hun鄄 names will generate a species name bibliography. dred pages. These documents are retained when ap鄄 Searchers can click on any name in the uBio box to propriate metadata is provided and are made availa鄄 see a bibliography of all other occurrences of the ble to other users through CiteBank (citebank.org). name in the entire portal. For example, selecting Citebank (Fig.7) is still under development, but in Rhododendron indicum will generate the bibliography addition to saving the BHL鄄selected documents, it is shown in Fig.5 that includes links to all source ma鄄 intended to provide robust search and browse capa鄄 terials. bilities to biodiversity publications stored in multiple The “Download/About this book冶 tab appears international repositories and aggregate content from (Fig.6) when a title is displayed. Users are able to as many systems as possible, so that biodiversity ANaturalist Fig.3摇 PortraitandtitlepageinErnestH. Wilsons Fig.4摇 SpeciesnamesdiscoveredbyuBioappearinboxinlower inWesternChina . (London:Methuen,1913) leftcorner. NotethelinkstoEOLspeciespages Rhododendronindicum Fig.5摇 uBiogeneratedbibliographyfor Fig.6摇 BHLDownload/Aboutthisbooktab 期 1 摇 摇 摇 摇 Judith A.WARNEMENT: Botanical Literature Goes Global: The Biodiversity Heritage Library摇 摇 摇 摇 摇 摇 4摇3 researchers have a single point of access to published na (Fig. 10) (www. bhl鄄china. org/cms/en), and materials. CiteBank is also intended to provide a the Internet Archive installed a scanner in Beijing in storage platform for articles and documents that are the summer of 2010 to help build the BHL鄄China Australia digitized, but not yet online and offer a common sys鄄 collection. In , The Atlas of Living Aus鄄 tem for researchers to share specialized bibliogra鄄 tralia, funded by the Australian government忆s Na鄄 phies. Users will be able to upload, edit, and share tional Collaborative Research Infrastructure Strategy their own personal lists of references and citations, program joined BHL in June 2010 (www. ala. org. and these references will be linked to scanned con鄄 au). Additional partnerships, policies, tools, and tent in the Biodiversity Heritage Library portal. In tutorials are being explored and developed to refine addition, BHL offers to scan professional societies忆 the BHL to increasingly extend its global reach. publications or other publishers content at BHL忆s ex鄄 A great deal has been achieved through conven鄄 pense and integrate the content into the BHL. A鄄 tional mass scanning technologies and practices, but greements with nearly forty publishers have added a鄄 a significant portion of early biodiversity literature is bout one hundred titles to the collection. quite rare and valuable, sometimes fragile, and of鄄 The Biodiversity Heritage Library wiki (Fig.8) ten the book is too large or has folded maps or illus鄄 (biodivlib.wikispaces.com) presents a wealth of in鄄 trations that do not fit on conventional scanning Retooling Special Collec鄄 formation about the project, detailed instructions and beds. A planning grant, tions in the Age of Mass Digitization tutorials for using the various features, and lists of , awarded by the members and BHL staff. Developers忆 tools are de鄄 Institute of Museum and Library Services (IMLS) in scribed and documented, and the BHL忆 s opt鄄in 2008 allowed BHL partners to identify and develop a copyright feature is explained. The BHL also uses cost鄄effective and efficient large鄄scale digitization popular social media tools to connect with the pub鄄 workflow and to explore ways to enhance metadata lic, including Twitter, Facebook, and the BHL blog for library materials that are designated as “special (biodiversitylibrary. blogspot. com). Each site at鄄 collections.冶 The group held a series of meetings, tracts and supports a varied community of scientists communicated by email, and established a wiki to and the general public. record meetings, track progress, and share docu鄄 Interest and support in the BHL has grown at an ments about costs, statistics and workflows, and astonishing rate. In less than five years the BHL has small鄄scale scanning tests. The report included ex鄄 grown into an international partnership that mirrors tensive cost analyses and recommendations for e鄄 the global nature of biodiversity research. Formal quipment configurations to scan rare and oversized BHL agreements are in place in Europe, China, and materials. Australia, and there is strong interest in South A鄄 BHL partners are also exploring ways to intro鄄 Europe merica and Egypt. In , colleagues at twenty鄄 duce other essential content to the BHL portal. Col鄄 eight European institutions have obtained funding lectors忆 field notes, plant lists, and diaries often from the European Union eContentplus Program to hold important information that supplements content establish a BHL鄄Europe (www. bhl鄄europe. eu), found on specimen labels and published accounts. which is developing the technical infrastructure and Access to this primary source material is even more tools to deliver content from many scanning projects problematic to scholars because most archival collec鄄 throughout the continent. In the United Kingdom tions, if catalogued, are not described in very fine work is also proceeding via the BHL and Europeana detail. The United States National Herbarium and China (Fig.9) (www. europeana. eu/portal). In , Smithsonian Institution Archives have received a the Chinese Academy of Sciences supports BHL鄄Chi鄄 Cataloging Hidden Special Collections and Archives 植 物 分 类 与 资 源 学 报 第 卷 摇44摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 33 Fig.7摇 CiteBankhomepage Fig.8摇 BHLwiki Fig.9摇 Europeanaportal Fig.10摇 BHL鄄Chinaportal ature Grant, from the Council on Library and Information will develop a system for integrating biological Resources (CLIR) to catalog all the field books, researchers忆 field and specimen notes with museum unpublished journals, loose notes, and sketches that specimens and related electronically published litera鄄 document field research related to all disciplines of ture. The enhanced and integrated access to biologi鄄 Exposing Biodiversity Field biology. The grant, cal data will serve a wide variety of users, and will Books and Original Expedition Journals at the Smith鄄 connect to other ongoing projects such as the Biodi鄄 sonian Institution , will also will build a cataloging versity Heritage Library. tool to and create a central repository so that other The Biodiversity Heritage Library will soon ben鄄 institutions can contribute their holdings. The en鄄 efit from another new collaboration. The Internation鄄 hanced level of description will improve access to al Association of Plant Taxonomists (IAPT) has giv鄄 these important research materials that are frequently en their permission to rescan and integrate with the difficult to discover and access remotely. BHL the monumental fifteen volume botanical bibli鄄 Taxonomic Literature nd Several BHL partners have been awarded an ography, , 2 ed. (TL鄄2). IMLS grant as a companion grant to the Smithsonian忆s The Smithsonian Institution Library has been awar鄄 Connecting Content A Collaboration CLIR proposal. : ded an Atherton Seidell Grant to accomplish the to Link Field Notes to Specimens and Published Liter鄄 scanning and design the schema. The BHL envisions 期 1 摇 摇 摇 摇 Judith A.WARNEMENT: Botanical Literature Goes Global: The Biodiversity Heritage Library摇 摇 摇 摇 摇 摇 4摇5 a dynamically linked “TL鄄3冶 that will connect cita鄄 the Biodiversity Heritage Library faces many challen鄄 tions to published references and allow for correc鄄 ges in the near future. Initial sources of funding end tions and the addition of new and expanded content. in 2012, and a plan for financial and digital sustain鄄 The Biodiversity Heritage Library has achieved ability must be formulated. The rapid international remarkable success in its relatively short existence. expansion of BHL presents new governance issues, The partners have demonstrated that independent increases the need for clear and focused standards, and geographically dispersed institutions can collabo鄄 and strategies to avoid duplication of effort. BHL is rate effectively, and have proven their ability to gen鄄 working to ensure the technical infrastructure for de鄄 erate significant financial support. The technical ac鄄 livering and preserving content through digitization complishments by a small team of talented and dedi鄄 and retrospective ingestion, as well as the ability to cated informatics specialists and the efficient and continue to deliver new services as needed by the collegial intra鄄institutional working groups are appar鄄 community. ent in the array of tools and services currently deliv鄄 Acknowledgements ered and under development via the various BHL in鄄 : The author wishes to acknowledge all terfaces. On June 27, 2010, the American Library of the BHL partners for their collegiality and dedication, and Dr. JinshuangMaandtheShanghaiChenshanBotanicalGar鄄 Association忆s Association for Library Collections & den and Plant Science Research Center for their support. Technical Services (ALCTS) awarded their Out鄄 standing Collaboration Citation to the BHL in recog鄄 References : nition of their outstanding collaborative partnership. Gwinn NE, Rinaldo C, 2009. The Biodiversity Heritage Library: The project has generated excitement in the in鄄 IFLA Jour鄄 SharingBiodiversity literature with the world [J]. ternational community and many opportunities to de鄄 nal 35 , :25—34摇 DOI:10.1177/0340035208102032 velop new partnerships and sources of funding. Soci鄄 InternationalAssociationofAquaticandMarineScienceLibrariesand ety journal publishers are enthusiastic about partici鄄 InformationCentersConference(35th:2009:Brugge,Belgium). IAMSLICProceedings摇 http://hdl.handle.net/1912/3787 pation in the BHL opt鄄in copyright model. The por鄄 Rinaldo C, 2009. The Biodiversity Heritage Library: exposing the tal has recorded nearly1.5 million visits since Janu鄄 Journal of Agricultural & Food Infor鄄 taxonomicliterature [J]. ary of2008, the taxonomic intelligence tool is highly mation 10 , :259—265 摇 DOI:10.1080/10496500903014669 effective, and there are high levels of OCR accuracy Rinaldo C, Norton C,2010. The Biodiversity Heritage Library: an in late 19th and 20th century printing. However, expandinginternationalcollaboration