STRUCTURE OF THE GLOBAL NANOSCIENCE AND NANOTECHNOLOGY RESEARCH LITERATURE Dr. Ronald N. Kostoff Office of Naval Research 875 N. Randolph St. Arlington, VA 22217 Phone: 703-696-4198 Fax: 703-696-8744 Internet: [email protected] Mr. Ray Koytcheff Office of Naval Research 975 N. Randolph St. Arlington, VA 22217 Dr. Clifford GY Lau Institute for Defense Analyses 4850 Mark Center Drive Alexandria, VA 22311 KEYWORDS Nanoparticle; Nanotube; Nanostructure; Nanocomposite; Nanowire; Nanocrystal; Nanofiber; Nanofibre; Nanosphere; Nanorod; Nanotechnology; Nanocluster; Nanocapsule; Nanomaterial; Nanofabrication; Nanopore; Nanoparticulate; Nanophase; Nanopowder; Nanolithography; Nano-Particle; Nanodevice; Nanodot; Nanoindent; Nanolayer; Nanoscience; Nanosize; Nanoscale; Information Technology; Text Mining; Bibliometrics; Citation Analysis; Computational Linguistics; Document Clustering; Correlation Map; Factor Matrix. DISCLAIMER (The views in this paper are solely those of the authors, and do not represent the views of the Department of the Navy or any of its components, or the Institute for Defense Analyses) 1 Report Documentation Page Form Approved OMB No. 0704-0188 Public reporting burden for the collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing the collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information, including suggestions for reducing this burden, to Washington Headquarters Services, Directorate for Information Operations and Reports, 1215 Jefferson Davis Highway, Suite 1204, Arlington VA 22202-4302. Respondents should be aware that notwithstanding any other provision of law, no person shall be subject to a penalty for failing to comply with a collection of information if it does not display a currently valid OMB control number. 1. REPORT DATE 2. REPORT TYPE 3. DATES COVERED 2006 N/A - 4. TITLE AND SUBTITLE 5a. CONTRACT NUMBER Structure of the Global Nanoscience and Nanotechnology Research 5b. GRANT NUMBER Literature 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) 5d. PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER 7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) 8. PERFORMING ORGANIZATION Office of Naval Research Dr. Ronald N. Kostoff 875 N. Randolph St. REPORT NUMBER Arlington, VA 22217 9. SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS(ES) 10. SPONSOR/MONITOR’S ACRONYM(S) 11. SPONSOR/MONITOR’S REPORT NUMBER(S) 12. DISTRIBUTION/AVAILABILITY STATEMENT Approved for public release, distribution unlimited 13. SUPPLEMENTARY NOTES The original document contains color images. 14. ABSTRACT 15. SUBJECT TERMS 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF 18. NUMBER 19a. NAME OF ABSTRACT OF PAGES RESPONSIBLE PERSON a. REPORT b. ABSTRACT c. THIS PAGE SAR 1491 unclassified unclassified unclassified Standard Form 298 (Rev. 8-98) Prescribed by ANSI Std Z39-18 ABSTRACT Text mining was used to extract technical intelligence from the open source global nanotechnology and nanoscience research literature. An extensive nanotechnology/ nanoscience-focused query was applied to the Science Citation Index/ Social Science Citation Index (SCI/ SSCI) databases. The nanotechnology/ nanoscience research literature technical structure (taxonomy) was obtained using computational linguistics, document clustering, and factor analysis. The nanotechnology/ nanoscience research literature infrastructure (prolific authors, key journals/ institutions/ countries, most cited authors/ journals/ documents) for each of the clusters generated by the document clustering algorithm was obtained using bibliometrics. Another novel addition was the use of phrase auto-correlation maps to show technical thrust areas based on phrase co-occurrence in Abstracts, and the use of phrase-phrase cross-correlation maps to show technical thrust areas based on phrase relations due to the sharing of common co-occurring phrases. The use of factor matrices quantified further the strength of the linkages among institutions and among countries, and validated the co- publishing networks shown graphically on the maps. The ~400 most cited nanotechnology papers since 1991 were grouped, and their characteristics generated. Whereas the main analysis provided technical thrusts of all nanotechnology papers retrieved, analysis of the most cited papers allowed their unique characteristics to be displayed. The instrumentation literature associated with nanoscience and nanotechnology research was examined. About 65000 nanotechnology records for 2005 were retrieved from the Science Citation Index/ Social Science Citation Index (SCI/SSCI), and ~27000 of those were identified as instrumentation-related. All the diverse instruments were identified, and the relationships among the instruments, and among the instruments and the quantities they measure, were obtained. Metrics associated with research literatures for specific instruments/ instrument groups were generated. The Applications literature associated with nanoscience and nanotechnology research was examined. Through visual inspection of the Abstract phrases of the same ~65000 downloaded 2005 records, all the diverse non-medical Applications were identified, and the relationships among the non-medical Applications, both direct and indirect, were obtained. Metrics associated 2 with research literatures for specific Applications/ Applications groups were generated. For medical Applications, a fuzzy clustering algorithm was applied to the ~65000 downloaded 2005 records. A sub-network that encompassed all the medical Applications was identified. Again, metrics associated with research literatures for specific medical applications were generated. 3 EXECUTIVE SUMMARY Introduction Nanotechnology is booming! In the global fundamental nanotechnology research literature as represented by the Science Citation Index/ Social Science Citation Index (SCI/ SSCI (SCI, 2006)), global nanotechnology publications grew dramatically in the last two decades. Due to this exponential growth of the global open nanotechnology literature, there is need for gaining an integrated quantitative perspective on the state of this literature. In 2003-2005, a comprehensive text mining study was performed to overview the technical structure and infrastructure of the global nanotechnology research literature, as well as the seminal nanotechnology literature (Kostoff et al, 2005a, 2005b, 2006a, 2006b). Based on the wide-scale interest generated by these reports, it was decided to update and expand the study using more recent data, a much more comprehensive query, and more sophisticated analytical tools. In the updated study, text mining was used to extract technical intelligence from the open source global nanotechnology and nanoscience research literature (SCI/SSCI databases). Identified were: (1) the nanotechnology/nanoscience research literature infrastructure (prolific authors, key journals/institutions/countries, most cited authors/journals/documents); (2) the technical structure (pervasive technical thrusts and their inter-relationships); (3) nanotechnology instruments and their relationships; (4) potential nanotechnology applications, and (5) potential health impacts and applications. A comprehensive literature survey of the seminal works in nanotechnology is contained in Appendix 1. The results of this updated text mining study are divided into four main sections: Infrastructure; Technical Structure; Instrumentation; and Applications. In turn, Applications are divided into non-medical and medical. The results will be presented in the order listed above. Infrastructure describes the performers of nanoscience/ nanotechnology research at different levels, ranging from individual to national performers, and it includes the archived literature as well. Technical Structure identifies the pervasive technical thrusts (and their inter-relationships) of the nanoscience/ nanotechnology literature. Instrumentation provides both the 4 infrastructure and technical structure of the sub-set of the nanoscience/ nanotechnology literature that addresses specific instruments. Finally, Applications provides the infrastructure and taxonomy of the sub-set of the nanoscience/ nanotechnology literature that addresses specific non-medical and medical applications. ES1. INFRASTRUCTURE ES1.1. Country Publications • Global nanotechnology research article production has exhibited exponential growth for more than a decade (See Figure ES1). • The most rapid growth over that time period has come from East Asian nations, notably China and South Korea (See Figure ES2). • Some of this apparent rapid growth (in China for example) is partially due to 1) a country’s researchers publishing a non-negligible fraction of total papers in domestic low Impact Factor journals, and 2) these journals being accessed recently by the SCI/ SSCI, rather than due to growth based on increased sponsorship or productivity. • China’s representation in high Impact Factor journals is small, but increasing • From 1998 to 2002, China’s ratio of high impact nanotechnology papers to total nanotechnology papers doubled, placing China at parity for this metric with the advanced nations of Japan, Italy, and Spain. • The US remains the leader in aggregate nanotechnology research article production • In some selected nanotechnology sub-areas, China has achieved parity or taken the lead (see Figure ES3 for nanocomposites example). • South Korea started even further behind than China in both total nanotechnology publications and highly cited papers, but they have advanced rapidly to become second-tier contenders in total and highly cited papers. 5 FIGURE ES1 – SCI/ SSCI ARTICLES VS TIME TOTAL RECORDS RETRIEVED SCI ARTICLES VS TIME 70000 H T WI 60000 S E 50000 LS CT TIAC 40000 R R A F ST 30000 OB R A 20000 E B M 10000 U N 0 1-91950 1-91905 20-050 20005 YEAR 6 FIGURE ES2 – COUNTRY COMPARISON TIME TREND (number of articles vs. time) 16000 14000 USA 12000 Japan 10000 Germany 8000 France 6000 England 4000 PR China 2000 South Korea 0 1991 1998 2005 7 FIGURE ES3 – # PAPERS CONTAINING “NANOCOMPOSITE*” 45.0% " * E T 40.0% SI O s Pe 35.0% Mri Ont Cu Oo 30.0% NC Ag Nn 25.0% CHINA g "adi USA ne JAPAN ntainid in L 20.0% SOUTH KOREA oe 15.0% Ch s ers ubli 10.0% pP a P f 5.0% o % 0.0% 1991 1994 1998 2002 2005 Time ES1.2. Country Citations • There is a clear distinction between the publication practices of the three most prolific Western nations and the three most prolific East Asian nations. The Western nations publish in journals with almost twice the weighted average Impact Factors of the East Asian nations. Much of the difference stems from the East Asian nations publishing a non-negligible amount in domestic low Impact Factor journals, while the Western nations publish in higher Impact Factor international journals. • Two countries that lead in production of the most cited nanotechnology papers are the US (126) and Germany (31). The US and Germany account for forty percent of the most cited nanotechnology papers • The high paper volume production East Asian countries of China and South Korea account for two percent of the most cited nanotechnology papers. • Despite the increased paper productivity from East Asian countries, the US continues to generate the most cited nanotechnology papers. 8 ES1.3. Institution and Journal Citations • Of the thirty institutions publishing the most nanotechnology papers, four are from the US, whereas of the twenty-five institutions producing the most cited nanotechnology papers, twenty-one are in the US. • The top-tier institutions producing cited papers are Harvard University (27), University of California Berkeley (23), Rice University (17), University of California Santa Barbara (16). • The two journals that overwhelmingly contain the most cited nanotechnology papers since 1991 are Science (56) and Nature (37). ES1.4. Country Collaborations • The dominant country co-publishing network is a complex web of mainly European nations roughly following geographic lines: Nordic, Central Europe, Eastern Europe, and a Western Europe/ Latin American group of Romance language nations. There is also a UK component country network, but it is not linked to the interconnected continental members of the European Union (See Figure ES4). • Correlation of countries by common thematic interest shows two major poles: US and China. The US pole is strongly connected thematically to a densely connected network of English-speaking North American representatives, Western/ Central European nations, and most of the East Asian allies. China is relatively isolated except for India, and the Eastern European and Latin American representatives are outside the main network as well. 9