ebook img

The fifteenth text retrieval conference TREC 2006 PDF

2007·10.2 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview The fifteenth text retrieval conference TREC 2006

Nisr NIST Special Publication 500-272 National Instituteof StandardsandTechnology U.S. DepartmentofCommerce Information Technology: The Fifteenth Text Retrieval Conference TREC 2006 Ellen M. Voorhees and Lori P. Buckland, ) Editors Information Technology Laboratory National Institute of Standards and Technology MD Gaithersburg, 20899 57 •6c er2007 J heNational Institute ofStandards andTechnologywas established in 1988 by Congress to "assist industry inthe developmentoftechnology ... neededto improveproduct quality, to modernize manufacturing processes, to ensureproductreliability ... andto facilitate rapid commercialization ... ofproductsbased onnew scientific discoveries." NIST, originally founded astheNational Bureau ofStandards in 1901, works to strengthenU.S. industry's competitiveness; advance science and engineering; and improvepublic health, safety, andthe environment. One ofthe agency'sbasic functions is to develop, maintain, andretain custody ofthenational standards of measurement, andprovidethemeans andmethods for comparing standards used in science, engineering, manufacturing, cotimierce, industry, and educationwiththe standards adopted orrecognizedby the Federal Government. As an agency oftheU.S. CommerceDepartment, NISTconductsbasic and appliedresearch in the physical sciences and engineering, and developsmeasurement techniques, testmethods, standards, and related services. The Institute does generic andprecompetitive work onnew and advancedtechnologies. NIST's research facilities are locatedat Gaithersburg, MD 20899, and atBoulder, CO 80303. Major technical operating units andtheirprincipal activities are listedbelow. Formore informationvisittheNIST Website athttp;//www.nist.gov, orcontact the Publications andProgramInquiries Desk, 301-975-NIST. Office oftheDirector Nanoscale Science and Technology • BaldrigeNationalQualityProgram • PublicandBusinessAffairs Chemical Science and Technology Laboratory • CivilRightsandDiversity • InternationalandAcademicAffairs • BiochemicalScience • ProcessMeasurements Technology Services • SurfaceandMicroanalysisScience • StandardsServices • PhysicalandChemicalProperties^ • MeasurementServices • AnalyticalChemistry • InformationServices • WeightsandMeasures Physics Laboratory • ElectronandOpticalPhysics Advanced TechnologyProgram • AtomicPhysics • EconomicAssessment • OpticalTechnology • InformationTechnologyandElectronics • IonizingRadiation • ChemistryandLifeSciences • TimeandFrequency' • QuantumPhysics' Manufacturing Extension Partnership Program Manufacturing Engineering • CenterOperations Laboratory • SystemsOperation • PrecisionEngineering • ProgramDevelopment • ManufacturingMetrology • IntelligentSystems Electronics and ElectricalEngineering • FabricationTechnology Laboratory • ManufacturingSystemsIntegration • SemiconductorElectronics • Optoelectronics' Building and Fire Research • QuantumElectricalMetrology Laboratory • Electromagnetics • MaterialsandConstmctionResearch • BuildingEnvironment Materials Science and Engineering • FireResearch Laboratory • IntelligentProcessingofMaterials Information Technology Laboratory • Ceramics • MathematicalandComputational Sciences^ • MaterialsReliability! • AdvancedNetworkTechnologies • Polymers • ComputerSecurity • Metallurgy • InformationAccess • NISTCenterforNeutronResearch • Software DiagnosticsandConformanceTesting • StatisticalEngineering NIST CenterforNeutron Research 'AtBoulder,CO80303 ^SomeelementsatBoulder,CO NIST Special Publication 500-272 Information Technology: The Fifteenth Text Retrieval Conference TREC 2006 Ellen M. Voorhees and Lori P. Buckland, Editors Information Access Division Information Technology Laboratory National Institute ofStandards and Technology MD Gaithersburg, 20899 October 2007 U.S. Department ofCommerce CarlosM. Gutierrez, Secretary National Institute ofStandards and Technology James M. Turner, ActingDirector Reports on Information Technology The Information Technology Laboratory (ITL) at the National Institute of Standards and Technology (NIST) stimulates U.S. economic growth and industrial competitiveness through technical leadership and collaborative research in critical infrastructure technology, including tests, test methods, reference data, and forward-looking standards, to advance the development and productive use ofinformation technology. To overcome barriers to usability, scalability, interoperability, and security in information systems and networks, ITL programs focus on a broad range ofnetworking, security, and advanced information technologies, as well as the mathematical, statistical, and computational sciences. This Special Publication 500-series reports on ITL's research in tests and test methods for information technology, and its collaborative activities with industry, government, and academic organizations. National Institute ofStandards and Technology Special Publication 500-272 Natl. Inst. Stand. Technol. Spec. Publ. 500-272, 177 pages (October 2007) Certaincommercial entities, equipment, ormaterials may be identified in this documentinordertodescribe anexperimental procedureorconcept adequately. Such identification is notintended to implyrecommendationorendorsementby the National InstituteofStandards and Technology, nor is itintended to implythat the entities,materials, orequipment arenecessarily the best available forthepurpose. Foreword This report constitutes the proceedings ofthe 2006 Text REtrieval Conference, TREC 2006, held in Gaithersburg, Maryland, November 14-17, 2006. The conference was co-sponsored by the Na- tional Institute ofStandards and Technology (NIST) and the DisruptiveTechnology Office (DTO). Approximately 175 people attended the conference, including representatives from 17 countries. The conference was the fifteenth in an ongoing series ofworkshops to evaluate new technologies fortext retrieval and related information-seekingtasks. The workshop included plenary sessions, discussion groups, a poster session, and demonstrations. Because the participants in the workshop drew on their personal experiences, they sometimes cite specific vendors and commercial products. The inclusion or omission of a particular company or product implies neither endorsement nor criticism by NIST. Any opinions, findings, and con- clusions or recommendations expressed in the individual papers are the authors' own and do not necessarily reflect those ofthe sponsors. I gratefully acknowledge the tremendous work of the TREC program committee and the track coordinators. Ellen Voorhees September24, 2007 TREC 2006 Program Committee Ellen Voorhees, NIST, chair James Allan, University ofMassachusetts at Amherst Chris Buckley, SabirResearch, Inc. Gordon Cormack, University ofWaterloo Susan Dumais, Microsoft DonnaHarman, NIST & Bill Hersh, Oregon Health Science University David Lewis, David Lewis Consulting John Prager, IBM Steve Robertson, Microsoft Mark Sanderson, University ofSheffield Ian Soboroff, NIST UK Karen Sparck Jones, University ofCambridge, Richard Tong, Tarragon Consulting Ross Wilkinson, CSIRO iii iv TREC 2006 Proceedings Foreword iii Listing ofcontents of Appendix xiv Listing ofpapers, alphabetical by organization xv Listing ofpapers, organized by track xxiv Abstract xxxiv Overview Papers Overview ofTREC 2006 1 E.M. Voorhees, National Institute ofStandards and Technology (NIST) Overview ofthe TREC 2006 Blog Track 17 I. Ounis, C. Macdonald, University ofGlasgow M. de Rijke, G. Mishne, University ofAmsterdam I. Soboroff, NIST Overview ofthe TREC 2006 EnterpriseTrack 32 I. Soboroff, NIST A.P. de Vries, CWI N. Craswell, Microsoft Cambridge TREC 2006 Genomics Track Overview 52 W. Hersh, A.M. Cohen, P. Roberts, H.K. Rekapalli, Oregon Health & ScienceUniversity TREC 2006 Legal Track Overview 79 J.R. Baron, National Archives and Records Administration D.D. Lewis, David D. Lewis Consulting D.W. Oard, University ofMaryland Overview ofthe TREC 2006 Question Answering Track 99 H.T. Dang, NIST J. Lin, University ofMaryland, College Park D. Kelly, University ofNorth Carolina, Chapel Hill TREC 2006 SpamTrack Overview 117 G. Cormack, University ofWaterloo TheTREC 2006 Terabyte Track 128 S. Buttcher, C.L.A. Clarke, University ofWaterloo V Other Papers (Contents ofthesepapers arefoundon the TREC2006 Proceedings CD,) \ ASU atTREC 2006 Genomics Track L. Tari, G, Gonzalez, R. Leaman, S. Nikkila, R. Wendt, C. Baral, Arizona State University BUPT atTREC 2006: Enterprise Track Z. Ru, Q. Li, W. Xu. J. Guo, Beijing University ofPosts and Telecommunications BUPT atTREC 2006: SpamTrack Z. Yang, W. Xu, B. Chen, W. Xu, J. Guo, Beijing University ofPosts and Telecommunications Knowledge Transfer and Opinion Detection in the TREC 2006 Blog Track H. Yang, J. Callan, Carnegie Mellon University L. Si, Purdue University Case Western Reserve University atthe TREC 2006Enterprise Track A.D. Troy, G.-Q. Zhang, Case Western Reserve University CombiningLanguage Model with Sentiment Analysis for Opinion Retrieval ofBlog-Post X. Liao, D. Cao, S. Tan, Y. Liu, G. Ding, X. Cheng, Chinese Academy ofSciences Social Network Structure Behind the Mailing Lists: ICT-IIIS at TREC 2006 ExpertFinding Track H. Chen, H. Shen, J. Xiong, S. Tan, X. Cheng, Chinese Academy ofSciences PSM: A New Re-Ranking Algorithmfor Named-Page J. Guo, L. Ding, G. Zhang, Y. Liu, X. Cheng, ChineseAcademy ofSciences Window-based EnterpriseExpert Search W. Lu, H. Zhao, Wuhan University, Chinaand City University S. Robertson, Microsoft Research S. Robertson, A. Macfarlane, City University, London BioKI, A General Literature Navigation SystematTREC Genomics 2006 S. Bergler, J. Schuman, J. Dubuc, A. Lebedev, ConcordiaUniversity ConcordiaUniversity atthe TREC 15 QATrack 383 L. Kosseim, A. Beaudoin, A. Keighbadi, M. Razmara, Concordia University Seven Hypothesis about SpamFiltering The CRM114 Team Deep Context with a Sense-of-Self R. McArthur, CSIRO ICT Team Using Semantic Relations withWorld Knowledge forQuestion Answering K. Kan Lo, W. Lam, The Chinese University ofHong Kong CorrelatingTopic Rankings and Person Rankings toFind Experts T. Westerveld, CWI vi MonetDB/XlOO at the 2006 TREC Terabyte Track S. Heman, M. Zukowsi, A. de Vries, P. Boncz, CWI DalTREC 2006 QA SystemJellyfish: Regular Expressions Mark-and-Match Approach to Question Answering V. Keselj, T. Abou-Assaleh, N. Cercone, Dalhousie University DUTIR at TREC 2006: Genomics and Enterprise Tracks Z. Yang, H. Lin, Y. Li, L. Xu, Y. Pan, B. Liu, DaUan University ofTechnology QACTIS Enhancements in TREC QA 2006 P. Schone, U.S. Department ofDefense G. Ciany, Dragon Development Corporation R. Cutts, HenggelerComputer Consultants P. McNamee, J. Mayfield, T. Smith, Johns Hopkins Applied Physics Laboratory Question Answering by Diggery at TREC 2006 S. Tomlinson, Diggery Dublin City University at the TREC 2006 Terabyte Track P. Ferguson, A.F. Smeaton, P. Wilkins, Dublin City University Fuzzy TermProximity With Boolean Queries at 2006 TREC Terabyte Task A. Mercier, M. Beigbeder, Ecole Nationale Superieure des Mines de Saint Etienne ConceptBased Document Retrieval for Genomics Literature D. Trieschnigg, University ofTwente W. Kraaij, TNO M. Schuemie, Erasmus MC OSBF-Lua - A Text Classification Module forLua The Importance ofthe Training Method Fidelis Assis JudgingExpertise-WIM atEnterprise C. Lin, J. Niu, Fudan University Shanghai Using Profile Matching andText Categorization forAnswerExtraction in TREC Genomics H. Zheng, C. Lin, L. Huang, J. Xu, J. Zheng, Q. Sun, J. Niu, Fudan University FDUQA on TREC 2006 QA Track Y. Zhou, X. Yuan, J. Cao, X. Huang, L. Wu, Fudan University InsunQA06 on QATrack ofTREC 2006 Y. Zhao, Z.-M. Xu, P. Li, Y. Guan, Harbin Listitute ofTechnology SVM-Based SpamFilterwith Active and Online Learning Q. Wang, Y. Guan, X. Wang, Harbin histitute ofTechnology Highly Scalable Discriminative SpamFiltering M. Bruckner, P. Haider, T. Scheffer, Humboldt Universitat zu Berlin vii Juru atTREC 2006: TAAT versus DAAT in theTerabyte Track D. Carmel, E. Amitay, IBM HaifaResearch Lab roM in TREC 2006 Enterprise Track J. Chu-CarroU, G. Averboch, P. Duboue, D. Gondek, J.W. Murdock, J. Prager, IBM T.J. Watson Research Center P. Hoffmann, J. Wiebe, University ofPittsburgh I2R atTREC 2006 Genomics Track N. Yu, Y. Lingpeng, Z. Tie, S. Jian, J. Donghong, Institute for InfocommResearch HT TREC 2006: Genomics Track J. Urbain, N. Goharian, O. Frieder, Illinois Institute ofTechnology WIDIT in TREC 2006 Blog Track K. Yang, N. Yu, A.Valerio, H. Zhang, Indiana University, Bloomington Reconstructing DIOGENE: ITC-irst atTREC 2006 M. Negri, M. Kouylekov, B. Magnini, B. Coppola, ITC-irst Towards Practical PPM SpamFiltering: Experiments fortheTREC 2006 SpamTrack A. Bratko, JozefStefan Institute and Klika B. Filipic, JozefStefan Institute B. Zupan, University ofLjubljana Combining Vector-Space and Word-Based Aspect Models forPassage Retrieval R. Wan, I. Takigawa, H. Mamitsuka, Kyoto University V. Ngoc Anh, The University ofMelbourne L3S Research CenteratTREC 2006 Enterprise Track S. Chernov, G. Demartini, J. Gaugaz, L3S Research Center Question Answering withLCC's CHAUCER at TREC 2006 A. Hickl, J. Williams, J. Bensley, K. Roberts, Y. Shi, B. Rink, Language Computer Corporation ATemporally Enhanced PowerAnswer in TREC 2006 D. Moldovan, M. Bowden, M. Tatu, Language Computer Corporation LexiCloneLexical Cloning Systems I.S. Geller, LexiClone AnswerFinderatTREC 2006 D. Molla, M. van Zaanen, L. Pizzato, Macquarie University lO-Top-katTREC 2006: TerabyteTrack H. Bast, D. Majumdar, R. Schenkel, M. Theobald, G. Weikum, Max-Planck-Institut fUrLiformatik Question AnsweringExperiments and Resources B. Katz, G. Marton, S. Felshin, D. Loreto, B. Lu, F. Mora, O. Uzuner, M. McGraw-Herdeg, N. Cheung, A. Radul, Y. Shen, G. Zaccak, MIT Computer Science and Artificial Intelligence Laboratory viii

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.