Table Of ContentProfessional
Microsoft® Search
SharePoint® 2007 and Search Server 2008
Thomas Rizzo
Richard Riley
Shane Young
Wiley Publishing, Inc.
ffffiirrss..iinndddd vv 88//22//0088 22::5511::5522 PPMM
Professional Microsoft® Search:
SharePoint® 2007 and Search Server 2008
Published by
Wiley Publishing, Inc.
10475 Crosspoint Boulevard
Indianapolis, IN 46256
www.wiley.com
Copyright © 2008 by Wiley Publishing, Inc., Indianapolis, Indiana
Published simultaneously in Canada
ISBN: 978-0-470-27933-5
Manufactured in the United States of America
10 9 8 7 6 5 4 3 2 1
Library of Congress Cataloging-in-Publication Data
Rizzo, Thomas, 1972-
Professional Microsoft SharePoint search / Thomas Rizzo, Richard Riley, Shane Young.
p. cm.
Includes index.
ISBN 978-0-470-27933-5 (paper/website)
1. Querying (Computer science)—Computer programs. 2. Business enterprises—Computer networks.
3. Intranet programming. 4. Microsoft SharePoint (Electronic resource) 5. Search engines— Computer
programs. 6. Internet searching—Computer programs. I. Riley, Richard, 1973- II. Young, Shane, 1977-
III. Title.
QA76.625.R58 2008
006.7'6—dc22
2008029091
No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any
means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections
107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or
authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood
Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8600. Requests to the Publisher for permission should be
addressed to the Legal Department, Wiley Publishing, Inc., 10475 Crosspoint Blvd., Indianapolis, IN 46256, (317)
572-3447, fax (317) 572-4355, or online at http://www.wiley.com/go/permissions.
Limit of Liability/Disclaimer of Warranty: The publisher and the author make no representations or warranties
with respect to the accuracy or completeness of the contents of this work and specifically d isclaim all warranties,
including without limitation warranties of fitness for a particular purpose. No w arranty may be created or
extended by sales or promotional materials. The advice and strategies contained herein may not be suitable for
every situation. This work is sold with the understanding that the publisher is not engaged in rendering legal,
accounting, or other professional services. If professional assistance is required, the services of a competent
professional person should be sought. Neither the publisher nor the author shall be liable for damages arising
herefrom. The fact that an organization or Website is referred to in this work as a citation and/or a potential source
of further information does not mean that the author or the publisher endorses the information the organization or
Website may provide or recommendations it may make. Further, readers should be aware that Internet Websites
listed in this work may have changed or disappeared between when this work was written and when it is read.
For general information on our other products and services please contact our Customer Care Department within
the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.
Trademarks: Wiley, the Wiley logo, Wrox, the Wrox logo, Wrox Programmer to Programmer, and related trade dress
are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates, in the United States and other
countries, and may not be used without written permission. Microsoft and SharePoint are registered trademarks of
Microsoft Corporation in the United States and/or other countries. All other trademarks are the property of their
respective owners. Wiley Publishing, Inc. is not associated with any product or vendor mentioned in this book.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be
available in electronic books.
ffffiirrss..iinndddd vvii 88//22//0088 22::5511::5533 PPMM
About the Authors
Tom Rizzo is a director in the Microsoft SharePoint product management team. Before joining the
SharePoint team, Tom worked in the Microsoft Exchange and SQL Server product management teams.
Tom is the author of six development books on a range of Microsoft technologies.
Richard Riley is a senior technical product manager in the Microsoft SharePoint product management
team. He is responsible for driving Technical Readiness, both within, and outside of, Microsoft and
specializes in Search Server 2008 and the Search features of SharePoint Server 2007. He has more than
seven years of experience at Microsoft and has worked as a consultant in Microsoft Consultancy
Services, and as a technical specialist in sales. He has over 10 years of industry experience and is a
frequent speaker at Microsoft Technical Events.
Shane Young is the owner of SharePoint911. He has over 12 years of experience designing and
administering large - scale server farms using Microsoft enterprise technologies. For the past three years,
he has been working exclusively with SharePoint products and technologies as a consultant and trainer
for www.SharePoint911.com . Shane has been recognized by Microsoft as an authority on SharePoint
and is among an elite group of Microsoft Office SharePoint Server 2007 MVPs. Shane also maintains a
popular SharePoint focused blog, http://msmvps.com/blogs/shane , which contains a lot of beneficial
technical information about SharePoint administration.
About the Technical Editor
Andrew Edney has been an IT professional for more than twelve years and has worked for a range of
high-tech companies, including Microsoft, Hewlett-Packard, and Fujitsu Services. He has a wide range
of experience in virtually all aspects of Microsoft’s computing solutions, having designed and built large
enterprise solutions for government and private-sector customers. Andrew is also a well known speaker
and presenter on a wide range of information systems subjects. He has appeared at the annual Microsoft
Exchange Conference in Nice. Andrew is currently involved in numerous Microsoft beta programs,
including next-generation Windows operating systems and next-generation Microsoft Office products,
and he actively participates in all Windows Media Center beta programs. In addition, Andrew has
written a number of books, including Windows Home Server User’s Guide (Apress, 2007), Pro LCS: Live
Communications Server Administration (Apress, 2007), Getting More from Your Microsoft Xbox 360 (Bernard
Babani, 2006), How to Set Up Your Home or Small Business Network (Bernard Babani, 2006), Using Microsoft
Windows XP Media Center 2005 (Bernard Babani, 2006), Windows Vista: An Ultimate Guide (Bernard Babani,
2007), PowerPoint 2007 in Easy Steps (Computer Step, 2007), Windows Vista Media Center in Easy Steps
(Computer Step, 2007) and Using Ubuntu Linux (Bernard Babani, 2007).
ffffiirrss..iinndddd iixx 88//22//0088 22::5511::5533 PPMM
Credits
Acquisitions Editor Production Manager
Katie Mohr Tim Tate
Development Editor Vice President and Executive Group Publisher
Christopher J. Rivera Richard Swadley
Technical Editor Vice President and Executive Publisher
Andrew Edney Joseph B. Wikert
Production Editor Project Coordinator, Cover
Debra Banninger Lynsey Stanford
Copy Editor Proofreader
Foxxe Editorial Services Nancy Carrasco
Editorial Manager Indexer
Mary Beth Wakefield Jack J. Lewis
ffffiirrss..iinndddd xxii 88//22//0088 22::5511::5544 PPMM
Acknowledgments
There are a lot of folks to acknowledge, who helped make this book possible. If I miss anyone, I
apologize! First, I want to thank Jim Minatel, Katie Mohr, and Christopher Rivera at Wiley. The three of
them made this book possible and also pushed us along in the process at the right times. I also want to
thank our production editor Debra Banninger and our technical editor Andrew Edney. Both of them
made our words and technical concepts crystal clear. I also want to thank my coauthors who went on
this exciting and chaotic journey with me. Finally, I want to thank the SharePoint search team at
Microsoft. They are one of the most dedicated teams in delivering high - quality, customer - centric
solutions and are always willing to answer questions or provide feedback.
— Tom Rizzo
Writing a book takes much more than one person and a keyboard, and this one is no exception, I’ d like to
say a huge thank you to the very patient team at Wiley, particularly Katie Mohr and Christopher Rivera,
and my coauthors whom I ’ m sure were all quietly tearing their hair out at my habitual lateness with
content (including this page). I ’ d also like to say a heartfelt thanks to my colleagues in the Search team at
Microsoft, whom I ’ ve repeatedly peppered with questions: Puneet Narula, Keller Smith, Sage Kitamorn,
Sid Shah, Dan Blood, Michal Gideoni, Dmitriy Meyerzon, Karen Beattie Massey, Dan Evers, and Brenda
Carter. Last, but definitely not least, a thank you to Steve Caravajal, who rescued me from a deep hole
with the People Search chapter — I owe you one.
— Richard Riley
I would like to thank the SharePoint MVPS, my friends on the Microsoft product team, and the awesome
staff at SharePoint911. I want to send out a special thanks to my wife, Nicola. Without her understanding
and support, writing two books at the same time would never have been possible. Also, I have to send a
shout out to my two dogs, Tyson and Pugsley. I am sure I missed out on several rounds of throwing the
ball while I was busy typing away, but through thick and thin, they lay at my feet. I love you little
Sparky!
— Shane Young
ffffiirrss..iinndddd xxiiiiii 88//22//0088 22::5511::5544 PPMM
Contents
Introduction xxvii
Chapter 1: Introduction to Enterprise Search 1
Why Enterprise Search 1
A Tale of Two Content Types 1
Security, Security, Security 2
Algorithms to the Rescue 2
We All Love the Web and HTTP 3
Conclusion 4
Chapter 2: Overview of Microsoft Enterprise Search Products 5
Enterprise Search Product Overviews 5
Windows Desktop Search/Windows Vista 5
Features in Windows Vista Search 6
Windows SharePoint Services 11
SharePoint Search Architecture 11
Crawling Content 12
Searching Content 13
Configuring Search 14
Platform Services 14
Microsoft Search Server 2008 16
Simplified Setup and Administration 16
Federation Capabilities 18
Different Editions of Search Server 2008 19
What about WSS and Microsoft Office SharePoint Server? 20
Microsoft Office SharePoint Server 20
People Search 20
Business Data Catalog 21
The Microsoft Filter Pack 23
Connectors for Documentum and FileNet 23
Windows Live 24
FAST and SharePoint 25
Other Server Products (Exchange, SQL) 25
Conclusion 25
ffttoocc..iinndddd xxvviiii 88//22//0088 22::5533::2233 PPMM
Contents
Chapter 3: Planning and Deploying an Enterprise Search Solution 27
Key Components 27
The Index Role 27
The Query Role 28
The Shared Services Provider 28
The Database Server 29
Search Topologies 29
Single Server 29
A Small Farm 30
A Three-Server Farm 31
A Medium Server Farm 31
Larger Farms 32
Search Software Boundaries 33
Hardware Sizing Considerations 34
The Index Server 35
Query Servers 37
Database Servers 37
Testing 38
Performance Monitoring 39
Search Backups 41
Index Server Recovery Options 42
Using Federation to scale? 43
Conclusion 44
Chapter 4: Configuring and Administering Search 45
Configuring Search from Central Administration 45
The Search Services 45
The Office SharePoint Server Search Service 46
Windows SharePoint Services Search 50
Manage Search Service 52
Manage Content Database — Search Server 2008 57
Configuring Search from the Shared Services Provider 58
Creating or Editing the SSP Settings 59
SSP Search Administration 59
The Default Content Source 60
Full versus Incremental Crawls 62
Search Schedule 63
Additional Content Sources 64
Interacting with a Content Source 65
Crawl Rules 66
xviii
ffttoocc..iinndddd xxvviiiiii 88//22//0088 22::5533::2244 PPMM
Contents
Crawl Logs 70
File Types 72
Reset All Crawled Content 72
Search Alerts 72
Authoritative Pages 72
Federated Locations 73
Managed Properties 73
Shared Search Scopes 75
Server Name Mappings 77
Search Result Removal 78
Search Reporting 78
The Other Search Settings 79
Configuring Search Locally on the Server 80
IFilters 80
Installing the Microsoft Filter Pack 80
Maximum Crawl Size 82
Reset the Search Services 82
Crawling Case-Sensitive Web Sites 82
Diacritic-Sensitive Search 82
Conclusion 83
Chapter 5: Searching LOB Systems with the BDC 85
BDC Architecture and Benefits 85
The Application Definition File 87
XSD Schema File 87
BDC Definition Editor Tool 87
BDC Metadata Model Overview 88
MetadataObject Base Class 89
LobSystem 89
LobSystemInstances and LobSystemInstance 92
Entities and Entity Element 94
Identifiers and Identifier Element 95
Methods and Method Element 96
Parameters and Parameter Element 98
FilterDescriptors and FilterDescriptor Element 98
Actions, Action, and ActionParameter Elements 101
MethodInstance Element 101
TypeDescriptors, TypeDescriptor, DefaultValue Elements 106
Associations and Association Element 108
Complete BDC XML Samples 109
xix
ffttoocc..iinndddd xxiixx 88//22//0088 22::5533::2244 PPMM
Contents
BDC Web Parts, Lists, and Filters 109
Business Data List Web Part 110
Business Data Related List Web Part 110
Business Data Item Web Part 111
Business Data Actions Web Part 112
Business Data Item Builder Web Part 112
BDC in SharePoint Lists 113
Modifying Your BDC Profile Page 116
Searching the BDC 117
Adding a Content Source for Crawling BDC Data 117
Mapping Crawled Properties 119
Create a Search Scope 119
SharePoint Designer and the BDC 120
The BDC API 122
The BDC Assemblies 122
The Microsoft.Office.Server Namespaces 123
Putting It Together: Building Custom Applications for the BDC 123
Connecting to the Shared Services Database 125
Displaying LOBSystemInstances 125
Working with Entities 125
Working with an Entity – Finders, Fields, and Methods 126
Executing a Method and Displaying the Results 126
Working with Associations and Actions 128
TroubleShooting the BDC 129
Conclusion 129
Chapter 6: User Profiles and People Search 131
User Profiles 131
Managing User Profiles 133
Profile Services Connections 133
Configuring Profile Imports 134
Profile Properties 138
Configuring Profile Properties 139
BDC Supplemental Properties 142
Configuring for BDC Import 143
People Search 145
The Search Center 145
People Search Page and Tab 146
Results Page 147
xx
ffttoocc..iinndddd xxxx 88//22//0088 22::5533::2244 PPMM