You are previewing Working with Microsoft® FAST™ Search Server 2010 for SharePoint®.
O'Reilly logo
Working with Microsoft® FAST™ Search Server 2010 for SharePoint®

Book Description

Build robust, scalable search solutions to fit the way your business works

Deliver powerful search tools to your clients—using Microsoft FAST Search Server 2010 for SharePoint. Led by three search experts, you’ll learn how to deliver advanced intranet search capabilities and build custom, search-driven applications for your business. Use your skills as a SharePoint architect or developer to configure and program this server for different search scenarios, based on real-world examples.

Discover how to:

  • Integrate FAST Search Server for SharePoint into your existing SharePoint architecture

  • Use best practices to develop solutions specific to your business

  • Enable users to search millions of SharePoint documents efficiently

  • Master powerful indexing and data modification techniques

  • Expand document processing capabilities to handle data more effectively

  • Develop custom search applications and web parts

  • Configure your server for current content volume, and plan for future expansion

  • Manage search operations and monitor performance directly from SharePoint

  • Table of Contents

    1. Working with Microsoft® FAST™ Search Server 2010 for SharePoint®
    2. A Note Regarding Supplemental Files
    3. Foreword
    4. Introduction
      1. Who Should Read This Book
        1. Assumptions
      2. Who Should Not Read This Book
      3. Organization of This Book
        1. Finding Your Best Starting Point in This Book
      4. Conventions and Features in This Book
      5. System Requirements
      6. Code Samples
        1. Installing the Code Samples
        2. Using the Code Samples
      7. Acknowledgments from All the Authors
        1. Mikael Svenson’s Acknowledgments
        2. Marcus Johansson’s Acknowledgments
        3. Robert Piddocke’s Acknowledgments
      8. Errata & Book Support
      9. We Want to Hear from You
      10. Stay in Touch
    5. I. What You Need to Know
      1. 1. Introduction to FAST Search Server 2010 for SharePoint
        1. What Is FAST?
          1. Past
          2. Present
          3. Future
          4. Versions
            1. FSIS
              1. CTS, IMS, and FAST Search Designer
              2. Search Business Manager and IMS UI Toolkit
            2. FSIA
            3. FS4SP
        2. SharePoint Search vs. Search Server Versions, and FS4SP
          1. Features at a Glance
            1. Scalability
          2. Explanation of Features
            1. Item Processing
            2. Document Processing/Format Conversion
            3. Property Extraction
            4. Advanced Query Language (FQL)
            5. Duplicate Collapsing
            6. Linguistics
            7. Refiners
        3. What Should I Choose?
          1. Evaluating Search Needs
            1. Environment
              1. How many users do you have and how often do you expect each one to search?
              2. What type of and how much content do you have?
              3. How can you make your users take action on the search results?
              4. Where are you coming from and where do you want to go?
              5. How skilled are your users at searching?
            2. Corpus
              1. How much content do you have to search?
              2. Does everything need to be searchable?
              3. Are there multiple content sources?
              4. How much and in what divisions is the different content being consumed?
            3. Content
              1. What is the quality of the content?
              2. Is there rich metadata?
              3. What document types are there?
            4. Organization
              1. Does the content have an existing organization (taxonomy, site structure)?
              2. How difficult is it to navigate to the most relevant content vs. search?
          2. Decision Flowchart
          3. Features Scorecard
            1. Cost Estimator
        4. Conclusion
      2. 2. Search Concepts and Terminology
        1. Overview
          1. Relevancy
            1. Recall and Precision
            2. Query Expansion
            3. Corpus
            4. Rank Tuning
              1. Static rank tuning
              2. Dynamic rank tuning
            5. Linguistics
              1. Tokenization
              2. Keyword rank
            6. Rank Profiles
          2. SharePoint Components
            1. Web Front End
            2. Central Administration
            3. Search Service Applications
            4. Federated Search Object Model
            5. Query Web Service
            6. Query RSS Feed
            7. Query Object Model
            8. Web Parts
            9. People Search
        2. Content Processing
          1. Content Sources
          2. Crawling and Indexing
            1. Indexing
            2. Federated Search
            3. Document Processor
            4. FIXML
            5. Indexing Dispatcher
          3. Metadata
          4. Index Schema
        3. Query Processing
          1. QR Server
          2. Refiners (Faceted Search)
          3. Query Language
            1. Token Expressions
            2. Property Specifications
            3. Boolean Operators
          4. Search Scopes
          5. Security Trimming
          6. Claims-Based Authentication
        4. Conclusion
      3. 3. FS4SP Architecture
        1. Overview
        2. Server Roles and Components
          1. FS4SP Architecture
            1. SharePoint Search Service Applications
            2. FAST Search Connector (FAST Content SSA)
            3. FAST Search Query (FAST Query SSA)
            4. Crawl Components and Crawl Databases
          2. Search Rows, Columns, and Clusters
            1. Scenario
          3. FS4SP Index Servers
          4. FS4SP Query Result Servers/QR Server
        3. Conclusion
      4. 4. Deployment
        1. Overview
        2. Hardware Requirements
          1. Storage Considerations
            1. Disk Speed
            2. Disk Layout
            3. Using a SAN
            4. Using NAS
            5. Using SSD
          2. FS4SP and Virtualization
        3. Software Requirements
        4. Installation Guidelines
          1. Before You Start
          2. Software Prerequisites
          3. FS4SP Preinstallation Configuration
          4. FS4SP Update Installation
          5. FS4SP Slipstream Installation
          6. Single-Server FS4SP Farm Configuration
          7. Deployment Configuration
          8. Multi-Server FS4SP Farm Configuration
          9. Manual and Automatic Synchronization of Configuration Changes
          10. Certificates and Security
          11. Creating FAST Content SSAs and FAST Query SSAs
          12. Enabling Queries from SharePoint to FS4SP
          13. Creating a Search Center
          14. Scripted Installation
          15. Advanced Filter Pack
          16. IFilter
        5. Replacing the Existing SharePoint Search with FS4SP
        6. Development Environments
          1. Single-Server Farm Setup
          2. Multi-Server Farm Setup
          3. Physical Machines
          4. Virtual Machines
          5. Booting from a VHD
        7. Production Environments
          1. Content Volume
          2. Failover and High Availability
          3. Query Throughput
          4. Freshness
          5. Disk Sizing
          6. Server Load Bottleneck Planning
        8. Conclusion
      5. 5. Operations
        1. Introduction to FS4SP Operations
          1. Administration in SharePoint
          2. Administration in Windows PowerShell
          3. Other Means of Administration
            1. Using Configuration Files
            2. Using Command-Line Tools
        2. Basic Operations
          1. The Node Controller
            1. Starting and Stopping a Single-Server Installation
            2. Starting and Stopping a Multi-Server Installation
            3. Starting and Stopping Internal Processes
            4. Relationship Between the Node Controller and the FS4SP Windows Services
            5. Adding and Removing Document Processors
            6. A Note on Node Controller Internals
          2. Indexer Administration
            1. Suspending and Resuming Indexing Operations
            2. Rebuilding a Corrupt Index
          3. Search Administration
          4. Search Click-Through Analysis
            1. Checking the Status of SPRel
            2. Reconfiguring SPRel
          5. Link Analysis
            1. Checking the Status of the Web Analyzer
            2. Forcing the Web Analyzer to Run
            3. Listing the Relevance Data for a Specific Item
        3. Server Topology Management
          1. Modifying the Topology on the FS4SP Farm
            1. Adding One or More Index Columns Without Recrawling
          2. Modifying the Topology on the SharePoint Farm
          3. Changing the Location of Data and Log Files
        4. Logging
          1. General-Purpose Logs
            1. Windows Event Logs
            2. SharePoint Unified Logging Service (ULS)
            3. Internal FS4SP Logs
          2. Functional Logs
            1. Item Processing Logging
              1. Inspect runtime statistics
              2. Turn on debug and tracing for the running document processors
              3. Inspect crawled properties in the indexing pipeline by using FFDDumper
            2. Crawl Logging
            3. Query Logging
        5. Performance Monitoring
          1. Identifying Whether an FS4SP Farm Is an Indexing Bottleneck
          2. Identifying Whether the Document Processors Are the Indexing Bottleneck
          3. Identifying Whether Your Disk Subsystem Is a Bottleneck
        6. Backup and Recovery
          1. Prerequisites
          2. Backup and Restore Configuration
            1. Performing a Configuration Backup
            2. Performing a Configuration Restore
          3. Full Backup and Restore
            1. Performing a Full Backup
            2. Performing a Full Restore
            3. Incremental Backup
            4. Speeding Up Backups
        7. Conclusion
    6. II. Creating Search Solutions
      1. 6. Search Configuration
        1. Overview of FS4SP Configuration
          1. SharePoint Administration
          2. Windows PowerShell Administration
          3. Code Administration
            1. Code Example Assumptions
          4. Other Means of Administration
        2. Index Schema Management
          1. The Index Schema
          2. Crawled and Managed Properties
            1. SharePoint
            2. Windows PowerShell
            3. .NET
          3. Full-Text Indexes and Rank Profiles
            1. SharePoint
            2. Windows PowerShell
            3. .NET
          4. Managed Property Boosts
            1. Windows PowerShell
            2. .NET
          5. Static Rank Components
            1. Windows PowerShell
        3. Collection Management
          1. Windows PowerShell
          2. .NET
        4. Scope Management
          1. SharePoint
          2. Windows PowerShell
          3. .NET
        5. Property Extraction Management
          1. Built-in Property Extraction
            1. SharePoint
            2. Windows PowerShell
            3. .NET
        6. Keyword, Synonym, and Best Bet Management
          1. Keywords
            1. SharePoint
            2. Windows PowerShell
            3. .NET
          2. Site Promotions and Demotions
            1. SharePoint
            2. Windows PowerShell
            3. .NET
          3. FQL-Based Promotions
            1. Windows PowerShell
        7. User Context Management
          1. SharePoint
          2. Windows PowerShell
          3. Adding More Properties to User Contexts
        8. Conclusion
      2. 7. Content Processing
        1. Introduction
        2. Crawling Source Systems
          1. Crawling Content by Using the SharePoint Built-in Connectors
            1. Content Source Types
              1. SharePoint sites
              2. Web Sites
              3. File Shares
              4. Exchange Public Folders
              5. BCS
              6. Custom Repository
              7. Documentum
              8. Lotus Notes
            2. Crawl Rules Management
            3. Crawler Impact Rules Management
          2. Crawling Content by Using the FAST Search Specific Connectors
            1. Other FAST Search Specific Functionality for Indexing Content
              1. The Content API
              2. The docpush command-line tool
            2. FAST Search Web Crawler
              1. Basic operations
              2. Crawl configuration
              3. Crawled properties
            3. FAST Search Database Connector
              1. Basic operations
              2. Crawl configuration
              3. Crawled properties
            4. FAST Search Lotus Notes Connector
              1. Basic operations
              2. Crawl configuration
              3. Crawled properties
          3. Choosing a Connector
        3. Item Processing
          1. Understanding the Indexing Pipeline
          2. Optional Item Processing
            1. The OptionalProcessing.xml Configuration File
            2. Built-in Property Extraction
              1. Person names
              2. Locations
              3. Companies
            3. Custom Property Extraction
            4. Custom XML Item Processing
            5. Offensive Content Filtering
            6. Debugging Item Processing
            7. Extended Metadata Extraction of Word and PowerPoint Documents
            8. Document Conversion
          3. Integrating an External Item Processing Component
            1. Configuration
            2. Developing an External Item Processing Component
            3. The Mythical PEWS Framework
        4. Conclusion
      3. 8. Querying the Index
        1. Introduction
        2. Query Languages
          1. Keyword Query Syntax
          2. FQL
            1. Overview
            2. Usage
              1. Field/Scope specification
              2. Wildcard expressions
              3. Reserved words and characters
              4. Simple Query Language
              5. XRANK
            3. Examples
          3. Search Center and RSS URL Syntax
        3. Search APIs
          1. Querying a QR Server Directly
          2. Federated Search Object Model
            1. SharedQueryManager and QueryManager
            2. LocationList
            3. Location
            4. FASTSearchRuntime
            5. Executing a Query by Using QueryManager, LocationList, and Location
            6. Getting the Correct Total When Using Duplicate Trimming
            7. Executing a Query by Using FASTSearchRuntime
          3. Query Object Model
            1. Keyword Query
            2. Executing a Query by Using the KeywordQuery Class
            3. Executing a Query by Using FQL with the KeywordQuery Class
          4. Query Web Service
            1. Executing a Query by Using the Query Web Service
          5. Query via RSS
          6. Choosing Which API to Use
        4. Conclusion
      4. 9. Useful Tips and Tricks
        1. Searching Inside Nondefault File Formats
          1. Installing Third-Party IFilters
        2. Extending the Expiration Date of the FS4SP Self-Signed Certificate
        3. Replacing the Default FS4SP Certificate with a Windows Server CA Certificate
        4. Removing the FAST Search Web Crawler
        5. Upgrading from SharePoint Search to FS4SP
          1. Reducing the Downtime When Migrating from SharePoint Search to FS4SP
        6. Improving the Built-in Duplicate Removal Feature
        7. Returning All Text for an Indexed Item
        8. Executing Wildcard Queries Supported by FQL
        9. Getting Relevancy with Wildcards
        10. Debugging an External Item Processing Component
          1. Inspecting Crawled Properties by Using the Spy Processor
          2. Using the Visual Studio Debugger to Debug a Live External Item Processing Component
        11. Using the Content of an Item in an External Item Processing Component
        12. Creating an FQL-Enabled Core Results Web Part
        13. Creating a Refinement Parameter by Using Code
        14. Improving Query Suggestions
          1. Adding, Removing, and Blocking Query Suggestions
          2. Security Trimming Search Suggestions
          3. Displaying Actual Results Instead of Suggestions
          4. Creating a Custom Search Box and Search Suggestion Web Service
        15. Preventing an Item from Being Indexed
          1. Using List, Library, and Site Permission to Exclude Content
          2. Using Crawl Rules
          3. Creating Custom Business Rules
        16. Creating a Custom Property Extractor Dictionary Based on a SharePoint List
        17. Crawling a Password-Protected Site with the FAST Search Web Crawler
        18. Configuring the FAST Search Database Connector to Detect Database Changes
        19. Conclusion
      5. 10. Search Scenarios
        1. Productivity Search
          1. Introduction to Productivity Search
          2. Contoso Productivity Search
            1. Setting Up a Search Center
            2. Setting Up Department User Context
            3. Creating a Department Refiner
            4. Promoting Items for Coworkers by Using an FQL-Enabled Web Part
            5. Promoting Items for Coworkers by Using Predefined Scopes
            6. Demoting Site Collection Root and Site Home
            7. Adding a Products Refiner
            8. Using Visual Best Bets to Redirect to a Landing Page
            9. Adding Relevant Documents to a Page by Using Enterprise Keywords
          3. Productivity Search Example Wrap-Up
        2. E-Commerce Search
          1. Introduction to E-Commerce Search
          2. Adventure Works E-Commerce
            1. Setting Up an External Content Type for Indexing
            2. Setting Up a New Content Source
            3. Defining Properties, Full-Text Index, and Rank Profile
            4. Setting Up a Basic Search Storefront
            5. Promoting High-Margin Products
              1. Alternative 1: Extend the Search Core Results Web Part with formula sort
              2. Alternative 2: Add the product margin as a static rank component
            6. Setting Up a Price Refiner Conditional on the Product Language
            7. Promoting Items
            8. Multiple Search Setting Groups
          3. E-Commerce Example Wrap-Up
    7. A. About the Authors
    8. Index
    9. About the Authors
    10. Copyright