Best Practices for Ranking Optimization
You can optimize the ranking of results returned by CES using a few CES features. Document authors and people responsible for archiving documents can also adjust their document management process to improve document ranking.
Ranking optimization using CES features
-
CES ranking parameters
CES allows you to fine-tune ranking parameters according to specific needs or requirements of your organization (see Customizing Search Results Ranking).
-
Multiple sources
In CES, you can set a ranking score to each source according to the general estimated relevance of the documents contained in the source. Rating a source above others will favor documents found in this source.
Example: The source for the content in a legacy content management system (CMS) might be set with a lower ranking score compared to a source for a similar content in the new CMS.
As much as possible, divide your content in multiple sources instead of in one large source. This way you can set the rating separately for each source and refine the ranking.
Example: When indexing a network file server containing folders for each department, rather than creating one source for the whole file server, create one source for each department folder.
-
Top results
Some frequent badly formulated queries may be bound to fail whatever ranking tuning process is applied. A CES mechanism named Top Results may help overcome this problem (seeAbout Top Results in .NET Search Interfaces and Adding Top Results to a CES Index). This feature allows to set a specific document or item to appear at the top or the results for one or a set of queries. The rest of the search results list is ranked normally.
-
Query Ranking Expressions
You can add query ranking expressions (QRE) to a search interface to adjust the search results ranking only for this interface (see What Are Query Ranking Expressions?).
Ranking optimization adjusting document management process
-
File and folder naming conventions
Some of the ranking parameters are related to the document path. This path is made of the concatenation of several folder names and one filename. Thus, to ensure ranking accuracy, choose suitable and meaningful names for folders and files. A simple and easy method to clarify names is to insert separating characters between words within file and folder names.
Example: The query super would not be matched against c:\Superaudio\docs but it would against c:\Super_audio\docs or c:\Super audio\docs.
Whenever possible, in the path or filename, use both the acronym and its corresponding meaning, such as c:\companies\Super car audio – SCA\docs. As a result, both the query Super car audio and SCA match this path. -
Organizational vocabulary and metadata
There are often equivalences among the terms used by people within an organization.
Example: The acronym SCA might be used instead of Super car audio. Nonetheless, the query SCA would not match any document that only contain Super car audio and never mention the acronym SCA.
To remedy this problem, metadata content can be created to be indexed when such interchangeabilities occur (see What Is Metadata/Meta-Information?). Many document formats (HTML, PDF, Word…) provide methods to include metadata. Refer to the documentation of the software used to create the various file formats to include appropriate metadata and set appropriate values.
Example: An HTML file that contains SCA in the body text can have a META KEYWORDS set to Super car audio.
-
Document titles
The users naturally search for a document expecting to find a specific document title. Authors of documents must choose short and accurate titles. CES uses the document title separately from the rest of the content in the ranking process. In the title score calculation, the proportion of the title that matches the query is taken into account.
Example: For the query Order Form, a document entitled Super Car Audio – SCA Car Audio – International Order Form Access Page would get less title points than a document entitled International Order Form, since a larger part of the title of the latter document matches the query.
Authors must also properly set the title in the properties of each file format. When no title metadata exists in a document, CES guesses that the first sentence is the title. With a proper title property, a query specifically searching for a given document title is almost guaranteed to return the right result in the first positions.
Example: In Microsoft Word, the title must be entered in the Title field in document properties.