Product DocsMenu

Enabling and Customizing Result Clustering

Result clustering regroups the most relevant search results by similarities in order to allow quick identification of pertinent documents on the same subject. Clusters are created using an algorithm which identifies similarities between document concepts, excerpts or summaries (depending on the Clustering Source selected).

You can specify the minimum and maximum number of clusters to create as well as the maximum number of results to analyze for clustering. However, the exact number of clusters produced is determined by the variety of keywords and document types identified by the result clustering algorithm.

By default, result clustering is disabled in the index. When you enable it, the Cluster regroups the most relevant search results by similarities to allow quick identification of pertinent documents on the same subject.

Note: Result clustering is used transparently in the .NET search interface (i.e. groups of results are not identified).

To enable and customize result clustering

  1. On the Coveo server, access the Administration Tool (see Opening the Administration Tool).

  2. In the Administration Tool, select Index > Result Clustering.

  3. In the Result Clustering page:

    1. In the Enabled section, select the Result clustering enabled check box.

    2. In the Minimum Number of Clusters box, modify the minimum number of clusters to create. The value must be between 1 and 1000. This value is used when documents returned by a query are similar.

    3. In the Maximum Number of Clusters box, modify the maximum number of clusters to create. The value must be between 1 and 1000. This value is used when documents returned by a query are numerous and heterogeneous.

    4. In the Maximum Number of Results for Clustering box, enter the maximum number of results to analyze for clustering. The value must be between 1 and 1000. Documents not analyzed are sorted by relevance score. If a query returns fewer results than the number entered, all documents are subject to clustering.

      Note: If the value in the Minimum Number of Clusters box is high, the value in the Maximum Number of Results for Clustering box must equally be high; otherwise, clusters containing only one result can be created (to users, it looks like clustering is not used).

      Note: Clustering requires considerable CPU resources; therefore analyzing more results than the default value of 100 can slow down Coveo Enterprise Search (CES).

    5. In the Clustering Source drop-down list, select the source to use for clustering.

      The result clustering algorithm compares concepts, excerpts or summaries of documents:

      • Concepts are words recognized as important by linguistic algorithms.

      • Excerpts are groups of passages containing the terms queried.

      • Summaries are complete sentences recognized as important by an advanced linguistic algorithm. They are more precise than concepts, but require more CPU resources to be extracted. Summarization can be disabled for certain sources in order to speed up indexing.

    6. Click Apply Changes.

What's Next?

Once enabled in the index, in a .NET search interface, you can refine search results using the Refine by Cluster facet (see Refining a Search by Cluster in a .NET Search Interface) .

People who viewed this topic also viewed