Product DocsMenu

Limiting the Size of the Coveo index

CES is designed to help organizations index millions of documents. When the disk space is limited, you can customize CES to index only pertinent information (ex.: the roots of words) while leaving out non-essential data (ex.: plural variations of words)—because each organization has different requirements, the index options can be fully customized. Moreover, CES proposes maintenance tools to keep the index as compact as possible.

Note: The Coveo Platform 7 features a self-optimizing index that no longer needs scheduling index optimization tasks (see About the Index Self-Optimization Process).

The following table describes different actions that you can take in the Administration Tool to limit the index size.

Where What How
Add Source page

Index > Sources and Collections page > Sources section, click Add to get to the Add Source page (see Adding a Source).

When indexing Local/Network Files, SharePoint, Documentum, Exchange Crawler, Enterprise Vault or ODBC sources, do not index subfolders or subsites.
  1. In the Source Type drop-down list, select the appropriate source type.

  2. In the Options section, clear the Index subfolders check box.

  3. Complete the indexing of the source (see Adding a Source).

Do not generate a Quick View version of the indexed documents. In the Options section, clear the Generate a cached HTML version of indexed documents check box.
Select a field set that contains no custom field or a limited number of pertinent custom fields. In the Fields section, select the appropriate field set (see What Are Field Sets? and Adding or Modifying Custom Fields).
Filters page
  1. Access the Sources and Collections page (Index > Sources and Collections).

  2. In the Sources section, expand the appropriate source drop-down list.

  3. Select Edit Filters.

Add exclusion filters to exclude documents that do not need to be indexed.
  1. Click Add an Exclusion Filter.

  2. In the Excluded Pattern box, enter the addresses of the documents to exclude (one entry per line, use wildcards if necessary).

  3. Click Save.

(see Adding or Modifying Source Filters)

Advanced page of Web sources
  1. Access the Sources and Collections page (Index > Sources and Collections).

  2. In the Sources section, expand the appropriate Web source drop-down list.

  3. Select Edit Advanced.

Restrict crawling to one (only the main page) or two (the main page and pages directly linked to it) levels. In the Crawling section, select Restrict crawling to X levels and enter the appropriate number of level (see Modifying Advanced Source Parameters).
Query History page (Reports > Query History)

or

Index History page (Reports > Index History)

Export the reports to Excel, and then delete the repost history on the Coveo server.
  1. In the Fixed drop-down list, select the appropriate time period.

  2. Click Update. The report appears.

  3. Click Export. The File Download box appears.

  4. Click Save.

  5. Access the Settings page (Reports > Settings).

  6. In the Index History or Query History section, beside Delete index history older than
     or
    Delete query history older than

    select the appropriate date (in order to delete only the exported reports).

  7. Click Delete index history older than or Delete query history older than.

Setting page (Logs > Settings) Keep logs for less than 90 days. In the Log Archive section, select the Keep logs for the last X days option and enter the appropriate number of days.
Languages page

Access the Converter Managers page (Configuration > Converters) and click Languages in the navigation panel on the left.

Do not index documents if their language is not recognized by CES (see Supported Languages). In the Language Detection section, select Reject the document (see Modifying How Documents Written in Unrecognized Languages Are Indexed).
People who viewed this topic also viewed