Product DocsMenu

Modifying How CES Handles a Document Type

Document types describe how CES handles each file encountered during indexing. In a document type set, you can add or remove document types and for each document type, you can configure what CES does with files of this type.

To modify how CES handles files of a specific document type

  1. On the Coveo server, access the Administration Tool (see Opening the Administration Tool).

  2. Access the Document Type Sets page (Configuration > Document Types).

  3. In the Document Type Sets page, click the document type set that you want to modify.

  4. In the Document Types page, click the document type that you want to modify.

  5. In the configuration page for the selected document type:

    1. Modify the appropriate parameters:

      File Extensions

      Enter one or more file extensions, separated by a semicolon, corresponding to the document type.

      Example: For help files: .chm;.hlp

      Action

      Select the appropriate indexing action taken by CES for this document type (see What Is the Difference between Indexing by Reference and Indexing by Content?):

      Index entire document

      Indexes the whole content of the document. This is called indexing by content.

      Index file information only

      Only indexes file metadata. This is called indexing by reference.

      Reject document

      Does not index the document.

      Indexing Failure Action

      Select the action taken when the document is corrupted and cannot be indexed:

      Index file information only

      Only indexes file metadata.

      Reject document

      Does not index the document.

      Converter

      Select one of the two options to specify which converter to use to process documents that belong to the document type (see Administration Tool - Converters Menu).

      Use a default converter

      Select to use one of the built-in CES converters, and then select the desired converter in the drop-down list. Select Detect to let CES automatically select the appropriate converter based on the detected file type.

      Use an open converter

      Select to rather use an open converter, and then select the appropriate open converter in the drop-down list (see Custom Converters and Adding an Open Converter).

      Content Types

      Optionally enter the type of content returned by custom connectors for this document type.

      Example: binarydata.

      sysfiletype Field Value

      Select the value for the sysfiletype field (see Administration Tool - Fields Menu).

      Use the value set by the converter

      By default, select this option to let the selected converter set the field value.

      Use this value

      Select to set a custom value, independent of the converter, and then enter the desired value in the box.

      Quick View

      Select how the cached HTML version for indexed document of this type will be generated (see About the Enhanced Quick View):

      Note: The Quick View option is available starting from the CES 7.0.6547 March 2014 monthly release.

      • Default

        Select to create the cached HTML version with the original Quick View feature.

        The original Quick View is a simple HTML version that has low impact on index size and indexing performance, but still allows end users to quickly locate searched terms in the document. When the document contains images or is graphically rich, the original Quick View may not be a visually representative version of the original document.

        It is recommended to select Default for document types that contain only text, or contain graphical content that is not required to review the document meaning.

      • LibreOffice and PDF2HTMLEx

        Select to create enhanced Quick Views for this document type using LibreOffice to convert documents to PDF format, and PDF2HTMLEx to convert the PDF to HTML format. This option produces the most accurate HTML reproduction of the original documents, but requires significant server resources and significantly increases the index size.

        It is recommended to select LibreOffice and PDF2HTMLEx only for document types with important and meaningful graphical content such as Microsoft PowerPoint document.

      • LibreOffice Only CES 7.0.6607+ (April 2014)

        Select to create enhanced Quick Views for this document type, but only using LibreOffice to convert documents directly to HTML format. This option produces a less accurate HTML reproduction of the original documents, but also requires less server resources and does not increase the index size as much.

        It is also recommended to select LibreOffice Only only for document types with important and meaningful graphical content. Select LibreOffice Only over LibreOffice and PDF2HTMLEx when you can compromise on the HTML reproduction quality to reduce requirements on server resources.

      Notes: When LibreOffice and PDF2HTMLEx or LibreOffice Only is selected:

      • If the conversion for a given document fails, the original Quick View is created as a fallback option and will be available for the corresponding search results in the search interface.

      • If a PDF document is indexed, PDF2HTMLEx is used to generate the HTML.

      Options

      When indexing attachments, index the parent document

      Select this check box to index email attachments or archive documents (ex.: documents in .zip files) with their parent document. This option is not selected by default.

      Inherits source options

      Select this check box to apply the Disable document summarization and Open results with cached version options selected for the parent source to the document type (see Modifying General Source Parameters). This option is selected by default. Clear this check box to customize the following two options.

      Disable document summarization

      Select this check box to apply a different Disable document summarization option to the document type than the one selected for the parent source. To make this option available, clear the Inherits source options check box. The Disable document summarization option is not selected by default (see What Is a Summary?).

      Open results with cached version

      Select this check box to force search result items for this document type to open in a Quick View with a cached version, independently from the Open results with cached version option set for the parent source. To make this option available, clear the Inherits source options check box. The Open results with cached version option is not selected by default.

      Title Selection Sequence

      Use the arrows to set the order of actions taken to attempt to set the document title independently from the parent source. When CES fails to extract a title using the first option, it proceeds to the second one and so on. The title appears in the search result list. To make this option available, clear the Inherits source options check box.

      Title Metadata Name

      Uses a different Title Metadata Name to index the document type than the one selected for the parent source. To make this option available, clear the Inherits source options check box.

    2. Click Apply Changes.

What's Next?

Ensure this modified document type set is associated to the appropriate source or sources (see Modifying the Document Type Set Used by a Source).

People who viewed this topic also viewed