Associating the OCR Open Converter to Document Types
To associate the OCR open converter to document types
-
On the Coveo server, access the Administration Tool (see Opening the Administration Tool).
-
Select Index > Sources and Collections.
-
In the Sources and Collections page, select the source on which the OCR open converter script will be used. The Status page is displayed.
-
In the navigation panel on the left, click Document Types.
-
In the Document Types page, click Add.
-
In the Add Document Type Set page that appears:
-
In the Name box, enter a name of your choice for the document type set to be used for indexing using the OCR module.
Example: OCR Document Types
-
In the Description box, optionally enter a description of the usage of the document type set.
-
Click Save.
-
-
Back in the Document Types page:
-
In the Document Type Set drop-down list, ensure that the document type set you just created is selected.
-
Click Edit.
-
-
In the page that appears, for each document type to be indexed using the OCR module:
-
Click the document type.
Example: Click Adobe Acrobat Documents.
Note: The document formats supported by the OCR module are: .tiff, .tiff-fx, .pcx, .dcx, .bmp, .jpeg, .png, .max, .gif, .pbm, and .pdf.
-
In the page that appears for this document type:
-
In the Action drop-down list, select Index entire document.
-
In the Converter section, select Use an open converter, and then select the name of the converter that you created for this purpose (see Adding an OCR Open Converter).
-
Click Apply Changes.
-
-
-
Open the CES Console (see Using the CES Console).
-
Back in the Administration Tool, rebuild the sources that use the new document type set.
-
In the CES Console, follow the rebuild activities. Documents are crawled, converted, and transactions are applied to index.
The end-users can search for the OCR indexed document content once transactions are applied to the index.