Product DocsMenu

Influencing the Word Corrector Lexicon Algorithm

The Word Corrector Lexicon (WCL) that powers the Query Correction (Did You Mean) feature can be influenced by a bias file, containing word/bias pairs (see Query Correction Feature).

Example: You add your product names in the bias file to ensure your products are suggested when a client misspelled them in the search box of your e-commerce site. Before your changes, the Did you Mean feature was only suggesting common English words with a high degree of similarity.

To influence the Word Corrector Lexicon Algorithm

  1. Ensure your CES index meets the requirements of the Query Correction feature (see Query Correction Feature).

  2. List the DidYouMean suggested terms that you want to influence along with their number of occurrences in your index.

    Tip: You can misspell one or two letters in important concepts for your organization and ensure the DidYouMean suggestions are correct.

  3. Create a WordCorrectorBias.txt file that respects the following format:

    Word1 [-]n

    Word2 [-]n

    Wordn [-]n

    Note: The numbers (n) next to the terms represent the bias in absolute numbers of occurrence in the lexicon. A negative value will decrease the number of occurrences and thus reduce the chance of seeing the term as a DidYouMean suggestion. A positive value will, on the contrary, enforce the term as a possible suggestion. The number of occurrences to add or subtract is dependent on the index content.

    Example: You want to make sure that a query containing the "verison" typo does not suggest "version", but "Verizon", so you enter the following:

    verizon 100000

    version -20000

  4. In the CES 7 config.txt file (located in the C:\CES7\Config folder), in the PhysicalIndex section, add the WordCorrectorBiasFilePath tag with the path to the WordCorrectorBias.txt file.

    Example: <WordCorrectorBiasFilePath>C:\CES7\Config\WordCorrectorBias.txt</WordCorrectorBiasFilePath>

    Note: You can also make the algorithm outputs suggestions containing ligated characters, which is impossible otherwise, since the index does not contain any ligated character.

    Example: To make the index outputs the "Œsophage" suggestion, you add the term with a bias of 0 (for the term to have the same number of occurrences as "oesophage"):

    Œsophage 0

  5. Restart the CES service.

  6. Once you are satisfied with your bias file, rebuild the Word Corrector Lexicon to make sure the WordCorrectorBias.txt file is used:

    1. On the Coveo server, access the Administration Tool (see Opening the Administration Tool).

    2. Select Index > Advanced.

    3. In the Advanced page, click Rebuild Word Corrector Lexicon.

People who viewed this topic also viewed