Product DocsMenu

Managing Text Analytics Pipeline Configurations

Once you successfully deployed your Text Analytics module and understand what a pipeline is, you can start creating, running, and fine-tuning your own text analytics pipelines.

Creating a Custom Run or Job from a Template

You can easily create a typical pipeline configuration with pre-populated values for many parameters starting from a template. Run or job template files contain placeholder variables for many parameter values that are automatically filled with appropriate values when you create a configuration using them.

To create a custom run or job from a template

  1. Start TAnGO (see Starting TAnGO).

  2. In TAnGO, click Create Configuration File.

  3. In the Create New Configuration File dialog box:

    1. In the Text Analytics section, under Paths, in the Configuration Template box, select the run or job template file on which you want to base your custom configuration.

    2. Click Save.

    3. In the Save Configuration File as dialog box, browse to the [Text_Analytics_Path]\Config\ folder, in the File name box enter a name of your choice for your custom pipeline (ex: MyFirstPipeline), and then click Save.

  4. Using a text editor:

    1. Open the pipeline configuration file that you just created.

    2. Respecting the XML format of the file and of available plugins, modify or remove existing plugins or add new ones to achieve desired results (see Text Analytics Run Plugins and Predefined Text Analytics Job Plugins).

    3. Save the file.

Running a Custom Run or Job

  1. Start TAnGO (see Starting TAnGO).

  2. In TAnGO:

    1. Click Register.

    2. In the Open Configuration File dialog box, select the configuration file for the custom run or job that you want to execute, and then click Open.

      The selected pipeline starts to run once or at scheduled time intervals.

    3. In the Job Logs panel, review text analytics activity logs to verify that the run progresses without errors.

Fine-Tuning the Text Analytics Output

Once you ran your custom pipeline, fine-tuning the output is often an iterative process involving adding and tuning plugins until the desired output is achieved.

To fine-tune the text analytics output

  1. Inspect the output of your text analytic pipeline.

    Example: In your search interface, create facets based on fields created by your text analytics pipeline. Inspect the item of these facets (see Adding or Customizing a Facet With the .NET Interface Editor).

  2. In a run, when unnecessary or too much content is processed:

    1. Consider tuning the fetcher plugin to better scope the document set to process (see Predefined Text Analytics Fetcher).

    2. Consider using filters to exclude part of the fetched content before proceeding with the extraction (see Predefined Text Analytics Filters).

  3. When you want to find and tag documents containing specific text strings, consider creating a white list that includes these values and add a whitelister plugin to the pipeline (see Whitelister).

  4. When unwanted values appear in extracted metadata, consider creating one or more blacklist files and adding one or more blacklister plugins to eliminate the unwanted occurrences (see MetadataBlackLister).

  5. When two or more values of extracted metadata correspond to a unique element, consider adding one or more normalizer plugins to homogenize metadata values (see MetadataNormalizer).

People who viewed this topic also viewed