Indexing Secure Web Pages Using Forms
In the Add Form Configuration page of the Administration Tool, you can retrieve form parameters from websites or enter them manually and configure CES to automatically fill forms for both HTTP and HTTPS forms.
This topic contains the following sections:
Retrieving the form parameters from a website
-
On the Coveo server, access the Administration Tool (see Opening the Administration Tool).
-
In the Administration Tool, select Index > Sources and Collections.
-
In the Sources and Collections page, in the Sources section, expand the drop-down list of the source that you want to modify, and then select Edit Forms.
-
In the Forms page, click Add.
-
In the Add a Form Configuration page:
-
In the Form Parameters drop-down list, select Get the form parameters from a Web address.
-
In the Form Address box, enter the URI of the form.
Example: http://www.coveo.com/en/Products/Default.aspx
-
Click Retrieve parameters from URL.
The Form to Use, Name, Form Address, Method and Action are automatically retrieved.
-
Enter the appropriate parameters. For more information, refer to the following table.
Section Description Form to Use
Indicates which form to use if the Get the form parameters from a Web address action has encountered more than one form at the specified address.
Name
Identifies the form.
Form Address
Indicates the address where the form is located.
Method
Indicates the method used to submit form information (either Get or Post).
Action
Indicates the address where the form information is submitted.
Parameters
Identifies the type, name and value of each parameter. The Type parameter indicates the nature of the information; whereas, its Name identifies the field in which the Value is submitted.
Example: To enter Coveo in the Username box, the type would be Text, the name Username and the value Coveo.
To add parameters, click Add.
The parameter types are:
Text: String value entered in a text box (ex.: username).
Password: String value entered in a password box. Note that it is replaced by dots (●●●) for security reasons.
Checkbox: True or false (i.e. selected or unselected) value applied to a check box.
Radio: True or false (i.e. selected or unselected) value applied to a radio button.
Submit: Submit function applied to previously entered parameters.
Reset: Reset function applied to the previously entered parameters.
File: File attached to the form.
Hidden: Value entered in a hidden box.
Image: Image file attached to the form.
Button: Button (other than Submit or Reset) clicked.
Addresses Using This Form
Indicates the addresses accessed using this form. Use wildcards if necessary (see What Are Wildcards?).
Failed Authentication Result Addresses
Indicates the address of the page where CES is redirected if authentication fails (instead of indexing the latter page, CES attempts to re-authenticate).
Options
Indicates whether to re-authenticate each time a secure page is accessed or use authentication cookies. Because re-authentication slows down the indexing process, the Always authenticate when crawling a document option should be selected only if the secure pages do not support cookies.
Test Form
Indicates the address used to test the form. When Apply Changes and Test the Form Using This Address is clicked, CES tries to access this page. If it succeeds, the form is considered valid. If it fails, form parameters must be modified.
-
Click Apply Changes and Test the Form Using This Address to test the form. If the test fails, verify the validity of each parameter.
-
When the test succeeds, click Save.
-
Entering the form parameters manually
-
On the Coveo server, access the Administration Tool (see Opening the Administration Tool).
-
In the Administration Tool, select Index > Sources and Collections.
-
In the Sources and Collections page, in the Sources section, expand the drop-down list of the source that you want to modify, and then select Edit Forms.
-
In the Forms page, click Add.
-
In the Add a Form Configuration page:
-
In the Form Parameters drop-down list, select Enter the form parameters manually.
-
Enter the appropriate parameters. For more information, refer to the table in the previous section.
-
Click Apply Changes and Test the Form Using This Address to test the form. If the test fails, verify the validity of each parameter.
-
When the test succeeds, click Save.
-
Important: Unless Always authenticate when crawling a document is selected, CES keeps authentication cookies in its memory. Therefore, if authentication fails it can be because of expired cookie information delete cookies to force CES to re-authenticate using the form. If this procedure does not solve the problem, the form information has been modified; create a new form.
Deleting the authentication cookies
-
On the Coveo server, access the Administration Tool (see Opening the Administration Tool).
-
In the Administration Tool, select Index > Sources and Collections.
-
In the Sources and Collections page, in the Sources section, expand the drop-down list of the source that you want to modify, and then select Edit Forms.
-
In the Forms page, under Authentication Cookies, click Delete Source Authentication Cookies.