Configuring and Indexing a Microsoft SharePoint Source
Notes:
-
In an environment with more than one Microsoft SharePoint Web Application, it is recommended to define one source for each Microsoft SharePoint Web Application that you want to index, and only index user profiles once to not create duplicates in your index (see Modifying Hidden Microsoft SharePoint Source Parameters).
-
CES 7.0.6830+ (July 2014) The SharePoint source type is for the second generation SharePoint connector. When you are still using the original SharePoint connector to create your SharePoint source, ensure to rather use the SharePoint Legacy source type (see Configuring and Indexing a Microsoft SharePoint Source With the Legacy Connector).
To configure and index a Microsoft SharePoint source
-
On the Coveo server, access the Administration Tool (see Opening the Administration Tool).
-
Select Index > Sources and Collections.
-
In the Collections section:
-
Select an existing collection in which you want to add the new source.
OR
-
Click Add to create a new collection (see Adding a Collection).
-
-
In the Sources section, click Add.
The Add Source page that appears is organized in three sections.
-
In the General Settings section of the Add Source page:
-
Enter the appropriate value for the following required parameters:
-
Name
-
Enter a descriptive name of your choice for this source.
Example: When you have more than one SharePoint site to index, you can include in the name information to help distinguish between them.
SharePoint 2016 Intranet
SharePoint 2013 Extranet
-
Source Type
-
The connector used by this source. In this case, select SharePoint.
Note: CES 7.0.6767– (June 2014) The SharePoint type corresponds to what is now the Legacy SharePoint source type (see Configuring and Indexing a Microsoft SharePoint Source With the Legacy Connector).
-
List of specific SharePoint farm sections that you want to index. If you need to index more than one section, enter one URL per line.
Note: CES 7.0.6942 (August 2014) Starting addresses must end with /.
Examples:
-
For the whole farm:
https://farm/
-
For a specific Web Application:
https://farm:8080/
-
For a specific site collection:
https://farm:8080/sites/Support/default.aspx
-
For a specific website:
https://farm:8080/sites/Support/subsite/default.aspx
-
For a specific document library:
https://farm:8080/Document Library/
-
For a specific list:
https://farm:8080/sites/Support/Lists/Contacts/AllItems.aspx
Important: A specific folder in a list is not supported.
-
For SharePoint Online:
https://domain.sharepoint.com
Note: You can also use the source Crawl Scope parameter to control more precisely the content to crawl (see below).
-
-
Fields
-
Select the field set that you created for this source (see Microsoft SharePoint Connector Deployment Overview).
-
Time interval at which the source is automatically refreshed to keep the index content up-to-date.
Note: The default Every Day option is typically good, but when your SharePoint content changes frequently within a day, after creating your source, you should schedule incremental refresh at significantly shorter time interval to continuously index ongoing SharePoint content changes (see Scheduling a Source Incremental Refresh). You can then consider to refresh the source weekly by selecting the Every Sunday option.
-
-
Review the value for the following parameters that often do not need to be modified:
-
Rating
-
Change this value only when you want to globally change the rating associated with all items in this source relative to the rating to other sources (see Understanding Search Results Ranking).
Example: If this source was for a legacy Intranet, you may want to set this parameter to Low, so that in the search interface, results from this source appear later in the list compared to those from other sources.
-
Document Types
-
If you defined custom document type sets, ensure to select the most appropriate for this source (see What Are Document Type Sets?).
-
Active Languages
-
If you defined custom active language sets, ensure to select the most appropriate for this source.
-
-
-
In the Specific Connector Parameters & Options section of the Add Source page, review if you need to change the parameter default values:
-
In the Number of Refresh Threads box, when your Coveo server has available CPU cores, consider increasing the number to easily and significantly increase the crawling performance. The default value is 2.
-
In the Mapping File box, leave the default value to use the default mapping file (Coveo.CES.CustomCrawlers.SharePoint.MappingFile.xml).
When you identify that some custom SharePoint content is not indexed or not properly mapped, consider creating a custom mapping file, and then enter the full path to the file (see Creating and Using a Custom SharePoint Mapping File).
-
CES 7.0.6830+ (July 2014) In the Crawling Scope drop-down box, select the option for the content type that you want to crawl in relation with the source Addresses that you specified (see above).
Select WebApplication, the default value and highest element type in the SharePoint farm (tenant in SharePoint Online) hierarchy to crawl everything.
Value Content to crawl WebApplication All site collections of the specified web application SiteCollection All web sites of the specified site collection WebAndSubWebs Only the specified web site and its sub webs List Only the specified list or document library -
In the Authentication Type drop-down list, refer to the following table to select the authentication type value corresponding to your SharePoint environment and the type of User Identity that you assigned to this source (see Microsoft SharePoint Connector Deployment Overview).
SharePoint environment User identity type Option to select Classic Windows account
(SharePoint 2010 default)WindowsClassic Claims Windows account
(SharePoint 2013 and 2016 default)WindowsUnderClaims ADFS federated account AdfsUnderClaims Okta Okta Online Native Office 365 account SpOnlineNative Single Sign-On Office 365 account SpOnlineFederated -
In the Parameters section, click Add Parameter when you want to show and configure advanced hidden source parameters (see Modifying Hidden Microsoft SharePoint Source Parameters).
Examples:
-
In the case of an ADFS environment, when the Authentication Type parameter value is either AdfsUnderClaims or SpOnlineFederated, you must add ADFS related hidden parameters (see ADFS Related Parameters).
Notes: You can configure the security provider to operate when multiple ADFS servers are used to authenticate users in SharePoint. [more]
-
CES 7.0.8541+ (September 2016) When you create a SharePoint search service application to list your user profiles, you must add the following hidden parameters (see LoadUserProfiles and UsePeopleSearchForUserProfiles).
-
CES 7.0.9272+ (March 2018) When your SharePoint instance uses Okta as a single sign-on provider, you must add the OktaRealmand OktaSignInUrl parameters, and the corresponding values that you previously retrieved (see Okta Single Sign-On Provider for SharePoint On-Premises).
-
-
In the Option section:
-
Index Subfolders
-
Keep this check box selected (recommended). By doing so, all subfolders from the specified server address are indexed.
-
Index the document's metadata
-
When selected, CES indexes all the document metadata, even metadata that are not associated with a field. The orphan metadata are added to the body of the document so that they can be searched using free text queries.
When cleared (default), only the values of system and custom fields that have the Free Text Queries attribute selected will be searchable without using a field query (see Adding a Field to Search On and What Are Field Queries and Free Text Queries?).
Example: A document has two metadata:
-
LastEditedBy containing the value Hector Smith
-
Department containing the value RH
In CES, the custom field CorpDepartment is bound to the metadata Department and its Free Text Queries attribute is selected.
When the Index the document's metadata option is cleared, searching for RH returns the document because a field is indexing this value. Searching for hector does not return the document because no field is indexing this value.
When the Index the document's metadata option is selected, searching for hector also returns the document because CES indexed orphan metadata.
-
-
Document's addresses are case-sensitive
-
Leave the check box cleared. This parameter needs to be checked only in rare cases for case sensitive systems in which distinct documents may have the same file name but with different casing.
-
Generate a cached HTML version of indexed documents
-
When you select this check box (recommended), at indexing time, CES creates HTML versions of indexed documents. In the search interfaces, users can then more rapidly review the content by clicking the Quick View link rather than opening the original document with the original application. Consider clearing this check box only if you do not want to use Quick View links.
-
Open results with cached version
-
Leave this check box cleared (recommended) so that in the search interfaces, the main search result link opens the original document with the original application. Consider selecting this check box only when you do not want users to be able to open the original document but only see the HTML version of the document as a Quick View. In this case, you must also select Generate a cached HTML version of indexed documents.
-
-
-
In the Security section of the Add Source page:
-
In the Authentication drop-down list, select the user identity that you created for the Microsoft SharePoint farm (tenant in SharePoint Online) (see Microsoft SharePoint Connector Deployment Overview).
-
In the Security Provider drop-down list, select the SharePoint security provider that you created for this SharePoint source.
-
Click Save to save the source configuration and consider revising advanced source parameters before starting indexing the new source (see Modifying Hidden Microsoft SharePoint Source Parameters).
OR
-
Click Save and Start to save and start indexing immediately.
-
Note: When your SharePoint Web Application uses Claims, the first time the SharePoint search interface is accessed, the first time setup page appears to let you enter your Claims information and allow access to the search interface (see Coveo .NET Front-End First Time Setup).
What's Next?
Set an incremental refresh schedule for your source (see Scheduling a Source Incremental Refresh).