Product DocsMenu

Handling Exceptions

Exceptions can arise during a crawling run for a variety of reasons. It is important to handle the exceptions correctly in order for the connector to be robust and efficient. It is also important to keep in mind that you do not want the crawling to stop in the middle of the process (mainly if you do not have Pause/Resume) because of an exception. Generally, all connector exceptions are split into two groups:

  • Fatal exceptions: These exceptions are usually connection errors, not controlled by the connector. When something like that occurs, it is normal for the crawling process to log a message concerning it and stop. It is also possible to implement retries and delays in case it is a temporary issue.

  • Ignorable exceptions: These exceptions represent all the little details that should not stop the crawling process. For example, an access denied to a specific document or an ignored folder. Skip the forbidden item and go to the next one. Corrupted items or random errors in the API of the target can be ignored.

The CustomCrawler class offers a few exception classes that can be useful when developing a connector:

Exceptions:

private void HandleExceptions(string p_Uri)
{
   try {
       Crawl(p_Uri);
   } catch (CrawlerIgnorableException ex) {
       // We can continue crawling
       Context.LogMessage(ex.Message, p_Uri, Severity.Warning, Operation.Unspecified);
   } catch (CrawlerFatalException ex) {
       // Should stop crawling
       throw;
   }
}     

Of course, these are only suggestions. It is possible to create customized exceptions to suit specific needs. What is important is for exceptions to be handled correctly in order for the connector to be robust and efficient.

See also: How to Crawl Content