Liquid error: Unknown operator current_user

Removing URLs From Monitoring

Product

Essentials
Growth
Enterprise

When monitoring a website, Conductor Monitoring is able to find and index URLs based on many different types of relations such as incoming links or the sitemap reference. More about how Conductor Monitoring finds URLs here: URL finding.

Once Conductor Monitoring finds and indexes a URL, it keeps being monitored. Conductor Monitoring doesn’t automatically remove URLs from its index even if the content of the pages gets removed, if the page starts returning a 404 or the URL has no incoming relations anymore.

However, there are two ways in which you can remove URLs from Conductor Monitoring’s index:

  • The URL Exclusion List
  • The Purge orphan pages feature

Important

When you use the methods below to remove certain URLs from monitoring, all of the existing data Conductor Monitoring has collected related to these URLs will be deleted.

 

URL Exclusion List

The URL Exclusion List allows you to exclude certain parts of the website from Conductor Monitoring’s monitoring based on URL patterns and essentially works as a virtual robots.txt file for Conductor Monitoring.

The URL Exclusion List lets you disallow practically any files and pages apart from the robots.txt file, sitemaps which can be found in the default locations: /sitemap.xml and /sitemap_index.xml and the homepage.

Conductor Monitoring does not follow the directives in the actual robots.txt file to be able to access the whole website and report the indexability and relations data accurately.

However, you can easily import the directives from your robots.txt file to the URL Exclusion List in Conductor Monitoring.

Setting up the URL Exclusion List

You can import directives to your exclusion list and add your own exclusion rules.

Import directives from the robots.txt file to the URL Exclusion List

  1. Go to the website's Settings.
  2. At the left, click Set up URL Exclusion List.
  3. Import the directives from the website’s robots.txt file (if there is one) to the URL Exclusion List.
     
  4. Click Import exclusions.If you don’t want to import any directives from the robots.txt file, click Skip.
  5. If you have imported existing robots.txt directives, they are shown here.
  6. The URL Exclusion List follows the robots.txt format and supports both Disallow and Allow directives. The order of the directives doesn’t matter in the URL Exclusion list.

Add rules to the URL Exclusion List

As the next step, you can also add your own URL patterns to exclude from monitoring. If you have imported existing robots.txt directives, they are shown here as well.

The URL Exclusion List follows the robots.txt format and supports both Disallow and Allow directives. The order of the directives doesn’t matter in the URL Exclusion list.

When adding rules to the URL Exclusion List, the Disallow directive is added to the patterns by default.

Here are a few common example use cases for custom exclusion rules that can be useful for you:

  • An asterisk (*) matches any character.
  • /admin/ excludes all URLs starting with /admin/
  • *?filter= excludes all URLs containing ?filter=

If you want Conductor Monitoring to monitor a specific subdirectory of the website you can do so using the Allow directive.

The Allow directive is used to override the Disallow directive and allows Conductor Monitoring to monitor specific paths within excluded subdirectories:

/media/
Allow: /media/press
Allow: /media/blog

The example above excludes the /media/ subdirectory except for /media/press and /media/blog.

Once everything is set up as you wish, click Apply changes. The exclusions rules will take effect in the next few minutes:

  • All monitored URLs matching the exclusion rules will be immediately removed from Conductor Monitoring’s index.
  • Conductor Monitoring will not crawl the URLs matching the pattern at all, unless you remove the exclusion rule from the URL Exclusion List.

Purging orphan pages

The Purge orphan pages feature removes all URLs without any incoming relations from Conductor Monitoring's index.

This feature is useful for example when you have removed certain pages from your website and their URLs are not being linked within the website anymore.

If you don't want or need Conductor Monitoring to monitor such orphaned URLs anymore, you can use this feature to remove them from Conductor Monitoring's index.

What is an orphaned page

A URL is considered orphaned in Conductor Monitoring if:

  • It doesn't have any incoming links.
  • It doesn't have any incoming redirects.
  • It doesn't have any incoming canonicals.
  • It isn't referenced in the XML sitemap.

If a URL has any incoming relations pointing to it or is referenced in an XML sitemap it is not considered orphaned and the Purge orphan pages feature doesn’t apply to it.

Using the Purge orphaned pages feature

To purge orphaned pages:

  1. Go to the website’s Settings.
  2. On the left side, click Purge orphan pages. A pop-up window will appear.
  3. Here you need to select both of the boxes to confirm that you are aware that orphaned URLs will be purged and not monitored anymore by Conductor Monitoring.
  4. Click Purge orphans.