Advanced Search

Home » Help »

Full-text search covers a lot of ground, but often you need more control. InterroBot captures a variety of field data that can be queried independently. The syntax to search against a particular field is as follows:

fieldname: query

While full-text always searches for text/strings, field search supports one of three data types: string, number, or boolean, depending on the field used. This precision is essential for auditing technical SEO, tracking down issues, or managing content.

Filtering to a path (aka directory).

HTTP Headers and URL Filtering

The HTTP headers contain useful information relative to caching, content types, security flags, and more. HTTP headers provide crucial signals about how browsers and search engines should handle your content.

Fulltext search is often an inadequate tool for filtering. If you wanted to find all PDFs hosted on a website, querying headers for application/pdf is going to be the best option. The name/value pair nature of the headers data makes it less prone to false positives than full-text.

headers: application/pdf

URL filtering is ostensibly about finding all pages/assets within a particular HTTP path. This, in itself, can be a practical tool. More useful, however, is the ability to use URL filtering with other queries (including full-text!). Field queries can be combined with an uppercase AND.

headers: application/pdf AND url: /archive/

This pattern is particularly valuable when auditing site migrations or tracking down resources that should have been moved or removed. For understanding how InterroBot categorizes different asset types, see Webpages and Assets.

HTTP Status, Download Size, and Response Time

Where searching against headers and full-text requires strings, status and download values are captured as numbers. This means you can search using greater-than, lesser-than, and equality operators.

If you've used the Client Errors and Server Errors buttons, you are already familiar with searching status. You can search for either a particular status code, or a range. Status codes follow the HTTP standard, and understanding them is crucial for diagnosing crawlability issues. If you wanted all errors, both client and server, you could search the following:

status: >=400

For a complete breakdown of what different status codes mean and how to fix them, see Finding Broken Links.

Filtering on the download size is an easy way to identify bloated and uncompressed images, HTML, or other assets. Page speed is a confirmed ranking factor, and oversized resources are often the culprit behind slow load times. The search is in bytes (1,048,576 bytes per megabyte). If you wanted to find all images over half a megabyte, you could search:

size: >500000 AND headers: image

Narrowing down to large images using field search.

Response time filtering works similarly and helps identify slow-loading resources that could be impacting your site's Core Web Vitals performance metrics.

Redirects and Robots Exclusion

Querying for redirected resources and robots exclusion is filtered as a boolean (true or false). Redirect chains waste crawl budget and can dilute link signal, so identifying and consolidating them is an important SEO maintenance task. Say you wanted to clean up all redirects that were the result of links to HTTP (presumably forwarding to HTTPS). This could be achieved with the following query:

redirect: true AND url: http

The robots exclusion filter helps you verify that important pages aren't accidentally being blocked from search engine crawlers. For more on how InterroBot respects robots.txt directives, see Getting Started.

Advanced Boolean Search

Boolean NOT and parentheses operators are supported, in addition to AND and OR. This allows for complex queries like finding all large images that aren't in your CDN:

size: >500000 AND headers: image AND NOT url: cdn.example.com

Or identifying all HTML pages with specific content that aren't returning errors:

status: 200 AND headers: text/html AND ("product-widget" OR "legacy-component")

For developers building automated workflows around these queries, see API and Plugin Development. If you're integrating search results with AI tools for content analysis, check out AI Data Access.


InterroBot is a web crawler and developer tool for Windows, macOS, Linux, iOS, and Android.
Want to learn more? Check out our help section or download the latest build.