Frequently Asked Questions
How to Deal with an Excess of Site-Internal Norobots?
If you are encountering pages in norobots while crawling with InterroBot... a workaround is to modify your sites
robots.txt to globally allow InterroBot, internal to your site. As robots.txt is read top to bottom, it is best to add at the bottom of the file to reduce potential for other rules overriding the allow.
For sites beyond your control, there is no way to override the behavior. InterroBot does not party as an uninvited guest.
When is it OK to Use the Personal Edition?
Personal edition is intended for non-commercial use OR for commercial use by a sole-proprietor or corporation where annual gross receipts total less than $80,000 USD. Gross receipts represent pre-tax, pre-expensed (inclusive of salary) revenue.
This makes sense to me, but if you think you've hit an edge case–a letter vs. spirit scenario–contact me via the contact form, in-app.
What's are the Page Limits of the Crawler?
There are no hard limits, only limits of the host machine. Most sites with less than 100,000 pages should be fine. Storage is a consideration, however. The more pages, the more space you will need on your disk. It is not unheard of for a large site to utilize a gigabyte of storage.
I Found a Bug?
Great! Let me know. There's a contact form in the client application, it's under the Options page (the gear icon).