AI Data Access

Home » Help »

Feeding web content into AI is just a couple clicks away with InterroBot. There are three common options to integrate your LLM with InterroBot data: by export, by MCP, and by spreadsheet.

LLM via Exporter Plugin

The Exporter Plugin provides a straightforward way to package your crawled content into Web Archive (WARC) format, ready for AI consumption. There are two WARC export options to choose from, each with its own advantages.

The content analysis WARC export strips away HTML complexity, converting pages to markdown format. This approach significantly increases how much content you can feed into an AI system at once. While you lose some of the technical nuance of the original HTML, you gain the ability to work with larger portions of your site's content.

For those needing complete technical analysis, the HTML WARC export preserves the full webpage structure. This gives you everything—tags, attributes, and page semantics—but the tradeoff is size. You'll want to keep an eye on your page count here, as HTML can be verbose. Queries under 200 pages are typically manageable for foundation models, though your mileage may vary depending on how complex your pages are, your subscription, etc.. If your entire website can be loaded this way, it may be all you need.

Exporting crawl data to WARC format using the Exporter plugin.

It's worth noting that binary content like images and videos are not included in either export type. InterroBot optimizes crawls using HTTP HEAD requests where possible, capturing metadata without downloading large binary files. This keeps exports focused on textual content while still preserving important details like file sizes and HTTP headers. InterroBot produced WARC is intended for LLM consumption as a standard, well-understood format. It should not be used for archival purposes.

Model Context Protocol (MCP)

Windows and macOS only, requires Claude Desktop

WARC exports give you your entire website in one package—load it once and you're done. MCP takes a different approach, providing an interactive connection to your website data. Think of it as having a live conversation with your content—you can explore and analyze in real-time, following wherever your investigation leads.

This interactive approach shines with larger sites. While WARC loads everything at once, MCP lets you explore your content dynamically. You might start with a specific section, dive into related pages, or pull random samples to get a feel for different areas. The AI can guide this exploration, helping craft queries that reveal new insights as you go.

For the technically inclined, you're getting direct SQL access to InterroBot's structured data. This means you can leverage the full power of SQL queries, from simple content pulls to complex analysis combining full-text search, metadata, and crawl statistics. Your AI assistant can help craft these queries, often discovering patterns and relationships you might not have considered.

Connecting InterroBot's database through MCP for dynamic content access.

It is recommended you not write to the InterroBot database via the MCP connection. But if you must, or were going to do it anyway, just remember that database backups (available through application options) are key to getting out of a bind. Claude will eventually surprise you with an unexpected drop/overwrite.

Windows users: The current configuration presented in the InterroBot application options may be insufficient. If you use extended characters—anything beyond windows-1252 encoding, even a single emoji, you'll need to explicitly specify UTF-8 in the connection. You will need to replace username in the path with your username.

{ 
  "mcpServers": {
    "InterroBot": {
      "command": "uvx",
      "args": ["mcp-server-sqlite", "--db-path", 
          "C:\\Users\\username\\Documents\\InterroBot\\interrobot.v2.db"]
      },
     "env": {"PYTHONIOENCODING": "utf-8" }  
    }
}

Extending the Utility of Spreadsheets

Every corner of InterroBot offers data export capabilities. Whether you're looking at a spell check report, checking links, or running a custom report, there's usually a CSV or Excel export export nearby. These aren't just for spreadsheet analysis—they're perfect fodder for LLM assistance.

AI can help you make sense of these reports in ways traditional analysis might miss. Feed a keyword cannibalization report into an AI assistant and you might discover content patterns that weren't immediately obvious. Or let AI analyze your link checker results to suggest priority fixes based on user impact. Even exported spell check CSV can benefit from AI review, helping separate true errors from industry jargon or intentional styling.

The tabular format of these exports makes them particularly AI-friendly. You're not just getting raw numbers - you're getting structured data that AI can parse, analyze, and contextualize. This means you can have meaningful conversations about your data, asking questions and getting insights that might not be apparent from looking at the raw spreadsheet.

Exported CSV from reports can be investigated further with AI.

Prompting

These suggestions highlight ways to leverage AI with your InterroBot database:

  • Website Content Management. Allow the AI to identify all instances where price, policy, or representative information appears across your website. After updates, you can trigger a recrawl and have the AI verify the changes were comprehensive.
  • Technical SEO Enhancement. The AI can analyze your site structure and provide customized SEO recommendations based on your specific business goals and target audience. It can help prioritize improvements for maximum impact.
  • Accessibility Assessment. While AI shouldn't replace professional accessibility audits, it can identify major accessibility issues that directly impact user experience, helping you address the most pressing concerns first.
  • Performance Optimization. AI can analyze performance metrics and site structure to identify the root causes of slow loading times or excessive resource usage. It can then suggest a prioritized list of optimizations based on potential impact.
  • Marketing Strategy Development. Use the AI to analyze your website content as a comprehensive data source, both for generating marketing ideas and ensuring accuracy in product and service descriptions.

InterroBot is a web crawler and developer tool for Windows, macOS, and Android.
Want to learn more? Check out our help section or download the latest build.