Crawl Content from Your Website

What type of site crawl is best for you? Learn how to gather website information to power your chatbot.

In this article

Site crawls

💡 Use Site Crawl to bring in all available information from your site at once.

  1. Paste in the URL of your website
  2. We’ll automatically scan the text and extract content from on-page text and PDFs.
     

RSS feeds

💡 Use the RSS Feed option to connect event-related or time-sensitive content. Ideal for sites with frequently updated calendars or announcements.

  1. Paste in your RSS feed URL. 
  2. We’ll pull in information like event names, dates, times, and locations.

Custom URLs

💡 Great for curated content strategies or partial site ingestion.

Use Custom URLs when:

  • You want to enter a few URLs manually
  • You have content spread across multiple domains or microsites
  • You have a list of important pages to add via Excel upload

You can either:

  • Manually enter individual URLs

  • Upload an Excel file with all URLs listed in a single column


     

File crawls

💡 Perfect for internal manuals, onboarding guides, or product specs.

Use the From Files option when you have documents you want the chatbot to learn from:

  1. Upload PDFs, text files (.txt), or Word documents (.docx)
  2. We’ll extract the content and make it available to the chatbot for response generation

💡 Why this matters: All of these options help ensure your chatbot is pulling from reliable, relevant information—whether it’s live on your site, sitting in a document, or published in an RSS feed. The better the crawl, the better the answers.