Skip to main content
Documents Module View If Crawlers are books from the public library (the web), Documents are the private files in your office. This is where you upload your confidential manuals, Excel price lists or internal PDF guides. You give the agent direct access to specific files so it becomes an instant expert in your business.

📥 What can I upload here? (Magic formats)

Your agent is multilingual and can read almost anything:
  • 📄 PDF: Manuals, catalogues, brochures.
  • 📊 XLSX / CSV: Price tables, stock lists, simple databases.
  • 📝 TXT / JSONL: Quick notes, logs or structured data.

⚙️ How to manage your files (The Upload Zone)

Upload and Manage Documents It’s as easy as it looks:
  1. Drag & Drop: Drag your file into the box or click to browse your computer.
  2. Processing: The system will read (“parse”) the text so the agent can understand it.
  3. Ready! Once processed, you’ll see the file in the list below.

List Controls:

  • Name: The file name. Keep it clear (e.g. Prices_2024.pdf instead of doc1.pdf).
  • Toggle (The Switch): The most useful feature.
    • 🔵 ON: The agent reads and uses this file.
    • OFF: The file is stored but ignored by the agent. Perfect for drafts or older versions you don’t want to delete completely.
  • Bin 🗑️: Permanently removes the file from the agent’s memory.

⚠️ Golden Rules (What NOT to do)

To keep everything running smoothly, keep this in mind:

1. Empty files = Error 🚫

If you upload a PDF that only contains scanned images (no selectable text) or a corrupted file, you’ll see an error like [MARKITDOWN] pdf is empty.
Solution: Make sure the PDF contains real text (OCR) or that the Excel file has actual data.

2. Watch the size (Split it up!) ✂️

There is a word limit per file to avoid overloading the agent’s brain.
  • Does your manual have 500 pages? Don’t upload it as one file.
  • Split it: Upload Manual_Part1.pdf, Manual_Part2.pdf, etc.
  • The agent is smart and will combine information from all active files to give a coherent answer.

🧠 How does the agent reason with this?

Imagine you upload three files and switch them ON:
  1. Washing_Machine_Manual.pdf
  2. Spare_Parts_Prices.xlsx
  3. Warranty_Policy.txt
Now, if a user asks: “How much does part X cost and is it covered by the warranty?” The agent will:
  1. Look up the price in the Excel.
  2. Read the conditions in the TXT.
  3. Combine both and reply: “The part costs £20 and yes, it is covered under clause 3.” 🤯

🎓 Best Practices Summary (Cheat Sheet)

  • Descriptive names: Help the agent (and yourself) understand what’s inside. Wedding_Menu.pdf is better than scan001.pdf.
  • Updates: If you upload a new version (Prices_V2.xlsx), remember to switch OFF or delete the old one (Prices_V1.xlsx) so the bot doesn’t get confused with two different prices.
  • Clean data: Avoid uploading Excel files with thousands of empty rows or PDFs with heavy watermarks. The cleaner the text, the smarter the answer.
Start filling your agent’s brain with your best documents! 📂✨