Knowledge base
Upload documents to a dataset
Documents are the unit of content inside a dataset. Each file is parsed, chunked, embedded, and added to the index. Agents retrieve chunks at runtime.
What you'll learn
- Which file formats Dezifi accepts
- How to upload via drag-and-drop or file picker
- How long indexing takes and how to monitor it
- How to view, replace, and delete documents
Supported formats
Dezifi accepts PDF, DOCX, TXT, CSV, and JSON. PDFs and DOCX go through layout-aware extraction. CSV and JSON are parsed row-by-row or path-by-path so structured rows stay retrievable as units. Max file size is 50 MB per document; split larger files before upload.
Upload a file
- 1
Open the dataset
From Knowledge Base, click the dataset you want to load into. You'll see the document list — empty on a new dataset. - 2
Drag files in or click Upload
Drag one or more files onto the page, or click Upload to open the file picker. Multi-select is supported. Mixed formats in the same batch is fine. - 3
Watch indexing progress
Each row shows status: Queued, Parsing, Embedding, Indexed, or Failed. A small PDF indexes in seconds; a 50 MB file may take a few minutes. The dataset is queryable as soon as the first document hits Indexed. - 4
Handle failures
A Failed row shows the reason — usually unreadable PDF, password-protected file, or content over the size limit. Click Retry after fixing, or delete the row and re-upload a clean copy.
View and manage documents
- 1
Open a document
Click any row to see the parsed text and the chunks Dezifi extracted. This is what the agent will retrieve. - 2
Replace a document
Upload a new version with the same name to overwrite. The old chunks are removed from the index and replaced atomically. - 3
Delete a document
Use the row action menu. Deletion is immediate — chunks are pulled from the index so agents stop retrieving them on the next query.
Frequently asked questions
- Can I upload a folder or zip?
- Zip uploads are expanded automatically and each file inside is indexed individually. Folder uploads work in browsers that support directory selection. Each contained file must be one of the supported formats.
- How are PDFs with images or tables handled?
- Text is extracted from layout-aware parsing. Tables become CSV-like chunks. Embedded images are skipped today — for image-heavy content, export to text-first DOCX or pre-process with OCR.
- Are my documents used to train any model?
- No. Uploaded documents are stored in your workspace's vector store and used only for retrieval by your agents. They are not used to train base models or shared across tenants.
- Why is one file stuck in Parsing?
- Most often a malformed PDF or an unusually large file. Check the row's error detail. If it persists for more than ten minutes on a sub-50-MB file, delete and re-upload.