Document Engine
Upload Any Document. AI Reads the Rest.
PDFs, Word files, CSVs, scanned images — uploaded, parsed, chunked, and indexed automatically. Your team searches content, not file systems.
Capabilities
From raw files to searchable knowledge in minutes.
Every Format That Matters
PDF, DOCX, CSV, XLSX, and scanned images via OCR. One upload endpoint handles all of them.
Structure-Aware Chunking
Documents split at paragraph and section boundaries — never mid-sentence, never across page breaks.
Background Processing
Upload and walk away. Documents process asynchronously with real-time status updates in the dashboard.
Vector Indexing
1024-dimension embeddings stored in pgvector with HNSW indexing — fast similarity search from day one.
Rich Metadata
File name, page numbers, chunk positions, upload timestamps — all preserved and searchable.
Per-Client Isolation
Each client's documents live in a separate database schema. Structural isolation, not just access control.
The Processing Pipeline
Four steps from raw file to searchable knowledge.
Upload
Drop files via the dashboard or send them by email. The API accepts single files or batches.
Parse
Unstructured.io extracts text, tables, and structure from any supported format — including OCR for scans.
Chunk
Our RecursiveChunker splits content at natural boundaries, preserving headings, lists, and page context.
Index
Mistral Embed generates vectors. pgvector stores them with HNSW indexing for sub-second retrieval.
Specs at a Glance
For the engineers doing due diligence.
PDF, DOCX, CSV, XLSX, images (OCR)
1024 dimensions, Mistral Embed
HNSW via pgvector — works on empty tables
Async with real-time status via API
Recursive, structure-aware, page-boundary safe
PostgreSQL per-client schema isolation
See document processing in action.
Upload a sample document during your demo — watch it indexed and searchable in real time.
Or email us at contact@ailoopwise.com