Knowledge Base¶
The Knowledge Base enables Retrieval-Augmented Generation (RAG) β grounding AI responses in your organization's actual documents, policies, and data.
How It Works¶
flowchart LR
subgraph Ingestion["Document Ingestion"]
Upload["π Upload"] --> Chunk["βοΈ Chunking"] --> Embed["π’ Embedding"] --> Store["πΎ Vector Store"]
end
subgraph Retrieval["Query-Time Retrieval"]
Query["β User Question"] --> QEmbed["π’ Embed Query"] --> Search["π Similarity Search"] --> Context["π Top-K Context"]
end
Store -.-> Search
Context --> LLM["π€ LLM + Context"] --> Response["π¬ Grounded Response"]
Document Ingestion¶
Supported Formats¶
| Format | Description |
|---|---|
| Technical documents, policies, reports | |
| Text | Plain text files, markdown |
| CSV | Structured data, catalogs, inventories |
Ingestion Pipeline¶
Documents are processed through Apache Camel's integration engine:
- Upload β via API or admin UI
- Tenant isolation β documents are tagged with the tenant ID
- Chunking β documents are split into manageable chunks (default: 1000 tokens with 200 token overlap)
- Embedding β each chunk is converted to a vector using the configured embedding model
- Storage β vectors are stored in MongoDB Atlas Vector Search
API¶
# Upload a document
POST /api/v1/kb/documents
Content-Type: multipart/form-data
file: @document.pdf
sourceType: MANUAL
Vector Search¶
MongoDB Atlas Vector Search¶
Synaptiq uses MongoDB Atlas Vector Search for semantic similarity:
- Index type β HNSW (Hierarchical Navigable Small World)
- Dimensions β 768 (nomic-embed-text) or 1536 (OpenAI)
- Similarity metric β Cosine similarity
- Top-K retrieval β configurable (default: 5 chunks)
Search API¶
Embedding Models¶
| Provider | Model | Dimensions | Notes |
|---|---|---|---|
| Ollama | nomic-embed-text |
768 | Local, free, recommended for dev |
| Gemini Embedding | 768 | Cloud-based, production-ready | |
| OpenAI | text-embedding-3-small |
1536 | Cloud-based, high quality |
Configure in application-dev.yml:
RAG in Chat¶
When a knowledge base is configured, chat responses are automatically augmented:
- User's message is embedded
- Top-K similar chunks are retrieved from the vector store
- Retrieved context is injected into the system prompt
- The LLM generates a response grounded in your documents
- Source citations are included
RAG-Augmented Response
User: "What is our return policy for electronics?"
Response: Based on the Company Returns Policy (v2.3, Section 4.2), electronics can be returned within 30 days of purchase in original packaging with receipt. Opened items are subject to a 15% restocking fee. Defective items can be exchanged within 90 days under warranty.
Sources: Returns Policy v2.3 Β§4.2, Warranty Terms Β§2.1
Best Practices¶
Document Quality
- Use well-structured documents with clear headings and sections
- Avoid scanned PDFs without OCR β use text-based PDFs
- Keep documents up to date β outdated context leads to outdated answers
Chunk Size Optimization
- Smaller chunks (500 tokens) β better for precise Q&A
- Larger chunks (2000 tokens) β better for context-heavy responses
- Adjust overlap to prevent losing context at chunk boundaries
Domain-Specific Vocabularies
- Upload glossaries and terminology guides
- The RAG pipeline will ground responses in your domain's language