spector-rag 🤖¶

Zero-dependency Retrieval-Augmented Generation (RAG) context assembly and prompt pipelines.

spector-rag implements zero-dependency RAG context orchestration. It executes semantic/hybrid searches on SpectorEngine, ranks and filters the top retrieved passages using listwise re-rankers, assembles context packages, and templates LLM prompts.

🏗️ Core Architecture & Roles¶

Context Assembler (ContextAssembler): Aggregates text passages from hybrid search, deduplicates entries, and trims contexts based on maximum LLM token budgets.

Prompt Template Controller (PromptTemplate): Maps search results to custom markdown templates:

Answer the query based ONLY on the following context:
---
[Context Passages]
---
Query: [User Query]

RAG Client (RagClient): Handles remote connection streams with Ollama or external OpenAI-compatible services.

🚀 Key APIs¶

Generating a Prompt Template¶

// Retrieve context using SpectorEngine
SearchResponse response = engine.hybridSearch(query, queryVector, 5);

// Format prompt based on Markdown template
String contextPrompt = PromptTemplate.DEFAULT
    .withQuery(query)
    .withResults(response.results())
    .assemble();

// Send prompt directly to your LLM provider
String llmResponse = RagClient.generate(contextPrompt);