spector-embed-ollama 🤖¶

Out-of-the-box Ollama embedding integration, fallback handling, and parallel batch calling for Spector.

spector-embed-ollama implements the EmbeddingProvider contract for local Ollama instances. It supports parallel API calls, high-throughput batching, automatic JSON escape handling, and resilient connection timeout fallbacks.

🏗️ Core Architecture & Roles¶

OllamaEmbeddingProvider: Connects to local or remote Ollama HTTP servers (e.g. http://localhost:11434/api/embed) using asynchronous JDK HTTP Clients.
Parallel GPU Batching: Splits large text collections into optimal GPU batches (e.g., 500 vectors) to saturate local GPU accelerators.
Resiliency Fallbacks: Manages connection pooling, HTTP request timeouts, and automatically retries failed batches to ensure ingestion pipeline safety.

🚀 Key APIs¶

Configuring Ollama Provider¶

// Connect to a local Ollama service running qwen3-embedding
EmbeddingProvider provider = new OllamaEmbeddingProvider(
    "http://localhost:11434",
    "qwen3-embedding"
);

// Single vector generation
float[] vector = provider.embed("Spector uses Panama FFM");

// Batch generation
List<String> sentences = List.of("First sentence", "Second sentence");
float[][] batchVectors = provider.embedBatch(sentences);