❓ FAQ¶

Quick answers to the most common questions about Spector. Can't find what you're looking for? Check GitHub Discussions or the specific wiki pages linked throughout.

🌟 General¶

What Java version do I need?¶

JDK 25 or later. Spector uses the Java Vector API (incubator module) for SIMD acceleration and Panama FFM for off-heap memory. OpenJDK builds include these by default.

Does it work without a GPU?¶

Yes, completely. GPU is optional. Without a GPU, Spector uses CPU SIMD acceleration (AVX2/AVX-512/NEON) which delivers sub-millisecond search at 100K documents. GPU helps primarily for high-concurrency batch workloads.

Tip

See GPU Acceleration for details on when GPU adds value (spoiler: batch sizes > 32).

Can I use it as an embedded library?¶

Absolutely! Spector runs in two modes:

Mode	Description	Overhead
Embedded	Add JAR to classpath, create `SpectorEngine`	Zero network overhead
Server	REST API with auth, CORS, metrics	HTTP overhead

try (var engine = new SpectorEngine(SpectorConfig.DEFAULT.withDimensions(384))) {
    engine.ingest("id", "content", vector);
    var results = engine.hybridSearch("query", queryVector, 10);
}

What about persistence? Do I lose data on restart?¶

No! Spector supports persistence through memory-mapped files. The HNSW index uses a page-aligned binary format that loads instantly via mmap — no deserialization needed. Vector data survives restarts.

How does it compare to Elasticsearch?¶

Aspect	⚡ Spector	Elasticsearch
Vector search latency	0.13 ms (100K, in-process)	2–10 ms
Hybrid search latency	1.01 ms (100K, in-process)	10–30 ms
Deployment	Embedded JAR or server	Cluster only
Dependencies	Zero (JDK only)	JVM + heavy stack
GPU support	✅ CUDA	❌
IVF-PQ compression	✅ 32×	❌

Elasticsearch excels at distributed full-text search with a mature query language and ecosystem. Spector excels at raw in-process performance, embedded use, and modern JVM features. The latency advantage is largest for in-process embedded use; network-bound deployments narrow the gap.

Does it support filtering/metadata queries?¶

Yes. The Spring AI integration supports filter expressions:

vectorStore.similaritySearch(
    SearchRequest.query("search algorithms")
        .withFilterExpression("category == 'indexing' && version > 2")
);

What embedding models work with Spector?¶

Any model that produces float32 vectors. Set dimensions to match:

Model	Dimensions	Provider
all-MiniLM-L6-v2	384	Sentence Transformers / Ollama
e5-base-v2	768	Sentence Transformers
text-embedding-ada-002	1536	OpenAI
nomic-embed-text	768	Ollama
mxbai-embed-large	1024	Ollama

Note

Spector includes an Ollama embedding provider out of the box. Implement the EmbeddingProvider SPI for any other source.

🔧 Technical¶

What similarity functions are supported?¶

Function	Best For
COSINE (default)	Normalized embeddings (most models)
DOT_PRODUCT	Unnormalized embeddings, magnitude matters
EUCLIDEAN	Spatial/geometric data

What's the maximum dataset size?¶

Mode	Scale
Single node	Up to 10 million documents
IVF-PQ mode	Billions of vectors (32× compression)
Distributed mode	Scale horizontally (2–256 shards)

How does the LLM re-ranking work?¶

flowchart LR
    A["🔍 Search<br/>Top-N candidates"] --> B["🤖 LLM (Ollama)<br/>Listwise scoring"]
    B --> C["✨ Re-ranked<br/>Top-K results"]

Vector/hybrid search retrieves top-N candidates (default: 20)
Candidates sent to Ollama for listwise relevance scoring
LLM reorders based on semantic relevance
Final top-K results reflect LLM judgment

Warning

Adds 100–500ms latency but significantly improves precision for ambiguous queries.

What are virtual threads and why do they matter?¶

Virtual threads (Project Loom) are lightweight threads that don't map 1:1 to OS threads:

✅ Handle millions of concurrent requests without pool tuning
✅ No synchronized blocks that pin platform threads
✅ Near-zero scheduling overhead
✅ Linear scaling (4.5× at 16 threads measured)

How does zero-copy storage work?¶

Vectors are stored in memory-mapped files using Panama's MemorySegment:

OS maps file directly into process address space
SIMD kernels read vectors without copying to Java heap
Zero garbage collection pressure
Instant startup (no deserialization)
Supports datasets larger than available RAM

What's the difference between HNSW and IVF-PQ?¶

Aspect	🌐 HNSW	🗜️ IVF-PQ
Speed	Fastest (0.05ms)	Fast (nprobe-dependent)
Memory	Full vectors (1.5KB/vec @ 384-dim)	32× compressed (48 bytes/vec)
Recall	High (configurable)	Moderate (nprobe-dependent)
Scale	Up to millions	Up to billions
Use case	Default for most workloads	Memory-constrained, billion-scale

Can I run benchmarks in CI?¶

Yes! JSON output + baseline regression detection:

mvn -pl spector-bench exec:java -Dexec.args="-rf json -rff results.json"

⚙️ Operations¶

What ports does Spector use?¶

Port	Protocol	Purpose
7070	HTTP	REST API (configurable)
9090	gRPC	Cluster communication (distributed mode)

How do I monitor Spector?¶

curl http://localhost:7070/health          # Health check
curl http://localhost:7070/api/v1/status    # Engine status
curl http://localhost:7070/api/v1/metrics   # Request metrics

What JVM arguments should I use in production?¶

java \
  --add-modules jdk.incubator.vector \
  --enable-native-access=ALL-UNNAMED \
  -XX:+UseZGC -XX:+ZGenerational \
  -Xmx4g -Xms4g \
  -jar spector.jar

How do I upgrade without downtime?¶

Distributed mode: 1. Drain one node (stop routing requests) 2. Upgrade the node binary 3. Restart and wait for replica sync 4. Repeat for each node

Embedded mode: Standard application deployment with new Spector version.

Is there authentication?¶

Yes. Set an API key at server startup:

# via environment variable
SPECTOR_API_KEY=my-secret-key mvn -Psynapse -pl synapse/spector-synapse spring-boot:run

# or via Spring Boot argument
mvn -Psynapse -pl synapse/spector-synapse spring-boot:run -- --spector.api-key=my-secret-key

Clients include X-API-Key: my-secret-key in requests. Without a key configured, all requests are allowed.

🔗 See Also¶

Getting Started — Quick start guide
What is Spector — Product overview
Configuration Guide — All parameters
Performance Tuning — Optimization strategies