spector-gpu 🖥️¶
GPU acceleration for Spector using JNI-free Panama FFM interop with CUDA.
spector-gpu accelerates batch vector similarity calculations by offloading distance calculations to NVIDIA GPUs. Using Project Panama's Foreign Function & Memory (FFM) API, it loads CUDA dynamic libraries (nvcuda.dll or libcuda.so) and binds memory buffers directly to GPU contexts without writing any JNI C++ code.
🏗️ Core Architecture & Roles¶
- CUDA Kernel Loader (
CudaKernelLoader): Loads compiled CUDA PTX/SASS kernels and runs JNI-free host/device FFM commands. - GPU Vector Store (
GpuVectorStore): Allocates page-locked host memory (pinned RAM) and copies vector blocks directly to device memory (VRAM). - Batch Similarity (
GpuSimilarityKernel): Executes parallel matrix-multiplication kernels on GPU cores, achieving up to 4× speedups over AVX-512 for batch queries of size \(N \geq 100{,}000\).