๐ Synapse โ Tags & Scoring¶
Package:
com.spectrayan.spector.memory.synapseBiological Analog: In neuroscience, the Synaptic Tagging and Capture (STC) hypothesis (Frey & Morris, 1997) describes how synapses are "tagged" during learning with lightweight chemical markers. These tags don't contain the memory itself โ they identify what the memory is about and when it was formed, enabling the brain to route consolidation activity efficiently.
Versioned Header Layouts¶
Every cognitive memory record begins with a synaptic header โ the digital equivalent of a synaptic tag. The header format is versioned via the HeaderLayout sealed interface, supporting three layout sizes:
classDiagram
class HeaderLayout {
<<sealed interface>>
+headerBytes() int
+version() int
+readHeader(segment, offset) CognitiveHeader
+writeHeader(segment, offset, header)
+forVersion(int) HeaderLayout$
+defaultLayout() HeaderLayout$
}
class HeaderLayoutV1 {
+headerBytes() = 32
+version() = 1
}
class HeaderLayoutV2 {
+headerBytes() = 48
+version() = 2
}
class HeaderLayoutV3 {
+headerBytes() = 64
+version() = 3
}
HeaderLayout <|.. HeaderLayoutV1 : permits
HeaderLayout <|.. HeaderLayoutV2 : permits
HeaderLayout <|.. HeaderLayoutV3 : permits
V1 โ Core Layout (32 bytes)¶
The original layout, still supported for backward compatibility. Contains all fields required for the 6-Phase Scoring Pipeline.
Offset Size Field Description
โโโโโโ โโโโ โโโโโ โโโโโโโโโโโ
0 8B timestamp_ms Unix epoch ms when memory was formed
8 8B synaptic_tags 64-bit Bloom filter of contextual markers
16 4B exact_norm L2 norm of original float vector
20 4B importance Cognitive importance (0.05 โ 10.0)
24 4B recall_count Times recalled (LTP reconsolidation counter)
28 2B centroid_id IVF centroid assignment (max 65,535)
30 1B valence Emotional coloring (signed: -128 to +127)
31 1B flags Bit flags (see below)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Total: 32 bytes (1ร AVX2 register)
Why 32 bytes?
The V1 header is exactly one AVX2 register width (256 bits). The entire header can be loaded in a single SIMD instruction for bulk scanning operations.
V2 โ Extended Layout (48 bytes)¶
Adds arousal and storage strength for emotional modulation and the future Two-Factor Memory Strength model.
Offset Size Field Description
โโโโโโ โโโโ โโโโโ โโโโโโโโโโโ
0 32B [V1 core] All V1 fields (timestamp through flags)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ V2 extension โโโโโโโโโโโโโโโโโโโโโโโ
32 1B arousal Emotional intensity (unsigned: 0-255)
33 3B [padding] Alignment padding
36 4B storage_strength Durability factor S(t) for Two-Factor model
40 8B [reserved] Future use (zeroed)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Total: 48 bytes (1.5ร AVX2 registers)
New fields:
| Field | Type | Range | Purpose |
|---|---|---|---|
arousal |
unsigned byte | 0 (calm) โ 255 (extreme) | Modulates decay curve โ high-arousal memories resist forgetting |
storage_strength |
float | 0.0 โ 5.0 | Two-Factor model durability (default: 1.0). Reserved for Labs |
V3 โ Full Cache-Line Layout (64 bytes) โญ Default¶
The default for all new stores. Extends V2 with a 16-byte future buffer, aligned to a full CPU cache line (64 bytes) for optimal sequential scan performance.
Offset Size Field Description
โโโโโโ โโโโ โโโโโ โโโโโโโโโโโ
0 32B [V1 core] All V1 fields (timestamp through flags)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ V2 extension โโโโโโโโโโโโโโโโโโโโโโโ
32 1B arousal Emotional intensity (unsigned: 0-255)
33 3B [padding] Alignment padding
36 4B storage_strength Durability factor S(t)
40 8B [reserved_1] Future use (zeroed)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ V3 extension โโโโโโโโโโโโโโโโโโโโโโโ
48 16B [reserved_2] Future expansion buffer (zeroed)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Total: 64 bytes (1ร cache line, 2ร AVX2)
Why V3 is the default
Cache-line alignment eliminates split-line reads during sequential scans. When the scorer iterates over 1M records, each header read hits exactly one cache line โ no partial line loads, no false sharing. The 16 bytes of reserved space cost ~1.5% total memory overhead but prevent future migration costs when new fields are added.
Version Comparison¶
| Property | V1 (32B) | V2 (48B) | V3 (64B) |
|---|---|---|---|
| Core fields | โ | โ | โ |
| Arousal | โ (default: 0) | โ | โ |
| Storage strength | โ (default: 1.0) | โ | โ |
| Future buffer | โ | โ | โ (16B) |
| Cache-line aligned | โ | โ | โ |
| Memory per 1M records | 32 MB | 48 MB | 64 MB |
| SIMD reads per header | 1 | 2 | 2 |
Backward Compatibility¶
When a V3 reader encounters a V1 file, the missing fields return safe defaults:
// V1 โ V3 transparent upgrade
CognitiveHeader header = layout.readHeader(segment, offset);
header.arousal(); // โ 0 (neutral โ no arousal effect)
header.storageStrength(); // โ 1.0 (default durability)
No data migration is required for reads. The CognitiveScorer checks headerBytes > 32 to determine whether arousal is available and skips the arousal read on V1 segments.
HeaderMigrator โ One-Time Version Upgrades¶
The HeaderMigrator performs atomic, one-time migration of store files between header versions.
Supported Paths¶
Upgrade (lossless):
V1 (32B) โโโ V2 (48B) โ
New fields filled with defaults
V1 (32B) โโโ V3 (64B) โ
New fields filled with defaults
V2 (48B) โโโ V3 (64B) โ
Existing V2 fields preserved
Downgrade (lossy):
V3 (64B) โโโ V2 (48B) โ ๏ธ Reserved buffer lost
V3 (64B) โโโ V1 (32B) โ ๏ธ Arousal + storage_strength lost
V2 (48B) โโโ V1 (32B) โ ๏ธ Arousal + storage_strength lost
Atomic Migration Process¶
flowchart LR
A["Original Store<br/>store.dat"] --> B["Write to temp<br/>store.dat.migrating"]
B --> C["Verify temp<br/>record count match"]
C --> D["Backup original<br/>store.dat.bak"]
D --> E["Atomic rename<br/>temp โ store.dat"]
C -->|"Verify failed"| F["Delete temp<br/>Abort migration"]
style A fill:#3498db,color:white
style E fill:#27ae60,color:white
style F fill:#e74c3c,color:white
- Write โ Records are read from source, headers expanded/shrunk, written to
store.dat.migrating - Verify โ Record count in temp file must match source exactly
- Backup โ Original file renamed to
store.dat.bak - Rename โ Temp file atomically renamed to
store.dat - Cleanup โ On startup, orphaned
.migratingfiles are detected and deleted
Usage¶
HeaderMigrator migrator = new HeaderMigrator();
// Upgrade V1 store to V3
migrator.migrate(
Path.of("/data/episodic.dat"),
HeaderLayout.forVersion(1), // source layout
HeaderLayout.forVersion(3), // target layout
quantizedVecBytes // vector payload size
);
Flags Bitfield¶
The flags byte at offset 31 encodes per-record state:
Bit Name Description
โโโ โโโโ โโโโโโโโโโโ
0 tombstone Record is logically deleted (pruned by Deep Sleep)
1-2 memory_type 2-bit type: 0=WORKING, 1=EPISODIC, 2=SEMANTIC, 3=PROCEDURAL
3 consolidated Has been reflected into Semantic tier
4 pinned Exempt from decay and pruning (flashbulb memories)
5 resolved Zeigarnik Effect โ resolved tasks return to normal decay
6-7 reserved Future use
Zeigarnik Effect (Bit 5)¶
Unresolved memories (bit 5 = 0) resist time-decay โ their decay bucket is clamped to 0, keeping them perpetually "fresh." This models the psychological phenomenon where incomplete tasks remain more accessible than completed ones.
// In CognitiveScorer Phase 4:
if (!isResolved(flags) && !isPinned(flags)) {
adjustedBucket = 0; // acts like the memory was just formed
}
// Agent marks task complete:
memory.markResolved("task-123"); // bit 5 โ 1, normal decay resumes
SynapticTagEncoder โ The Inline Bloom Filter¶
The synaptic_tags field is a 64-bit inline Bloom filter rather than a discrete bitmap. This enables encoding thousands of unique tag strings across the system while each individual record holds 5-50 tags with negligible false positive rates.
How It Works¶
public static long encode(String... tags) {
long filter = 0L;
for (String tag : tags) {
filter |= encodeTag(tag);
}
return filter;
}
private static long encodeTag(String tag) {
long h = murmurHash64(tag);
long h1 = h;
long h2 = h >>> 32 | h << 32; // Swap halves for second hash
long filter = 0L;
for (int i = 0; i < K; i++) { // K = 3 hash functions
int bitIndex = Math.abs((int) ((h1 + (long) i * h2) % M)); // M = 64
filter |= (1L << bitIndex);
}
return filter;
}
Key properties:
| Property | Value |
|---|---|
| Filter size | 64 bits (fits in a single CPU register) |
| Hash functions | k = 3 (MurmurHash3-inspired double hashing) |
| Bits per tag | 3 |
| Match operation | (record & query) == query (containment check) |
| Cost | 1 CPU cycle (single long read + bitwise AND) |
False Positive Rates¶
| Tags per Record | FPR | Assessment |
|---|---|---|
| 5 tags | 0.03% | Excellent โ 1 false match per 3,000 records |
| 10 tags | 0.2% | Excellent โ 1 false match per 500 records |
| 20 tags | 2.3% | Good โ vector distance rejects false matches |
| 50 tags | 12% | Acceptable โ still useful for coarse gating |
System vs. Record Tags
The system can have thousands of unique tag strings. But any single record should have at most 10-50 tags for the Bloom filter to remain effective. This is a natural fit โ a single memory rarely has more than 5-15 contextual associations.
Tag Overlap Scoring¶
Beyond binary gating, the SynapticTagEncoder also computes a fractional overlap ratio for weighted tag relevance in Phase 6:
public static float overlapRatio(long recordTags, long queryMask) {
if (queryMask == 0) return 0f;
int overlapBits = Long.bitCount(recordTags & queryMask);
int queryBits = Long.bitCount(queryMask);
return (float) overlapBits / queryBits;
}
This ratio is used as a multiplier in the scoring formula: finalScore = baseScore ร (1 + tagOverlap ร tagRelevanceBoost). A record matching 3 of 5 query tags gets a 60% tag boost vs 100% for a full match.
CognitiveRecordLayout โ Binary Format¶
The CognitiveRecordLayout class manages reading/writing headers and quantized vectors to/from off-heap MemorySegment. It delegates header operations to the active HeaderLayout:
public final class CognitiveRecordLayout {
private final HeaderLayout headerLayout;
private final int quantizedVecBytes;
/**
* Record stride = header bytes + vector payload.
* V1: 32 + vecBytes, V2: 48 + vecBytes, V3: 64 + vecBytes.
*/
public int stride() {
return headerLayout.headerBytes() + quantizedVecBytes;
}
/**
* Offset where the quantized vector begins within a record.
*/
public long vectorOffset(long recordOffset) {
return recordOffset + headerLayout.headerBytes();
}
public void writeHeader(MemorySegment segment, long offset, CognitiveHeader header) {
headerLayout.writeHeader(segment, offset, header);
}
public CognitiveHeader readHeader(MemorySegment segment, long offset) {
return headerLayout.readHeader(segment, offset);
}
}
CognitiveHeader Record¶
The header data is represented as a Java record with all fields from all versions:
public record CognitiveHeader(
long timestampMs, // when the memory was formed
long synapticTags, // 64-bit Bloom filter
float exactNorm, // L2 norm of original vector
float importance, // cognitive importance (0.05 โ 10.0)
int recallCount, // LTP reconsolidation counter
short centroidId, // IVF partition routing ID
byte valence, // emotional coloring (-128 to +127)
byte flags, // bit field (tombstone, type, consolidated, pinned, resolved)
byte arousal, // V2+: emotional intensity (unsigned 0-255)
float storageStrength // V2+: Two-Factor durability S(t)
) {
/**
* V1-compatible constructor โ fills V2+ fields with safe defaults.
*/
public CognitiveHeader(long timestampMs, long synapticTags, float exactNorm,
float importance, int recallCount, short centroidId,
byte valence, byte flags) {
this(timestampMs, synapticTags, exactNorm, importance,
recallCount, centroidId, valence, flags,
(byte) 0, // arousal: neutral
1.0f); // storageStrength: default durability
}
}
DecayStrategy โ SIMD-Friendly Temporal Decay¶
The exp() Problem
The naive decay formula Math.exp(-ฮปยทage) costs 50-100ns per call and is a scalar operation โ it cannot be SIMD-vectorized. At 1M memories, this adds 50-100ms of pure overhead, destroying the SIMD advantage.
The Solution: Precomputed Decay Buckets¶
DecayStrategy quantizes time into discrete buckets and uses a precomputed lookup table:
// Precomputed โ zero Math.exp() calls at query time
private static final float[] DECAY_TABLE = {
1.00f, // Bucket 0: 0-1 hours
0.95f, // Bucket 1: 1-6 hours
0.85f, // Bucket 2: 6-24 hours
0.70f, // Bucket 3: 1-3 days
0.50f, // Bucket 4: 3-7 days
0.30f, // Bucket 5: 1-2 weeks
0.15f, // Bucket 6: 2-4 weeks
0.05f, // Bucket 7: 1-3 months
0.01f // Bucket 8+: 3+ months
};
public static float decay(int bucket) {
return DECAY_TABLE[Math.min(bucket, DECAY_TABLE.length - 1)];
}
Reconsolidation Adjustment¶
Every 3 recalls shifts the bucket back by 1, simulating Long-Term Potentiation:
public static int adjustForReconsolidation(int rawBucket, int recallCount) {
return Math.max(0, rawBucket - (recallCount / 3));
}
A memory recalled 12 times is 4 buckets "younger" than its actual age โ it resists forgetting.
Arousal-Modulated Decay¶
Emotionally intense memories resist forgetting. The arousal byte (V2+ headers) modulates the decay curve through a 4-bucket lookup table:
private static final int[] AROUSAL_THRESHOLDS = {64, 128, 192};
private static final float[] AROUSAL_MODIFIERS = {1.0f, 1.15f, 1.35f, 1.65f};
| Arousal Range | Bucket | Modifier | Biological Basis |
|---|---|---|---|
| 0-63 (neutral) | 0 | 1.00ร | Normal forgetting โ routine memories |
| 64-127 (mild) | 1 | 1.15ร | Slightly persistent โ mildly emotional |
| 128-191 (moderate) | 2 | 1.35ร | Noticeably persistent โ significant events |
| 192-255 (extreme) | 3 | 1.65ร | Very hard to forget โ flashbulb memories |
The modifier multiplies the base decay factor, slowing the decay rate. A production outage at arousal=200 decays 1.65ร slower than a routine log entry at arousal=0.
/**
* Computes decay with arousal modulation.
* Higher arousal โ slower decay โ memory persists longer.
*/
public static float computeDecayWithArousal(int bucket, byte arousal) {
float baseFactor = decay(bucket);
float modifier = arousalModifier(arousal);
return Math.min(1.0f, baseFactor * modifier);
}
/**
* Returns the arousal modifier for a given arousal byte (unsigned 0-255).
*/
public static float arousalModifier(byte arousal) {
int unsigned = Byte.toUnsignedInt(arousal);
for (int i = AROUSAL_THRESHOLDS.length - 1; i >= 0; i--) {
if (unsigned >= AROUSAL_THRESHOLDS[i]) return AROUSAL_MODIFIERS[i + 1];
}
return AROUSAL_MODIFIERS[0];
}
Automatic arousal derivation: When arousal is not explicitly set by the LLM, it is auto-derived from valence at ingestion time:
This means both extremely positive (valence=+100) and extremely negative (valence=-100) memories are equally arousing โ matching the psychological finding that emotional intensity, not polarity, drives memory persistence.
Wiring in CognitiveScorer¶
The scorer reads arousal from the header and applies the modifier to both standard and lateral scoring paths:
// In CognitiveScorer, after Phase 4 (temporal/importance pre-screen):
// Read arousal โ only available on V2+ layouts
byte arousal = hasArousal
? segment.get(LAYOUT_AROUSAL, offset + OFFSET_AROUSAL)
: (byte) 0; // V1 fallback: no arousal effect
// Phase 6: Standard scoring
float decay = DecayStrategy.decay(adjustedBucket) * DecayStrategy.arousalModifier(arousal);
decay = Math.min(1.0f, decay);
float baseScore = alpha * similarity + beta * importance * decay;
Next Steps¶
- Dopamine โ Surprise Detection โ auto-importance scoring
- Cortex โ Tier Stores โ the 4-tier architecture
- 6-Phase Scoring Pipeline โ how scoring uses the header
- Labs โ Research Roadmap โ Two-Factor Memory, Dynamic Quantization