BSDC

Binary Sparse Distributed Codes

BSDC uses sparse binary vectors where only a small fraction of bits are set to 1. This provides memory efficiency and biological plausibility while maintaining hyperdimensional properties.

Properties

Property	Value
Binding	XOR
Inverse	Self (approximate)
Commutative	Yes
Self-inverse	Yes
Space	Sparse binary
Sparsity	p = 1/√d (optimal)

When to Use

Memory-constrained systems
Very high dimensions (d > 10,000)
Biologically plausible models (cortical codes)
Sparse data structures available
When storage scales with active bits

Theory

Vector Space

BSDC vectors are sparse binary:

\[v_i \in \{0, 1\}, \quad ||v||_0 = k\]

where k is the number of active (1) bits.

Optimal Sparsity

The optimal sparsity maximizes information capacity:

\[p^* = \frac{1}{\sqrt{d}}\]

Dimension	Sparsity	Active Bits
1,000	3.2%	~32
10,000	1.0%	~100
100,000	0.32%	~316

Binding Operation

Binding uses XOR, which approximately preserves sparsity:

\[c = a \oplus b\]

Expected sparsity of result: \(2p(1-p) \approx 2p\) for small p.

Bundling Operation

Bundling uses majority voting with sparsity preservation:

Sum all vectors element-wise
Select top-k positions with highest counts
k chosen to maintain target sparsity

Similarity Measure

Overlap coefficient:

\[\text{sim}(a, b) = \frac{|a \cap b|}{k}\]

where k is the expected number of active bits.

Capacity Analysis

BSDC has lower bundle capacity than dense codes due to sparse overlap dynamics:

\[n_{max} \approx 0.01 \cdot d\]

Dimension	Max Items
2048	~20
4096	~40
10000	~100

Note

While capacity per dimension is lower, BSDC uses far less memory, allowing much higher dimensions. At dim=100,000 with 1% sparsity, you get ~1000 item capacity with only ~2KB storage per vector.

Memory Efficiency

Dimension	Dense Storage	Sparse Storage	Ratio
10,000	10 KB	200 B	50×
100,000	100 KB	600 B	160×
1,000,000	1 MB	2 KB	500×

Sparse storage: indices of active bits (e.g., 100 × 16-bit indices = 200 bytes).

Code Examples

Basic Usage

from holovec import VSA

# Create BSDC model with default optimal sparsity
model = VSA.create('BSDC', dim=10000)

print(f"Sparsity: {model.sparsity:.4f}")  # ~0.01
print(f"Expected ones: {int(model.sparsity * 10000)}")  # ~100

# Generate sparse hypervectors
a = model.random(seed=1)
b = model.random(seed=2)

print(f"Actual ones in a: {a.sum()}")  # ~100

Custom Sparsity

# More sparse (less memory, lower capacity)
sparse_model = VSA.create('BSDC', dim=10000, sparsity=0.005)

# Less sparse (more memory, higher capacity)
dense_model = VSA.create('BSDC', dim=10000, sparsity=0.02)

Measuring Sparsity

# After binding, sparsity changes slightly
c = model.bind(a, b)
actual_sparsity = model.measure_sparsity(c)
print(f"Result sparsity: {actual_sparsity:.4f}")

# Rehash to restore target sparsity
c_rehashed = model.rehash(c)
print(f"After rehash: {model.measure_sparsity(c_rehashed):.4f}")

Sparse Storage

import numpy as np

# Get indices of active bits (sparse representation)
a = model.random(seed=1)
active_indices = np.where(a == 1)[0]
print(f"Storage: {len(active_indices) * 2} bytes")  # 2 bytes per index

Biological Plausibility

BSDC mirrors cortical neural codes:

Property	BSDC	Cortex
Sparsity	~1%	~1-5%
Encoding	Distributed	Distributed
Binding	XOR-like	Synaptic modulation

Comparison with Similar Models

vs Model	BSDC Advantage	BSDC Disadvantage
BSC	Much less memory	Slightly lower capacity
MAP	Sparse storage	Approximate operations
BSDC-SEG	More flexible sparsity	No segment structure

References

Kanerva, P. (1988). Sparse Distributed Memory
Rachkovskij, D. A., & Kussul, E. M. (2001). Binding and normalization of binary sparse codes
Kleyko, D., et al. (2023). A Survey on Hyperdimensional Computing

Properties

When to Use

Theory

Vector Space

Optimal Sparsity

Binding Operation

Bundling Operation

Similarity Measure

Capacity Analysis

Memory Efficiency

Code Examples

Basic Usage

Custom Sparsity

Measuring Sparsity

Sparse Storage

Biological Plausibility

Comparison with Similar Models

References

See Also