SynthDef Parser and Storage Implementation¶
Overview¶
This document describes the SuperCollider SynthDef parsing and storage implementation for audiomancer.
Components Implemented¶
1. SynthDef Parser (src/audiomancer/analyzers/synthdef.py)¶
A robust parser for SuperCollider SynthDef files with:
- Primary parsing: Uses regex-based extraction (sclang subprocess parsing prepared but not fully implemented)
- Fallback mechanism: Graceful degradation when sclang is unavailable
- Security: No
shell=True, always sets subprocess timeouts - Comprehensive extraction:
- SynthDef name
- Control parameters with default values
- UGen usage detection
- Gate and envelope detection
- Output channel count
- Source code preservation
- File hash for deduplication
Key Functions¶
Main parsing function with timeout protection. Intelligent categorization based on UGens and controls: - bass: Synths with filters (MoogFF, RLPF) - lead: Pitched synths with envelopes and gate - pad: Long sustained synths (uses ASR envelopes) - drum: Percussive synths without gate - fx: Effect processors, noise generatorsData Structures¶
@dataclass
class SynthControl:
name: str
default_value: float
spec: Optional[str] = None
description: Optional[str] = None
@dataclass
class SynthDefInfo:
name: str
file_path: str
file_hash: str
num_channels: int
has_gate: bool
has_envelope: bool
ugens_used: list[str]
controls: list[SynthControl]
source_code: str
category: Optional[str] = None
tags: list[str] = field(default_factory=list)
2. SynthStore (src/audiomancer/storage/synth_store.py)¶
SQLite-based storage for SynthDef metadata following the same patterns as SampleStore:
- CRUD operations: add, get, update, delete
- Retrieval methods: by ID, name, path, or hash
- Search & filtering: by category, name, has_gate
- Pagination: limit and offset support
- Lineage tracking: parent-child synth relationships
- JSON serialization: for complex fields (controls, characteristics, categorization)
- Atomic operations: proper transaction handling with rollback
Key Methods¶
add(synth: dict) -> str
get(synth_id: str) -> Optional[dict]
get_by_name(name: str) -> Optional[dict]
get_by_path(file_path: str) -> Optional[dict]
get_by_hash(file_hash: str) -> Optional[dict]
update(synth_id: str, updates: dict) -> bool
delete(synth_id: str) -> bool
search(query, category, has_gate, limit, offset) -> list[dict]
count(query, category, has_gate) -> int
add_lineage(synth_id, parent_synth_id, contribution_weight) -> None
get_lineage(synth_id) -> list[dict]
Database Schema¶
The implementation uses the existing Synth and SynthLineage tables from db.py:
CREATE TABLE synths (
id TEXT PRIMARY KEY,
name TEXT UNIQUE NOT NULL,
file_path TEXT UNIQUE NOT NULL,
file_hash TEXT UNIQUE NOT NULL,
characteristics TEXT, -- JSON
categorization TEXT, -- JSON
source_code TEXT NOT NULL,
controls TEXT, -- JSON array
created_at TEXT NOT NULL,
updated_at TEXT NOT NULL
);
CREATE TABLE synth_lineage (
id INTEGER PRIMARY KEY AUTOINCREMENT,
synth_id TEXT NOT NULL,
parent_synth_id TEXT NOT NULL,
contribution_weight REAL DEFAULT 0.5,
created_at TEXT NOT NULL,
FOREIGN KEY(synth_id) REFERENCES synths(id) ON DELETE CASCADE,
FOREIGN KEY(parent_synth_id) REFERENCES synths(id) ON DELETE CASCADE
);
Test Coverage¶
Parser Tests (tests/unit/test_synthdef_parser.py)¶
18 tests covering: - Parsing simple_sine.scd fixture - Parsing tb303.scd fixture (complex acid bass) - Error handling (nonexistent file, invalid extension) - Source code preservation - Hash consistency - Regex fallback parser - Categorization logic (bass, lead, pad, drum, fx) - Data structure creation
Storage Tests (tests/unit/test_synth_store.py)¶
28 tests covering: - Add operations with validation - Duplicate detection (name and hash) - Retrieval by ID, name, path, hash - Updates with timestamp tracking - Deletion - Search and filtering - Pagination - Count operations - Lineage tracking (single and multiple parents)
All 46 tests pass.
Usage Examples¶
Parsing a SynthDef¶
from pathlib import Path
from audiomancer.analyzers import parse_synthdef
# Parse SynthDef file
info = parse_synthdef(Path("synths/tb303.scd"))
print(f"Name: {info.name}")
print(f"Category: {info.category}")
print(f"Controls: {[c.name for c in info.controls]}")
print(f"UGens: {info.ugens_used}")
print(f"Has gate: {info.has_gate}")
Storing and Retrieving Synths¶
from audiomancer.storage import SynthStore
# Initialize store
store = SynthStore("~/.audiomancer/samples.db")
# Prepare synth metadata
synth = {
"id": f"synth_{info.file_hash[:8]}",
"name": info.name,
"file_path": str(info.file_path),
"file_hash": info.file_hash,
"source_code": info.source_code,
"controls": [
{"name": c.name, "default": c.default_value}
for c in info.controls
],
"characteristics": {
"num_channels": info.num_channels,
"has_gate": info.has_gate,
"has_envelope": info.has_envelope,
},
"categorization": {
"category": info.category,
"tags": info.tags,
},
}
# Add to database
synth_id = store.add(synth)
# Retrieve
retrieved = store.get_by_name("tb303")
print(retrieved["characteristics"])
# Search
bass_synths = store.search(category="bass", limit=10)
for synth in bass_synths:
print(f"{synth['name']}: {synth['controls']}")
Tracking Synth Evolution¶
# Track when one synth is derived from another
store.add_lineage(
synth_id="synth_new_variation",
parent_synth_id="synth_original",
contribution_weight=0.8 # 80% based on parent
)
# Get lineage
parents = store.get_lineage("synth_new_variation")
for parent in parents:
print(f"Parent: {parent['parent_synth_id']}")
print(f"Contribution: {parent['contribution_weight']}")
Implementation Notes¶
Security Considerations¶
- No shell injection: Never uses
shell=Truein subprocess calls - Timeout protection: All subprocess calls have timeout limits
- Input validation: File extensions and paths validated before processing
- SQL injection prevention: Uses SQLAlchemy ORM with parameterized queries
Error Handling¶
All operations use structured exceptions from audiomancer.errors:
- SynthDefError: Parsing and validation errors
- SubprocessTimeoutError: sclang timeout
- StorageError: Database operation errors
All errors include details dict for debugging.
Performance¶
- Regex parsing: Fast, no external dependencies
- Database indexing: Indexes on name, file_path, file_hash
- Batch operations: Not implemented (single synths typically added)
- JSON serialization: Minimal overhead for complex fields
Future Enhancements¶
- Complete sclang parsing: Implement full SuperCollider subprocess integration for more accurate metadata extraction
- Batch operations: Add
add_batch()for importing multiple synths - Vector embeddings: Generate and store embeddings for similarity search
- Parameter range analysis: Extract min/max ranges from ControlSpecs
- UGen graph extraction: Parse and visualize signal flow
- Audio rendering: Render synth examples for preview
Test Fixtures¶
The implementation includes two test SynthDefs:
- simple_sine.scd: Basic sine wave with gate and envelope
- tb303.scd: Complex acid bass with filters, envelopes, and multiple controls
Both are used to verify parser accuracy and categorization logic.
Integration¶
The new components are fully integrated:
- Exported from
audiomancer.analyzersmodule - Exported from
audiomancer.storagemodule - Compatible with existing database schema
- Follows existing code patterns and conventions