Performance Tuning
Optimize SwissArmyHammer for speed, memory usage, and scalability in different environments.
Performance Overview
SwissArmyHammer is designed for performance across several dimensions:
- Startup Time: Fast initialization for CLI commands
- Memory Usage: Efficient memory management for large codebases
- I/O Performance: Optimized file system operations
- Search Speed: Fast semantic search with vector databases
- Template Rendering: Efficient Liquid template processing
- Concurrent Operations: Parallel execution where beneficial
Benchmarking
Built-in Benchmarks
SwissArmyHammer includes comprehensive benchmarks:
# Run all benchmarks
cargo bench
# Run specific benchmark suites
cargo bench search
cargo bench templates
cargo bench workflows
# Compare with baseline
cargo bench -- --save-baseline main
git checkout feature-branch
cargo bench -- --baseline main
Profiling Tools
CPU Profiling
# Install profiling tools
cargo install cargo-flamegraph
# Profile a specific command
cargo flamegraph --bin sah -- search query "error handling"
# Profile with perf (Linux)
perf record --call-graph=dwarf cargo run --bin sah -- search index "**/*.rs"
perf report
Memory Profiling
# Install memory profilers
cargo install cargo-profdata
# Profile memory usage
valgrind --tool=massif cargo run --bin sah -- search index "**/*.rs"
ms_print massif.out.12345
# Use heaptrack (Linux)
heaptrack cargo run --bin sah -- search index "**/*.rs"
heaptrack_gui heaptrack.sah.12345.gz
Configuration Tuning
General Performance Settings
# ~/.swissarmyhammer/sah.toml
[general]
# Disable auto-reload for better performance
auto_reload = false
# Increase timeout for large operations
default_timeout_ms = 60000
[template]
# Increase cache size for frequently used templates
cache_size = 2000
# Disable template recompilation in production
recompile_templates = false
[workflow]
# Increase parallel action limit for powerful machines
max_parallel_actions = 8
# Enable workflow caching
enable_caching = true
cache_dir = "/tmp/sah-workflow-cache"
[search]
# Use faster but larger embedding model
embedding_model = "nomic-embed-code"
# Increase memory limits for large indexes
max_memory_mb = 2048
# Optimize index for read performance
index_compression = false
Memory Optimization
[security]
# Reduce memory limits for resource-constrained environments
max_memory_mb = 256
max_disk_usage_mb = 1024
[search]
# Limit file size for indexing
max_file_size = 524288 # 512KB
# Reduce embedding dimensions for smaller memory footprint
embedding_dimensions = 384 # vs 768 default
[template]
# Smaller template cache
cache_size = 100
# Aggressive cache eviction
cache_ttl_ms = 300000 # 5 minutes
I/O Optimization
[general]
# Use faster file watching (when available)
file_watcher = "polling" # or "native"
# Batch file operations
batch_size = 100
[search]
# Use faster storage backend
storage_backend = "memory" # for small indexes
# storage_backend = "disk" # for large indexes
# Enable compression for large indexes
enable_compression = true
# Use faster hash function
hash_algorithm = "xxhash"
Search Performance
Indexing Optimization
Selective Indexing
# Index only important directories
sah search index "src/**/*.{rs,py,js}" --exclude "**/target/**"
# Avoid large generated files
sah search index "**/*.rs" \
--exclude "**/target/**" \
--exclude "**/node_modules/**" \
--exclude "**/*.generated.*"
# Set file size limits
sah search index "**/*.rs" --max-size 1048576 # 1MB limit
Parallel Indexing
[search]
# Enable parallel file processing
parallel_indexing = true
indexing_threads = 4
# Batch processing for better throughput
batch_size = 50
# Use memory mapping for large files
use_mmap = true
Incremental Indexing
# Only index changed files (much faster)
sah search index "**/*.rs" # Skips unchanged files automatically
# Force full reindex only when needed
sah search index "**/*.rs" --force
Query Optimization
Efficient Queries
# Use specific, focused queries
sah search query "async function error handling" --limit 5
# Adjust similarity threshold for faster results
sah search query "database connection" --threshold 0.7
# Use exact matches when possible
sah search query "fn main()" --threshold 0.9
Query Caching
[search]
# Enable query result caching
cache_results = true
result_cache_size = 1000
result_cache_ttl_ms = 300000 # 5 minutes
# Cache embeddings for repeated queries
cache_embeddings = true
embedding_cache_size = 10000
Template Performance
Template Optimization
Efficient Template Design
{% comment %}Good: Filter once, use multiple times{% endcomment %}
{% assign active_users = users | where: "active", true %}
Active users: {{active_users | size}}
Names: {{active_users | map: "name" | join: ", "}}
{% comment %}Avoid: Repeated filtering{% endcomment %}
Active users: {{users | where: "active", true | size}}
Names: {{users | where: "active", true | map: "name" | join: ", "}}
Loop Optimization
{% comment %}Good: Early termination{% endcomment %}
{% for item in items limit:10 %}
{% if item.important %}
{{item.name}}
{% break %}
{% endif %}
{% endfor %}
{% comment %}Good: Batch operations{% endcomment %}
{% assign important_items = items | where: "important", true %}
{% for item in important_items limit:10 %}
{{item.name}}
{% endfor %}
Template Caching
[template]
# Aggressive caching for production
cache_size = 5000
cache_compiled_templates = true
# Pre-compile frequently used templates
precompile_templates = [
"code-review",
"documentation",
"test-generator"
]
Variable Management
{% comment %}Cache expensive computations{% endcomment %}
{% assign file_count = files | size %}
{% if file_count > 0 %}
Processing {{file_count}} files...
{% for file in files %}
File: {{file.name}} ({{forloop.index}}/{{file_count}})
{% endfor %}
{% endif %}
Workflow Performance
Parallel Execution
[workflow]
# Optimize for CPU cores
max_parallel_actions = 8
# Enable fork-join optimization
optimize_forks = true
# Use async execution where possible
prefer_async = true
Action Optimization
Shell Actions
**Actions:**
# Good: Combine related commands
- shell: `cargo build && cargo test --lib` (timeout: 300s)
# Avoid: Separate slow commands
- shell: `cargo build` (timeout: 120s)
- shell: `cargo test --lib` (timeout: 180s)
Prompt Actions
**Actions:**
# Good: Batch similar prompts
- prompt: multi-analyzer files="$(find . -name '*.rs' | head -10)" analysis_type="comprehensive"
# Avoid: Individual file analysis
- prompt: code-reviewer file="src/main.rs"
- prompt: code-reviewer file="src/lib.rs"
State Machine Optimization
# Good: Minimize state transitions
### build-and-test
**Actions:**
- shell: `cargo build --release`
- shell: `cargo test --release`
**Transitions:**
- On success → deploy
- On failure → failed
# Avoid: Too many small states
### build
**Actions:**
- shell: `cargo build --release`
**Transitions:**
- Always → test
### test
**Actions:**
- shell: `cargo test --release`
**Transitions:**
- On success → deploy
System-Level Optimization
File System Performance
SSD Optimization
# Use SSD for search database
mkdir -p /mnt/ssd/sah-cache
sah config set search.index_path "/mnt/ssd/sah-cache/search.db"
# Use tmpfs for temporary operations
mkdir -p /tmp/sah-temp
sah config set workflow.temp_dir "/tmp/sah-temp"
Network File Systems
[general]
# Reduce file watching on network filesystems
auto_reload = false
# Use local cache
local_cache_dir = "/tmp/sah-cache"
[search]
# Cache index locally
local_index_cache = true
cache_dir = "/tmp/sah-search-cache"
Memory Management
Large Scale Operations
# For large codebases, use streaming operations
export SAH_STREAMING_MODE=true
export SAH_MAX_MEMORY=4G
# Process in batches
sah search index "**/*.rs" --batch-size 100
# Use disk-based sorting for large datasets
export SAH_USE_DISK_SORT=true
Memory-Constrained Environments
[search]
# Use smaller embedding model
embedding_model = "all-MiniLM-L6-v2" # 384 dimensions vs 768
# Reduce cache sizes
embedding_cache_size = 1000
result_cache_size = 100
# Enable aggressive garbage collection
gc_threshold = 1000
CPU Optimization
Multi-core Systems
[general]
# Use all available cores
worker_threads = 0 # Auto-detect
[search]
# Parallel indexing
indexing_threads = 8
search_threads = 4
[workflow]
# Parallel action execution
max_parallel_actions = 16
Single-core Systems
[general]
# Minimize threading overhead
worker_threads = 1
[search]
# Sequential processing
indexing_threads = 1
search_threads = 1
[workflow]
# Sequential execution
max_parallel_actions = 1
Monitoring and Profiling
Runtime Metrics
# Enable detailed timing
export SAH_ENABLE_TIMING=true
export SAH_LOG_LEVEL=debug
# Monitor with built-in metrics
sah doctor --check performance
# Profile specific operations
time sah search query "error handling"
time sah prompt test code-reviewer --var file=src/main.rs
Performance Monitoring
[logging]
# Enable performance logging
enable_timing = true
log_slow_operations = true
slow_operation_threshold_ms = 1000
[metrics]
# Export metrics for monitoring
enable_metrics = true
metrics_port = 9090
metrics_endpoint = "/metrics"
Continuous Performance Testing
# Add performance tests to CI
#!/bin/bash
# performance-test.sh
# Index performance
time_start=$(date +%s%N)
sah search index "**/*.rs" --force >/dev/null 2>&1
time_end=$(date +%s%N)
index_time=$(( (time_end - time_start) / 1000000 ))
echo "Index time: ${index_time}ms"
# Query performance
time_start=$(date +%s%N)
sah search query "async function" >/dev/null 2>&1
time_end=$(date +%s%N)
query_time=$(( (time_end - time_start) / 1000000 ))
echo "Query time: ${query_time}ms"
# Fail if performance regression
if [ $index_time -gt 30000 ]; then
echo "Index performance regression!"
exit 1
fi
if [ $query_time -gt 1000 ]; then
echo "Query performance regression!"
exit 1
fi
Performance Troubleshooting
Common Issues
Slow Startup
# Check file system performance
time ls -la ~/.swissarmyhammer/
# Disable auto-reload
sah config set general.auto_reload false
# Clear caches
rm -rf ~/.swissarmyhammer/cache/
High Memory Usage
# Monitor memory usage
ps aux | grep sah
pmap $(pidof sah)
# Reduce cache sizes
sah config set template.cache_size 100
sah config set search.embedding_cache_size 1000
# Enable streaming mode
export SAH_STREAMING_MODE=true
Slow Search Performance
# Check index size
ls -lh ~/.swissarmyhammer/search.db
# Rebuild index with optimizations
sah search index "**/*.rs" --force --optimize
# Use smaller embedding model
sah config set search.embedding_model "all-MiniLM-L6-v2"
By applying these performance tuning techniques, SwissArmyHammer can be optimized for various environments and use cases, from resource-constrained development machines to high-performance CI/CD servers.