Performance Tuning

Optimize SwissArmyHammer for speed, memory usage, and scalability in different environments.

Performance Overview

SwissArmyHammer is designed for performance across several dimensions:

Startup Time: Fast initialization for CLI commands
Memory Usage: Efficient memory management for large codebases
I/O Performance: Optimized file system operations
Search Speed: Fast semantic search with vector databases
Template Rendering: Efficient Liquid template processing
Concurrent Operations: Parallel execution where beneficial

Benchmarking

Built-in Benchmarks

SwissArmyHammer includes comprehensive benchmarks:

# Run all benchmarks
cargo bench

# Run specific benchmark suites
cargo bench search
cargo bench templates
cargo bench workflows

# Compare with baseline
cargo bench -- --save-baseline main
git checkout feature-branch
cargo bench -- --baseline main

Profiling Tools

CPU Profiling

# Install profiling tools
cargo install cargo-flamegraph

# Profile a specific command
cargo flamegraph --bin sah -- search query "error handling"

# Profile with perf (Linux)
perf record --call-graph=dwarf cargo run --bin sah -- search index "**/*.rs"
perf report

Memory Profiling

# Install memory profilers
cargo install cargo-profdata

# Profile memory usage
valgrind --tool=massif cargo run --bin sah -- search index "**/*.rs"
ms_print massif.out.12345

# Use heaptrack (Linux)
heaptrack cargo run --bin sah -- search index "**/*.rs"
heaptrack_gui heaptrack.sah.12345.gz

Configuration Tuning

General Performance Settings

# ~/.swissarmyhammer/sah.toml

[general]
# Disable auto-reload for better performance
auto_reload = false

# Increase timeout for large operations
default_timeout_ms = 60000

[template]
# Increase cache size for frequently used templates
cache_size = 2000

# Disable template recompilation in production
recompile_templates = false

[workflow]
# Increase parallel action limit for powerful machines
max_parallel_actions = 8

# Enable workflow caching
enable_caching = true
cache_dir = "/tmp/sah-workflow-cache"

[search]
# Use faster but larger embedding model
embedding_model = "nomic-embed-code"

# Increase memory limits for large indexes
max_memory_mb = 2048

# Optimize index for read performance
index_compression = false

Memory Optimization

[security]
# Reduce memory limits for resource-constrained environments
max_memory_mb = 256
max_disk_usage_mb = 1024

[search]
# Limit file size for indexing
max_file_size = 524288  # 512KB

# Reduce embedding dimensions for smaller memory footprint
embedding_dimensions = 384  # vs 768 default

[template]
# Smaller template cache
cache_size = 100

# Aggressive cache eviction
cache_ttl_ms = 300000  # 5 minutes

I/O Optimization

[general]
# Use faster file watching (when available)
file_watcher = "polling"  # or "native"

# Batch file operations
batch_size = 100

[search]
# Use faster storage backend
storage_backend = "memory"  # for small indexes
# storage_backend = "disk"   # for large indexes

# Enable compression for large indexes
enable_compression = true

# Use faster hash function
hash_algorithm = "xxhash"

Search Performance

Indexing Optimization

Selective Indexing

# Index only important directories
sah search index "src/**/*.{rs,py,js}" --exclude "**/target/**"

# Avoid large generated files
sah search index "**/*.rs" \
  --exclude "**/target/**" \
  --exclude "**/node_modules/**" \
  --exclude "**/*.generated.*"

# Set file size limits
sah search index "**/*.rs" --max-size 1048576  # 1MB limit

Parallel Indexing

[search]
# Enable parallel file processing
parallel_indexing = true
indexing_threads = 4

# Batch processing for better throughput
batch_size = 50

# Use memory mapping for large files
use_mmap = true

Incremental Indexing

# Only index changed files (much faster)
sah search index "**/*.rs"  # Skips unchanged files automatically

# Force full reindex only when needed
sah search index "**/*.rs" --force

Query Optimization

Efficient Queries

# Use specific, focused queries
sah search query "async function error handling" --limit 5

# Adjust similarity threshold for faster results
sah search query "database connection" --threshold 0.7

# Use exact matches when possible
sah search query "fn main()" --threshold 0.9

Query Caching

[search]
# Enable query result caching
cache_results = true
result_cache_size = 1000
result_cache_ttl_ms = 300000  # 5 minutes

# Cache embeddings for repeated queries
cache_embeddings = true
embedding_cache_size = 10000

Template Performance

Template Optimization

Efficient Template Design

{% comment %}Good: Filter once, use multiple times{% endcomment %}
{% assign active_users = users | where: "active", true %}
Active users: {{active_users | size}}
Names: {{active_users | map: "name" | join: ", "}}

{% comment %}Avoid: Repeated filtering{% endcomment %}
Active users: {{users | where: "active", true | size}}
Names: {{users | where: "active", true | map: "name" | join: ", "}}

Loop Optimization

{% comment %}Good: Early termination{% endcomment %}
{% for item in items limit:10 %}
  {% if item.important %}
    {{item.name}}
    {% break %}
  {% endif %}
{% endfor %}

{% comment %}Good: Batch operations{% endcomment %}
{% assign important_items = items | where: "important", true %}
{% for item in important_items limit:10 %}
  {{item.name}}
{% endfor %}

Template Caching

[template]
# Aggressive caching for production
cache_size = 5000
cache_compiled_templates = true

# Pre-compile frequently used templates
precompile_templates = [
  "code-review",
  "documentation", 
  "test-generator"
]

Variable Management

{% comment %}Cache expensive computations{% endcomment %}
{% assign file_count = files | size %}
{% if file_count > 0 %}
  Processing {{file_count}} files...
  {% for file in files %}
    File: {{file.name}} ({{forloop.index}}/{{file_count}})
  {% endfor %}
{% endif %}

Workflow Performance

Parallel Execution

[workflow]
# Optimize for CPU cores
max_parallel_actions = 8

# Enable fork-join optimization
optimize_forks = true

# Use async execution where possible
prefer_async = true

Action Optimization

Shell Actions

**Actions:**
# Good: Combine related commands
- shell: `cargo build && cargo test --lib` (timeout: 300s)

# Avoid: Separate slow commands
- shell: `cargo build` (timeout: 120s)
- shell: `cargo test --lib` (timeout: 180s)

Prompt Actions

**Actions:**
# Good: Batch similar prompts
- prompt: multi-analyzer files="$(find . -name '*.rs' | head -10)" analysis_type="comprehensive"

# Avoid: Individual file analysis
- prompt: code-reviewer file="src/main.rs"
- prompt: code-reviewer file="src/lib.rs"

State Machine Optimization

# Good: Minimize state transitions
### build-and-test
**Actions:**
- shell: `cargo build --release`
- shell: `cargo test --release`
**Transitions:**
- On success → deploy
- On failure → failed

# Avoid: Too many small states
### build
**Actions:**
- shell: `cargo build --release`
**Transitions:**
- Always → test

### test  
**Actions:**
- shell: `cargo test --release`
**Transitions:**
- On success → deploy

System-Level Optimization

File System Performance

SSD Optimization

# Use SSD for search database
mkdir -p /mnt/ssd/sah-cache
sah config set search.index_path "/mnt/ssd/sah-cache/search.db"

# Use tmpfs for temporary operations
mkdir -p /tmp/sah-temp
sah config set workflow.temp_dir "/tmp/sah-temp"

Network File Systems

[general]
# Reduce file watching on network filesystems
auto_reload = false

# Use local cache
local_cache_dir = "/tmp/sah-cache"

[search]
# Cache index locally
local_index_cache = true
cache_dir = "/tmp/sah-search-cache"

Memory Management

Large Scale Operations

# For large codebases, use streaming operations
export SAH_STREAMING_MODE=true
export SAH_MAX_MEMORY=4G

# Process in batches
sah search index "**/*.rs" --batch-size 100

# Use disk-based sorting for large datasets
export SAH_USE_DISK_SORT=true

Memory-Constrained Environments

[search]
# Use smaller embedding model
embedding_model = "all-MiniLM-L6-v2"  # 384 dimensions vs 768

# Reduce cache sizes
embedding_cache_size = 1000
result_cache_size = 100

# Enable aggressive garbage collection
gc_threshold = 1000

CPU Optimization

Multi-core Systems

[general]
# Use all available cores
worker_threads = 0  # Auto-detect

[search]
# Parallel indexing
indexing_threads = 8
search_threads = 4

[workflow]
# Parallel action execution
max_parallel_actions = 16

Single-core Systems

[general]
# Minimize threading overhead
worker_threads = 1

[search]
# Sequential processing
indexing_threads = 1
search_threads = 1

[workflow]
# Sequential execution
max_parallel_actions = 1

Monitoring and Profiling

Runtime Metrics

# Enable detailed timing
export SAH_ENABLE_TIMING=true
export SAH_LOG_LEVEL=debug

# Monitor with built-in metrics
sah doctor --check performance

# Profile specific operations
time sah search query "error handling"
time sah prompt test code-reviewer --var file=src/main.rs

Performance Monitoring

[logging]
# Enable performance logging
enable_timing = true
log_slow_operations = true
slow_operation_threshold_ms = 1000

[metrics]
# Export metrics for monitoring
enable_metrics = true
metrics_port = 9090
metrics_endpoint = "/metrics"

Continuous Performance Testing

# Add performance tests to CI
#!/bin/bash
# performance-test.sh

# Index performance
time_start=$(date +%s%N)
sah search index "**/*.rs" --force >/dev/null 2>&1
time_end=$(date +%s%N)
index_time=$(( (time_end - time_start) / 1000000 ))

echo "Index time: ${index_time}ms"

# Query performance
time_start=$(date +%s%N)
sah search query "async function" >/dev/null 2>&1
time_end=$(date +%s%N)
query_time=$(( (time_end - time_start) / 1000000 ))

echo "Query time: ${query_time}ms"

# Fail if performance regression
if [ $index_time -gt 30000 ]; then
    echo "Index performance regression!"
    exit 1
fi

if [ $query_time -gt 1000 ]; then
    echo "Query performance regression!"
    exit 1
fi

Performance Troubleshooting

Common Issues

Slow Startup

# Check file system performance
time ls -la ~/.swissarmyhammer/

# Disable auto-reload
sah config set general.auto_reload false

# Clear caches
rm -rf ~/.swissarmyhammer/cache/

High Memory Usage

# Monitor memory usage
ps aux | grep sah
pmap $(pidof sah)

# Reduce cache sizes
sah config set template.cache_size 100
sah config set search.embedding_cache_size 1000

# Enable streaming mode
export SAH_STREAMING_MODE=true

Slow Search Performance

# Check index size
ls -lh ~/.swissarmyhammer/search.db

# Rebuild index with optimizations
sah search index "**/*.rs" --force --optimize

# Use smaller embedding model
sah config set search.embedding_model "all-MiniLM-L6-v2"

By applying these performance tuning techniques, SwissArmyHammer can be optimized for various environments and use cases, from resource-constrained development machines to high-performance CI/CD servers.

Keyboard shortcuts

SwissArmyHammer Documentation