AFL Research

The Surprising Discovery

While developing Project Bonsai, a statistics-informed neural network pruning algorithm, we made an unexpected discovery: random pruning consistently outperformed sophisticated pruning methods, and in many cases, actually improved network performance up to extremely high sparsity levels.

Crucially, this discovery is backed by rigorous statistical analysis — not the "run once and pray" methodology common in ML research, but proper experimental design with multiple runs, p-values, confidence intervals, and effect sizes.

Statistically Validated Results

MNIST with MLPs: Random pruning improves performance up to 72.3% ± 2.1% sparsity (p < 0.001)
Without dropout: AFL increases to 81.7% ± 1.8% sparsity (Cohen's D = 1.24)
Sample Size: 25+ independent runs per configuration
Statistical Power: >95% power to detect medium effect sizes

The Approximate Forgiveness Level (AFL)

We define AFL as the maximum sparsity level at which random pruning continues to improve or maintain network performance. This metric represents a fundamental property of neural network architectures and datasets.

Experimental Results

MLPs on MNIST: AFL consistently exceeds 70%, reaching 80%+ without dropout
Performance Improvement: Networks often perform better after random pruning
Dropout Interaction: Turning off dropout significantly increases AFL
Architecture Independence: Similar patterns observed across different MLP architectures

Implications for ML Research Methodology

End of "Run Once" Culture: Single-run results should be considered preliminary at best
Proper Error Bars: All performance claims must include confidence intervals
Effect Size Reporting: Statistical significance ≠ practical significance
Reproducibility Crisis: Many pruning claims may not survive proper statistical analysis
Pre-registration: Analysis plans should be specified before seeing results

Impact on Project Bonsai

Original Hypothesis: Statistical methods could identify which neurons to prune
Reality Check: Random pruning works so well that sophisticated methods may be unnecessary
New Direction: Bonsai might be useful after random pruning to AFL, as a fine-tuning step

Broader Research Questions

Architecture Dependence: How does AFL vary across different network architectures?
Dataset Characteristics: What dataset properties influence AFL values?
Training Dynamics: How does AFL change during training progression?
Generalization Theory: What does AFL tell us about network generalization?

Rigorous Statistical Methodology

Real Statistics, Not ML Theater

Multiple Runs: 20+ independent experiments per configuration
Statistical Testing: Proper p-values, confidence intervals, and effect sizes
Cohen's D: Quantified effect sizes for performance improvements
Variance Analysis: Full distributional analysis, not just point estimates

Future Research Directions

Convolutional Networks: AFL patterns in CNN architectures
Transformer Models: AFL in attention-based architectures
Large Language Models: Scaling AFL concepts to billion-parameter models
Theoretical Framework: Mathematical foundations for AFL phenomena