Research & Publications

Advancing statistical methodology through human-AI collaboration

Research Philosophy

Our research is built on the principle that transparency about human-AI collaboration strengthens rather than weakens scientific credibility. We openly acknowledge AI assistance in our research process while maintaining rigorous statistical standards.

Core Principles

Transparency: Open about methodologies and AI collaboration
Rigor: Maintaining statistical standards while embracing innovation
Reproducibility: All methods designed for validation and extension
Impact: Focus on problems that matter to real research

Current Research Areas

Missing Data Mechanism Detection

Our flagship research addresses the critical gap in distinguishing between MAR (Missing at Random) and MNAR (Missing Not at Random) mechanisms. Traditional statistical practice often handwaves this distinction, leading to potentially biased analyses.

  • Neural network architectures for missingness pattern recognition
  • Transformer-based attention mechanisms for missing data analysis
  • Quantified uncertainty in missingness mechanism assessment
  • Applications in pharmaceutical, financial, and insurance domains

Optimizer Performance Analysis

Systematic evaluation of machine learning optimizers against known global minima using toy problems where exact solutions are feasible.

  • Statistical rigor in optimizer comparison (50+ runs, confidence intervals)
  • Performance gaps between claimed and actual optimization results
  • Loss landscape characteristics and optimizer suitability
  • Evidence-based optimizer selection guidelines

Methodological Contributions

Computational Statistics

Leveraging modern GPU architectures and cloud computing for statistical methodology:

  • GPU-native statistical libraries (PyMVNMLE, PyRegression)
  • Scalable implementations of classical statistical methods
  • Modern interfaces for traditional statistical workflows

Human-AI Collaboration Frameworks

Developing best practices for transparent AI-assisted research:

  • Methodological transparency in AI-assisted research
  • Validation frameworks for AI-generated hypotheses
  • Quality control in human-AI collaborative workflows

Upcoming Publications

"Approximate Forgiveness Level: Random Pruning Outperforms Sophisticated Methods"

Target Venue: International Conference on Learning Representations (ICLR) or NeurIPS

Status: Manuscript in preparation

Summary: Presents the surprising discovery that random pruning improves neural network performance up to 70-80% sparsity, challenging fundamental assumptions in the pruning literature. Demonstrates proper statistical methodology with 25+ runs per configuration, confidence intervals, and effect sizes - setting a new standard for ML experimental rigor.

"Neural Networks for Missing Data Mechanism Detection"

Target Journal: Journal of the American Statistical Association (JASA)

Status: Manuscript in preparation

Summary: Presents Project Lacuna methodology, validation results, and applications across multiple domains. Demonstrates significant improvements over traditional Little's MCAR test and related approaches.

"Quantifying Optimizer Performance: A Blacklight Analysis"

Target Conference: International Conference on Machine Learning (ICML)

Status: Research phase

Summary: Systematic evaluation of popular optimizers against known global minima, revealing true performance gaps and providing evidence-based selection guidelines.

Open Research

We believe in advancing the field through open methodologies and reproducible research. Our work is designed to complement and enhance traditional statistical practice:

  • Open Source Libraries: PyMVNMLE, PyRegression, and related tools
  • Reproducible Methods: All research includes complete implementation details
  • Community Impact: Tools designed for immediate practical application
  • Educational Resources: Documentation and tutorials for new methodologies
Collaborate View Code →