Biotech & pharmaceutical statistics built on PyStatistics
Domain-specific methods for the drug development pipeline. Free forever.
PyStatsBio provides domain-specific statistical methods for biotech and pharmaceutical research. It's built on top of PyStatistics, inheriting its validation rigor and GPU acceleration capabilities.
The library covers four key areas of the drug development pipeline: clinical trial planning (power analysis), preclinical pharmacology (dose-response), biomarker evaluation (diagnostic accuracy), and pharmacokinetics (NCA).
PyStatistics is the general-purpose statistical engine — regression, ANOVA, mixed models, hypothesis testing. PyStatsBio adds domain-specific methods that biostatisticians and pharmacologists need. Install PyStatsBio and you get both.
Coming soon to PyPI. For now, install from source:
pip install git+https://github.com/sgcx-org/pystatsbio.git
This automatically installs PyStatistics as a dependency. For GPU support:
pip install "pystatsbio[gpu] @ git+https://github.com/sgcx-org/pystatsbio.git"
from pystatsbio.power import power_t_test, power_logrank
# How many subjects for a two-sample t-test?
result = power_t_test(d=0.5, alpha=0.05, power=0.8)
print(f"Required n per group: {result.n}")
# Sample size for a survival study (log-rank test)
result = power_logrank(hr=0.7, alpha=0.05, power=0.8, p_event=0.6)
print(f"Required total N: {result.n}")
from pystatsbio.doseresponse import fit_drm, ec50
import numpy as np
dose = np.array([0.01, 0.1, 1, 10, 100, 1000])
response = np.array([5, 10, 30, 70, 90, 95])
# Fit a 4-parameter logistic curve (self-starting, no manual guesses)
result = fit_drm(dose, response, model='LL.4')
print(result.summary())
# Extract EC50 with confidence interval
ec = ec50(result, conf_level=0.95)
print(f"EC50: {ec.estimate} ({ec.ci_lower}, {ec.ci_upper})")
from pystatsbio.diagnostic import roc, diagnostic_accuracy
import numpy as np
response = np.array([1, 1, 1, 0, 0, 0, 1, 0, 1, 0])
predictor = np.random.randn(10)
# ROC curve with DeLong confidence intervals
result = roc(response, predictor)
print(f"AUC: {result.auc} ({result.auc_ci_lower}, {result.auc_ci_upper}")
# Sensitivity, specificity, PPV, NPV at a cutoff
dx = diagnostic_accuracy(response, predictor, cutoff=0.5)
print(dx.sensitivity, dx.specificity)
Each module targets a specific phase of biotech research and validates against established R packages.
Sample size & power calculations. t-tests, proportions, log-rank, ANOVA, non-inferiority, equivalence, crossover, cluster trials.
Validates against: pwr, TrialSize, gsDesign, PowerTOST
API Reference →4PL/5PL curve fitting, EC50/IC50, relative potency, benchmark dose. GPU-accelerated batch fitting for HTS.
Validates against: drc, nplr, BMDS
API Reference →ROC analysis with DeLong CIs, sensitivity/specificity, optimal cutoffs, batch AUC for biomarker panels.
Validates against: pROC, OptimalCutpoints, epiR
API Reference →Non-compartmental pharmacokinetic analysis. AUC (linear-up/log-down), Cmax, half-life, clearance.
Validates against: PKNCA, NonCompart
API Reference →