Diagnostic Accuracy¶
ROC analysis with DeLong confidence intervals, sensitivity/specificity with exact binomial CIs, optimal cutoff selection, and GPU-accelerated batch AUC for biomarker panels.
Validates against R packages: pROC, OptimalCutpoints, epiR.
Diagnostic accuracy analysis for biomarker evaluation.
ROC analysis, sensitivity/specificity, predictive values, likelihood ratios, and high-throughput batch AUC computation for biomarker panel screening.
Validates against: R packages pROC, OptimalCutpoints, epiR.
- class pystatsbio.diagnostic.ROCResult(thresholds, tpr, fpr, auc, auc_se, auc_ci_lower, auc_ci_upper, conf_level, n_positive, n_negative, direction)[source]¶
Bases:
objectResult of ROC analysis.
- Parameters:
- thresholds¶
Thresholds at which TPR/FPR are evaluated. Includes
-infand+infso the curve always passes through (0,0) and (1,1).- Type:
array
- tpr¶
True positive rate (sensitivity) at each threshold.
- Type:
array
- fpr¶
False positive rate (1 − specificity) at each threshold.
- Type:
array
- auc_ci_lower, auc_ci_upper
Confidence interval for AUC (logit-transformed DeLong).
- Type:
- n_positive, n_negative
Number of positive (case) and negative (control) observations.
- Type:
- class pystatsbio.diagnostic.DiagnosticResult(cutoff, sensitivity, sensitivity_ci, specificity, specificity_ci, ppv, npv, lr_positive, lr_negative, dor, dor_ci, prevalence, conf_level, method)[source]¶
Bases:
objectResult of diagnostic accuracy evaluation at a fixed cutoff.
All CIs use the method specified in
method(e.g.'clopper-pearson'for exact binomial CIs).- Parameters:
- class pystatsbio.diagnostic.ROCTestResult(statistic, p_value, auc1, auc2, auc_diff, method)[source]¶
Bases:
objectResult of comparing two correlated ROC curves (DeLong test).
- Parameters:
- class pystatsbio.diagnostic.CutoffResult(cutoff, sensitivity, specificity, method, criterion_value)[source]¶
Bases:
objectResult of optimal cutoff selection.
- Parameters:
- class pystatsbio.diagnostic.BatchAUCResult(auc, se, n_markers)[source]¶
Bases:
objectResult of batch AUC computation across multiple biomarkers.
- Parameters:
- pystatsbio.diagnostic.roc(response, predictor, *, direction='auto', conf_level=0.95)[source]¶
Compute empirical ROC curve with DeLong AUC confidence interval.
- Parameters:
response (array of int) – Binary outcome (0/1).
predictor (array of float) – Continuous predictor (biomarker value).
direction (str) –
'<'(controls < cases, higher predictor → positive),'>'(controls > cases, lower predictor → positive), or'auto'(choose direction giving AUC ≥ 0.5).conf_level (float) – Confidence level for AUC CI.
- Returns:
ROCResult
Validates against (R
pROC::roc(),pROC::ci.auc())
- Return type:
- pystatsbio.diagnostic.roc_test(roc1, roc2, *, predictor1=None, predictor2=None, response=None, method='delong')[source]¶
Compare two correlated ROC curves using DeLong’s test.
The two ROC curves must be computed on the same subjects (same response vector). The original predictor values and shared response are required to compute the paired DeLong covariance.
- Parameters:
roc1 (ROCResult) – Two ROC curves computed on the same subjects.
roc2 (ROCResult) – Two ROC curves computed on the same subjects.
predictor1 (array of float) – Original predictor values for each marker.
predictor2 (array of float) – Original predictor values for each marker.
response (array of int) – Shared binary outcome.
method (str) –
'delong'(only supported method).
- Returns:
ROCTestResult
Validates against (R
pROC::roc.test())
- Return type:
- pystatsbio.diagnostic.diagnostic_accuracy(response, predictor, *, cutoff, direction='<', prevalence=None, conf_level=0.95, ci_method='clopper-pearson')[source]¶
Compute diagnostic accuracy metrics at a fixed cutoff.
- Parameters:
response (array of int) – Binary outcome (0/1).
predictor (array of float) – Continuous predictor.
cutoff (float) – Classification threshold.
direction (str) –
'<'means predictor ≥ cutoff is classified positive (controls < cases, higher values = disease).'>'means predictor ≤ cutoff is classified positive (controls > cases, lower values = disease).prevalence (float or None) – Disease prevalence for PPV/NPV adjustment via Bayes’ theorem. If
None, uses sample prevalence.conf_level (float) – Confidence level.
ci_method (str) –
'clopper-pearson'(exact) or'wilson'.
- Returns:
DiagnosticResult
Validates against (R
epiR::epi.tests())
- Return type:
- pystatsbio.diagnostic.optimal_cutoff(roc_result, *, method='youden', cost_fp=1.0, cost_fn=1.0, prevalence=None)[source]¶
Find optimal classification cutoff from an ROC curve.
- Parameters:
roc_result (ROCResult) – A computed ROC curve.
method (str) –
'youden'— maximize sensitivity + specificity − 1.'closest_topleft'— minimize distance to(FPR=0, TPR=1).'cost'— minimize weighted misclassification cost.cost_fp (float) – Costs of false positives and false negatives (for
method='cost').cost_fn (float) – Costs of false positives and false negatives (for
method='cost').prevalence (float or None) – Disease prevalence (for
method='cost'). Uses sample prevalencen_positive / (n_positive + n_negative)ifNone.
- Returns:
CutoffResult
Validates against (R
OptimalCutpoints::optimal.cutpoints())
- Return type:
- pystatsbio.diagnostic.batch_auc(response, predictors, *, backend='auto')[source]¶
Compute AUC for many biomarker candidates simultaneously.
- Parameters:
response (array of int, shape
(n_samples,)) – Shared binary outcome (0/1).predictors (array of float, shape
(n_samples, n_markers)) – Matrix of biomarker values (one column per candidate marker).backend (str) –
'cpu','gpu', or'auto'.
- Return type:
Notes
GPU backend is beneficial when
n_markers > 100. Uses rank-based AUC computation which is embarrassingly parallel across markers. DeLong standard errors are computed for each marker.