Diagnostic Accuracy

ROC analysis with DeLong confidence intervals, sensitivity/specificity with exact binomial CIs, optimal cutoff selection, and GPU-accelerated batch AUC for biomarker panels.

Validates against R packages: pROC, OptimalCutpoints, epiR.

Diagnostic accuracy analysis for biomarker evaluation.

ROC analysis, sensitivity/specificity, predictive values, likelihood ratios, and high-throughput batch AUC computation for biomarker panel screening.

Validates against: R packages pROC, OptimalCutpoints, epiR.

class pystatsbio.diagnostic.ROCResult(thresholds, tpr, fpr, auc, auc_se, auc_ci_lower, auc_ci_upper, conf_level, n_positive, n_negative, direction)[source]

Bases: object

Result of ROC analysis.

Parameters:
thresholds

Thresholds at which TPR/FPR are evaluated. Includes -inf and +inf so the curve always passes through (0,0) and (1,1).

Type:

array

tpr

True positive rate (sensitivity) at each threshold.

Type:

array

fpr

False positive rate (1 − specificity) at each threshold.

Type:

array

auc

Area under the ROC curve (Mann-Whitney U / (n1*n0)).

Type:

float

auc_se

DeLong standard error of the AUC.

Type:

float

auc_ci_lower, auc_ci_upper

Confidence interval for AUC (logit-transformed DeLong).

Type:

float

conf_level

Confidence level used for CI.

Type:

float

n_positive, n_negative

Number of positive (case) and negative (control) observations.

Type:

int

direction

'<' (controls < cases) or '>' (controls > cases).

Type:

str

thresholds: ndarray[tuple[Any, ...], dtype[floating]]
tpr: ndarray[tuple[Any, ...], dtype[floating]]
fpr: ndarray[tuple[Any, ...], dtype[floating]]
auc: float
auc_se: float
auc_ci_lower: float
auc_ci_upper: float
conf_level: float
n_positive: int
n_negative: int
direction: str
summary()[source]

Human-readable summary.

Return type:

str

class pystatsbio.diagnostic.DiagnosticResult(cutoff, sensitivity, sensitivity_ci, specificity, specificity_ci, ppv, npv, lr_positive, lr_negative, dor, dor_ci, prevalence, conf_level, method)[source]

Bases: object

Result of diagnostic accuracy evaluation at a fixed cutoff.

All CIs use the method specified in method (e.g. 'clopper-pearson' for exact binomial CIs).

Parameters:
cutoff: float
sensitivity: float
sensitivity_ci: tuple[float, float]
specificity: float
specificity_ci: tuple[float, float]
ppv: float
npv: float
lr_positive: float
lr_negative: float
dor: float
dor_ci: tuple[float, float]
prevalence: float
conf_level: float
method: str
summary()[source]

Human-readable summary.

Return type:

str

class pystatsbio.diagnostic.ROCTestResult(statistic, p_value, auc1, auc2, auc_diff, method)[source]

Bases: object

Result of comparing two correlated ROC curves (DeLong test).

Parameters:
statistic: float
p_value: float
auc1: float
auc2: float
auc_diff: float
method: str
summary()[source]
Return type:

str

class pystatsbio.diagnostic.CutoffResult(cutoff, sensitivity, specificity, method, criterion_value)[source]

Bases: object

Result of optimal cutoff selection.

Parameters:
cutoff: float
sensitivity: float
specificity: float
method: str
criterion_value: float
class pystatsbio.diagnostic.BatchAUCResult(auc, se, n_markers)[source]

Bases: object

Result of batch AUC computation across multiple biomarkers.

Parameters:
auc: ndarray[tuple[Any, ...], dtype[floating]]
se: ndarray[tuple[Any, ...], dtype[floating]]
n_markers: int
pystatsbio.diagnostic.roc(response, predictor, *, direction='auto', conf_level=0.95)[source]

Compute empirical ROC curve with DeLong AUC confidence interval.

Parameters:
  • response (array of int) – Binary outcome (0/1).

  • predictor (array of float) – Continuous predictor (biomarker value).

  • direction (str) – '<' (controls < cases, higher predictor → positive), '>' (controls > cases, lower predictor → positive), or 'auto' (choose direction giving AUC ≥ 0.5).

  • conf_level (float) – Confidence level for AUC CI.

Returns:

  • ROCResult

  • Validates against (R pROC::roc(), pROC::ci.auc())

Return type:

ROCResult

pystatsbio.diagnostic.roc_test(roc1, roc2, *, predictor1=None, predictor2=None, response=None, method='delong')[source]

Compare two correlated ROC curves using DeLong’s test.

The two ROC curves must be computed on the same subjects (same response vector). The original predictor values and shared response are required to compute the paired DeLong covariance.

Parameters:
  • roc1 (ROCResult) – Two ROC curves computed on the same subjects.

  • roc2 (ROCResult) – Two ROC curves computed on the same subjects.

  • predictor1 (array of float) – Original predictor values for each marker.

  • predictor2 (array of float) – Original predictor values for each marker.

  • response (array of int) – Shared binary outcome.

  • method (str) – 'delong' (only supported method).

Returns:

  • ROCTestResult

  • Validates against (R pROC::roc.test())

Return type:

ROCTestResult

pystatsbio.diagnostic.diagnostic_accuracy(response, predictor, *, cutoff, direction='<', prevalence=None, conf_level=0.95, ci_method='clopper-pearson')[source]

Compute diagnostic accuracy metrics at a fixed cutoff.

Parameters:
  • response (array of int) – Binary outcome (0/1).

  • predictor (array of float) – Continuous predictor.

  • cutoff (float) – Classification threshold.

  • direction (str) – '<' means predictor ≥ cutoff is classified positive (controls < cases, higher values = disease). '>' means predictor ≤ cutoff is classified positive (controls > cases, lower values = disease).

  • prevalence (float or None) – Disease prevalence for PPV/NPV adjustment via Bayes’ theorem. If None, uses sample prevalence.

  • conf_level (float) – Confidence level.

  • ci_method (str) – 'clopper-pearson' (exact) or 'wilson'.

Returns:

  • DiagnosticResult

  • Validates against (R epiR::epi.tests())

Return type:

DiagnosticResult

pystatsbio.diagnostic.optimal_cutoff(roc_result, *, method='youden', cost_fp=1.0, cost_fn=1.0, prevalence=None)[source]

Find optimal classification cutoff from an ROC curve.

Parameters:
  • roc_result (ROCResult) – A computed ROC curve.

  • method (str) – 'youden' — maximize sensitivity + specificity − 1. 'closest_topleft' — minimize distance to (FPR=0, TPR=1). 'cost' — minimize weighted misclassification cost.

  • cost_fp (float) – Costs of false positives and false negatives (for method='cost').

  • cost_fn (float) – Costs of false positives and false negatives (for method='cost').

  • prevalence (float or None) – Disease prevalence (for method='cost'). Uses sample prevalence n_positive / (n_positive + n_negative) if None.

Returns:

  • CutoffResult

  • Validates against (R OptimalCutpoints::optimal.cutpoints())

Return type:

CutoffResult

pystatsbio.diagnostic.batch_auc(response, predictors, *, backend='auto')[source]

Compute AUC for many biomarker candidates simultaneously.

Parameters:
  • response (array of int, shape (n_samples,)) – Shared binary outcome (0/1).

  • predictors (array of float, shape (n_samples, n_markers)) – Matrix of biomarker values (one column per candidate marker).

  • backend (str) – 'cpu', 'gpu', or 'auto'.

Return type:

BatchAUCResult

Notes

GPU backend is beneficial when n_markers > 100. Uses rank-based AUC computation which is embarrassingly parallel across markers. DeLong standard errors are computed for each marker.