Epidemiology

Epidemiological measures: 2×2 table analysis, rate standardization, and Mantel-Haenszel stratified analysis.

Epidemiological analysis: measures of association, rate standardization, and stratified analysis.

Computes risk ratios, odds ratios, risk differences, attributable fractions, age-standardized rates, and Mantel-Haenszel pooled estimates from 2x2 tables.

Validates against: R epiR, epitools, stats::mantelhaen.test()

class pystatsbio.epi.EpiMeasure(name, estimate, ci_lower, ci_upper, conf_level, method)[source]

Bases: object

A single epidemiological measure with confidence interval.

Parameters:
name

Human-readable measure name (e.g. “Risk Ratio”, “Odds Ratio”).

Type:

str

estimate

Point estimate.

Type:

float

ci_lower

Lower bound of the confidence interval.

Type:

float

ci_upper

Upper bound of the confidence interval.

Type:

float

conf_level

Confidence level used (e.g. 0.95).

Type:

float

method

Name of the CI method used.

Type:

str

name: str
estimate: float
ci_lower: float
ci_upper: float
conf_level: float
method: str
summary()[source]

Human-readable one-line summary.

Return type:

str

class pystatsbio.epi.Epi2x2Solution(result)[source]

Bases: SolutionReprMixin

Public result of a 2x2 analysis — wraps Result[Epi2x2Params].

Exposes every measure as a read-only property plus the uniform .backend_name / .timing / .warnings / .info metadata and a Jupyter _repr_html_ (via SolutionReprMixin).

Parameters:

result (Result[Epi2x2Params])

property backend_name: str
property timing: dict[str, float] | None
property warnings: tuple[str, ...]
property info: dict
property risk_ratio: EpiMeasure
property odds_ratio: EpiMeasure
property risk_difference: EpiMeasure
property attributable_risk_exposed: EpiMeasure
property population_attributable_fraction: EpiMeasure
property nnt: EpiMeasure
property table: ndarray[tuple[Any, ...], dtype[_ScalarT]]
summary()[source]

Human-readable summary of all measures.

Return type:

str

class pystatsbio.epi.Epi2x2Params(risk_ratio, odds_ratio, risk_difference, attributable_risk_exposed, population_attributable_fraction, nnt, table)[source]

Bases: object

Computed payload from a 2x2 contingency table analysis.

Parameters:
risk_ratio

Risk ratio (relative risk) with CI.

Type:

EpiMeasure

odds_ratio

Odds ratio with CI (Woolf method).

Type:

EpiMeasure

risk_difference

Risk difference (absolute risk reduction) with CI.

Type:

EpiMeasure

attributable_risk_exposed

Attributable fraction in exposed: (RR - 1) / RR.

Type:

EpiMeasure

population_attributable_fraction

Population attributable fraction (Levin formula).

Type:

EpiMeasure

nnt

Number needed to treat: 1 / |RD|.

Type:

EpiMeasure

table

The original (possibly corrected) 2x2 table.

Type:

NDArray

risk_ratio: EpiMeasure
odds_ratio: EpiMeasure
risk_difference: EpiMeasure
attributable_risk_exposed: EpiMeasure
population_attributable_fraction: EpiMeasure
nnt: EpiMeasure
table: ndarray[tuple[Any, ...], dtype[_ScalarT]]
class pystatsbio.epi.StandardizedRateSolution(result)[source]

Bases: SolutionReprMixin

Public result of rate standardization — wraps Result[StandardizedRateParams].

Exposes every output as a read-only property plus the uniform .backend_name / .timing / .warnings / .info metadata and a Jupyter _repr_html_ (via SolutionReprMixin).

Parameters:

result (Result[StandardizedRateParams])

property backend_name: str
property timing: dict[str, float] | None
property warnings: tuple[str, ...]
property info: dict
property crude_rate: float
property adjusted_rate: float
property adjusted_rate_ci: tuple[float, float]
property conf_level: float
property method: str
property sir: float | None
property sir_ci: tuple[float, float] | None
summary()[source]

Human-readable summary.

Return type:

str

class pystatsbio.epi.StandardizedRateParams(crude_rate, adjusted_rate, adjusted_rate_ci, conf_level, method, sir, sir_ci)[source]

Bases: object

Computed payload from rate standardization.

Parameters:
crude_rate

Unstandardized (crude) rate.

Type:

float

adjusted_rate

Standardized (adjusted) rate.

Type:

float

adjusted_rate_ci

Confidence interval for the adjusted rate.

Type:

tuple of float

conf_level

Confidence level used.

Type:

float

method

‘direct’ or ‘indirect’.

Type:

str

sir

Standardized incidence/mortality ratio (indirect only).

Type:

float or None

sir_ci

CI for SIR (indirect only).

Type:

tuple of float or None

crude_rate: float
adjusted_rate: float
adjusted_rate_ci: tuple[float, float]
conf_level: float
method: str
sir: float | None
sir_ci: tuple[float, float] | None
class pystatsbio.epi.MantelHaenszelSolution(result)[source]

Bases: SolutionReprMixin

Public result of an MH analysis — wraps Result[MantelHaenszelParams].

Exposes every output as a read-only property plus the uniform .backend_name / .timing / .warnings / .info metadata and a Jupyter _repr_html_ (via SolutionReprMixin).

Parameters:

result (Result[MantelHaenszelParams])

property backend_name: str
property timing: dict[str, float] | None
property warnings: tuple[str, ...]
property info: dict
property pooled_estimate: EpiMeasure
property cmh_statistic: float
property cmh_p_value: float
property breslow_day_statistic: float | None
property breslow_day_p_value: float | None
property n_strata: int
property measure: str
summary()[source]

Human-readable summary.

Return type:

str

class pystatsbio.epi.MantelHaenszelParams(pooled_estimate, cmh_statistic, cmh_p_value, breslow_day_statistic, breslow_day_p_value, n_strata, measure)[source]

Bases: object

Computed payload from Mantel-Haenszel stratified analysis.

Parameters:
pooled_estimate

MH pooled odds ratio or risk ratio with CI.

Type:

EpiMeasure

cmh_statistic

Cochran-Mantel-Haenszel chi-squared test statistic.

Type:

float

cmh_p_value

p-value for the CMH test (chi-squared df=1).

Type:

float

breslow_day_statistic

Breslow-Day homogeneity test statistic (None if < 2 strata).

Type:

float or None

breslow_day_p_value

p-value for Breslow-Day test (None if < 2 strata).

Type:

float or None

n_strata

Number of strata.

Type:

int

measure

‘odds-ratio’ or ‘risk-ratio’.

Type:

str

pooled_estimate: EpiMeasure
cmh_statistic: float
cmh_p_value: float
breslow_day_statistic: float | None
breslow_day_p_value: float | None
n_strata: int
measure: str
pystatsbio.epi.epi_2by2(table, *, conf_level=0.95)[source]

Compute epidemiological measures from a 2x2 contingency table.

Table layout (matching R’s epiR::epi.2by2):
[[a, b], a = exposed+disease, b = exposed+no disease

[c, d]] c = unexposed+disease, d = unexposed+no disease

Computes: - Risk ratio (RR) with log-transformed CI - Odds ratio (OR) with Woolf CI - Risk difference (RD) with Wald CI - Attributable fraction in exposed (AFe) - Population attributable fraction (PAF) via Levin’s formula - Number needed to treat (NNT)

Parameters:
  • table (array-like, shape (2, 2)) – 2x2 contingency table with non-negative values. If any cell is zero, a 0.5 continuity correction is applied.

  • conf_level (float) – Confidence level for all intervals. Must be in (0, 1).

Return type:

Epi2x2Solution

Raises:
  • ValidationError – If table is not 2x2, contains negative values, or conf_level is out of range.

  • Validates against – R epiR::epi.2by2():

pystatsbio.epi.rate_standardize(counts, person_time, standard_pop, *, method='direct', conf_level=0.95)[source]

Age-standardize rates using direct or indirect method.

Direct standardization:

adjusted_rate = sum(rate_i * weight_i) / sum(weight_i) where rate_i = counts_i / person_time_i and weight_i = standard_pop_i

CI via normal approximation: SE = sqrt(sum(weight_i^2 * count_i / person_time_i^2)) / sum(weight_i)

Indirect standardization:

expected = sum(standard_rate_i * person_time_i) SIR = observed / expected CI: exact Poisson CI on observed, divided by expected

For indirect, standard_pop should be standard RATES, not populations.

Parameters:
  • counts (array-like) – Observed event counts per stratum (e.g., age group).

  • person_time (array-like) – Person-time at risk per stratum.

  • standard_pop (array-like) – Standard population weights (direct) or standard rates (indirect).

  • method (str) – ‘direct’ or ‘indirect’.

  • conf_level (float) – Confidence level for intervals. Must be in (0, 1).

Return type:

StandardizedRateSolution

Raises:
  • ValidationError – If inputs are invalid or method is not recognized.

  • Validates against – R epitools::ageadjust.direct(), epitools::ageadjust.indirect():

pystatsbio.epi.mantel_haenszel(tables, *, measure='odds-ratio', conf_level=0.95)[source]

Mantel-Haenszel stratified analysis for pooled OR or RR.

Pools effect measures across K strata of 2x2 tables, tests for conditional independence (CMH test), and tests for homogeneity of effect across strata (Breslow-Day test, OR only).

Parameters:
  • tables (array-like, shape (K, 2, 2)) – K strata of 2x2 tables. Each stratum follows the layout: [[a, b], [c, d]] where a = exposed+disease, b = exposed+no disease, c = unexposed+disease, d = unexposed+no disease.

  • measure (str) – ‘odds-ratio’ or ‘risk-ratio’.

  • conf_level (float) – Confidence level for intervals. Must be in (0, 1).

Return type:

MantelHaenszelSolution

Raises:
  • ValidationError – If tables shape is wrong, measure is invalid, or computations are degenerate.

  • Validates against – R stats::mantelhaen.test(), DescTools::BreslowDayTest():