Epidemiology¶

Epidemiological measures: 2×2 table analysis, rate standardization, and Mantel-Haenszel stratified analysis.

Epidemiological analysis: measures of association, rate standardization, and stratified analysis.

Computes risk ratios, odds ratios, risk differences, attributable fractions, age-standardized rates, and Mantel-Haenszel pooled estimates from 2x2 tables.

Validates against: R epiR, epitools, stats::mantelhaen.test()

class pystatsbio.epi.EpiMeasure(name, estimate, ci_lower, ci_upper, conf_level, method)[source]¶

Bases: object

A single epidemiological measure with confidence interval.

Parameters:

name (str)
estimate (float)
ci_lower (float)
ci_upper (float)
conf_level (float)
method (str)

name¶

Human-readable measure name (e.g. “Risk Ratio”, “Odds Ratio”).

Type:: str

estimate¶

Point estimate.

Type:: float

ci_lower¶

Lower bound of the confidence interval.

Type:: float

ci_upper¶

Upper bound of the confidence interval.

Type:: float

conf_level¶

Confidence level used (e.g. 0.95).

Type:: float

method¶

Name of the CI method used.

Type:: str

name: str¶

estimate: float¶

ci_lower: float¶

ci_upper: float¶

conf_level: float¶

method: str¶

summary()[source]¶

Human-readable one-line summary.

Return type:: str

class pystatsbio.epi.Epi2x2Solution(result)[source]¶

Bases: SolutionReprMixin

Public result of a 2x2 analysis — wraps Result[Epi2x2Params].

Exposes every measure as a read-only property plus the uniform .backend_name / .timing / .warnings / .info metadata and a Jupyter _repr_html_ (via SolutionReprMixin).

Parameters:: result (Result[Epi2x2Params])

property backend_name: str¶

property timing: dict[str, float] | None¶

property warnings: tuple[str, ...]¶

property info: dict¶

property risk_ratio: EpiMeasure¶

property odds_ratio: EpiMeasure¶

property risk_difference: EpiMeasure¶

property attributable_risk_exposed: EpiMeasure¶

property population_attributable_fraction: EpiMeasure¶

property nnt: EpiMeasure¶

property table: ndarray[tuple[Any, ...], dtype[_ScalarT]]¶

summary()[source]¶

Human-readable summary of all measures.

Return type:: str

class pystatsbio.epi.Epi2x2Params(risk_ratio, odds_ratio, risk_difference, attributable_risk_exposed, population_attributable_fraction, nnt, table)[source]¶

Bases: object

Computed payload from a 2x2 contingency table analysis.

Parameters:

risk_ratio (EpiMeasure)
odds_ratio (EpiMeasure)
risk_difference (EpiMeasure)
attributable_risk_exposed (EpiMeasure)
population_attributable_fraction (EpiMeasure)
nnt (EpiMeasure)
table (ndarray[tuple[Any, ...], dtype[_ScalarT]])

risk_ratio¶

Risk ratio (relative risk) with CI.

Type:: EpiMeasure

odds_ratio¶

Odds ratio with CI (Woolf method).

Type:: EpiMeasure

risk_difference¶

Risk difference (absolute risk reduction) with CI.

Type:: EpiMeasure

attributable_risk_exposed¶

Attributable fraction in exposed: (RR - 1) / RR.

Type:: EpiMeasure

population_attributable_fraction¶

Population attributable fraction (Levin formula).

Type:: EpiMeasure

nnt¶

Number needed to treat: 1 / |RD|.

Type:: EpiMeasure

table¶

The original (possibly corrected) 2x2 table.

Type:: NDArray

risk_ratio: EpiMeasure¶

odds_ratio: EpiMeasure¶

risk_difference: EpiMeasure¶

attributable_risk_exposed: EpiMeasure¶

population_attributable_fraction: EpiMeasure¶

nnt: EpiMeasure¶

table: ndarray[tuple[Any, ...], dtype[_ScalarT]]¶

class pystatsbio.epi.StandardizedRateSolution(result)[source]¶

Bases: SolutionReprMixin

Public result of rate standardization — wraps Result[StandardizedRateParams].

Exposes every output as a read-only property plus the uniform .backend_name / .timing / .warnings / .info metadata and a Jupyter _repr_html_ (via SolutionReprMixin).

Parameters:: result (Result[StandardizedRateParams])

property backend_name: str¶

property timing: dict[str, float] | None¶

property warnings: tuple[str, ...]¶

property info: dict¶

property crude_rate: float¶

property adjusted_rate: float¶

property adjusted_rate_ci: tuple[float, float]¶

property conf_level: float¶

property method: str¶

property sir: float | None¶

property sir_ci: tuple[float, float] | None¶

summary()[source]¶

Human-readable summary.

Return type:: str

class pystatsbio.epi.StandardizedRateParams(crude_rate, adjusted_rate, adjusted_rate_ci, conf_level, method, sir, sir_ci)[source]¶

Bases: object

Computed payload from rate standardization.

Parameters:

crude_rate (float)
adjusted_rate (float)
adjusted_rate_ci (tuple[float, float])
conf_level (float)
method (str)
sir (float | None)
sir_ci (tuple[float, float] | None)

crude_rate¶

Unstandardized (crude) rate.

Type:: float

adjusted_rate¶

Standardized (adjusted) rate.

Type:: float

adjusted_rate_ci¶

Confidence interval for the adjusted rate.

Type:: tuple of float

conf_level¶

Confidence level used.

Type:: float

method¶

‘direct’ or ‘indirect’.

Type:: str

sir¶

Standardized incidence/mortality ratio (indirect only).

Type:: float or None

sir_ci¶

CI for SIR (indirect only).

Type:: tuple of float or None

crude_rate: float¶

adjusted_rate: float¶

adjusted_rate_ci: tuple[float, float]¶

conf_level: float¶

method: str¶

sir: float | None¶

sir_ci: tuple[float, float] | None¶

class pystatsbio.epi.MantelHaenszelSolution(result)[source]¶

Bases: SolutionReprMixin

Public result of an MH analysis — wraps Result[MantelHaenszelParams].

Exposes every output as a read-only property plus the uniform .backend_name / .timing / .warnings / .info metadata and a Jupyter _repr_html_ (via SolutionReprMixin).

Parameters:: result (Result[MantelHaenszelParams])

property backend_name: str¶

property timing: dict[str, float] | None¶

property warnings: tuple[str, ...]¶

property info: dict¶

property pooled_estimate: EpiMeasure¶

property cmh_statistic: float¶

property cmh_p_value: float¶

property breslow_day_statistic: float | None¶

property breslow_day_p_value: float | None¶

property n_strata: int¶

property measure: str¶

summary()[source]¶

Human-readable summary.

Return type:: str

class pystatsbio.epi.MantelHaenszelParams(pooled_estimate, cmh_statistic, cmh_p_value, breslow_day_statistic, breslow_day_p_value, n_strata, measure)[source]¶

Bases: object

Computed payload from Mantel-Haenszel stratified analysis.

Parameters:

pooled_estimate (EpiMeasure)
cmh_statistic (float)
cmh_p_value (float)
breslow_day_statistic (float | None)
breslow_day_p_value (float | None)
n_strata (int)
measure (str)

pooled_estimate¶

MH pooled odds ratio or risk ratio with CI.

Type:: EpiMeasure

cmh_statistic¶

Cochran-Mantel-Haenszel chi-squared test statistic.

Type:: float

cmh_p_value¶

p-value for the CMH test (chi-squared df=1).

Type:: float

breslow_day_statistic¶

Breslow-Day homogeneity test statistic (None if < 2 strata).

Type:: float or None

breslow_day_p_value¶

p-value for Breslow-Day test (None if < 2 strata).

Type:: float or None

n_strata¶

Number of strata.

Type:: int

measure¶

‘odds-ratio’ or ‘risk-ratio’.

Type:: str

pooled_estimate: EpiMeasure¶

cmh_statistic: float¶

cmh_p_value: float¶

breslow_day_statistic: float | None¶

breslow_day_p_value: float | None¶

n_strata: int¶

measure: str¶

pystatsbio.epi.epi_2by2(table, *, conf_level=0.95)[source]¶

Compute epidemiological measures from a 2x2 contingency table.

Table layout (matching R’s epiR::epi.2by2):

[[a, b], a = exposed+disease, b = exposed+no disease: [c, d]] c = unexposed+disease, d = unexposed+no disease

Computes: - Risk ratio (RR) with log-transformed CI - Odds ratio (OR) with Woolf CI - Risk difference (RD) with Wald CI - Attributable fraction in exposed (AFe) - Population attributable fraction (PAF) via Levin’s formula - Number needed to treat (NNT)

Parameters:

table (array-like, shape (2, 2)) – 2x2 contingency table with non-negative values. If any cell is zero, a 0.5 continuity correction is applied.
conf_level (float) – Confidence level for all intervals. Must be in (0, 1).

Return type:

Epi2x2Solution

Raises:

ValidationError – If table is not 2x2, contains negative values, or conf_level is out of range.
Validates against – R epiR::epi.2by2():

pystatsbio.epi.rate_standardize(counts, person_time, standard_pop, *, method='direct', conf_level=0.95)[source]¶

Age-standardize rates using direct or indirect method.

Direct standardization:

adjusted_rate = sum(rate_i * weight_i) / sum(weight_i) where rate_i = counts_i / person_time_i and weight_i = standard_pop_i

CI via normal approximation: SE = sqrt(sum(weight_i^2 * count_i / person_time_i^2)) / sum(weight_i)

Indirect standardization:

expected = sum(standard_rate_i * person_time_i) SIR = observed / expected CI: exact Poisson CI on observed, divided by expected

For indirect, standard_pop should be standard RATES, not populations.

Parameters:

counts (array-like) – Observed event counts per stratum (e.g., age group).
person_time (array-like) – Person-time at risk per stratum.
standard_pop (array-like) – Standard population weights (direct) or standard rates (indirect).
method (str) – ‘direct’ or ‘indirect’.
conf_level (float) – Confidence level for intervals. Must be in (0, 1).

Return type:

StandardizedRateSolution

Raises:

ValidationError – If inputs are invalid or method is not recognized.
Validates against – R epitools::ageadjust.direct(), epitools::ageadjust.indirect():

pystatsbio.epi.mantel_haenszel(tables, *, measure='odds-ratio', conf_level=0.95)[source]¶

Mantel-Haenszel stratified analysis for pooled OR or RR.

Pools effect measures across K strata of 2x2 tables, tests for conditional independence (CMH test), and tests for homogeneity of effect across strata (Breslow-Day test, OR only).

Parameters:

tables (array-like, shape (K, 2, 2)) – K strata of 2x2 tables. Each stratum follows the layout: [[a, b], [c, d]] where a = exposed+disease, b = exposed+no disease, c = unexposed+disease, d = unexposed+no disease.
measure (str) – ‘odds-ratio’ or ‘risk-ratio’.
conf_level (float) – Confidence level for intervals. Must be in (0, 1).

Return type:

MantelHaenszelSolution

Raises:

ValidationError – If tables shape is wrong, measure is invalid, or computations are degenerate.
Validates against – R stats::mantelhaen.test(), DescTools::BreslowDayTest():