Core¶
Core infrastructure: DataSource container, Result wrapper, device selection, precision management, and exception classes.
Core infrastructure for PyStatistics.
- class pystatistics.core.DataSource(_data, _capabilities, _metadata=<factory>)[source]¶
Bases:
objectUniversal data container. Domain-agnostic.
Construct via factory classmethods, not directly.
The lumber yard analogy: DataSource has data (logs). It doesn’t know or care what you’re building—furniture (regression), paper (MVN MLE), or two-by-fours (survival analysis).
- keys()[source]¶
Return the names of all available arrays.
Example
>>> ds = DataSource.from_arrays(X=X, y=y) >>> ds.keys() frozenset({'X', 'y'})
- supports(capability)[source]¶
Check if this DataSource supports a capability.
- Parameters:
capability (str) – Use constants from pystatistics.core.capabilities
- Returns:
True if supported, False otherwise
- Return type:
Note
Unknown capabilities return False, never raise.
- classmethod from_arrays(*, X=None, y=None, data=None, columns=None, **named_arrays)[source]¶
Construct from NumPy arrays.
- classmethod from_file(path, *, columns=None)[source]¶
Construct from file (CSV, NPY).
- Parameters:
- Return type:
- classmethod from_dataframe(df, *, source_path=None)[source]¶
Construct from pandas DataFrame.
- Parameters:
df (pd.DataFrame)
source_path (str | None)
- Return type:
- classmethod from_tensors(*, X=None, y=None, **named_tensors)[source]¶
Construct from PyTorch tensors (already on GPU).
- Parameters:
X (torch.Tensor | None)
y (torch.Tensor | None)
named_tensors (torch.Tensor)
- Return type:
- class pystatistics.core.Result(params, info, timing, backend_name, warnings=<factory>, provenance=<factory>)[source]¶
Bases:
Generic[P]Immutable result envelope for statistical computations.
- Type Parameters:
P: The domain-specific parameter payload type
- Parameters:
- params¶
Domain-specific parameters (coefficients, estimates, etc.)
- Type:
pystatistics.core.result.P
The frozen=True ensures results are immutable after creation, which is important for reproducibility and prevents accidental modification.
Examples
>>> # Direct method (no convergence notion) >>> Result( ... params=LinearParams(coefficients=beta), ... info={'method': 'qr', 'rank': 5}, ... timing={'total_seconds': 0.01}, ... backend_name='cpu_qr' ... )
>>> # Iterative method >>> Result( ... params=MVNParams(mu=mu, sigma=sigma), ... info={'method': 'em', 'converged': True, 'iterations': 23}, ... timing={'total_seconds': 0.5, 'e_step': 0.3, 'm_step': 0.2}, ... backend_name='cpu_em' ... )
- params: P¶
- exception pystatistics.core.PyStatisticsError[source]¶
Bases:
ExceptionBase exception for all PyStatistics errors.
- exception pystatistics.core.ValidationError[source]¶
Bases:
PyStatisticsErrorInput validation failed.
Raised when user-provided inputs fail validation checks.
- exception pystatistics.core.DimensionError[source]¶
Bases:
ValidationErrorArray dimensions are incorrect or inconsistent.
Raised when array shapes don’t match expected dimensions or when multiple arrays have inconsistent shapes.
- exception pystatistics.core.NumericalError[source]¶
Bases:
PyStatisticsErrorNumerical computation failed.
Base class for errors arising from numerical issues during computation.
- exception pystatistics.core.SingularMatrixError(message, matrix_name=None, condition_number=None, rank=None, expected_rank=None)[source]¶
Bases:
NumericalErrorMatrix is singular or nearly singular.
Raised when a matrix operation requires invertibility but the matrix is singular or numerically rank-deficient.
- Parameters:
- matrix_name¶
Name/description of the problematic matrix
- condition_number¶
Estimated condition number, if available
- rank¶
Numerical rank, if computed
- expected_rank¶
Expected rank (typically min(n, p))
- exception pystatistics.core.NotPositiveDefiniteError(message, matrix_name=None, min_eigenvalue=None)[source]¶
Bases:
NumericalErrorMatrix is not positive definite.
Raised when an operation requires a positive definite matrix (e.g., Cholesky decomposition) but the matrix fails this requirement.
- matrix_name¶
Name/description of the problematic matrix
- min_eigenvalue¶
Minimum eigenvalue, if computed
- exception pystatistics.core.ConvergenceError(message, iterations, final_change=None, reason=None, threshold=None)[source]¶
Bases:
PyStatisticsErrorIterative algorithm failed to converge.
Raised when an iterative optimization method (EM, Newton-Raphson, IRLS) fails to meet convergence criteria within the maximum number of iterations.
- Parameters:
- iterations¶
Number of iterations completed
- final_change¶
Final parameter or objective change
- reason¶
Why convergence failed (e.g., ‘max_iterations’, ‘diverging’)
- threshold¶
The convergence threshold that was not met
DataSource¶
- class pystatistics.core.DataSource(_data, _capabilities, _metadata=<factory>)[source]¶
Universal data container. Domain-agnostic.
Construct via factory classmethods, not directly.
The lumber yard analogy: DataSource has data (logs). It doesn’t know or care what you’re building—furniture (regression), paper (MVN MLE), or two-by-fours (survival analysis).
- keys()[source]¶
Return the names of all available arrays.
Example
>>> ds = DataSource.from_arrays(X=X, y=y) >>> ds.keys() frozenset({'X', 'y'})
- supports(capability)[source]¶
Check if this DataSource supports a capability.
- Parameters:
capability (str) – Use constants from pystatistics.core.capabilities
- Returns:
True if supported, False otherwise
- Return type:
Note
Unknown capabilities return False, never raise.
- classmethod from_arrays(*, X=None, y=None, data=None, columns=None, **named_arrays)[source]¶
Construct from NumPy arrays.
- classmethod from_file(path, *, columns=None)[source]¶
Construct from file (CSV, NPY).
- Parameters:
- Return type:
- classmethod from_dataframe(df, *, source_path=None)[source]¶
Construct from pandas DataFrame.
- Parameters:
df (pd.DataFrame)
source_path (str | None)
- Return type:
- classmethod from_tensors(*, X=None, y=None, **named_tensors)[source]¶
Construct from PyTorch tensors (already on GPU).
- Parameters:
X (torch.Tensor | None)
y (torch.Tensor | None)
named_tensors (torch.Tensor)
- Return type: