Multivariate Normal MLE¶
Maximum likelihood estimation for multivariate normal distributions with missing data. Direct BFGS and EM algorithms. Little’s MCAR test. Missing data pattern analysis.
Multivariate normal maximum likelihood estimation with missing data.
- Public API:
mlest(data, …) -> MVNSolution
- pystatistics.mvnmle.mlest(data_or_design, *, algorithm='direct', backend='auto', method=None, tol=None, max_iter=None, verbose=False)[source]¶
Maximum likelihood estimation for multivariate normal with missing data.
- Accepts EITHER:
An MVNDesign object
Raw data array or DataFrame (convenience)
- Parameters:
data_or_design (array-like or MVNDesign) – Data matrix with NaN for missing values, or MVNDesign object.
algorithm (str) –
Estimation algorithm: - ‘direct’ (default): BFGS optimization on the log-likelihood,
using R-exact inverse Cholesky parameterization.
’em’: Expectation-Maximization algorithm. Typically slower to converge but guaranteed monotone likelihood increase.
backend (str) – Backend selection: ‘auto’, ‘cpu’, ‘gpu’.
method (str or None) – Optimization method for direct algorithm. If None, auto-selected by backend. Ignored for EM.
tol (float or None) – Convergence tolerance. If None, uses algorithm-appropriate default: direct = 1e-5 (gradient tolerance), em = 1e-4 (parameter change).
max_iter (int or None) – Maximum iterations. If None, uses algorithm-appropriate default: direct = 100, em = 1000.
verbose (bool) – Print progress information.
- Return type:
Examples
>>> from pystatistics.mvnmle import mlest, datasets >>> result = mlest(datasets.apple) >>> result_em = mlest(datasets.apple, algorithm='em') >>> print(result.muhat) >>> print(result.loglik)
- class pystatistics.mvnmle.MVNDesign(_data, _n, _p)[source]¶
Bases:
objectDesign for multivariate normal MLE with missing data.
Wraps a data matrix (n observations x p variables) that may contain NaN values representing missing data. Immutable after construction.
- Construction:
MVNDesign.from_array(data) MVNDesign.from_datasource(ds, columns=[‘a’, ‘b’, ‘c’])
- classmethod from_array(data)[source]¶
Build MVNDesign from array-like data.
- Parameters:
data (array-like) – 2D data matrix. Can be numpy array, pandas DataFrame, or any array-like with .values attribute.
- Return type:
- classmethod from_datasource(source, *, columns=None)[source]¶
Build MVNDesign from a DataSource.
- Parameters:
source (DataSource) – Data source providing columns
columns (list of str, optional) – Column names to include. If None, uses all columns.
- Return type:
- class pystatistics.mvnmle.MVNSolution(_result, _design)[source]¶
Bases:
objectUser-facing MVN MLE results.
Wraps the backend Result and provides convenient accessors for all MVN estimation outputs.
- property correlation_matrix: ndarray[tuple[Any, ...], dtype[floating[Any]]]¶
Correlation matrix derived from estimated covariance.
- class pystatistics.mvnmle.MVNParams(muhat, sigmahat, loglik, n_iter, converged, gradient_norm=None)[source]¶
Bases:
objectParameter payload for MVN MLE.
Immutable data computed by backends.
- Parameters:
- pystatistics.mvnmle.analyze_patterns(data)[source]¶
Analyze missingness patterns in the data.
- Parameters:
data (array-like) – Data matrix with missing values as np.nan. Can be NumPy array or pandas DataFrame.
- Returns:
Patterns sorted by frequency (most common first).
- Return type:
List[PatternInfo]
- pystatistics.mvnmle.pattern_summary(patterns, data_shape=None)[source]¶
Generate summary statistics for missingness patterns.
- Parameters:
patterns (List[PatternInfo]) – Pattern information from analyze_patterns()
data_shape (Optional[Tuple[int, int]]) – Original data shape (n_obs, n_vars)
- Return type:
- class pystatistics.mvnmle.PatternInfo(pattern_id, observed_indices, missing_indices, n_cases, data, pattern_vector)[source]¶
Bases:
objectInformation about a single missingness pattern.
- Parameters:
- class pystatistics.mvnmle.PatternSummary(n_patterns, total_cases, overall_missing_rate, most_common_pattern, complete_cases, complete_cases_percent, variable_missing_rates)[source]¶
Bases:
objectSummary statistics for all missingness patterns in a dataset.
- Parameters:
- most_common_pattern: PatternInfo¶
- pystatistics.mvnmle.little_mcar_test(data, alpha=0.05, verbose=False)[source]¶
Little’s test for Missing Completely at Random (MCAR).
- Parameters:
data (array-like, shape (n_observations, n_variables)) – Data matrix with missing values as np.nan.
alpha (float, default=0.05) – Significance level
verbose (bool, default=False) – Print detailed progress
- Return type: