API Reference
This page contains the API reference for the public classes in pyhanami.
Last update: Apr 19, 2026
- class pyhanami.DataDiagnostics(datasets=None)[source]
Bases:
objectPerform diagnostic comparisons between climate simulation ensembles.
This class provides functionality for computing and visualizing differences in climate variables between simulation ensembles. It includes methods for computing annual time series, absolute differences, effect sizes and significance differences at grid point level.
- Parameters:
datasets (SimulationData or Iterable[SimulationData], optional) – Ensemble or list of ensembles containing simulation data and metadata.
- datasets
List of ensembles containing simulation data and metadata.
- Type:
list[SimulationData]
- variables
Configuration dictionary mapping variable names to display metadata.
- Type:
dict
- max_workers_grid
Number of parallel workers used for grid-level computations.
- Type:
int
- add_datasets(datasets)[source]
Add new datasets to the DataDiagnostics object.
- Parameters:
datasets (SimulationData or Iterable[SimulationData])) – Ensemble or list of ensembles containing simulation data and metadata to add.
- time_series_plot(var_name, data_names=None, output_path=None, obs=False, obs_paths=None, obs_names=None, time_freq='annual', start_year=None, end_year=None, plot_ens=False)[source]
Generate time series plot for the given datasets and variable for the selected period and time frequency.
- Parameters:
var_name (str) – Climate variable name.
data_names (str or list[str], optional) – Name or list of names of simulation ensembles to plot. If None, all datasets in the DataDiagnostics object are used.
output_path (str, optional) – Path to save the time series plot.
obs (bool) – If True, also plot observational data if available (default: False).
obs_paths (str or list[str], optional) – Path to the observations database/s.
obs_names (str or list[str], optional) – Name of the observational dataset/s.
time_freq (str) – Resampling frequency (default: ‘annual’).
start_year (int) – Start year to plot.
end_year (int) – End year to plot.
plot_ens (bool) – Whether to plot individual ensemble members trajectories (default: False).
- abs_diff_plot(var_name, data_names=None, output_path=None, start_year=None, end_year=None, clon=0)[source]
Generate absolute difference plot for the given datasets and variable.
- Parameters:
var_name (str) – Climate variable name.
data_names (list[str], optional) – List of names of two simulation ensembles to compare. If None, the first two datasets in the diagnostics object are used.
output_path (str, optional) – Path to save the spatial plots.
start_year (int) – Start year to plot.
end_year (int) – End year to plot.
clon (int) – Central longitude for the spatial map (default: 0).
- eff_size_plot(var_name, data_names=None, output_path=None, start_year=None, end_year=None, clon=0, alpha=0.05, stat=scipy.stats.ttest_ind)[source]
Generate effect size plot for the given datasets and variable marking grid points with statistically significant differences.
- Parameters:
var_name (str) – Climate variable name.
data_names (list[str], optional) – List of names of two simulation ensembles to compare. If None, the first two datasets in the diagnostics object are used.
output_path (str, optional) – Path to save the spatial plots.
start_year (int) – Start year to plot.
end_year (int) – End year to plot.
clon (int) – Central longitude for the spatial map (default: 0).
alpha (float) – Significance level for the statistical test (default: 0.05).
stat (Callable) – Statistical test function to use for significance testing (default: ttest_ind).
- bias_plot(var_name, data_name=None, output_path=None, obs_path=None, obs_name=None, start_year=None, end_year=None, clon=0)[source]
Generate bias plot for the given dataset and variable comparing with observations.
- Parameters:
var_name (str) – Climate variable name.
data_name (str, optional) – Name of the simulation ensemble to plot. If None, the first dataset in the DataDiagnostics object is used.
output_path (str, optional) – Path to save the spatial plot.
obs_path (str) – Path to the observations database.
obs_name (str) – Name of the observational dataset.
start_year (int) – Start year to plot.
end_year (int) – End year to plot.
clon (int) – Central longitude for the spatial map (default: 0).
- class pyhanami.ObservationData(data_path, sim, name='obs', realization=0, regrid_method='bilinear')[source]
Bases:
objectRetrieves and processes observational datasets for evaluation of simulations.
This class interfaces with an external observational data source to retrieve datasets that match the variables and time period of a given simulation dataset. Retrieved data are then regridded to match the spatial resolution of the input simulation data.
- Parameters:
data_path (str) – Path to an observations database.data.
sim (xr.Dataset) – Input simulation dataset.
name (str) – Name of the observations instance (default: obs).
realization (int) – Realization number to select from the observations dataset if more than one member is present (default: 0).
regrid_method (str) – Regridding method (default: bilinear).
- data_path
Path to the observations database.
- Type:
Path
- data
Processed observational data, regridded to match the input simulation.
- Type:
xr.Dataset
- name
Name of the observations instance (default: obs).
- Type:
str
- realization
Realization number to select from the observations dataset if more than one member is present (default: 0).
- Type:
int
- regrid_method
Regridding method (default: bilinear).
- Type:
str
- load_and_process(sim)[source]
Retrieve and regrid observational data for the variables and period available in the given simulation ensemble.
- Parameters:
sim (xr.Dataset) – Input simulation dataset.
- Returns:
data_new_grid – Regridded observational dataset matching the input simulation.
- Return type:
xr.Dataset
- class pyhanami.ReplicabilityTest(datasets=None, obs_path=None, alpha=0.05)[source]
Bases:
objectPerform replicability test between two climate simulation ensembles.
This class compares two climate simulation ensembles using a variety of metrics and statistical tests to assess whether both climates are statistically significantly different. The test is conducted over multiple variables, regions, seasons, and ensemble members. It also supports plotting results and generating summary reports.
- Parameters:
datasets (Iterable[SimulationData], optional) – Ensemble or list of ensembles containing simulation data and metadata.
obs_path (str) – Path to the observations database.
alpha (float) – Significance level for the statistical tests (default: 0.05).
- datasets
List of ensembles containing simulation data and metadata.
- Type:
list[SimulationData]
- obs_path
Path to the observations database.
- Type:
str
- obs
Instance containing observational data for comparison.
- Type:
- variables
Configuration dictionary mapping variable names to display metadata.
- Type:
dict
- alpha
Significance level for the statistical tests.
- Type:
float
- max_workers_grid
Number of parallel workers used for variable-wise computations.
- Type:
int
- metrics
List of metrics with names and corresponding functions to compute scores.
- Type:
list of dict
- tests
Dictionary of statistical tests for comparing score distributions.
- Type:
dict
- seasons
List of seasons to compute scores over.
- Type:
list of str
- regions
Dictionary mapping region names to latitude bounds.
- Type:
dict
- eff_sizes
Dictionary to store effect sizes between the replicability test scores for each pair of datasets.
- Type:
dict
- test_results
Dictionary to store results of the replicability test for each pair of datasets.
- Type:
dict
- add_datasets(datasets)[source]
Add new datasets to the ReplicabilityTest object.
- Parameters:
datasets (SimulationData or Iterable[SimulationData]) – Ensemble or list of ensembles containing simulation data and metadata to add.
- perform_rep_test(data_names=None)[source]
Perform replicability test comparing the given simulation ensembles.
- Parameters:
data_names (list[str], optional) – List of names of two simulation ensembles to compare. If None, the first two datasets in the ReplicabilityTest object are used.
- get_eff_sizes(data_names)[source]
Return precomputed effect sizes between the replicability test scores for the given simulation ensembles.
- Parameters:
data_names (list[str]) – List of names of two simulation ensembles to compare.
- Returns:
eff_sizes – Effect sizes for all variables, seasons, regions and metrics.
- Return type:
xr.DataArray
- get_test_results(data_names)[source]
Return replicability test results for the given simulation ensembles.
- Parameters:
data_names (list[str]) – List of names of two simulation ensembles to compare.
- Returns:
test_results – Results of the replicability test for all variables, seasons, regions and tests.
- Return type:
xr.DataArray
- save_data(data_names, output_path)[source]
Save computed effect size between the replicability test scores and test results to NetCDF files.
- Parameters:
data_names (list[str]) – List of names of two simulation ensembles to compare.
output_path (str) – Path to save the data files.
- matrix_plot(data_names, output_path=None)[source]
Generate matrix plot with effect sizes and replicability test results.
- Parameters:
data_names (list[str]) – List of names of two simulation ensembles to compare.
output_path (str, optional) – Path to save the matrix plot.
- report(output_path, time_series=False, spatial=False)[source]
Generate a summary report with the results of the replicability test and the selected plots.
- Parameters:
output_path (str) – Path to save the report.
time_series (bool) – Whether to include time series plots in the report (default: False).
spatial (bool) – Whether to include spatial plots in the report (default: False).
- class pyhanami.ScientificEvaluation(datasets=None)[source]
Bases:
objectCompute and plot scores for scientific model skill evaluation.
This class provides functionality for computing and visualizing metric to evaluate how well a model reproduces several phenomena. Currently, it includes methods for bimodal ISO indices.
- Parameters:
datasets (SimulationData or Iterable[SimulationData], optional) – Ensemble or list of ensembles containing simulation data and metadata.
- datasets
List of ensembles containing simulation data and metadata.
- Type:
list[SimulationData]
- variables
Configuration dictionary mapping variable names to display metadata.
- Type:
dict
- add_datasets(datasets)[source]
Add new datasets to the ScientificEvaluation object.
- Parameters:
datasets (SimulationData or Iterable[SimulationData])) – Ensemble or list of ensembles containing simulation data and metadata to add.
- compute_general_scores(var_names=None, data_name=None, obs_name=None, obs_path=None, start_year=None, end_year=None)[source]
Initialize and compute general model skill evaluation scores for a selected dataset.
- Parameters:
var_names (str or list[str], optional) – Climate variable(s) name(s). If None, all variables in the simulated dataset will be used.
data_name (str, optional) – Name of simulation ensemble to use. If None, the first dataset in the ScientificEvaluation object is used.
obs_name (str) – Name of the observational dataset to compare to (default: config_params.GEN_OBS_NAME).
obs_path (str) – Path to the observations database (default: config_params.GEN_OBS_PATH).
start_year (int) – Initial and end years to compute the general scores for.
end_year (int) – Initial and end years to compute the general scores for.
- Returns:
general_analysis – GeneralEvaluation object containing the computed general scientific skill scalar scores.
- Return type:
GeneralEvaluation
- compute_iso_scores(data_name=None, start_year_eeof=None, end_year_eeof=None, start_year_pc=None, end_year_pc=None, obs=False, obs_path=None, correct_pc=False, iso_config=None)[source]
Initialize and compute bimodal ISO indices (following (K. Kikuchi, 2020)) and derive scalar scores (following (M. Nakano et al., 2019)) for a selected dataset.
- Parameters:
data_name (str, optional) – Name of simulation ensemble to use. If None, the first dataset in the ScientificEvaluation object is used.
start_year_eeof (int) – Initial and end years to perform the Extended Empirical Orthogonal Function (EEOF) analysis for (not needed if obs=True).
end_year_eeof (int) – Initial and end years to perform the Extended Empirical Orthogonal Function (EEOF) analysis for (not needed if obs=True).
start_year_pc (int) – Initial and end years to compute Principal Components (PCs) for.
end_year_pc (int) – Initial and end years to compute Principal Components (PCs) for.
obs (bool) – If True, use EEOFs from observational data (default: False).
obs_path (str) – Path to the observational NOAA data file. As of now, only necessary if the resolution of the NOAA data (2.5°x2.5°) is higher than that of the simulation data.
correct_pc (bool) – Whether to adjust simulated PCs by dividing by alpha (default: False).
iso_config (ISOConfig) – Configuration dataclass with parameters necessary for the ISO evaluation. If None, default values from the configuration file pyhanami.config.scientific_evaluation_parameters.yaml will be used.
- Returns:
iso_analysis – ISOEvaluation object containing the computed bimodal ISO indices and related scalar scores.
- Return type:
ISOEvaluation
- compute_mjo_scores(data_name=None, obs_path=None, start_year_mjo=None, end_year_mjo=None, start_year_ref=None, end_year_ref=None, threshold_active_days=None, mjo_config=None, mjo_vars=['ua850', 'ua200', 'rlut'])[source]
Initialize and compute Real-Time Multivariate MJO (RMM) indices following (M.C. Wheeler & H.H. Hendon, 2004) and MJO wavenumber-frequency power spectra following (M.C. Wheeler & G.N. Kiladis, 1999) and derived scalar scores following (M.-S. Ahn et al., 2017) for a selected dataset.
- Parameters:
data_name (str, optional) – Name of simulation ensemble to use. If None, the first dataset in the ScientificEvaluation object is used.
obs_path (str) – Path to the observational data file with the necessary variables for the MJO analysis.
start_year_mjo (int) – Initial and end years to perform the analysis for.
end_year_mjo (int) – Initial and end years to perform the analysis for.
start_year_ref (int) – Initial and end years for computing the reference seasonal cycle. If None, taken as the initial and end years for the whole MJO analysis.
end_year_ref (int) – Initial and end years for computing the reference seasonal cycle. If None, taken as the initial and end years for the whole MJO analysis.
threshold_active_days (float) – Threshold for the amplitude of the first two PCs to consider the MJO active at a given day. If None, the mean MJO amplitude over the entire period is used as a threshold.
mjo_config (MJOConfig) – Configuration dataclass with parameters necessary for the MJO evaluation. If None, default values from the configuration file pyhanami.config.scientific_evaluation_parameters.yaml will be used.
mjo_vars (list[str]) – Variables to be usd for the MJO analysis (default: [‘ua850’, ‘ua200’, ‘rlut’]).
- Returns:
mjo_analysis – MJOEvaluation object containing the computed RMM MJO indices, power spectra and scalar scores.
- Return type:
MJOEvaluation
- compute_tc_scores(data_name=None, start_year_tc=None, end_year_tc=None, obs=True, wind_factor=1.0, min_wind=10, basin=-1, bin_size=2.5, tc_config=None)[source]
Compute Tropical Cyclones (TCs) metrics and derive scalar scores following (C.M. Zarzycki et al., 2021) and plot results.
- Parameters:
data_name (str, optional) – Name of simulation ensemble to use. If None, the first dataset in the ScientificEvaluation object is used.
start_year_tc (int, optional) – Initial and end years to compute the TCs metrics for.
end_year_tc (int, optional) – Initial and end years to compute the TCs metrics for.
obs (bool) – If True, include observational data if available (default: True).
wind_factor (float) – Wind speed correction factor (to normalize the provided wind to 10 m wind) for simulations (default: 1.0).
min_wind (float) – Minimum 10 m wind speed in m/s for TCs detection (default: 10.0).
basin (int) –
- Basin/hemisphere to consider for the analysis (default: -1). Codes are:
<0 → GLOB (Global domain)
1 → NATL (North Atlantic)
2 → EPAC (Eastern Pacific)
3 → CPAC (Central Pacific)
4 → WPAC (Western Pacific)
5 → NIO (North Indian Ocean)
6 → SIO (South Indian Ocean)
7 → SPAC (South Pacific)
8 → SATL (South Atlantic)
9 → FLA (Florida)
20 → NHEMI (Northern Hemisphere)
21 → SHEMI (Southern Hemisphere)
otherwise → NONE (unrecognized)
bin_size (float) – Size of the bins in degrees for computing the TCs metrics with CyMeP (default: 2.5).
tc_config (TCConfig) – Configuration dataclass with parameters necessary for the TC evaluation. If None, default values from the configuration file pyhanami.config.scientific_evaluation_parameters.yaml will be used.
- Returns:
tc_analysis – TCEvaluation object containing the computed TCs metrics and scalar scores.
- Return type:
TCEvaluation
- class pyhanami.SimulationData(data_source, name='sim')[source]
Bases:
objectLoads and processes climate simulation data from a NetCDF file or a catalogue interface.
This class provides functionality to read input data, perform validation, and store metadata such as the simulation name and file path.
- Parameters:
data_source (str or Path or xr.Dataset) – Path to a dataset file or catalogue interface, or an already loaded xarray.Dataset object.
name (str) – Name of the simulation instance (default: ‘sim’).
- data_path
Path to the dataset file or catalogue interface if provided; None if dataset was passed directly.
- Type:
Path or None
- name
Name of the simulation instance.
- Type:
str
- data
Loaded dataset object with climate variables.
- Type:
xr.Dataset
- class pyhanami.ISOConfig(lat_range: tuple = (-30, 30), window_size: int = 141, low_freq: float = 0.011111111111111112, high_freq: float = 0.04, lag: int = 5, n_lags: int = 3, n_modes: int = 2)
Bases:
object- high_freq: float = 0.04
- lag: int = 5
- lat_range: tuple = (-30, 30)
- low_freq: float = 0.011111111111111112
- n_lags: int = 3
- n_modes: int = 2
- window_size: int = 141
- class pyhanami.MJOConfig(lat_range: tuple = (-15, 15), rolling_window_size: int = 120, n_harmonics: int = 3, normalize_std: bool = False, n_modes: int = 2, seg_size: int = 96, n_overlap: int = 60, mjo_freq_bounds: tuple = (0.0125, 0.03333333333333333), mjo_wavenum_bounds: tuple = (0, 4))
Bases:
object- lat_range: tuple = (-15, 15)
- mjo_freq_bounds: tuple = (0.0125, 0.03333333333333333)
- mjo_wavenum_bounds: tuple = (0, 4)
- n_harmonics: int = 3
- n_modes: int = 2
- n_overlap: int = 60
- normalize_std: bool = False
- rolling_window_size: int = 120
- seg_size: int = 96
- class pyhanami.TCConfig(psl_delta: float = 200.0, psl_dist: float = 5.5, z_delta: float = -6.0, z_dist: float = 6.5, z_offset: float = 1.0, merge_dist: float = 6.0, traj_range: float = 8.0, traj_min_length: int = 10, traj_max_gap: int = 3, min_len: int = 10, max_lat: float = 50.0, max_topo: float = 1500.0, truncate_years: bool = True, do_defineMIbypres: bool = False, do_fill_missing_pw: bool = True, do_special_filter_obs: bool = False, threshold_ace_wind: float = -1.0, threshold_pace_pres: float = -100.0)
Bases:
object- do_defineMIbypres: bool = False
- do_fill_missing_pw: bool = True
- do_special_filter_obs: bool = False
- max_lat: float = 50.0
- max_topo: float = 1500.0
- merge_dist: float = 6.0
- min_len: int = 10
- psl_delta: float = 200.0
- psl_dist: float = 5.5
- threshold_ace_wind: float = -1.0
- threshold_pace_pres: float = -100.0
- traj_max_gap: int = 3
- traj_min_length: int = 10
- traj_range: float = 8.0
- truncate_years: bool = True
- z_delta: float = -6.0
- z_dist: float = 6.5
- z_offset: float = 1.0