ise.data
Data loading, processing, and feature engineering for ice sheet emulation.
This package covers the full data pipeline:
ForcingFile / GridFile — load and sector-average climate forcing and grid NetCDF files.
ISEFlowAISInputs / ISEFlowGrISInputs — validated input dataclasses for running pretrained ISEFlow emulators.
FeatureEngineer — train/val/test splitting, scaling, lag variables, outlier handling, and ISM characteristic merging.
ProjectionProcessor / DatasetMerger — IVAF calculation from raw ISMIP6 NetCDF outputs and sector-level forcing/projection merging.
EmulatorDataset / PyTorchDataset / TSDataset / ScenarioDataset — PyTorch
Datasetsubclasses for LSTM and normalizing-flow training.StandardScaler / RobustScaler / LogScaler — GPU-compatible
nn.Modulescalers for use in training loops.
Submodules
ise.data.anomaly
Climatology-based anomaly conversion for ISEFlow inputs.
The ISEFlow models are trained on forcing anomalies (departures from a
historical baseline), not raw absolute values. This module provides
AnomalyConverter — a lightweight class that looks up the pre-extracted
ISMIP6 climatological baselines (stored in data_files/) and subtracts
them from user-supplied raw time-series arrays to produce the anomaly arrays
expected by ISEFlowAISInputs and ISEFlowGrISInputs.
Supported ice sheets
- AIS:
Atmospheric variables
pr,evspsbl,smb,ts→ anomalies. All inputs are in kg m⁻² s⁻¹ (pr / evspsbl / smb / mrro) or K (ts), matching the ISMIP6 atmospheric forcing file convention. The baseline is the 1995-2014 spatial mean over each AIS sector (AIS_atmos_climatologies.csv). Anomaly outputs retain the same units as the inputs.- GrIS:
Atmospheric variables
smb,st→ anomalies. Raw inputs are expected in mm w.e. yr⁻¹ (smb) and °C (st), matching the MAR 3.9 Reference file convention (1960-1989 long-term mean,GrIS_atmos_climatologies.csv). The outputaSMBanomaly is automatically converted to kg m⁻² s⁻¹ — the units used in the ISMIP6 aSMB forcing files and in the ISEFlow training data.aSTis returned in °C.- Variables that are not anomalies (passed through unchanged):
- AIS:
ocean_thermal_forcing(°C),ocean_salinity(PSU), ocean_temperature(°C)
GrIS:
ocean_thermal_forcing(°C),basin_runoff(m yr⁻¹)- AIS:
Usage — AIS
With a bundled ISMIP6 climatology:
converter = AnomalyConverter("AIS")
anomalies = converter.compute_ais(
aogcm="noresm1-m_rcp85",
sector=10,
pr=pr_array, # kg m⁻² s⁻¹
evspsbl=evspsbl_array, # kg m⁻² s⁻¹
smb=smb_array, # kg m⁻² s⁻¹
ts=ts_array, # K
)
# anomalies = {"pr_anomaly": ..., # kg m⁻² s⁻¹
# "evspsbl_anomaly": ..., # kg m⁻² s⁻¹
# "smb_anomaly": ..., # kg m⁻² s⁻¹
# "ts_anomaly": ...} # K
With a user-supplied climatology (e.g. a new CMIP model not in ISMIP6):
converter = AnomalyConverter("AIS")
anomalies = converter.compute_ais(
sector=10,
pr=pr_array,
evspsbl=evspsbl_array,
smb=smb_array,
ts=ts_array,
custom_climatology={ # 1995-2014 absolute means, same units as inputs
"pr": 1.3e-5, # kg m⁻² s⁻¹
"evspsbl": 4e-6, # kg m⁻² s⁻¹
"smb": 9e-6, # kg m⁻² s⁻¹
"ts": 253.7, # K
},
)
Usage — GrIS
With a bundled ISMIP6 climatology:
converter = AnomalyConverter("GrIS")
anomalies = converter.compute_gris(
aogcm="hadgem2-es_rcp85",
sector=1,
smb=smb_array, # absolute SMB in mm w.e. yr⁻¹ (MAR Reference units)
st=st_array, # absolute surface temperature in °C (MAR Reference units)
)
# anomalies = {"aSMB": ..., # SMB anomaly in kg m⁻² s⁻¹ (model training units)
# "aST": ...} # surface temperature anomaly in °C
With a user-supplied climatology:
converter = AnomalyConverter("GrIS")
anomalies = converter.compute_gris(
sector=1,
smb=smb_array,
st=st_array,
custom_climatology={ # 1960-1989 MAR absolute baseline means
"smb": -241.2, # mm w.e. yr⁻¹
"st": -22.8, # °C
},
)
- class ise.data.anomaly.AnomalyConverter(ice_sheet: str)[source]
Bases:
objectConvert raw absolute forcing arrays to anomalies using ISMIP6 climatologies.
- Parameters:
ice_sheet (str) –
'AIS'or'GrIS'.
- ice_sheet
- Type:
str
- climatology
The loaded climatology table for the selected ice sheet.
- Type:
pd.DataFrame
- property climatology: DataFrame
Return the climatology DataFrame, loading it on first access.
- compute_ais(sector: int, pr: ndarray, evspsbl: ndarray, smb: ndarray, ts: ndarray, aogcm: str | None = None, custom_climatology: dict | None = None, mrro: ndarray | None = None) dict[source]
Compute AIS atmospheric anomalies from raw annual time-series arrays.
Subtracts the 1995-2014 ISMIP6 climatological baseline for the given AOGCM and sector from each raw input array. All anomaly outputs retain the same units as the corresponding inputs.
Exactly one of
aogcm(use bundled ISMIP6 climatology) orcustom_climatology(user-supplied baseline scalars) must be provided.- Parameters:
sector (int) – AIS drainage sector number (1-18).
pr (np.ndarray) – Raw precipitation time series (86 values, kg m⁻² s⁻¹).
evspsbl (np.ndarray) – Raw evaporation/sublimation time series (86 values, kg m⁻² s⁻¹).
smb (np.ndarray) – Raw surface mass balance time series (86 values, kg m⁻² s⁻¹).
ts (np.ndarray) – Raw surface temperature time series (86 values, K).
aogcm (str, optional) – AOGCM name to look up in the bundled climatology. Common alternate spellings are normalised automatically (e.g.
'NorESM1-M_rcp8.5'→'noresm1-m_rcp85').custom_climatology (dict, optional) – User-supplied 1995-2014 absolute baseline means for a CMIP model not in ISMIP6. Must contain keys
'pr'(kg m⁻² s⁻¹),'evspsbl'(kg m⁻² s⁻¹),'smb'(kg m⁻² s⁻¹),'ts'(K), and optionally'mrro'(kg m⁻² s⁻¹) ifmrrois provided.mrro (np.ndarray, optional) – Raw runoff time series (86 values, kg m⁻² s⁻¹). Required only for ISEFlow v1.0.0; not used by v1.1.0.
- Returns:
Keys
'pr_anomaly','evspsbl_anomaly','smb_anomaly','ts_anomaly'as 86-element numpy arrays. Units match the inputs: kg m⁻² s⁻¹ for pr / evspsbl / smb, K for ts.'mrro_anomaly'(kg m⁻² s⁻¹) is included whenmrrois provided and a baseline is available for the requested AOGCM.- Return type:
dict
- Raises:
ValueError – If neither or both of
aogcm/custom_climatologyare given, or if array lengths are not 86.
- compute_gris(sector: int, smb: ndarray, st: ndarray, aogcm: str | None = None, custom_climatology: dict | None = None) dict[source]
Compute GrIS atmospheric anomalies from raw annual time-series arrays.
Subtracts the 1960-1989 MAR long-term mean for the given AOGCM and sector from each raw input array, then converts the SMB anomaly from mm w.e. yr⁻¹ to kg m⁻² s⁻¹ to match the units used in the ISMIP6 aSMB forcing files and in the ISEFlow training data.
Exactly one of
aogcm(use bundled ISMIP6 climatology) orcustom_climatology(user-supplied baseline scalars) must be provided.- Parameters:
sector (int) – GrIS drainage basin number (1-6).
smb (np.ndarray) – Raw (absolute) surface mass balance time series (86 values, mm w.e. yr⁻¹, matching the MAR 3.9 Reference file convention). Typical range: −2000 to +200 mm w.e. yr⁻¹ depending on sector. The output
aSMBis automatically converted to kg m⁻² s⁻¹.st (np.ndarray) – Raw (absolute) surface temperature time series (86 values, °C, matching the MAR 3.9 Reference file convention).
aogcm (str, optional) – AOGCM name to look up in the bundled climatology. Common alternate spellings are normalised automatically.
custom_climatology (dict, optional) – User-supplied 1960-1989 MAR absolute baseline means for a CMIP model not in ISMIP6. Must contain keys
'smb'(mm w.e. yr⁻¹) and'st'(°C).
- Returns:
{'aSMB': ..., 'aST': ...}as 86-element numpy arrays.aSMB: SMB anomaly in kg m⁻² s⁻¹, matching the units of the ISMIP6 aSMB forcing files and the ISEFlow training data.aST: surface temperature anomaly in °C.
Variable names match
ISEFlowGrISInputsfield names.- Return type:
dict
- Raises:
ValueError – If neither or both of
aogcm/custom_climatologyare given, or if array lengths are not 86.
- get_climatology(aogcm: str, sector: int) dict[source]
Return the climatological mean values for a given AOGCM and sector.
- Parameters:
aogcm (str) – Canonical AOGCM name (see
list_aogcms()). Common alternate spellings are normalised automatically.sector (int) – Sector / drainage basin number.
- Returns:
Variable name → scalar climatological mean for the baseline period. AIS units: kg m⁻² s⁻¹ (pr / evspsbl / smb / mrro), K (ts). GrIS units: mm w.e. yr⁻¹ (smb), °C (st).
- Return type:
dict
- Raises:
KeyError – If
aogcmis not found in the bundled climatology.
ise.data.forcings
NetCDF climate forcing file loading and sector aggregation.
ForcingFile wraps a single ISMIP6 atmospheric or oceanic forcing NetCDF
and provides a chainable API for loading, cleaning, depth-aggregating, sector
assigning, and spatially averaging the data into the per-sector time series
required by the ISEFlow training pipeline.
Supported ice sheets
- AIS:
Atmospheric variables
pr,evspsbl,smb,tsand oceanic variablesthermal_forcing,salinity,temperature.- GrIS:
Atmospheric variables
aSMB,aSTand oceanic variablesthermal_forcing,basin_runoff.
Typical workflow
from ise.data.grids import GridFile
from ise.data.forcings import ForcingFile
gridfile = GridFile("AIS", "AIS_sectors_8km.nc")
gridfile.format_grids()
forcing = ForcingFile("AIS", realm="atmos", filepath="pr_AIS_noresm1-m_rcp85.nc")
forcing.load(decode_times=False)
forcing.format_timestamps()
forcing.drop_vars(["lat", "lon", "mapping"])
forcing.assign_sectors(gridfile)
sector_df = forcing.average_over_sector(sector_number=10).to_dataframe()
Ocean realm requires depth aggregation before sector assignment:
ocean = ForcingFile("AIS", realm="ocean", filepath="thermal_forcing.nc",
varname="thermal_forcing")
ocean.load(decode_times=False)
ocean.format_timestamps()
ocean.aggregate_depth(method="mean")
ocean.assign_sectors(gridfile)
tf_df = ocean.average_over_sector(sector_number=10).to_dataframe()
These steps are orchestrated automatically by process_AIS_atmospheric_sectors(),
process_AIS_oceanic_sectors(), and their GrIS counterparts in
ise.data.process.
- class ise.data.forcings.ForcingFile(ice_sheet: str, realm: str, filepath: str, varname: str | None = None)[source]
Bases:
objectWrapper for loading and processing climate forcing NetCDF files.
Supports atmospheric and oceanic realms, sector assignment, depth aggregation (ocean), and sector-averaged time series.
- Parameters:
ice_sheet (str) – Ice sheet identifier (‘AIS’ or ‘GrIS’).
realm (str) – Forcing realm (‘atmos’ or ‘ocean’).
filepath (str) – Path to the NetCDF forcing file.
varname (str, optional) – Name of the data variable. Defaults to None (first data var).
- ice_sheet
Ice sheet identifier.
- Type:
str
- realm
Forcing realm.
- Type:
str
- filepath
Path to the file.
- Type:
str
- data
Loaded dataset after load().
- Type:
xarray.Dataset or None
- sector_averages
Sector-averaged data after average_over_sector().
- Type:
xarray.Dataset or None
- sectors
Sector IDs after assign_sectors().
- Type:
numpy.ndarray or None
- varname
Data variable name.
- Type:
str or None
- aggregate_depth(method='mean')[source]
Aggregate over the depth dimension (ocean realm only).
- Parameters:
method (str) – ‘mean’ or ‘sum’. Defaults to ‘mean’.
- Returns:
The dataset with depth aggregated.
- Return type:
xarray.Dataset
- Raises:
ValueError – If realm is not ‘ocean’, data not loaded, or no ‘z’ dimension.
- assign_sectors(sectors: ndarray | GridFile) Dataset[source]
Assign sector IDs to the dataset (e.g. from a GridFile).
- Parameters:
sectors (numpy.ndarray or GridFile) – Sector IDs or GridFile to get sectors from.
- Returns:
The dataset with sector coordinate.
- Return type:
xarray.Dataset
- Raises:
ValueError – If data is not loaded.
- average_over_sector(sector_number: int | None = None) Dataset[source]
Average data over grid cells within a sector (or all sectors).
- Parameters:
sector_number (int, optional) – Sector ID. If None, must be pre-averaged. Defaults to None.
- Returns:
Sector-averaged data.
- Return type:
xarray.Dataset
- Raises:
ValueError – If data not loaded or sectors not assigned.
NotImplementedError – If sector_number is None (averaging all sectors at once).
- drop_vars(vars: list[str]) Dataset[source]
Drop dimensions or variables from the loaded dataset.
- Parameters:
vars (List[str]) – Names of dimensions or variables to drop.
- Returns:
The dataset (modified in place).
- Return type:
xarray.Dataset
- format_timestamps() Dataset[source]
Convert and subset time coordinate to 2015-2100 (86 years).
- Returns:
The dataset with formatted time.
- Return type:
xarray.Dataset
- load(filepath: str | None = None, validate=True, **kwargs) Dataset[source]
Load the forcing dataset from the NetCDF file.
- Parameters:
filepath (str, optional) – Override path. Defaults to self.filepath.
validate (bool, optional) – Whether to validate (non-NaN data). Defaults to True.
**kwargs – Passed to xarray.open_dataset.
- Returns:
The loaded dataset.
- Return type:
xarray.Dataset
ise.data.grids
NetCDF sector-definition grid file loading and formatting.
GridFile wraps the ice-sheet sector boundary grids used to assign each
spatial grid cell to a drainage sector (AIS: 18 sectors; GrIS: 6 drainage
basins). The sector array it exposes is consumed by ForcingFile.assign_sectors()
during the data processing pipeline.
Grid files expected
- AIS:
AIS_sectors_8km.nc— sector variable named'sectors'.- GrIS:
GrIS_Basins_Rignot_sectors_5km.nc— sector variable named'ID'.
Typical workflow
Sector grids need a time dimension that matches the forcing data (86 years)
before they can be broadcast alongside a forcing xarray.Dataset. The
format_grids() convenience method handles the three required steps:
from ise.data.grids import GridFile
gridfile = GridFile("AIS", filepath="AIS_sectors_8km.nc")
gridfile.format_grids() # load → expand time to 86 → align dims
sectors = gridfile.get_sectors() # xr.DataArray of shape (time, x, y)
To perform steps individually (e.g. for a custom time length):
gridfile = GridFile("GrIS", filepath="GrIS_Basins_Rignot_sectors_5km.nc")
gridfile.load()
gridfile.expand_dims(dim="time", size=86)
gridfile.align_dims(dims=["time", "x", "y"])
sectors = gridfile.get_sectors()
In both cases the returned DataArray is passed directly to
ForcingFile.assign_sectors(gridfile) or used as a mask in the sector-level
aggregation functions in ise.data.process.
- class ise.data.grids.GridFile(ice_sheet: str, filepath: str)[source]
Bases:
objectWrapper for loading and formatting sector grid NetCDF files.
Used to load sector IDs and optionally expand/align dimensions for compatibility with forcing data (e.g. time dimension of length 86).
- Parameters:
ice_sheet (str) – Ice sheet identifier (‘AIS’ or ‘GrIS’).
filepath (str) – Path to the grid NetCDF file.
- ice_sheet
Ice sheet identifier.
- Type:
str
- filepath
Path to the file.
- Type:
str
- data
Loaded dataset after load().
- Type:
xarray.Dataset or None
- sector_variable_name
Name of the sector variable (‘sectors’ for AIS, ‘ID’ for GrIS).
- Type:
str
- align_dims(dims: list | None = None) Dataset[source]
Transpose dimensions to a standard order.
- Parameters:
dims (list, optional) – Dimension order. If None, uses (‘time’, ‘x’, ‘y’, …).
- Returns:
The dataset with reordered dimensions.
- Return type:
xarray.Dataset
- expand_dims(dim: str = 'time', size: int | None = None) Dataset[source]
Expand dimensions (e.g. add time dimension of given size).
- Parameters:
dim (str, optional) – Dimension name. Defaults to ‘time’.
size (int, optional) – Size of the new dimension. Defaults to None.
- Returns:
The dataset with expanded dimension.
- Return type:
xarray.Dataset
ise.data.inputs
Input dataclasses for ISEFlow-AIS and ISEFlow-GrIS predictions.
This module defines ISEFlowAISInputs and ISEFlowGrISInputs, which
validate, encode, and package the climate forcing arrays and ice sheet model
(ISM) configuration required by the pretrained ISEFlow emulators.
Both dataclasses perform the following on construction:
Validation — all parameter values are checked against the enumerated sets of allowed options (numerics, stress balance, resolution, etc.).
Encoding — human-readable strings (e.g.
'fd','hybrid') are mapped to the internal categorical encodings expected by the model weights (e.g.'FD','Hybrid').Array coercion — all forcing arrays are cast to
numpy.ndarray.Year encoding — calendar years 2015-2100 are converted to the model-internal 1-86 encoding.
Alternative constructor — raw absolute forcings
If you have raw (non-anomaly) atmospheric forcing values, use
from_absolute_forcings(). It calls AnomalyConverter internally to subtract
the ISMIP6 climatological baseline before building the dataclass:
from ise.data.inputs import ISEFlowAISInputs
import numpy as np
inputs = ISEFlowAISInputs.from_absolute_forcings(
year=np.arange(2015, 2101),
sector=10,
pr=pr_array, # kg m⁻² s⁻¹, raw absolute values
evspsbl=evspsbl_array,
smb=smb_array,
ts=ts_array, # K
ocean_thermal_forcing=otf_array,
ocean_salinity=sal_array,
ocean_temperature=temp_array,
aogcm="noresm1-m_rcp85", # or custom_climatology={...} for new CMIP models
# ISM configuration:
numerics="fd",
stress_balance="hybrid",
resolution="8",
init_method="eq",
initial_year=2005,
melt_in_floating_cells="sub-grid",
icefront_migration="str",
ocean_forcing_type="open",
ocean_sensitivity="medium",
ice_shelf_fracture=False,
open_melt_type="quad",
standard_melt_type=None,
)
If the ISM configuration matches one of the bundled ISMIP6 models, you can
pass model_configs="BISICLES_UBC" (or whichever model key appears in
ismip6_model_configs.json) instead of specifying all parameters
individually.
Output
Call inputs.to_df() to obtain a pandas.DataFrame (86 rows × features)
that can be passed directly to ISEFlow_AIS.process() or
ISEFlow_GrIS.process(). The pretrained wrappers call process()
internally when you invoke model.predict(inputs).
See also: ise.data.anomaly.AnomalyConverter
- class ise.data.inputs.ISEFlowAISInputs(year: ndarray, sector: ndarray | int, pr_anomaly: ndarray, evspsbl_anomaly: ndarray, smb_anomaly: ndarray, ts_anomaly: ndarray, ocean_thermal_forcing: ndarray, ocean_salinity: ndarray, ocean_temperature: ndarray, ice_shelf_fracture: bool, ocean_sensitivity: str, mrro_anomaly: ndarray | None = None, initial_year: int | None = None, numerics: str | None = None, stress_balance: str | None = None, resolution: str | None = None, init_method: str | None = None, melt_in_floating_cells: str | None = None, icefront_migration: str | None = None, ocean_forcing_type: str | None = None, open_melt_type: str | None = None, standard_melt_type: str | None = None, model_configs: str | None = None, version: str = 'v1.1.0', override_params: dict | None = None)[source]
Bases:
objectInputs for an ISEFlow-AIS prediction.
Expects pre-computed anomaly arrays (
pr_anomaly,evspsbl_anomaly,smb_anomaly,ts_anomaly). If you have raw absolute forcing values instead, use the alternative constructor:inputs = ISEFlowAISInputs.from_absolute_forcings( year=..., sector=..., pr=..., evspsbl=..., smb=..., ts=..., ocean_thermal_forcing=..., ocean_salinity=..., ocean_temperature=..., aogcm="noresm1-m_rcp85", # or custom_climatology={...} **ism_config_kwargs, )
from_absolute_forcings()subtracts the ISMIP6 1995-2014 climatological baseline automatically. Passaogcmfor a bundled ISMIP6 model orcustom_climatology(dict with keys'pr','evspsbl','smb','ts') for a CMIP model not in the bundled climatology.- evspsbl_anomaly: ndarray
- classmethod from_absolute_forcings(year: ndarray, sector: int, pr: ndarray, evspsbl: ndarray, smb: ndarray, ts: ndarray, ocean_thermal_forcing: ndarray, ocean_salinity: ndarray, ocean_temperature: ndarray, aogcm: str | None = None, custom_climatology: dict | None = None, mrro: ndarray | None = None, **kwargs) ISEFlowAISInputs[source]
Construct ISEFlowAISInputs from raw (non-anomaly) atmospheric forcings.
Subtracts the ISMIP6 1995-2014 climatological baseline from each atmospheric variable to produce the anomaly arrays required by the model. Ocean variables (
ocean_thermal_forcing,ocean_salinity,ocean_temperature) are absolute values and are passed through unchanged.Exactly one of
aogcmorcustom_climatologymust be provided.- Parameters:
year (np.ndarray) – Years corresponding to the time series (86 values, 2015-2100).
sector (int) – AIS drainage sector (1-18).
pr (np.ndarray) – Raw precipitation (86 values, kg m⁻² s⁻¹).
evspsbl (np.ndarray) – Raw evaporation / sublimation (86 values, kg m⁻² s⁻¹).
smb (np.ndarray) – Raw surface mass balance (86 values, kg m⁻² s⁻¹).
ts (np.ndarray) – Raw surface temperature (86 values, K).
ocean_thermal_forcing (np.ndarray) – Ocean thermal forcing (86 values, °C). Passed through unchanged.
ocean_salinity (np.ndarray) – Ocean salinity (86 values, PSU). Passed through unchanged.
ocean_temperature (np.ndarray) – Ocean temperature (86 values, °C). Passed through unchanged.
aogcm (str, optional) – AOGCM name to look up in the bundled ISMIP6 climatology (e.g.
'noresm1-m_rcp85'). Common alternate spellings are normalised automatically.custom_climatology (dict, optional) – Baseline means for a CMIP model not in the bundled climatology. Must contain keys
'pr','evspsbl','smb','ts'(and'mrro'ifmrrois also provided). Values should be in the same units as the raw input arrays.mrro (np.ndarray, optional) – Raw runoff (86 values). Only needed for ISEFlow v1.0.0.
**kwargs – All remaining keyword arguments are forwarded to
ISEFlowAISInputs.__init__(e.g. ISM config fields such asnumerics,stress_balance,model_configs, etc.).
- Returns:
Fully validated inputs object ready for
model.predict().- Return type:
Examples
Using a bundled ISMIP6 climatology:
inputs = ISEFlowAISInputs.from_absolute_forcings( year=np.arange(2015, 2101), sector=10, pr=pr_array, evspsbl=evspsbl_array, smb=smb_array, ts=ts_array, ocean_thermal_forcing=otf_array, ocean_salinity=sal_array, ocean_temperature=temp_array, aogcm="noresm1-m_rcp85", numerics="fd", stress_balance="hybrid", resolution="8", init_method="eq", initial_year=2005, melt_in_floating_cells="sub-grid", icefront_migration="str", ocean_forcing_type="open", ocean_sensitivity="medium", ice_shelf_fracture=False, open_melt_type="quad", standard_melt_type="nonlocal", )
Using a custom climatology for a new CMIP model:
inputs = ISEFlowAISInputs.from_absolute_forcings( year=np.arange(2015, 2101), sector=10, pr=pr_array, evspsbl=evspsbl_array, smb=smb_array, ts=ts_array, ocean_thermal_forcing=otf_array, ocean_salinity=sal_array, ocean_temperature=temp_array, custom_climatology={ "pr": 1.3e-5, "evspsbl": 3.8e-6, "smb": 9.0e-6, "ts": 253.7, }, numerics="fd", ... )
- classmethod from_raw_values(*args, **kwargs)[source]
Deprecated — use
from_absolute_forcingsinstead.
- ice_shelf_fracture: bool
- icefront_migration: str | None = None
- init_method: str | None = None
- initial_year: int | None = None
- melt_in_floating_cells: str | None = None
- model_configs: str | None = None
- mrro_anomaly: ndarray | None = None
- numerics: str | None = None
- ocean_forcing_type: str | None = None
- ocean_salinity: ndarray
- ocean_sensitivity: str
- ocean_temperature: ndarray
- ocean_thermal_forcing: ndarray
- open_melt_type: str | None = None
- override_params: dict | None = None
- pr_anomaly: ndarray
- resolution: str | None = None
- sector: ndarray | int
- smb_anomaly: ndarray
- standard_melt_type: str | None = None
- stress_balance: str | None = None
- to_df()[source]
Convert the dataclass fields to a pandas DataFrame.
- Returns:
One row per timestep (86 rows) with all forcing and configuration columns needed by
ISEFlow_AIS.process().- Return type:
pandas.DataFrame
- ts_anomaly: ndarray
- version: str = 'v1.1.0'
- year: ndarray
- class ise.data.inputs.ISEFlowGrISInputs(year: ndarray, sector: ndarray | int, aST: ndarray, aSMB: ndarray, ocean_thermal_forcing: ndarray, basin_runoff: ndarray, ice_shelf_fracture: bool, ocean_sensitivity: str, standard_ocean_forcing: bool, initial_year: int | None = None, numerics: str | None = None, ice_flow_model: str | None = None, initialization: str | None = None, initial_smb: str | None = None, velocity: str | None = None, bedrock_topography: str | None = None, surface_thickness: str | None = None, geothermal_heat_flux: str | None = None, res_min: str | None = None, res_max: str | None = None, model_configs: str | None = None, version: str = 'v1.1.0')[source]
Bases:
objectInputs for an ISEFlow-GrIS prediction.
Expects pre-computed anomaly arrays (
aSMB,aST). If you have raw absolute forcing values instead, use the alternative constructor:inputs = ISEFlowGrISInputs.from_absolute_forcings( year=..., sector=..., smb=..., st=..., ocean_thermal_forcing=..., basin_runoff=..., aogcm="hadgem2-es_rcp85", # or custom_climatology={...} **ism_config_kwargs, )
from_absolute_forcings()subtracts the ISMIP6 1960-1989 MAR climatological baseline automatically. Passaogcmfor a bundled ISMIP6 model orcustom_climatology(dict with keys'smb','st') for a CMIP model not in the bundled climatology.- aSMB: ndarray
- aST: ndarray
- basin_runoff: ndarray
- bedrock_topography: str | None = None
- classmethod from_absolute_forcings(year: ndarray, sector: int, smb: ndarray, st: ndarray, ocean_thermal_forcing: ndarray, basin_runoff: ndarray, aogcm: str | None = None, custom_climatology: dict | None = None, **kwargs) ISEFlowGrISInputs[source]
Construct ISEFlowGrISInputs from raw (non-anomaly) atmospheric forcings.
Subtracts the ISMIP6 1960-1989 MAR climatological baseline from each atmospheric variable to produce the anomaly arrays (
aSMB,aST) required by the model. Ocean variables (ocean_thermal_forcing,basin_runoff) are absolute values and are passed through unchanged.Exactly one of
aogcmorcustom_climatologymust be provided.- Parameters:
year (np.ndarray) – Years (86 values, 2015-2100).
sector (int) – GrIS drainage basin number (1-6).
smb (np.ndarray) – Raw surface mass balance (86 values, mm w.e. yr⁻¹, matching the MAR Reference file units used in the bundled climatology CSV). The anomaly conversion automatically converts to kg m⁻² s⁻¹.
st (np.ndarray) – Raw surface temperature (86 values, K or °C, consistent with the MAR reference).
ocean_thermal_forcing (np.ndarray) – Ocean thermal forcing (86 values). Passed through unchanged.
basin_runoff (np.ndarray) – Basin-integrated runoff (86 values). Passed through unchanged.
aogcm (str, optional) – AOGCM name to look up in the bundled ISMIP6 climatology (e.g.
'hadgem2-es_rcp85'). Common alternate spellings are normalised automatically.custom_climatology (dict, optional) – Baseline means for a CMIP model not in the bundled climatology. Must contain keys
'smb'and'st'in MAR units.**kwargs – All remaining keyword arguments are forwarded to
ISEFlowGrISInputs.__init__(e.g. ISM config fields such asnumerics,ice_flow_model,model_configs, etc.).
- Returns:
Fully validated inputs object ready for
model.predict().- Return type:
Examples
Using a bundled ISMIP6 climatology:
inputs = ISEFlowGrISInputs.from_absolute_forcings( year=np.arange(2015, 2101), sector=1, smb=smb_array, st=st_array, ocean_thermal_forcing=otf_array, basin_runoff=runoff_array, aogcm="hadgem2-es_rcp85", initial_year=1990, numerics="fe", ice_flow_model="ho", initialization="dav", initial_smb="ra3", velocity="joughin", bedrock_topography="morlighem", surface_thickness="None", geothermal_heat_flux="g", res_min=1.0, res_max=7.5, standard_ocean_forcing=True, ocean_sensitivity="medium", ice_shelf_fracture=False, )
Using a custom climatology for a new CMIP model:
inputs = ISEFlowGrISInputs.from_absolute_forcings( year=np.arange(2015, 2101), sector=1, smb=smb_array, st=st_array, ocean_thermal_forcing=otf_array, basin_runoff=runoff_array, custom_climatology={"smb": -241.2, "st": -22.8}, initial_year=1990, ... )
- classmethod from_raw_values(*args, **kwargs)[source]
Deprecated — use
from_absolute_forcingsinstead.
- geothermal_heat_flux: str | None = None
- ice_flow_model: str | None = None
- ice_shelf_fracture: bool
- initial_smb: str | None = None
- initial_year: int | None = None
- initialization: str | None = None
- model_configs: str | None = None
- numerics: str | None = None
- ocean_sensitivity: str
- ocean_thermal_forcing: ndarray
- res_max: str | None = None
- res_min: str | None = None
- sector: ndarray | int
- standard_ocean_forcing: bool
- surface_thickness: str | None = None
- to_df()[source]
Convert the dataclass fields to a pandas DataFrame.
- Returns:
One row per timestep (86 rows) with all forcing and configuration columns needed by
ISEFlow_GrIS.process().- Return type:
pandas.DataFrame
- velocity: str | None = None
- version: str = 'v1.1.0'
- year: ndarray
ise.data.feature_engineer
Feature engineering for ISMIP6 emulator training datasets.
This module transforms the raw merged dataset (output of ise.data.process)
into the scaled, lagged, train/val/test-split arrays consumed by
ISEFlow.fit(). The primary interface is the FeatureEngineer class,
backed by a set of standalone functions that can also be called independently.
Pipeline stages
The typical preprocessing sequence is:
from ise.data.feature_engineer import FeatureEngineer
fe = FeatureEngineer("AIS", data=df)
fe.add_model_characteristics() # merge ISM config one-hot columns
fe.drop_outliers( # remove SLE < -26.3 mm (physics bound)
method="explicit",
column="sle",
expression=[("sle", "<", -26.3)],
)
fe.backfill_outliers() # replace extreme spikes with prev value
fe.add_lag_variables(lag=5) # add t-1 … t-5 copies of forcing vars
fe.split_data(output_directory="splits/") # 70/15/15 by simulation id
X_scaled, y_scaled = fe.scale_data(method="standard", save_dir="splits/")
Key design choices
Split granularity: train/val/test is done by simulation id, not by individual rows, so no future data leaks into the validation set. The default split is 70/15/15 with
random_state=1.Outlier threshold:
drop_outlierswithexpression=[("sle", "<", -26.3)]removes physically implausible projections (sea level rise of more than 26.3 mm is considered a physical bound for individual sectors).Lag variables:
add_lag_variables(lag=5)adds t-1 through t-5 copies of each atmospheric and oceanic forcing column within each 86-year segment, respecting projection boundaries so lag values do not cross between runs.Model characteristics:
add_model_characteristics()merges the ISM configuration CSV (e.g.AIS_model_characteristics.csv) and one-hot encodes categorical columns such as numerics, stress balance, etc.
Standalone functions (also usable without FeatureEngineer)
split_training_data — train/val/test split by simulation id.
add_lag_variables — add t-k lag columns within each 86-step segment.
backfill_outliers — replace extreme y-values with previous-row value.
drop_outliers — remove entire runs containing outlier timesteps.
add_model_characteristics — merge and encode ISM config metadata.
scale_data — apply a pre-fitted sklearn scaler from disk.
fill_mrro_nans — impute missing mrro_anomaly values.
- class ise.data.feature_engineer.FeatureEngineer(ice_sheet, data: DataFrame, fill_mrro_nans: bool = False, split_dataset: bool = False, train_size: float = 0.7, val_size: float = 0.15, test_size: float = 0.15, output_directory: str | None = None)[source]
Bases:
objectA class for performing feature engineering on a given dataset, including preprocessing, scaling, dataset splitting, and outlier handling.
- Parameters:
ice_sheet (str) – The name of the ice sheet being analyzed.
data (pd.DataFrame) – The input dataset.
fill_mrro_nans (bool, optional) – Whether to fill missing values in the ‘mrro’ column. Defaults to False.
split_dataset (bool, optional) – Whether to split the dataset into training, validation, and test sets. Defaults to False.
train_size (float, optional) – Proportion of data to use for training. Defaults to 0.7.
val_size (float, optional) – Proportion of data to use for validation. Defaults to 0.15.
test_size (float, optional) – Proportion of data to use for testing. Defaults to 0.15.
output_directory (str, optional) – Directory to save the split datasets. Defaults to None.
- data
The input dataset.
- Type:
pd.DataFrame
- train_size
Proportion of training data.
- Type:
float
- val_size
Proportion of validation data.
- Type:
float
- test_size
Proportion of testing data.
- Type:
float
- output_directory
Directory to save datasets.
- Type:
str
- scaler_X_path
Path to the saved input feature scaler.
- Type:
str
- scaler_y_path
Path to the saved target variable scaler.
- Type:
str
- scaler_X
Scaler for input features.
- Type:
scaler object
- scaler_y
Scaler for target variables.
- Type:
scaler object
- train
Training dataset.
- Type:
pd.DataFrame
- val
Validation dataset.
- Type:
pd.DataFrame
- test
Test dataset.
- Type:
pd.DataFrame
- _including_model_characteristics
Whether model characteristics have been included.
- Type:
bool
- add_lag_variables(lag, data=None)[source]
Adds lagged versions of predictor variables to the dataset.
- Parameters:
lag (int) – Number of time steps to lag the variables.
data (pd.DataFrame, optional) – The dataset. If not provided, the class attribute ‘data’ is used.
- Returns:
The modified instance with lag variables added.
- Return type:
- add_model_characteristics(data=None, model_char_path=None, encode=True, ids_path=None)[source]
Merges model characteristic data with the dataset.
- Parameters:
data (pd.DataFrame, optional) – The dataset. If not provided, the class attribute ‘data’ is used.
model_char_path (str, optional) – Path to the model characteristics file. Defaults to the internal path.
encode (bool, optional) – Whether to one-hot encode categorical characteristics. Defaults to True.
ids_path (str, optional) – Path to an additional ID mapping file. Defaults to None.
- Returns:
The modified instance with model characteristics added.
- Return type:
- backfill_outliers(percentile=99.999, data=None)[source]
Replaces extreme values in target variables with the previous row’s value.
- Parameters:
percentile (float, optional) – Percentile threshold for identifying outliers. Defaults to 99.999.
data (pd.DataFrame, optional) – The dataset. If not provided, the class attribute ‘data’ is used.
- Returns:
The modified instance with outliers handled.
- Return type:
- drop_outliers(method, column, expression=None, quantiles=[0.01, 0.99], data=None)[source]
Drops simulations that are outliers based on the provided method.
- Parameters:
method (str) – Method of outlier deletion (‘quantile’ or ‘explicit’).
column (str) – Column used for detecting outliers.
expression (list[tuple], optional) – List of filtering expressions in the form [(column, operator, value)]. Defaults to None.
quantiles (list[float], optional) – Quantiles for ‘quantile’ method. Defaults to [0.01, 0.99].
data (pd.DataFrame, optional) – The dataset. If not provided, the class attribute ‘data’ is used.
- Returns:
The modified instance with outliers removed.
- Return type:
- exclude_fetish_models(data=None, exclude='both')[source]
Excludes specific models from the dataset.
- Parameters:
data (pd.DataFrame, optional) – The dataset. If not provided, the class attribute ‘data’ is used.
- Returns:
The modified instance with specific models excluded.
- Return type:
- fill_mrro_nans(method, data=None)[source]
Fills missing values in the ‘mrro’ column.
- Parameters:
method (str) – The method used to fill missing values.
data (pd.DataFrame, optional) – The dataset. Defaults to None.
- Returns:
The dataset with missing values filled.
- Return type:
pd.DataFrame
- scale_data(X=None, y=None, method='standard', save_dir=None)[source]
Scales input (X) and target (y) variables using a specified scaling method.
- Parameters:
X (pd.DataFrame or np.ndarray, optional) – Input data. Defaults to None.
y (pd.DataFrame or np.ndarray, optional) – Target data. Defaults to None.
method (str, optional) – Scaling method (‘standard’, ‘minmax’, ‘robust’). Defaults to ‘standard’.
save_dir (str, optional) – Directory to save scalers. Defaults to None.
- Returns:
Scaled X and y values.
- Return type:
tuple
- split_data(data=None, train_size=None, val_size=None, test_size=None, output_directory=None, random_state=1)[source]
Splits the dataset into training, validation, and test sets.
- Parameters:
data (pd.DataFrame, optional) – The input dataset. Defaults to None.
train_size (float, optional) – Proportion of training data. Defaults to None.
val_size (float, optional) – Proportion of validation data. Defaults to None.
test_size (float, optional) – Proportion of testing data. Defaults to None.
output_directory (str, optional) – Directory to save split datasets. Defaults to None.
random_state (int, optional) – Random seed for reproducibility. Defaults to 42.
- Returns:
Training, validation, and test datasets as pandas DataFrames.
- Return type:
tuple
- unscale_data(X=None, y=None, scaler_X_path=None, scaler_y_path=None)[source]
Reverses the scaling transformation for input (X) and target (y) variables.
- Parameters:
X (pd.DataFrame or np.ndarray, optional) – The input data to be unscaled. Defaults to None.
y (pd.DataFrame, np.ndarray, or torch.Tensor, optional) – The target data to be unscaled. Defaults to None.
scaler_X_path (str, optional) – Path to the stored input scaler. Defaults to None.
scaler_y_path (str, optional) – Path to the stored target scaler. Defaults to None.
- Returns:
Unscaled X and y data.
- Return type:
tuple
- ise.data.feature_engineer.add_lag_variables(data: DataFrame, lag: int, verbose=True) DataFrame[source]
Adds lagged variables to the input dataset, creating time-shifted versions of the predictor variables.
- Parameters:
data (pd.DataFrame) – The dataset containing time series data.
lag (int) – The number of time steps to lag the variables.
verbose (bool, optional) – Whether to display a progress bar. Defaults to True.
- Returns:
The dataset with lagged variables added.
- Return type:
pd.DataFrame
- ise.data.feature_engineer.add_model_characteristics(data, model_char_path=None, encode=True, ids_path=None) DataFrame[source]
Adds model characteristics to the dataset.
- Parameters:
data (pd.DataFrame) – The input dataset.
model_char_path (str, optional) – Path to the model characteristics file. Defaults to internal path.
encode (bool, optional) – Whether to one-hot encode categorical characteristics. Defaults to True.
ids_path (str, optional) – Path to an additional ID mapping file. Defaults to None.
- Returns:
The dataset with model characteristics added.
- Return type:
pd.DataFrame
- ise.data.feature_engineer.backfill_outliers(data, percentile=99.999)[source]
Replaces extreme values in y-values (above the specified percentile and below the 1-percentile across all y-values) with the value from the next row (bfill). Trailing outliers at the end of the series will remain as NaN.
- Parameters:
data (pd.DataFrame) – The dataset containing y-values.
percentile (float, optional) – The percentile threshold to define upper extreme values. Defaults to 99.999.
- Returns:
The dataset with extreme values replaced using backfill.
- Return type:
pd.DataFrame
- ise.data.feature_engineer.drop_outliers(data: DataFrame, column: str, method: str, expression: list[tuple] | None = None, quantiles: list[float] = [0.01, 0.99])[source]
Removes outliers from the dataset based on a specified method.
- Parameters:
data (pd.DataFrame) – The dataset containing the column with potential outliers.
column (str) – The column to assess for outliers.
method (str) – The method of outlier detection (‘quantile’ or ‘explicit’).
expression (list of tuples, optional) – A list of conditions in the format [(column, operator, value)] for explicit filtering. Defaults to None.
quantiles (list of float, optional) – Quantiles for filtering when using the ‘quantile’ method. Defaults to [0.01, 0.99].
- Returns:
The dataset with outliers removed.
- Return type:
pd.DataFrame
- Raises:
AttributeError – If the method is ‘quantile’ but no quantiles are provided.
AttributeError – If the method is ‘explicit’ but no expression is provided.
ValueError – If the operator in the expression is not recognized.
- ise.data.feature_engineer.exclude_fetish_models(data: DataFrame, exclude: str = 'both') DataFrame[source]
Excludes specific models from the dataset.
- Parameters:
data (pd.DataFrame) – The input DataFrame.
- Returns:
The filtered DataFrame.
- Return type:
pd.DataFrame
- ise.data.feature_engineer.fill_mrro_nans(data: DataFrame, method) DataFrame[source]
Fills the NaN values in the specified columns with the given method.
- Parameters:
data (pd.DataFrame) – The input DataFrame.
method (str or int) – The method to fill NaN values. Must be one of ‘zero’, ‘mean’, ‘median’, or ‘drop’.
- Returns:
The DataFrame with NaN values filled according to the specified method.
- Return type:
pd.DataFrame
- Raises:
ValueError – If the method is not one of ‘zero’, ‘mean’, ‘median’, or ‘drop’.
- ise.data.feature_engineer.scale_data(data, scaler_path)[source]
Scales the provided dataset using a pre-trained scaler.
- Parameters:
data (pd.DataFrame) – The dataset to be scaled.
scaler_path (str) – Path to the saved scaler.
- Returns:
The scaled dataset.
- Return type:
pd.DataFrame
- ise.data.feature_engineer.split_training_data(data, train_size, val_size, test_size=None, output_directory=None, random_state=1)[source]
Splits the dataset into training, validation, and test sets.
- Parameters:
data (str or pd.DataFrame) – The dataset or path to the dataset to be split.
train_size (float) – Proportion of data to use for training.
val_size (float) – Proportion of data to use for validation.
test_size (float, optional) – Proportion of data to use for testing. Defaults to the remainder.
output_directory (str, optional) – Directory to save the split datasets as CSV files. Defaults to None.
random_state (int, optional) – Seed for reproducibility. Defaults to 1.
- Returns:
Training, validation, and test datasets as pandas DataFrames.
- Return type:
tuple
- Raises:
ValueError – If the dataset length is not divisible by 86, indicating incomplete projections.
ValueError – If the dataset does not contain an ‘id’ column.
ise.data.process
End-to-end ISMIP6 data processing pipeline for ISEFlow training data.
This module converts raw ISMIP6 forcing and projection files into the
sector-level, analysis-ready dataset.csv consumed by FeatureEngineer
and ultimately by ISEFlow.fit().
Public entry points
process_sectors(main entry point):End-to-end pipeline. Reads raw ISMIP6 forcing NetCDFs from the GHub directory layout and pre-computed IVAF scalar projection files from Zenodo, aggregates both to sector-level annual time series (86 years, 2015-2100), joins them on (aogcm, year, sector), and returns a single
pandas.DataFrame:from ise.data.process import process_sectors dataset = process_sectors( ice_sheet="AIS", forcing_directory="/path/to/GHub/AIS/", grid_file="/path/to/AIS_sectors_8km.nc", zenodo_directory="/path/to/zenodo_download/", export_directory="outputs/", )
Intermediate CSVs (
AIS_atmospheric.csv,AIS_oceanic.csv,forcings.csv,projections.csv,dataset.csv) are written toexport_directoryso individual stages are skipped on re-runs (controlled byoverwrite=False).ProjectionProcessor:Only needed when starting from raw 3-D ISMIP6 NetCDF output files rather than the pre-computed Zenodo scalar files. Computes Ice Volume Above Flotation (IVAF) from bed topography, ice thickness, and ice/grounded fraction at each grid cell, subtracts the matched control-run IVAF, and writes
ivaf_<ice_sheet>_<group>_<model>_<exp>.ncfiles:from ise.data.process import ProjectionProcessor processor = ProjectionProcessor( ice_sheet="AIS", forcings_directory="/path/to/forcing/", projections_directory="/path/to/projections/", scalefac_path="af2_scalefac.nc", densities_path="AIS_densities.csv", ) processor.process()
DatasetMerger:Lower-level alternative to
process_sectors()for when intermediate per-run CSV files already exist on disk. Performs only the join step (forcing ↔ projection matched by CMIP model and pathway).
Supporting functions
process_AIS_atmospheric_sectors/process_GrIS_atmospheric_sectorsAggregate atmospheric forcing NetCDFs to sector-level annual means.
process_AIS_oceanic_sectors/process_GrIS_oceanic_sectorsAggregate oceanic forcing NetCDFs to sector-level annual means.
process_AIS_outputs/process_GrIS_outputsLoad pre-computed IVAF scalar projections from Zenodo and convert to SLE.
merge_datasetsJoin sector-level forcings and projections DataFrames on (aogcm, year, sector).
get_model_densitiesExtract ice/water density values (rhoi, rhow) from raw ISMIP6 NetCDFs.
combine_gris_forcingsConcatenate annual GrIS atmospheric NetCDF files into per-AOGCM combined files.
- class ise.data.process.DatasetMerger(ice_sheet, forcings, projections, experiment_file, output_dir)[source]
Bases:
objectMerges pre-processed CSV forcing and projection files into a single dataset.
This is a lower-level alternative to
process_sectors(). Use it when the intermediate per-run CSV files already exist on disk and you only need the join step (forcing ↔ projection matched by CMIP model and pathway).- Parameters:
ice_sheet (str) – The ice sheet name (‘AIS’ or ‘GrIS’).
forcings (str) – Directory containing forcing CSV files.
projections (str) – Directory containing projection CSV files.
experiment_file (str) – Path to the experiment metadata file (CSV or JSON).
output_dir (str) – Directory to save the merged
dataset.csv.
- experiments
Experiment metadata loaded from
experiment_file.- Type:
pd.DataFrame
- forcing_paths
File paths for all forcing CSVs found under
forcings.- Type:
list
- projection_paths
File paths for all projection CSVs found under
projections.- Type:
list
- forcing_metadata
Extracted CMIP model and pathway for each forcing file.
- Type:
pd.DataFrame
- class ise.data.process.ProjectionProcessor(ice_sheet, forcings_directory, projections_directory, scalefac_path=None, densities_path=None)[source]
Bases:
objectA class for processing ISMIP6 projections (outputs) for ice sheet models, specifically for calculating Ice Volume Above Flotation (IVAF), handling control projections, and processing experimental projections.
- Parameters:
ice_sheet (str) – The ice sheet being analyzed (‘AIS’ or ‘GIS’).
forcings_directory (str) – Path to the directory containing forcing datasets.
projections_directory (str) – Path to the directory containing projection datasets.
scalefac_path (str, optional) – Path to the NetCDF file containing scaling factors for each grid cell. Defaults to None.
densities_path (str, optional) – Path to the CSV file containing density data for models. Defaults to None.
- forcings_directory
Path to forcing data.
- Type:
str
- projections_directory
Path to projection data.
- Type:
str
- densities_path
Path to density dataset.
- Type:
str
- scalefac_path
Path to scaling factor dataset.
- Type:
str
- ice_sheet
Ice sheet identifier (‘AIS’ or ‘GIS’).
- Type:
str
- resolution
Resolution of the dataset (5 for GIS, 8 for AIS).
- Type:
int
- process()[source]
Processes ISMIP6 projections by calculating IVAF and subtracting control projections.
Note
This class is only needed when starting from raw 3-D ISMIP6 NetCDF output files. If you are using the pre-computed scalar files from Zenodo (
ComputedScalarsPaper/for AIS,v7_CMIP5_pub/for GrIS), callprocess_sectors()directly instead.- process()[source]
Process ISMIP6 projections by calculating IVAF for control and experiment projections, subtracting out control IVAF from experiments, and exporting IVAF files.
- For each model run the method:
Loads bed topography, ice thickness, ice fraction, and grounded fraction.
Computes IVAF at every grid cell and time step.
Subtracts the matched control-run IVAF to isolate the forced signal.
Writes
ivaf_<ice_sheet>_<group>_<model>_<exp>.ncnext to the input files.
- Returns:
1 if processing is successful.
- Return type:
int
- Raises:
ValueError – If projections_directory is not specified.
- ise.data.process.combine_gris_forcings(forcing_dir)[source]
Combines GrIS forcings from multiple CMIP model directories into consolidated NetCDF files.
- Parameters:
forcing_dir (str) – Directory containing the GrIS forcing files.
- Returns:
0 upon successful processing.
- Return type:
int
- ise.data.process.get_model_densities(zenodo_directory: str, output_path: str | None = None)[source]
Extracts density values (rhoi and rhow) from NetCDF files in the specified directory and returns them in a pandas DataFrame.
- Parameters:
zenodo_directory (str) – Path to the directory containing the NetCDF files.
output_path (str, optional) – Path to save the extracted density values as a CSV file. Defaults to None.
- Returns:
A DataFrame containing the group, model, rhoi, and rhow values for each model run.
- Return type:
pandas.DataFrame
- ise.data.process.get_xarray_data(dataset_fp, var_name=None, ice_sheet='AIS', convert_and_subset=False)[source]
Retrieves and processes data from an xarray dataset.
- Parameters:
dataset_fp (str) – The file path to the xarray dataset.
var_name (str, optional) – The name of the variable to retrieve from the dataset. Defaults to None.
ice_sheet (str, optional) – The ice sheet type (‘AIS’ or ‘GrIS’). Defaults to ‘AIS’.
convert_and_subset (bool, optional) – If True, converts and subsets the dataset for the target time range. Defaults to False.
- Returns:
The extracted variable as a NumPy array or the entire processed dataset.
- Return type:
np.ndarray or xarray.Dataset
- ise.data.process.interpolate_values(data)[source]
Interpolates missing values in the x and y dimensions of the input dataset using linear interpolation. Ensures that first and last values are properly adjusted to maintain consistency.
- Parameters:
data (xarray.Dataset) – A dataset containing x and y dimensions with potential missing values.
- Returns:
A tuple containing the interpolated x and y arrays.
- Return type:
tuple
- ise.data.process.merge_datasets(forcings, projections, experiments_file, ice_sheet='AIS')[source]
Join sector-level forcings and projections into a single analysis-ready DataFrame.
Uses the experiment metadata to add the AOGCM name to the projections table, normalises AOGCM name formatting so the two tables join cleanly on
(aogcm, year, sector), then performs an inner merge.- Parameters:
forcings (pd.DataFrame) – Sector-level forcing DataFrame as produced by
process_AIS/GrIS_atmospheric_sectors()+process_AIS/GrIS_oceanic_sectors()(or read fromforcings.csv).projections (pd.DataFrame) – Sector-level projection DataFrame as produced by
process_AIS/GrIS_outputs()(or read fromprojections.csv).experiments_file (str or pd.DataFrame) – Path to the experiment-metadata CSV (maps experiment IDs → AOGCM names) or a pre-loaded DataFrame.
ice_sheet (str, optional) –
'AIS'or'GrIS'. Defaults to'AIS'.
- Returns:
- Merged dataset with one row per (model, experiment,
sector, year), containing all forcing columns and the target SLE projection.
- Return type:
pandas.DataFrame
- ise.data.process.process_AIS_atmospheric_sectors(forcing_directory, grid_file)[source]
Aggregate AIS atmospheric forcing to sector-level annual means.
Searches
Atmosphere_Forcing/for 8 km, 1995-2100 NetCDF files, loads each viaForcingFile, and averages spatially over each of the 18 AIS sectors defined by the grid file.- Parameters:
forcing_directory (str) – Root forcing directory (GHub layout expected). The function navigates to the
Atmosphere_Forcing/sub-directory automatically.grid_file (str) – Path to the AIS sector-definition NetCDF (e.g.
AIS_sectors_8km.nc).
- Returns:
- Rows indexed by (aogcm, sector, year) with one column
per atmospheric forcing variable, plus
aogcm,year, andsector.
- Return type:
pandas.DataFrame
- ise.data.process.process_AIS_oceanic_sectors(forcing_directory, grid_file)[source]
Aggregate AIS oceanic forcing to sector-level annual means.
Loads thermal forcing, salinity, and ocean temperature NetCDFs from
Ocean_Forcing/(8 km, 1995-2100 files), depth-averages each variable, and then spatially averages over each of the 18 AIS sectors.- Parameters:
forcing_directory (str) – Root forcing directory (GHub layout expected). The function navigates to the
Ocean_Forcing/sub-directory automatically.grid_file (str) – Path to the AIS sector-definition NetCDF (e.g.
AIS_sectors_8km.nc).
- Returns:
- Rows indexed by (aogcm, sector, year) with columns
thermal_forcing,salinity,temperature,aogcm,year, andsector.
- Return type:
pandas.DataFrame
- ise.data.process.process_AIS_outputs(zenodo_directory, with_ctrl=False)[source]
Load AIS IVAF scalar projections from Zenodo and convert to sea-level equivalent.
Reads per-experiment NetCDF files from the
ComputedScalarsPaper/sub-directory. Each file contains sector-level IVAF time series (ivaf_sector_1…ivaf_sector_18). Files with only 85 time steps have their first year duplicated to reach the required 86.SLE is computed as:
sle = -ivaf / 362.5 * 910 / (1e9 * 1000)
following the sign convention and ice density (910 kg m⁻³) used in Seroussi et al. (2020) ISMIP6 scripts.
- Parameters:
zenodo_directory (str) – Path to the Zenodo download directory. The function looks inside
ComputedScalarsPaper/automatically.with_ctrl (bool, optional) – If
True, includes files that contain control projections (ivaf_AIS_*files, excluding hist/ctrl filenames). Defaults toFalse, which selects onlyivaf_minus_ctrl_projfiles.
- Returns:
- One row per (model, experiment, sector, year) with
columns
ivaf,sle,sector,year,id,exp, andmodel.
- Return type:
pandas.DataFrame
- ise.data.process.process_GrIS_atmospheric_sectors(forcing_directory, grid_file)[source]
Aggregate GrIS atmospheric forcing (aSMB and aST) to sector-level annual means.
Reads annual NetCDF files from
Atmosphere_Forcing/aSMB_observed/v1/and combines them per AOGCM viacombine_gris_forcings()if combined files do not yet exist. Then averages Surface Mass Balance anomaly (aSMB) and surface temperature anomaly (aST) spatially over each of the 6 GrIS drainage basins.- Parameters:
forcing_directory (str) – Root forcing directory (GHub layout expected). The function navigates to the
Atmosphere_Forcing/aSMB_observed/v1/sub-directory automatically.grid_file (str or xarray.Dataset) – Path to (or loaded) sector-definition NetCDF defining the 6 GrIS drainage-basin sectors.
- Returns:
- Rows indexed by (aogcm, sector, year) with columns
aSMB,aST,aogcm,year, andsector.
- Return type:
pandas.DataFrame
- ise.data.process.process_GrIS_oceanic_sectors(forcing_directory, grid_file)[source]
Aggregate GrIS oceanic forcing to sector-level annual means.
Reads thermal forcing and basin runoff NetCDFs from
Ocean_Forcing/Melt_Implementation/v4/and spatially averages each over the 6 GrIS drainage-basin sectors.- Parameters:
forcing_directory (str) – Root forcing directory (GHub layout expected). The function navigates to the
Ocean_Forcing/Melt_Implementation/v4/sub-directory automatically.grid_file (str or xarray.Dataset) – Path to (or loaded) sector-definition NetCDF defining the 6 GrIS drainage-basin sectors.
- Returns:
- Rows indexed by (aogcm, sector, year) with columns
thermal_forcing,basin_runoff,aogcm,year, andsector.
- Return type:
pandas.DataFrame
- ise.data.process.process_GrIS_outputs(zenodo_directory)[source]
Load GrIS IVAF scalar projections from Zenodo and convert to sea-level equivalent.
Reads per-experiment NetCDF files from the
v7_CMIP5_pub/sub-directory. Each file contains basin-level IVAF time series for the 6 GrIS drainage basins (ivaf_no,ivaf_ne,ivaf_se,ivaf_sw,ivaf_cw,ivaf_nw). Files with only 85 time steps have their first year duplicated to reach the required 86.SLE is computed as:
sle = ivaf / 362.5 / 1e9
- Parameters:
zenodo_directory (str) – Path to the Zenodo download directory. The function looks inside
v7_CMIP5_pub/automatically.- Returns:
- One row per (model, experiment, sector, year) with
columns
ivaf,sle,sector,year,id,exp, andmodel.
- Return type:
pandas.DataFrame
- ise.data.process.process_sectors(ice_sheet, forcing_directory, grid_file, zenodo_directory, experiments_file='/home/docs/checkouts/readthedocs.org/user_builds/ise/checkouts/latest/ise/data/data_files/ismip6_experiments_updated.csv', export_directory=None, overwrite=False, with_ctrl=False)[source]
End-to-end pipeline that builds the sector-level training dataset from raw ISMIP6 files.
This is the main entry point for data preparation. It reads raw climate forcing NetCDFs and pre-computed IVAF scalar projection files, aggregates both to sector-level annual time series (86 years, 2015-2100), joins them on (aogcm, year, sector), and returns a single analysis-ready DataFrame.
Intermediate files are written to
export_directoryso individual stages can be skipped on re-runs (controlled byoverwrite):<ice_sheet>_atmospheric.csv- sector-averaged atmospheric forcings<ice_sheet>_oceanic.csv- sector-averaged oceanic forcingsforcings.csv- atmospheric + oceanic mergedprojections.csv- IVAF projections by sectordataset.csv- final merged dataset (also returned)
- Parameters:
ice_sheet (str) – Ice sheet to process:
'AIS'(18 sectors) or'GrIS'(6 sectors).forcing_directory (str) – Root directory of the ISMIP6 forcing data. Expected sub-structure mirrors the GHub layout (
Atmosphere_Forcing/,Ocean_Forcing/, etc.).grid_file (str) – Path to the sector-definition NetCDF (e.g.
AIS_sectors_8km.ncorGrIS_Basins_Rignot_sectors_5km.nc).zenodo_directory (str) – Directory containing the pre-computed IVAF scalar files from Zenodo (
ComputedScalarsPaper/for AIS,v7_CMIP5_pub/for GrIS).experiments_file (str) – Path to the experiment-metadata CSV that maps experiment IDs to AOGCM names. Defaults to the bundled
ismip6_experiments_updated.csv.export_directory (str, optional) – Directory to write intermediate and final CSVs. If
None, nothing is saved to disk.overwrite (bool, optional) – If
True, re-process and overwrite any existing intermediate files. Defaults toFalse.with_ctrl (bool, optional) – AIS only — if
True, includes control projections in the output. Defaults toFalse.
- Returns:
- Merged dataset with one row per (model, experiment,
sector, year), containing both forcing variables and the target SLE projection.
- Return type:
pandas.DataFrame
ise.data.dataclasses
PyTorch Dataset classes for ISEFlow training and inference.
This module provides four torch.utils.data.Dataset subclasses for loading
ice-sheet emulator data. The default for ISEFlow is EmulatorDataset,
which handles the 86-timestep projection structure and the sequence padding
needed by the LSTM members of DeepEnsemble.
Dataset classes
- EmulatorDataset (default for ISEFlow):
Wraps a flat
(N_projections * 86, features)or batched(N_projections, 86, features)feature matrix.__getitem__returns a zero-padded sliding window ofsequence_lengthtimesteps so that the LSTM always receives a fixed-length context window even at the start of a projection. Used by bothLSTM.fit()andNormalizingFlow.fit():from ise.data.dataclasses import EmulatorDataset from torch.utils.data import DataLoader ds = EmulatorDataset(X, y, sequence_length=5, projection_length=86) loader = DataLoader(ds, batch_size=64, shuffle=True)
- PyTorchDataset:
Minimal
(X[i], y[i])pair dataset with no sequence logic. Used when data is already structured as individual feature vectors (e.g. for the normalizing flow, which usessequence_length=1).- TSDataset:
Similar to
EmulatorDatasetbut expects pre-batched 3-D tensors(N, T, F). Kept for backward compatibility.- ScenarioDataset:
Simple
(features[idx], labels[idx])pair dataset used in the experimental scenario-classification models.
Padding convention
All sequence-aware datasets pad at the beginning of each projection with
the zero vector so that the most recent timestep is always at index -1
of the returned sequence. This means the LSTM sees a causal context that
grows from zero padding at t=1 to a full sequence_length window by
t=``sequence_length``.
- class ise.data.dataclasses.EmulatorDataset(X, y, sequence_length=5, projection_length=86)[source]
Bases:
DatasetA PyTorch dataset for loading emulator data, designed to handle sequence-based inputs and projections.
- Parameters:
X (pandas.DataFrame, numpy.ndarray, or torch.Tensor) – The input data.
y (pandas.DataFrame, numpy.ndarray, or torch.Tensor) – The target data.
sequence_length (int, optional) – The length of the input sequence. Default is 5.
projection_length (int or tuple, optional) – The length of the projection period. Default is 86.
- X
The input data converted to a PyTorch tensor.
- Type:
torch.Tensor
- y
The target data converted to a PyTorch tensor.
- Type:
torch.Tensor
- sequence_length
The length of the input sequence.
- Type:
int
- xdim
The number of dimensions in X.
- Type:
int
- num_projections
The number of projections in the dataset.
- Type:
int
- num_timesteps
The number of timesteps per projection.
- Type:
int
- num_features
The number of features in the dataset.
- Type:
int
- class ise.data.dataclasses.PyTorchDataset(X, y)[source]
Bases:
DatasetA PyTorch dataset for general-purpose data loading.
- Parameters:
X (torch.Tensor) – The input data.
y (torch.Tensor) – The target data.
- class ise.data.dataclasses.ScenarioDataset(features, labels)[source]
Bases:
DatasetA PyTorch dataset designed for scenario-based data loading.
- Parameters:
features (torch.Tensor) – The input features.
labels (torch.Tensor) – The target labels.
- features
The input features.
- Type:
torch.Tensor
- labels
The target labels.
- Type:
torch.Tensor
- class ise.data.dataclasses.TSDataset(X, y, sequence_length=5)[source]
Bases:
DatasetA PyTorch dataset for handling time series data with sequence-based input.
- Parameters:
X (torch.Tensor) – The input data.
y (torch.Tensor) – The target data.
sequence_length (int, optional) – The length of the input sequence. Default is 5.
- X
The input data.
- Type:
torch.Tensor
- y
The target data.
- Type:
torch.Tensor
- sequence_length
The sequence length.
- Type:
int
ise.data.scaler
GPU-compatible PyTorch scalers for ISEFlow inputs and outputs.
This module provides StandardScaler, RobustScaler, and LogScaler
as torch.nn.Module subclasses. They mirror the scikit-learn scaler API
(fit / transform / inverse_transform / save / load) but
operate on torch.Tensor objects and can be kept on GPU throughout the
forward pass.
Why not use sklearn?
Scikit-learn scalers require a CPU round-trip and cannot participate in the autograd graph. These subclasses keep scaling arithmetic on whichever device the model is running on (CUDA or CPU), avoiding expensive device transfers during inference.
Scalers in the ISEFlow pipeline
The pretrained ISEFlow models ship a scaler_X.pkl (sklearn) for input
features and a scaler_y.pkl (sklearn) for the SLE output target. These
are sklearn scalers used inside ise.data.feature_engineer.scale_data
and ISEFlow.predict().
The PyTorch scalers in this module are used during model training when GPU-resident tensors must be transformed inside the training loop without leaving the GPU:
from ise.data.scaler import StandardScaler
scaler = StandardScaler()
scaler.fit(X_train_tensor) # computes mean/std on GPU
X_scaled = scaler.transform(X_train_tensor)
X_orig = scaler.inverse_transform(X_scaled)
scaler.save("scaler.pt")
scaler_loaded = StandardScaler.load("scaler.pt")
Scaler summary
- StandardScaler:
(x - mean) / std. Zero-variance columns are replaced with a small epsilon to prevent division by zero.- RobustScaler:
(x - median) / IQR. More resistant to outliers than StandardScaler.- LogScaler:
log(x - min + epsilon). Useful for strictly positive, right-skewed targets. A shift is computed from the training-set minimum so that all values remain positive before taking the log.
- class ise.data.scaler.LogScaler(epsilon=1e-08)[source]
Bases:
ModuleA class for scaling input data using a logarithmic transformation, ensuring all values are positive by applying a shift.
- Parameters:
epsilon (float, optional) – A small constant to avoid log(0) errors. Defaults to 1e-8.
- epsilon
A small constant to avoid log(0) errors.
- Type:
float
- min_value
The minimum value in the dataset used for shifting.
- Type:
float
- device
The device (CPU or GPU) on which calculations are performed.
- Type:
torch.device
- fit(X)[source]
Computes the minimum value in the dataset to ensure all values remain positive during transformation.
- Parameters:
X (torch.Tensor) – The input data to be scaled.
- inverse_transform(X)[source]
Reverses the log transformation to recover the original scale of the data.
- Parameters:
X (torch.Tensor) – The log-transformed input data.
- Returns:
The transformed input data in its original scale.
- Return type:
torch.Tensor
- static load(path)[source]
Load a LogScaler from disk.
- Parameters:
path (str) – Path to a checkpoint produced by
LogScaler.save().- Returns:
A scaler with
epsilonandmin_valuerestored.- Return type:
- class ise.data.scaler.RobustScaler[source]
Bases:
ModuleA class for scaling input data using the median and interquartile range (IQR), making it robust to outliers.
- Parameters:
nn.Module – The base class for all neural network modules in PyTorch.
- median_
The median values of the input data.
- Type:
torch.Tensor
- iqr_
The interquartile range (IQR) values of the input data.
- Type:
torch.Tensor
- device
The device (CPU or GPU) on which the calculations are performed.
- Type:
torch.device
- fit(X)[source]
Computes the median and interquartile range (IQR) of the input data.
- Parameters:
X (torch.Tensor) – The input data to be scaled.
- inverse_transform(X)[source]
Reverses the scaling operation on the input data.
- Parameters:
X (torch.Tensor) – The scaled input data to be transformed back.
- Returns:
The transformed input data.
- Return type:
torch.Tensor
- Raises:
RuntimeError – If the RobustScaler instance is not fitted yet.
- static load(path)[source]
Load a RobustScaler from disk.
- Parameters:
path (str) – Path to a checkpoint produced by
RobustScaler.save().- Returns:
A scaler with
median_andiqr_restored.- Return type:
- class ise.data.scaler.StandardScaler[source]
Bases:
ModuleA class for scaling input data using mean and standard deviation.
- Parameters:
nn.Module – The base class for all neural network modules in PyTorch.
- mean_
The mean values of the input data.
- Type:
torch.Tensor
- scale_
The standard deviation values of the input data.
- Type:
torch.Tensor
- device
The device (CPU or GPU) on which the calculations are performed.
- Type:
torch.device
- fit(X)[source]
Computes the mean and standard deviation of the input data.
- Parameters:
X (torch.Tensor) – The input data to be scaled.
- inverse_transform(X)[source]
Reverses the scaling operation on the input data.
- Parameters:
X (torch.Tensor) – The scaled input data to be transformed back.
- Returns:
The transformed input data.
- Return type:
torch.Tensor
- Raises:
RuntimeError – If the Scaler instance is not fitted yet.
- static load(path)[source]
Loads the mean and standard deviation from a file.
- Parameters:
path (str) – The path to load the file from.
- Returns:
A Scaler instance with the loaded mean and standard deviation.
- Return type:
Scaler
ise.data.utils
Time coordinate normalisation for ISMIP6 xarray datasets.
ISMIP6 models encode the time dimension in a wide variety of formats:
cftime.DatetimeNoLeap, cftime.Datetime360Day, “days since” numeric
offsets, plain numpy.datetime64, or integer year labels. Before any
spatial or sector-level processing can be performed, all datasets must share
a uniform numpy.datetime64 time axis covering 2015-2100 (86 years).
This module exposes a single function, convert_and_subset_times, that
handles all known ISMIP6 time encodings and edge cases encountered in the
GHub dataset collection, including:
cftimecalendar types (NoLeap, 360-day) →pandas.DatetimeIndexNumeric “days since X” offsets →
numpy.datetime64VUW PISM “seconds since 0001-01-01” offsets
UAF every-5-years datasets (assume 2015-2100)
Datasets with duplicate time stamps (de-duplicated by unique index)
Datasets shorter than 86 years (padded with forward-fill)
Datasets longer than 86 years (trimmed to the last 86 steps)
Usage
import xarray as xr
from ise.data.utils import convert_and_subset_times
ds = xr.open_dataset("lithk_AIS_NCAR_CISM_exp01.nc", decode_times=False)
ds = convert_and_subset_times(ds)
# ds.time is now numpy.datetime64 with 86 annual steps from 2015 to 2100
This function is called internally by ForcingFile.format_timestamps(),
ProjectionProcessor._calculate_ivaf_single_file(), and the sector
aggregation functions in ise.data.process.
- ise.data.utils.convert_and_subset_times(dataset)[source]
Converts time variables in an xarray dataset to a uniform format and subsets time to the range 2015-2100.
- Parameters:
dataset (xarray.Dataset) – The dataset with time values to be converted and subset.
- Returns:
The dataset with standardized time format and subset to the correct time range.
- Return type:
xarray.Dataset
- Raises:
ValueError – If time values are not in a recognizable format.
Module contents
Data loading, processing, and utilities for ice sheet emulation.
This package provides:
- ForcingFile: load and process climate forcing NetCDF data.
- GridFile: load and format sector grid definitions.
- ISEFlowAISInputs, ISEFlowGrISInputs: input dataclasses for ISEFlow predictions.
- AnomalyConverter: convert raw absolute forcing arrays to anomalies using bundled ISMIP6 climatologies; used internally by from_absolute_forcings() on the input dataclasses.
- feature_engineer: FeatureEngineer and helpers for scaling, splitting, and lag variables.
- dataclasses: EmulatorDataset, PyTorchDataset, TSDataset, ScenarioDataset.
- process: ProjectionProcessor and sector-level forcing/projection processing.
- scaler: PyTorch-based StandardScaler, RobustScaler, LogScaler.
- utils: time conversion and subsetting for xarray datasets.
- class ise.data.AnomalyConverter(ice_sheet: str)[source]
Bases:
objectConvert raw absolute forcing arrays to anomalies using ISMIP6 climatologies.
- Parameters:
ice_sheet (str) –
'AIS'or'GrIS'.
- ice_sheet
- Type:
str
- climatology
The loaded climatology table for the selected ice sheet.
- Type:
pd.DataFrame
- property climatology: DataFrame
Return the climatology DataFrame, loading it on first access.
- compute_ais(sector: int, pr: ndarray, evspsbl: ndarray, smb: ndarray, ts: ndarray, aogcm: str | None = None, custom_climatology: dict | None = None, mrro: ndarray | None = None) dict[source]
Compute AIS atmospheric anomalies from raw annual time-series arrays.
Subtracts the 1995-2014 ISMIP6 climatological baseline for the given AOGCM and sector from each raw input array. All anomaly outputs retain the same units as the corresponding inputs.
Exactly one of
aogcm(use bundled ISMIP6 climatology) orcustom_climatology(user-supplied baseline scalars) must be provided.- Parameters:
sector (int) – AIS drainage sector number (1-18).
pr (np.ndarray) – Raw precipitation time series (86 values, kg m⁻² s⁻¹).
evspsbl (np.ndarray) – Raw evaporation/sublimation time series (86 values, kg m⁻² s⁻¹).
smb (np.ndarray) – Raw surface mass balance time series (86 values, kg m⁻² s⁻¹).
ts (np.ndarray) – Raw surface temperature time series (86 values, K).
aogcm (str, optional) – AOGCM name to look up in the bundled climatology. Common alternate spellings are normalised automatically (e.g.
'NorESM1-M_rcp8.5'→'noresm1-m_rcp85').custom_climatology (dict, optional) – User-supplied 1995-2014 absolute baseline means for a CMIP model not in ISMIP6. Must contain keys
'pr'(kg m⁻² s⁻¹),'evspsbl'(kg m⁻² s⁻¹),'smb'(kg m⁻² s⁻¹),'ts'(K), and optionally'mrro'(kg m⁻² s⁻¹) ifmrrois provided.mrro (np.ndarray, optional) – Raw runoff time series (86 values, kg m⁻² s⁻¹). Required only for ISEFlow v1.0.0; not used by v1.1.0.
- Returns:
Keys
'pr_anomaly','evspsbl_anomaly','smb_anomaly','ts_anomaly'as 86-element numpy arrays. Units match the inputs: kg m⁻² s⁻¹ for pr / evspsbl / smb, K for ts.'mrro_anomaly'(kg m⁻² s⁻¹) is included whenmrrois provided and a baseline is available for the requested AOGCM.- Return type:
dict
- Raises:
ValueError – If neither or both of
aogcm/custom_climatologyare given, or if array lengths are not 86.
- compute_gris(sector: int, smb: ndarray, st: ndarray, aogcm: str | None = None, custom_climatology: dict | None = None) dict[source]
Compute GrIS atmospheric anomalies from raw annual time-series arrays.
Subtracts the 1960-1989 MAR long-term mean for the given AOGCM and sector from each raw input array, then converts the SMB anomaly from mm w.e. yr⁻¹ to kg m⁻² s⁻¹ to match the units used in the ISMIP6 aSMB forcing files and in the ISEFlow training data.
Exactly one of
aogcm(use bundled ISMIP6 climatology) orcustom_climatology(user-supplied baseline scalars) must be provided.- Parameters:
sector (int) – GrIS drainage basin number (1-6).
smb (np.ndarray) – Raw (absolute) surface mass balance time series (86 values, mm w.e. yr⁻¹, matching the MAR 3.9 Reference file convention). Typical range: −2000 to +200 mm w.e. yr⁻¹ depending on sector. The output
aSMBis automatically converted to kg m⁻² s⁻¹.st (np.ndarray) – Raw (absolute) surface temperature time series (86 values, °C, matching the MAR 3.9 Reference file convention).
aogcm (str, optional) – AOGCM name to look up in the bundled climatology. Common alternate spellings are normalised automatically.
custom_climatology (dict, optional) – User-supplied 1960-1989 MAR absolute baseline means for a CMIP model not in ISMIP6. Must contain keys
'smb'(mm w.e. yr⁻¹) and'st'(°C).
- Returns:
{'aSMB': ..., 'aST': ...}as 86-element numpy arrays.aSMB: SMB anomaly in kg m⁻² s⁻¹, matching the units of the ISMIP6 aSMB forcing files and the ISEFlow training data.aST: surface temperature anomaly in °C.
Variable names match
ISEFlowGrISInputsfield names.- Return type:
dict
- Raises:
ValueError – If neither or both of
aogcm/custom_climatologyare given, or if array lengths are not 86.
- get_climatology(aogcm: str, sector: int) dict[source]
Return the climatological mean values for a given AOGCM and sector.
- Parameters:
aogcm (str) – Canonical AOGCM name (see
list_aogcms()). Common alternate spellings are normalised automatically.sector (int) – Sector / drainage basin number.
- Returns:
Variable name → scalar climatological mean for the baseline period. AIS units: kg m⁻² s⁻¹ (pr / evspsbl / smb / mrro), K (ts). GrIS units: mm w.e. yr⁻¹ (smb), °C (st).
- Return type:
dict
- Raises:
KeyError – If
aogcmis not found in the bundled climatology.
- class ise.data.ForcingFile(ice_sheet: str, realm: str, filepath: str, varname: str | None = None)[source]
Bases:
objectWrapper for loading and processing climate forcing NetCDF files.
Supports atmospheric and oceanic realms, sector assignment, depth aggregation (ocean), and sector-averaged time series.
- Parameters:
ice_sheet (str) – Ice sheet identifier (‘AIS’ or ‘GrIS’).
realm (str) – Forcing realm (‘atmos’ or ‘ocean’).
filepath (str) – Path to the NetCDF forcing file.
varname (str, optional) – Name of the data variable. Defaults to None (first data var).
- ice_sheet
Ice sheet identifier.
- Type:
str
- realm
Forcing realm.
- Type:
str
- filepath
Path to the file.
- Type:
str
- data
Loaded dataset after load().
- Type:
xarray.Dataset or None
- sector_averages
Sector-averaged data after average_over_sector().
- Type:
xarray.Dataset or None
- sectors
Sector IDs after assign_sectors().
- Type:
numpy.ndarray or None
- varname
Data variable name.
- Type:
str or None
- aggregate_depth(method='mean')[source]
Aggregate over the depth dimension (ocean realm only).
- Parameters:
method (str) – ‘mean’ or ‘sum’. Defaults to ‘mean’.
- Returns:
The dataset with depth aggregated.
- Return type:
xarray.Dataset
- Raises:
ValueError – If realm is not ‘ocean’, data not loaded, or no ‘z’ dimension.
- assign_sectors(sectors: ndarray | GridFile) Dataset[source]
Assign sector IDs to the dataset (e.g. from a GridFile).
- Parameters:
sectors (numpy.ndarray or GridFile) – Sector IDs or GridFile to get sectors from.
- Returns:
The dataset with sector coordinate.
- Return type:
xarray.Dataset
- Raises:
ValueError – If data is not loaded.
- average_over_sector(sector_number: int | None = None) Dataset[source]
Average data over grid cells within a sector (or all sectors).
- Parameters:
sector_number (int, optional) – Sector ID. If None, must be pre-averaged. Defaults to None.
- Returns:
Sector-averaged data.
- Return type:
xarray.Dataset
- Raises:
ValueError – If data not loaded or sectors not assigned.
NotImplementedError – If sector_number is None (averaging all sectors at once).
- drop_vars(vars: list[str]) Dataset[source]
Drop dimensions or variables from the loaded dataset.
- Parameters:
vars (List[str]) – Names of dimensions or variables to drop.
- Returns:
The dataset (modified in place).
- Return type:
xarray.Dataset
- format_timestamps() Dataset[source]
Convert and subset time coordinate to 2015-2100 (86 years).
- Returns:
The dataset with formatted time.
- Return type:
xarray.Dataset
- load(filepath: str | None = None, validate=True, **kwargs) Dataset[source]
Load the forcing dataset from the NetCDF file.
- Parameters:
filepath (str, optional) – Override path. Defaults to self.filepath.
validate (bool, optional) – Whether to validate (non-NaN data). Defaults to True.
**kwargs – Passed to xarray.open_dataset.
- Returns:
The loaded dataset.
- Return type:
xarray.Dataset
- class ise.data.GridFile(ice_sheet: str, filepath: str)[source]
Bases:
objectWrapper for loading and formatting sector grid NetCDF files.
Used to load sector IDs and optionally expand/align dimensions for compatibility with forcing data (e.g. time dimension of length 86).
- Parameters:
ice_sheet (str) – Ice sheet identifier (‘AIS’ or ‘GrIS’).
filepath (str) – Path to the grid NetCDF file.
- ice_sheet
Ice sheet identifier.
- Type:
str
- filepath
Path to the file.
- Type:
str
- data
Loaded dataset after load().
- Type:
xarray.Dataset or None
- sector_variable_name
Name of the sector variable (‘sectors’ for AIS, ‘ID’ for GrIS).
- Type:
str
- align_dims(dims: list | None = None) Dataset[source]
Transpose dimensions to a standard order.
- Parameters:
dims (list, optional) – Dimension order. If None, uses (‘time’, ‘x’, ‘y’, …).
- Returns:
The dataset with reordered dimensions.
- Return type:
xarray.Dataset
- expand_dims(dim: str = 'time', size: int | None = None) Dataset[source]
Expand dimensions (e.g. add time dimension of given size).
- Parameters:
dim (str, optional) – Dimension name. Defaults to ‘time’.
size (int, optional) – Size of the new dimension. Defaults to None.
- Returns:
The dataset with expanded dimension.
- Return type:
xarray.Dataset
- class ise.data.ISEFlowAISInputs(year: ndarray, sector: ndarray | int, pr_anomaly: ndarray, evspsbl_anomaly: ndarray, smb_anomaly: ndarray, ts_anomaly: ndarray, ocean_thermal_forcing: ndarray, ocean_salinity: ndarray, ocean_temperature: ndarray, ice_shelf_fracture: bool, ocean_sensitivity: str, mrro_anomaly: ndarray | None = None, initial_year: int | None = None, numerics: str | None = None, stress_balance: str | None = None, resolution: str | None = None, init_method: str | None = None, melt_in_floating_cells: str | None = None, icefront_migration: str | None = None, ocean_forcing_type: str | None = None, open_melt_type: str | None = None, standard_melt_type: str | None = None, model_configs: str | None = None, version: str = 'v1.1.0', override_params: dict | None = None)[source]
Bases:
objectInputs for an ISEFlow-AIS prediction.
Expects pre-computed anomaly arrays (
pr_anomaly,evspsbl_anomaly,smb_anomaly,ts_anomaly). If you have raw absolute forcing values instead, use the alternative constructor:inputs = ISEFlowAISInputs.from_absolute_forcings( year=..., sector=..., pr=..., evspsbl=..., smb=..., ts=..., ocean_thermal_forcing=..., ocean_salinity=..., ocean_temperature=..., aogcm="noresm1-m_rcp85", # or custom_climatology={...} **ism_config_kwargs, )
from_absolute_forcings()subtracts the ISMIP6 1995-2014 climatological baseline automatically. Passaogcmfor a bundled ISMIP6 model orcustom_climatology(dict with keys'pr','evspsbl','smb','ts') for a CMIP model not in the bundled climatology.- evspsbl_anomaly: ndarray
- classmethod from_absolute_forcings(year: ndarray, sector: int, pr: ndarray, evspsbl: ndarray, smb: ndarray, ts: ndarray, ocean_thermal_forcing: ndarray, ocean_salinity: ndarray, ocean_temperature: ndarray, aogcm: str | None = None, custom_climatology: dict | None = None, mrro: ndarray | None = None, **kwargs) ISEFlowAISInputs[source]
Construct ISEFlowAISInputs from raw (non-anomaly) atmospheric forcings.
Subtracts the ISMIP6 1995-2014 climatological baseline from each atmospheric variable to produce the anomaly arrays required by the model. Ocean variables (
ocean_thermal_forcing,ocean_salinity,ocean_temperature) are absolute values and are passed through unchanged.Exactly one of
aogcmorcustom_climatologymust be provided.- Parameters:
year (np.ndarray) – Years corresponding to the time series (86 values, 2015-2100).
sector (int) – AIS drainage sector (1-18).
pr (np.ndarray) – Raw precipitation (86 values, kg m⁻² s⁻¹).
evspsbl (np.ndarray) – Raw evaporation / sublimation (86 values, kg m⁻² s⁻¹).
smb (np.ndarray) – Raw surface mass balance (86 values, kg m⁻² s⁻¹).
ts (np.ndarray) – Raw surface temperature (86 values, K).
ocean_thermal_forcing (np.ndarray) – Ocean thermal forcing (86 values, °C). Passed through unchanged.
ocean_salinity (np.ndarray) – Ocean salinity (86 values, PSU). Passed through unchanged.
ocean_temperature (np.ndarray) – Ocean temperature (86 values, °C). Passed through unchanged.
aogcm (str, optional) – AOGCM name to look up in the bundled ISMIP6 climatology (e.g.
'noresm1-m_rcp85'). Common alternate spellings are normalised automatically.custom_climatology (dict, optional) – Baseline means for a CMIP model not in the bundled climatology. Must contain keys
'pr','evspsbl','smb','ts'(and'mrro'ifmrrois also provided). Values should be in the same units as the raw input arrays.mrro (np.ndarray, optional) – Raw runoff (86 values). Only needed for ISEFlow v1.0.0.
**kwargs – All remaining keyword arguments are forwarded to
ISEFlowAISInputs.__init__(e.g. ISM config fields such asnumerics,stress_balance,model_configs, etc.).
- Returns:
Fully validated inputs object ready for
model.predict().- Return type:
Examples
Using a bundled ISMIP6 climatology:
inputs = ISEFlowAISInputs.from_absolute_forcings( year=np.arange(2015, 2101), sector=10, pr=pr_array, evspsbl=evspsbl_array, smb=smb_array, ts=ts_array, ocean_thermal_forcing=otf_array, ocean_salinity=sal_array, ocean_temperature=temp_array, aogcm="noresm1-m_rcp85", numerics="fd", stress_balance="hybrid", resolution="8", init_method="eq", initial_year=2005, melt_in_floating_cells="sub-grid", icefront_migration="str", ocean_forcing_type="open", ocean_sensitivity="medium", ice_shelf_fracture=False, open_melt_type="quad", standard_melt_type="nonlocal", )
Using a custom climatology for a new CMIP model:
inputs = ISEFlowAISInputs.from_absolute_forcings( year=np.arange(2015, 2101), sector=10, pr=pr_array, evspsbl=evspsbl_array, smb=smb_array, ts=ts_array, ocean_thermal_forcing=otf_array, ocean_salinity=sal_array, ocean_temperature=temp_array, custom_climatology={ "pr": 1.3e-5, "evspsbl": 3.8e-6, "smb": 9.0e-6, "ts": 253.7, }, numerics="fd", ... )
- classmethod from_raw_values(*args, **kwargs)[source]
Deprecated — use
from_absolute_forcingsinstead.
- ice_shelf_fracture: bool
- icefront_migration: str | None = None
- init_method: str | None = None
- initial_year: int | None = None
- melt_in_floating_cells: str | None = None
- model_configs: str | None = None
- mrro_anomaly: ndarray | None = None
- numerics: str | None = None
- ocean_forcing_type: str | None = None
- ocean_salinity: ndarray
- ocean_sensitivity: str
- ocean_temperature: ndarray
- ocean_thermal_forcing: ndarray
- open_melt_type: str | None = None
- override_params: dict | None = None
- pr_anomaly: ndarray
- resolution: str | None = None
- sector: ndarray | int
- smb_anomaly: ndarray
- standard_melt_type: str | None = None
- stress_balance: str | None = None
- to_df()[source]
Convert the dataclass fields to a pandas DataFrame.
- Returns:
One row per timestep (86 rows) with all forcing and configuration columns needed by
ISEFlow_AIS.process().- Return type:
pandas.DataFrame
- ts_anomaly: ndarray
- version: str = 'v1.1.0'
- year: ndarray
- class ise.data.ISEFlowGrISInputs(year: ndarray, sector: ndarray | int, aST: ndarray, aSMB: ndarray, ocean_thermal_forcing: ndarray, basin_runoff: ndarray, ice_shelf_fracture: bool, ocean_sensitivity: str, standard_ocean_forcing: bool, initial_year: int | None = None, numerics: str | None = None, ice_flow_model: str | None = None, initialization: str | None = None, initial_smb: str | None = None, velocity: str | None = None, bedrock_topography: str | None = None, surface_thickness: str | None = None, geothermal_heat_flux: str | None = None, res_min: str | None = None, res_max: str | None = None, model_configs: str | None = None, version: str = 'v1.1.0')[source]
Bases:
objectInputs for an ISEFlow-GrIS prediction.
Expects pre-computed anomaly arrays (
aSMB,aST). If you have raw absolute forcing values instead, use the alternative constructor:inputs = ISEFlowGrISInputs.from_absolute_forcings( year=..., sector=..., smb=..., st=..., ocean_thermal_forcing=..., basin_runoff=..., aogcm="hadgem2-es_rcp85", # or custom_climatology={...} **ism_config_kwargs, )
from_absolute_forcings()subtracts the ISMIP6 1960-1989 MAR climatological baseline automatically. Passaogcmfor a bundled ISMIP6 model orcustom_climatology(dict with keys'smb','st') for a CMIP model not in the bundled climatology.- aSMB: ndarray
- aST: ndarray
- basin_runoff: ndarray
- bedrock_topography: str | None = None
- classmethod from_absolute_forcings(year: ndarray, sector: int, smb: ndarray, st: ndarray, ocean_thermal_forcing: ndarray, basin_runoff: ndarray, aogcm: str | None = None, custom_climatology: dict | None = None, **kwargs) ISEFlowGrISInputs[source]
Construct ISEFlowGrISInputs from raw (non-anomaly) atmospheric forcings.
Subtracts the ISMIP6 1960-1989 MAR climatological baseline from each atmospheric variable to produce the anomaly arrays (
aSMB,aST) required by the model. Ocean variables (ocean_thermal_forcing,basin_runoff) are absolute values and are passed through unchanged.Exactly one of
aogcmorcustom_climatologymust be provided.- Parameters:
year (np.ndarray) – Years (86 values, 2015-2100).
sector (int) – GrIS drainage basin number (1-6).
smb (np.ndarray) – Raw surface mass balance (86 values, mm w.e. yr⁻¹, matching the MAR Reference file units used in the bundled climatology CSV). The anomaly conversion automatically converts to kg m⁻² s⁻¹.
st (np.ndarray) – Raw surface temperature (86 values, K or °C, consistent with the MAR reference).
ocean_thermal_forcing (np.ndarray) – Ocean thermal forcing (86 values). Passed through unchanged.
basin_runoff (np.ndarray) – Basin-integrated runoff (86 values). Passed through unchanged.
aogcm (str, optional) – AOGCM name to look up in the bundled ISMIP6 climatology (e.g.
'hadgem2-es_rcp85'). Common alternate spellings are normalised automatically.custom_climatology (dict, optional) – Baseline means for a CMIP model not in the bundled climatology. Must contain keys
'smb'and'st'in MAR units.**kwargs – All remaining keyword arguments are forwarded to
ISEFlowGrISInputs.__init__(e.g. ISM config fields such asnumerics,ice_flow_model,model_configs, etc.).
- Returns:
Fully validated inputs object ready for
model.predict().- Return type:
Examples
Using a bundled ISMIP6 climatology:
inputs = ISEFlowGrISInputs.from_absolute_forcings( year=np.arange(2015, 2101), sector=1, smb=smb_array, st=st_array, ocean_thermal_forcing=otf_array, basin_runoff=runoff_array, aogcm="hadgem2-es_rcp85", initial_year=1990, numerics="fe", ice_flow_model="ho", initialization="dav", initial_smb="ra3", velocity="joughin", bedrock_topography="morlighem", surface_thickness="None", geothermal_heat_flux="g", res_min=1.0, res_max=7.5, standard_ocean_forcing=True, ocean_sensitivity="medium", ice_shelf_fracture=False, )
Using a custom climatology for a new CMIP model:
inputs = ISEFlowGrISInputs.from_absolute_forcings( year=np.arange(2015, 2101), sector=1, smb=smb_array, st=st_array, ocean_thermal_forcing=otf_array, basin_runoff=runoff_array, custom_climatology={"smb": -241.2, "st": -22.8}, initial_year=1990, ... )
- classmethod from_raw_values(*args, **kwargs)[source]
Deprecated — use
from_absolute_forcingsinstead.
- geothermal_heat_flux: str | None = None
- ice_flow_model: str | None = None
- ice_shelf_fracture: bool
- initial_smb: str | None = None
- initial_year: int | None = None
- initialization: str | None = None
- model_configs: str | None = None
- numerics: str | None = None
- ocean_sensitivity: str
- ocean_thermal_forcing: ndarray
- res_max: str | None = None
- res_min: str | None = None
- sector: ndarray | int
- standard_ocean_forcing: bool
- surface_thickness: str | None = None
- to_df()[source]
Convert the dataclass fields to a pandas DataFrame.
- Returns:
One row per timestep (86 rows) with all forcing and configuration columns needed by
ISEFlow_GrIS.process().- Return type:
pandas.DataFrame
- velocity: str | None = None
- version: str = 'v1.1.0'
- year: ndarray