Skip to content

output

Functions for finalising and writing output datasets.

gridstats.output

Output writers and dataset finalisation for gridstats.

finalise(dsout: xr.Dataset, source_ds: xr.Dataset, chunks: dict[str, int] = {}, metadata: dict[str, Any] = {}, global_attrs: dict[str, Any] = {}) -> xr.Dataset

Sort, chunk, transpose, and annotate the output dataset.

Parameters:

Name Type Description Default
dsout Dataset

Dataset to finalise.

required
source_ds Dataset

Original source dataset for global attribute extraction.

required
chunks dict[str, int]

Output chunking specification.

{}
metadata dict[str, Any]

Extra metadata to merge into variable attributes.

{}
global_attrs dict[str, Any]

Extra or override global dataset attributes.

{}

Returns:

Type Description
Dataset

Finalised dataset ready for writing.

write(dsout: xr.Dataset, path: str, append: bool = False, consolidate: bool = False, **kwargs) -> None

Dispatch to write_netcdf or write_zarr based on the file extension.

Parameters:

Name Type Description Default
dsout Dataset

Dataset to write.

required
path str

Output path. Must end in '.nc' or '.zarr'.

required
append bool

Passed to write_zarr (ignored for NetCDF).

False
consolidate bool

Passed to write_zarr (ignored for NetCDF).

False
**kwargs

Forwarded to the chosen writer.

{}

Raises:

Type Description
ValueError

If the extension is not '.nc' or '.zarr'.

write_netcdf(dsout: xr.Dataset, path: str, fill_value: int = _FILLVALUE_NC, format: str = 'NETCDF4') -> None

Write the dataset to a NetCDF file with zlib compression.

Parameters:

Name Type Description Default
dsout Dataset

Dataset to write.

required
path str

Output file path.

required
fill_value int

Fill value for all data variables.

_FILLVALUE_NC
format str

NetCDF format string.

'NETCDF4'

write_zarr(dsout: xr.Dataset, path: str, fill_value: int = int(_FILLVALUE_ZARR), append: bool = False, consolidate: bool = False, **kwargs) -> None

Write the dataset to a Zarr store.

Parameters:

Name Type Description Default
dsout Dataset

Dataset to write.

required
path str

Output Zarr store path or URL.

required
fill_value int

Fill value applied to all data variables.

int(_FILLVALUE_ZARR)
append bool

If True, add variables to an existing store (creating it if needed) rather than overwriting. Variables that already exist are deleted and rewritten. Coordinates shared across parallel tasks are handled safely via zarr's require_dataset. Consolidated metadata is intentionally skipped so that parallel writers do not race on .zmetadata — use consolidate=True in a final dependent task to produce it.

False
consolidate bool

If True, call zarr.consolidate_metadata after writing. Use this on the final task that depends on all parallel writers.

False
**kwargs

Forwarded to Dataset.to_zarr.

{}

set_variable_attributes(dsout: xr.Dataset, extra_metadata: dict[str, Any] = {}) -> xr.Dataset

Apply CF-convention attributes to all variables and coordinates.

Attribute definitions are loaded from attributes.yml. The extra_metadata dict can override or extend entries under the 'coords', 'data_vars', and 'stats' keys.

Parameters:

Name Type Description Default
dsout Dataset

Dataset to annotate.

required
extra_metadata dict[str, Any]

Optional overrides keyed by 'coords', 'data_vars', 'stats'.

{}

Returns:

Type Description
Dataset

The same dataset with attributes set in-place.

set_global_attributes(source_ds: xr.Dataset, dsout: xr.Dataset, extra_attrs: dict[str, Any] = {}) -> xr.Dataset

Set global CF attributes on the output dataset.

Defaults are computed from the source dataset; extra_attrs is merged on top so that user-supplied values override or extend the defaults.

Parameters:

Name Type Description Default
source_ds Dataset

Original source dataset (used to extract time coverage).

required
dsout Dataset

Output dataset to annotate.

required
extra_attrs dict[str, Any]

Additional or override global attributes from config.

{}

Returns:

Type Description
Dataset

The output dataset with global attrs set.