Output¶
Controls where and how results are written.
Fields¶
| Field | Type | Default | Description |
|---|---|---|---|
outfile |
string | — | Output file path. Extension determines format: .nc for NetCDF4, .zarr for Zarr. Supports remote paths (gs://, s3://). Required. |
global_attrs |
dict | {} |
Global dataset attributes to add to or override the defaults. See Global attributes below. |
append |
bool | false |
Add variables to an existing Zarr store rather than overwriting it. See Parallel Zarr writes below. |
consolidate |
bool | false |
Run zarr.consolidate_metadata() after writing. See Parallel Zarr writes below. |
mask |
dict | null |
Optional spatial mask applied to all output variables before writing. See Masking below. |
updir |
string | null |
Deprecated. Write directly to a remote path via outfile instead. |
Masking¶
An optional spatial mask can be applied to all output variables before writing. The mask is derived from a variable in the source dataset and broadcast automatically across any extra dimensions (e.g. time, quantile, direction) in the output.
Two mask types are supported, selected via the type field.
notnull — mask where a variable is null¶
Keeps output values where the chosen source variable is non-null; sets everything else to NaN. The most common use case is deriving a land/ice mask from a single timestamp of a wave or depth variable.
output:
outfile: out.zarr
mask:
type: notnull
var: hs # source variable to test
isel: # optional: reduce to 2-D before testing
time: 0
threshold — mask by a numerical condition¶
Keeps output values where var <operator> value is true.
operator |
Condition |
|---|---|
gt |
var > value |
lt |
var < value |
ge |
var >= value |
le |
var <= value |
output:
outfile: out.zarr
mask:
type: threshold
var: depth
isel:
time: 0
operator: gt
value: 0.0 # keep ocean points (depth > 0)
isel¶
Both mask types accept an optional isel dict that reduces the source variable to a lower-dimensional slice before the mask is computed. This is the standard way to produce a 2-D (latitude, longitude) mask from a variable that also has a time dimension.
Global attributes¶
By default, gridstats writes the following global attributes to every output file:
| Attribute | Default value |
|---|---|
title |
"Data stats" |
institution |
"Oceanum" |
source |
"gridstats" |
date_created |
Today's date (UTC) |
time_coverage_start |
First timestamp in the source dataset |
time_coverage_end |
Last timestamp in the source dataset |
time_coverage_duration |
ISO 8601 duration |
time_coverage_resolution |
ISO 8601 timestep |
Any key in global_attrs is merged on top of these defaults, overriding the matching default or adding a new attribute:
output:
outfile: ./stats.zarr
global_attrs:
title: "New Zealand Wave Climatology 1980–2020"
institution: "NIWA"
project: "NZ-Waves-2025"
references: "https://doi.org/10.xxxx/xxxxx"
Parallel Zarr writes¶
When computing different statistics in separate parallel tasks and writing them all to the same Zarr archive, set append: true on each task. Each task writes only its own variables; all other variables in the store are left untouched.
# Task A — computes hs stats
output:
outfile: gs://my-bucket/stats.zarr
append: true
# Task B — computes tp stats (runs in parallel with Task A)
output:
outfile: gs://my-bucket/stats.zarr
append: true
# Task C — consolidates metadata (runs after A and B complete)
output:
outfile: gs://my-bucket/stats.zarr
append: true
consolidate: true
Parallel write safety
Different tasks must write different variables. If two tasks attempt to write the same variable simultaneously the result is undefined. Rerunning a single task (e.g., after a failure) is safe — the existing variable is deleted and rewritten.