Quickstart Guide#
This guide will help you get started with zarrio quickly.
Basic Usage#
The simplest way to use zarrio is through its functional API:
from zarrio import convert_to_zarr
# Convert a single NetCDF file to Zarr
convert_to_zarr("input.nc", "output.zarr")
This will automatically handle the conversion with sensible defaults.
Advanced Usage#
For more control, you can use the class-based API:
from zarrio import ZarrConverter
# Create converter with custom settings
converter = ZarrConverter(
chunking={"time": 100, "lat": 50, "lon": 100},
compression="blosc:zstd:3",
packing=True,
packing_bits=16,
target_chunk_size_mb=100 # Configure target chunk size for your environment
)
# Convert data
converter.convert("input.nc", "output.zarr")
Environment-specific chunking: - Local development: 10-25 MB chunks - Production servers: 50-100 MB chunks - Cloud environments: 100-200 MB chunks
Command Line Interface#
zarrio also provides a powerful command-line interface:
# Convert NetCDF to Zarr
zarrio convert input.nc output.zarr
# Convert with chunking
zarrio convert input.nc output.zarr --chunking "time:100,lat:50,lon:100"
# Convert with compression
zarrio convert input.nc output.zarr --compression "blosc:zstd:3"
# Convert with data packing
zarrio convert input.nc output.zarr --packing --packing-bits 16
# Convert with manual packing ranges
zarrio convert input.nc output.zarr --packing \
--packing-manual-ranges '{"temperature": {"min": -50, "max": 50}}'
# Convert with automatic range calculation
zarrio convert input.nc output.zarr --packing \
--packing-auto-buffer-factor 0.05
Parallel Writing#
One of the key features of zarrio is parallel writing support:
from zarrio import ZarrConverter
# 1. Create template covering full time range
converter = ZarrConverter()
converter.create_template(
template_dataset=template_ds,
output_path="archive.zarr",
global_start="2020-01-01",
global_end="2023-12-31",
compute=False # Metadata only
)
# 2. Write regions in parallel processes
converter.write_region("data_2020.nc", "archive.zarr") # Process 1
converter.write_region("data_2021.nc", "archive.zarr") # Process 2
converter.write_region("data_2022.nc", "archive.zarr") # Process 3
converter.write_region("data_2023.nc", "archive.zarr") # Process 4
Configuration Files#
You can also use configuration files (YAML or JSON):
# config.yaml
chunking:
time: 150
lat: 60
lon: 120
compression:
method: blosc:zstd:2
clevel: 2
packing:
enabled: true
bits: 16
variables:
include:
- temperature
- pressure
exclude:
- humidity
attrs:
title: YAML Config Demo
version: 1.0
Then use it with the CLI:
zarrio convert input.nc output.zarr --config config.yaml
Or programmatically:
from zarrio import ZarrConverter
converter = ZarrConverter.from_config_file("config.yaml")
converter.convert("input.nc", "output.zarr")
Next Steps#
Explore the API Reference documentation for detailed API reference
Learn about Command-Line Interface options
Understand Configuration Management management
Discover Parallel Processing writing capabilities
Review Usage Examples for more use cases