Changelog#
0.1.0 (2025-01-01)#
Features#
Complete Rewrite: Fresh, clean codebase without legacy constraints
Modern Architecture: Modular design with clear separation of concerns
Full Type Safety: Comprehensive type hints throughout the codebase
Pydantic Configuration: Type-safe configuration management with validation
Intelligent Chunking: Automatic chunking analysis and recommendations
Parallel Writing: Template creation and region writing for parallel processing
Command-Line Interface: Comprehensive CLI with intuitive options
Configuration Files: YAML and JSON configuration file support
Data Packing: Fixed-scale offset encoding for compression
Compression Support: Blosc and other compression algorithms
Time Series Handling: Efficient handling of time-series data
Data Appending: Append new data to existing Zarr archives
Retry Logic: Automatic retries for handling missing data
Error Handling: Custom exceptions with clear error messages
Logging: Comprehensive logging for debugging and monitoring
API Changes#
Simplified Interface: Cleaner, more intuitive APIs
Functional API:
convert_to_zarr
,append_to_zarr
functionsClass-Based API:
ZarrConverter
class for advanced usageCLI Interface:
zarrio
command with comprehensive optionsConfiguration API: Pydantic models for type-safe configuration
Breaking Changes#
Complete Rewrite: No backward compatibility with onzarr 1.x
New Package Name:
zarrio
instead ofonzarr
New Module Structure: Clean, modular organization
New API: Simplified, more intuitive interfaces
New Configuration: Pydantic-based configuration management
Improvements#
Maintainability: Clean, focused modules that are easy to understand and modify
Reliability: Proper error handling with custom exceptions
Performance: Optimized for modern xarray and zarr capabilities
Documentation: Comprehensive documentation with examples
Testing: Comprehensive test suite with good coverage
Developer Experience: Better IDE support with type hints
User Experience: Intuitive APIs and CLI with helpful documentation
Bug Fixes#
Circular Imports: Fixed all import issues from the original design
Complex Inheritance: Eliminated complex inheritance hierarchies
Redundant Code: Removed redundant custom implementations now available in xarray
Parameter Handling: Simplified parameter handling with clear validation
Dependencies#
Python: 3.8+
xarray: 0.18.0+
zarr: 2.10.0+
numpy: 1.20.0+
pandas: 1.3.0+
click: 8.0.0+
pyyaml: 5.4.0+
dask: 2021.0.0+
netCDF4: 1.5.0+
Development#
Continuous Integration: GitHub Actions for testing and linting
Pre-commit Hooks: Automated code formatting and linting
Documentation: Sphinx documentation with ReadTheDocs theme
Testing: Pytest with comprehensive coverage
Type Checking: MyPy for static type checking
Code Formatting: Black for consistent code style
Linting: Flake8 for code quality
Examples#
Basic Usage: Simple conversion examples
Advanced Usage: Complex configuration examples
CLI Usage: Command-line interface examples
Configuration Files: YAML and JSON configuration examples
Parallel Processing: Template creation and region writing examples
Intelligent Chunking: Automatic chunking analysis examples
Data Packing: Compression examples with packing
Error Handling: Exception handling examples
Documentation#
Installation Guide: Complete installation instructions
Quickstart Guide: Getting started quickly
API Reference: Comprehensive API documentation
CLI Documentation: Command-line interface reference
Configuration Guide: Pydantic configuration management
Parallel Processing: Template creation and region writing guide
Chunking Guide: Intelligent chunking analysis and recommendations
Examples: Practical usage examples
Contributing Guide: Developer contribution guidelines
Changelog: Version history and changes
Initial Release Notes#
zarrio represents a complete rewrite of the original onzarr library with a focus on:
Clean Architecture: Fresh start without legacy constraints
Modern Design: Full type hints, proper error handling, clean APIs
Enhanced Functionality: Parallel writing, Pydantic configuration, intelligent chunking
Better Maintainability: Easy to understand and modify
Improved Usability: Intuitive APIs and comprehensive documentation
Robust Implementation: Proper error handling and testing
Performance Optimization: Leveraging modern xarray and zarr capabilities
Key Features in Initial Release:
NetCDF to Zarr conversion with data packing
Flexible chunking strategies with intelligent analysis
Compression support with Blosc and other algorithms
Time series handling with duplicate removal
Data appending to existing archives
Parallel writing with template creation and region writing
Pydantic configuration management and validation
Intelligent chunking analysis and recommendations
Retry logic for handling missing data
Comprehensive CLI with configuration file support
Full test suite with excellent coverage
Detailed documentation with practical examples
Migration from onzarr 1.x:
Breaking Changes: Complete rewrite with no backward compatibility
New Package:
zarrio
instead ofonzarr
New API: Simplified, more intuitive interfaces
New Configuration: Pydantic-based configuration management
Enhanced Features: Parallel writing, intelligent chunking, retry logic
Future Roadmap:
Performance optimization for large datasets
Additional compression algorithms
Cloud storage integration
Advanced parallel processing features
Machine learning-based chunking recommendations
Benchmarking and optimization tools
Extended documentation and tutorials
Community feedback and feature requests