Zarr

Overview Create N-dimensional arrays with any NumPy dtype Chunk arrays along any dimension Compress and/or filter chunks using any NumCodecs codec Flexible storage of arrays Read or write an array concurrently from multiple threads or processes

https://zarr.readthedocs.io/en/stable/

Chunks: Chunks must be uniform size across the N-Dimensional arrays

For example, the HRRR data is chunked as follows (time,x,y):

Analyses (1,150,150) Ex. (0.4.6) Forecasts (36,150,150) or (18,150,150)

Chunks are indexed by their location in the domain, starting with the upper left corner

For a 2-dimensional array, the chunk structure and indexing would be as follows:

Compressors: When compressing our Zarr data files, we do so using our chunk system

Data compression is a trade off between random access and compressibility. The compression we choose will result in varied speed of access and storage ratio.

Numcodecs is a Python package providing buffer compression and transformation codecs for use in data storage and communication applications. These include: Compression codecs, e.g., Zlib, BZ2, LZMA and Blosc Pre-compression filters, e.g., Delta, Quantize, FixedScaleOffset, PackBits, Categorize Integrity checks, e.g., CRC32, Adler32

https://numcodecs.readthedocs.io/en/stable/

Output: Zarr data files can be written to a variety of storage sources

Memory storage Disk (NFS) Zip storage Cloud storage (Google, AWS)

Initialize the zarr file store, then fill with arrays and compress the chunked variables