HRRR Zarr Example Using XArray

Download an analysis variable, whole grid, 24 hours

This example shows how to get a day's worth of analysis files for a single variable and combine them using high-level APIs.

We use xarray's open_mfdataset to load the data. There's a couple things missing from the metadata, so we use a metpy extension to add projection info and latitude/longitude. We also promote the "time" attribute to a coordinate so that combining the datasets for each hour will work later on.

The following function demonstrates how to format the urls to load the data, as well as how to combine the hours using xarray.concat. Note that because there's an extra level of nesting for the main data variable (level and variable name), we have to get both the zarr group url and the url for the nested subgroup. That's why we have to use open_mfdataset ("mf" means "multifile")––other zarr datasets likely won't have this quirk.

Just for demonstration purposes, we load up the data and calculate the standard deviation so we have something to plot across the geospatial domain. Note that this whole thing takes a little over 1 min on my laptop, mostly spent on downloading the data. You may need some performance optimizations or parallelization if you're doing a large analysis.

The data is loaded lazily, meaning only metadata saying what data exists (such as the grid, dataset attributes, and size and encoding of data) is downloaded just by "opening" the dataset. In fact, the actual data isn't downloaded even when we ask for the std deviation calculation to be done––the code just makes a note to calculate the std once the values are requested. So the time for std_dev.values to completely is almost entirely just the time for downloading the TMP data.

If we didn't print "values" here, the data wouldn't be downloaded until actually needed to construct the plot in the next cell. Note that also means that if you were to slice the data before displaying it––say I only wanted to show gridpoints in my county––a lot less data would be downloaded, at a faster pace.