[ncl-talk] Order of dimensions of an array in netcdf

Dave Allured - NOAA Affiliate dave.allured at noaa.gov
Wed Oct 21 20:25:21 MDT 2020


NCL uses zero-based subscripting for both memory variables and file
variables.  Therefore the first element in your example is *temp [0] [0]
[0] [0]*.

There are some problems with the Netcdf users guide, which you cited.  You
found the Netcdf-4 data model, which is rather complicated.  Here is a
simplified summary of the *data storage model* for the three most common
types of Netcdf storage.  For small, simple applications, the user should
not be at all concerned with the internal file format.  When performance
becomes an issue, it is good to have a general understanding of format
issues, so as to help optimize read and write strategies.

*Netcdf-3 fixed size variable.*  The entire N-dimensional array is stored
contiguously in the file, with the NCL rightmost (last) dimension varying
the fastest.  The physical layout is generally the same as you would have
for a simple array in program memory.  This storage is not expandable.

*Netcdf-3 record variable.*  The array is stored in chunks across the
leftmost (first, slowest varying) dimension, with all the other dimensions
stored contiguously in each chunk.  The first dimension is also called the
*record* or *unlimited* dimension.  This storage can be *expanded* by
appending on the leftmost (slowest) dimension only.  A typical application
is a gridded data set where new grids are added over time, as new data
become available.

*Netcdf-4 chunked variable.*  The array is subdivided and stored in
equal-sized chunks across all dimensions.  The individual chunk dimension
sizes may be arbitrarily chosen when a new file variable is first created.
Inside each chunk, storage is contiguous across dimensions, just like in
the other two storage models.  There is hidden infrastructure to keep track
of all the chunks.  This strategy has some advantages for arbitrary
unlimited (expandable) dimensions, and automatic internal data compression.

There are two simple approaches to optimizing file access.  Either read and
write the *entire variable* in a single statement, or else read and
write *contiguous
file chunks* or *contiguous subsets of file chunks*.  A common example is
to set up a gridded file variable so that each 2-D grid is stored
contiguously, then read and write a whole grid each time.  This is why you
see many examples where *lat* and *lon* are the rightmost dimensions; that
generally results in contiguous grids.


On Wed, Oct 21, 2020 at 6:10 PM Gurer, Kemal at ARB via ncl-talk <
ncl-talk at mailman.ucar.edu> wrote:

> Hello ncl’ers,
>
>
>
> I have been searching a detailed description for the dimensional order of
> the data points of a multi-dimensional variable to be written
>
> to a netcdf file. I found some locations, but am also equally confused
> with different explanations or my lack of understanding. One example
>
> for such an explanation for the variable:
>
>
>
> temp [TIMES] [LEVELS] [LATS] [LONS];
>
>
>
> with
>
>
>
> TIMES = 3,
>
> LEVELS = 1,
>
> LATS = 5,
>
> LONS = 10
>
>
>
> being written into the netcdf file in the order of dimensions is given at:
>
>
>
>
> https://www.unidata.ucar.edu/software/netcdf/docs/netcdf_data_set_components.html
>
>
>
> and mentions “The order in which the data will be *returned* is with the
> last dimension, LONS, varying fastest:”
>
> at the time of retrieval of the data.
>
>
>
> What I am wondering is when I write the variable “temp” above into the
> netcdf file, is the *first data element of temp*:
>
>
>
> Temp [2] [1] [4] [9]
>
>
>
> or,
>
>
>
> temp [0] [1] [0] [0]
>
>
>
> Thank you for your help.
>
>
>
> Kemal.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.ucar.edu/pipermail/ncl-talk/attachments/20201021/b13aab5f/attachment.html>


More information about the ncl-talk mailing list