[ncl-talk] Order of dimensions of an array in netcdf

Wed Oct 21 21:48:59 MDT 2020

Additional information to Dave's netCDF overview:

NCL is row major like C. A language like fortran is column major.

On 'paper' the following look opposite however they are arranged *exactly
the same in memory:*

NCL:     TEMP(ntim,klev,nlat,mlon)     <== fastest varying dimension
is the *rightmost;
row major*
Fortran: TEMP(mlon,nlat,klev,ntim)     <== fastest varying dimension is the
*leftmost  ; column major*

*===========*
NCAR's  Climate Analysis Section provides the
* Climate Data Guide <https://climatedataguide.ucar.edu/>*
You should explore it. There is a lot of information.

Specifically:
*https://climatedataguide.ucar.edu/climate-data-tools-and-analysis/common-climate-data-formats-overview*
<https://climatedataguide.ucar.edu/climate-data-tools-and-analysis/common-climate-data-formats-overview>
*https://climatedataguide.ucar.edu/climate-data-tools-and-analysis/netcdf-overview*
<https://climatedataguide.ucar.edu/climate-data-tools-and-analysis/netcdf-overview>

On Wed, Oct 21, 2020 at 8:25 PM Dave Allured - NOAA Affiliate via ncl-talk <
ncl-talk at mailman.ucar.edu> wrote:

> NCL uses zero-based subscripting for both memory variables and file
> variables.  Therefore the first element in your example is *temp [0] [0]
> [0] [0]*.
>
> There are some problems with the Netcdf users guide, which you cited.  You
> found the Netcdf-4 data model, which is rather complicated.  Here is a
> simplified summary of the *data storage model* for the three most common
> types of Netcdf storage.  For small, simple applications, the user should
> not be at all concerned with the internal file format.  When performance
> becomes an issue, it is good to have a general understanding of format
> issues, so as to help optimize read and write strategies.
>
> *Netcdf-3 fixed size variable.*  The entire N-dimensional array is stored
> contiguously in the file, with the NCL rightmost (last) dimension varying
> the fastest.  The physical layout is generally the same as you would have
> for a simple array in program memory.  This storage is not expandable.
>
> *Netcdf-3 record variable.*  The array is stored in chunks across the
> leftmost (first, slowest varying) dimension, with all the other dimensions
> stored contiguously in each chunk.  The first dimension is also called the
> *record* or *unlimited* dimension.  This storage can be *expanded* by
> appending on the leftmost (slowest) dimension only.  A typical application
> is a gridded data set where new grids are added over time, as new data
> become available.
>
> *Netcdf-4 chunked variable.*  The array is subdivided and stored in
> equal-sized chunks across all dimensions.  The individual chunk dimension
> sizes may be arbitrarily chosen when a new file variable is first created.
> Inside each chunk, storage is contiguous across dimensions, just like in
> the other two storage models.  There is hidden infrastructure to keep track
> of all the chunks.  This strategy has some advantages for arbitrary
> unlimited (expandable) dimensions, and automatic internal data compression.
>
> There are two simple approaches to optimizing file access.  Either read
> and write the *entire variable* in a single statement, or else read and
> write *contiguous file chunks* or *contiguous subsets of file chunks*.  A
> common example is to set up a gridded file variable so that each 2-D grid
> is stored contiguously, then read and write a whole grid each time.  This
> is why you see many examples where *lat* and *lon* are the rightmost
> dimensions; that generally results in contiguous grids.
>
>
> On Wed, Oct 21, 2020 at 6:10 PM Gurer, Kemal at ARB via ncl-talk <
> ncl-talk at mailman.ucar.edu> wrote:
>
>> Hello ncl’ers,
>>
>>
>>
>> I have been searching a detailed description for the dimensional order of
>> the data points of a multi-dimensional variable to be written
>>
>> to a netcdf file. I found some locations, but am also equally confused
>> with different explanations or my lack of understanding. One example
>>
>> for such an explanation for the variable:
>>
>>
>>
>> temp [TIMES] [LEVELS] [LATS] [LONS];
>>
>>
>>
>> with
>>
>>
>>
>> TIMES = 3,
>>
>> LEVELS = 1,
>>
>> LATS = 5,
>>
>> LONS = 10
>>
>>
>>
>> being written into the netcdf file in the order of dimensions is given at:
>>
>>
>>
>>
>> https://www.unidata.ucar.edu/software/netcdf/docs/netcdf_data_set_components.html
>>
>>
>>
>> and mentions “The order in which the data will be *returned* is with the
>> last dimension, LONS, varying fastest:”
>>
>> at the time of retrieval of the data.
>>
>>
>>
>> What I am wondering is when I write the variable “temp” above into the
>> netcdf file, is the *first data element of temp*:
>>
>>
>>
>> Temp [2] [1] [4] [9]
>>
>>
>>
>> or,
>>
>>
>>
>> temp [0] [1] [0] [0]
>>
>>
>>
>> Thank you for your help.
>>
>>
>>
>> Kemal.
>>
> _______________________________________________
> ncl-talk mailing list
> ncl-talk at mailman.ucar.edu
> List instructions, subscriber options, unsubscribe:
> https://mailman.ucar.edu/mailman/listinfo/ncl-talk
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.ucar.edu/pipermail/ncl-talk/attachments/20201021/12705889/attachment.html>