[mpas-developers] MPAS I/O requirements and design doc

Tue Feb 28 17:37:15 MST 2012

Hi, Phil.

Thanks for the comments; I've added my responses below.

Cheers,
Michael

On Tue, Feb 28, 2012 at 04:51:44PM +0000, Jones, Philip W wrote:
> Michael,
>
> Finally read your I/O doc and have a several comments/suggestions.
>
> For requirements, I would add:
>    - requirement for exact restart (any simulation interrupted by a
>      restart must be bitwise identical to a simulation without restart)

Would this requirement be to address floating-point issues during
reading and writing fields? It seems like these might be beyond our
control when working with the netCDF library, for example; and I'm a bit
hesitant to implement our own binary format.

Floating-point issues aside, I think this is beyond the scope of the I/O
layer, since each core needs to decide which fields are necessary and
sufficient for a bit-identical restart. Provided these fields (and
attributes) are added to the restart stream, I don't think we'd need
anything more than the ability to correctly write and read fields from
the I/O layer in order to get bit-identical restarts.

We can certainly add this requirement, in any case.

>    - for CESM, netCDF CF conventions are required for output files
>      (mostly certain required metadata - don't think we need to be bound
>      by CF grid conventions yet since they are a bit onerous, especially
>      for unstructured grids)

CF compliance (in terms of metadata) is something that we could indeed
tackle in the high-level I/O interface. I'll add this as a requirement.

>    - requirement to be able to specify the number of I/O tasks; this
>      will be important to optimize the I/O layer with underlying
>      architecture

Good point -- this capability was included in the design, but it should
be explicitly stated in the requirements, too.

>    - requirement to write many different types of fields, including
>      scalars and various multi-dimensional arrays (eg meridional
>      diagnostics, transports, surface fields), some of which we are not
>      likely to want in the registry (temporary derived diagnostic fields).
>      Note that even the scalars count as a field since they may have
>      associated metadata (eg transport diagnostics with the
>      name/units/possible location info for the transport).

I guess had taken for granted that we'd be writing different types of
fields as we do now (scalars, arrays with various combinations of
dimensions). Whether or not these fields are defined in the registry
shouldn't matter to the I/O layer, but it would be good to state this as
a requirement, as well as a requirement that the set of dimensions used
by fields can be arbitrary. Do you think the use of MPAS field types by
the high-level interface is at odds with this requirement?

> For global attributes, you will need a more complete stream attribute
> layer with the stream having an arbitrary number of attributes.  Some of
> these may be standard (eg conventions, history).  You will not only need
> to read/write but also add/remove attributes to a stream.

There are routines MPAS_readStreamAtt() and MPAS_writeStreamAtt()
through which an arbitrary number of attributes can be read or written
to a stream, but I'm not sure I follow the distinction between
reading/writing attributes and adding/removing attributes; could you
explain? Would we need to add an attribute but not write it, for
example?

> Similarly, you will need a more robust attribute layer for fields - you
> can either do this within the Field data type or define a new IOField
> data type that includes arrays of attributes together with the field.
> Basically, you will need to allow the user to define an arbitrary number
> of attributes of any kind (string, int, real, double, logical) for a
> given field, including standard attributes like short name, long name,
> units, valid range, undefined, etc.  Having this as part of the field
> layer can be beneficial for defining many attributes only once near
> start-up or reusing the same field in many different streams.

As in the last paragraph (just before the interface definitions) of
Section 3.3, the high-level interface provides routines for reading and
writing global attributes, which are very likely going to change on a
per-core and per-stream basis, but not for variable attributes, which
are (or will be) determined by the contents of the io_info component of
an MPAS field type. The reasoning here is that, in >95% of cases (in my
admittedly limited experience), the set of variable attributes are fixed
as those required for, e.g., CF compliance -- attributes like short
name, long name, units, etc. In the remaining cases where we might like
to add additional attributes to a field, that can always be accomplished
through use of the low-level interface routines MPAS_io_put_att() and
MPAS_io_get_att(). That being stated, I do think this would be a good
opportunity to extend the field/io_info types to handle arbitrary sets
of attributes, which could then be handled by the I/O implementation.
As I mentioned, my experience here is limited, so I'd be interested to
hear about use cases where we would need to add variable attributes
beyond those required for compliance with various metadata standards.

> You might also want to separate the dimension layer (you already sort of
> have this in the io_info within Field) so you can separately define
> dimensions and attach them to fields.  CESM will eventually want an
> unlimited (time) dimension too, but we can worry about that later.

The proposed I/O layer does already handle unlimited dimensions, but I
agree that we could extend the treatment of dimensions in general in the
io_info types that are contained within fields.

> For netCDF (or other self-defining formats), you will need multiple
> phases, especially for writing:
>   - creating the fields and adding all attributes
>   - "defining" the fields and dimensions (netCDF has a separate define
>     phase that writes all attributes and prepares for the binary portion
>     and it's very inefficient to jump in and out of the define phase)
>   - writing the field (generally writing the actual data since most
>     metadata is written during the define phase)

I'm quite familiar with netCDF and its "define" and "data" modes, and I
think the proposed design can accommodate these naturally by encouraging
users (if not outright requiring; but we can always re-enter define mode
at a cost) to make all calls to MPAS_addStream(s,field) before any calls
to MPAS_writeStream(s); in each call to MPAS_addStream, we define any
yet-undefined dimensions as well as the new field, and upon the first
call to MPAS_writeStream, we'll simply leave define mode and enter data
mode. So, there's no reason why the high-level interface must explicitly
identify a data mode and a define mode, so long as the appropriate
book-keeping is performed within the implementation.

> For performance/memory reasons, you don't want to have to gather/copy
> data during the first two phases since you'll need to define all fields
> up front.  So the field should probably use a pointer with which you can
> point to the actual binary data just before reading/writing.

I certainly hadn't planned on doing this, but it's a point worth
emphasizing in any case. The field types do already contain a pointer to
the arrays holding the actual data.

> Are we going to support straight binary format?  If so, we'll need a
> design for how we store metadata in those files.  Note that I don't
> think this is a strong need any more (netCDF can not always guarantee
> exact restart if an architecture isn't using a form of IEEE binary
> format, but that doesn't happen very often anymore).

The proposed design doesn't preclude the addition of new formats in
future; so, we could certainly add a binary format if needed.

> Basically, the current design is not complete enough, esp. wrt
> attributes, and should probably be fleshed out some more.  We can always
> prioritize aspects of the implementation so we can at least get
> multi-block I/O for registry variables, etc. up quickly.  But we'll need
> all of this before too long.

Besides attributes, could you list the specific deficiencies in the
proposed design so I can be sure to address them completely? I'm really
not being deliberately obtuse here.

> Sorry to add more work, but it's worth thinking about this stuff now.

I certainly agree!

>
> Phil
>
>
> On 2/24/12 2:08 PM, "Michael Duda" <duda at ucar.edu> wrote:
>
> Hi, Folks.
>
> I've been slowly working on a requirements and design document for
> a new I/O layer in MPAS that will provide parallel I/O (almost
> certainly to be implemented using PIO) and I/O for multiple blocks
> per MPI task. The Implementation and Testing chapters are still
> blank, as I first wanted to get some feedback on the requirements
> and proposed design to see whether I'm headed in the right
> direction.
>
> Attached is the document and its source; if anyone has questions,
> comments, or other suggestions, I'd be glad to hear them.
>
> Thanks!
> Michael
> ________________________________
> _______________________________________________
> mpas-developers mailing list
> mpas-developers at mailman.ucar.edu
> http://mailman.ucar.edu/mailman/listinfo/mpas-developers
>
>
> ---
> Correspondence/TSPA/DUSA EARTH
> ------------------------------------------------------------
> Philip Jones                                pwjones at lanl.gov
> Climate, Ocean and Sea Ice Modeling
> Los Alamos National Laboratory
> T-3 MS B216                                 Ph: 505-500-2699
> PO Box 1663                                Fax: 505-665-5926
> Los Alamos, NM 87545-1663