[mpas-developers] MPAS I/O requirements and design doc

Jones, Philip W pwjones at lanl.gov
Thu Mar 1 11:34:09 MST 2012


Michael,

Responses to your responses...

On 2/28/12 5:37 PM, "Michael Duda" <duda at ucar.edu> wrote:

Would this requirement be to address floating-point issues during
reading and writing fields? It seems like these might be beyond our
control when working with the netCDF library, for example; and I'm a bit
hesitant to implement our own binary format.

I was not asking for the I/O layer to handle binary formats internally.  For most machines, the netCDF use of IEEE format is sufficient to satisfy the exact restart requirement and I’m ok with that in the near term.  Exact restart is a hard requirement however, so if we run into an architecture that isn’t using IEEE format, we would need to support a native binary option as a backup (using Fortran/C binary I/O functions), but we can worry about that later as the need arises.  We’ve been able to write these in such a way that they use the same interface as other I/O options, so don’t think it affects anything at the interface level and can be added as needed.

CF compliance (in terms of metadata) is something that we could indeed
tackle in the high-level I/O interface. I'll add this as a requirement.

As long as you have attributes at both the field and file levels, CF compliance is easy, whether it’s enforced by the interface (i.e. Having required attributes as well as additional attributes) or just left to the component developer to ensure the necessary metadata is included.


... Do you think the use of MPAS field types by
the high-level interface is at odds with this requirement?

I think it’s fine.  Just worrying a bit about the case where you might want to dump a derived field that you weren’t treating as a field (a naked array as ESMF used to call them).  But I suspect we’re going to need to define a field in these cases anyway to support the proper metadata.  I don’t have a strong need/requirement for I/O on bare nekkid arrays.

There are routines MPAS_readStreamAtt() and MPAS_writeStreamAtt()
through which an arbitrary number of attributes can be read or written
to a stream, but I'm not sure I follow the distinction between
reading/writing attributes and adding/removing attributes; could you
explain? Would we need to add an attribute but not write it, for
example?

Regarding the attributes at the stream/field levels, you are correct – I was thinking too narrowly about previous implementations.  You can indeed use the writeStreamAtt as long as you’re tracking define mode.  I guess I would prefer returning an error when trying to write attributes while not in define mode and a hard enforcement of writing all attributes at the proper time.  Re-entering define mode, especially for high-res simulations, is a job-killer.  So we wouldn’t need an attribute layer for streams or any of the add/remove interfaces.

And I was certainly in agreement with keeping the field attributes in the field layer (rather than the I/O layer).  But the current field types in mpas_grid_types do not seem to have any metadata other than fieldName, so didn’t see any place for field attributes.

You are correct that for most fields, the attributes are pretty standard, but there are cases (e.g.  normalization, time averaging, some BGC fields) where additional metadata can be useful and you’d like to have the flexibility for users to add additional field metadata if it’s needed.  So it would be beneficial to have a more generic attribute layer in the field with add/remove functions (and still keep the “standard” attributes in the interfaces).  User-defined attributes are not a high priority however and can be added later, but we will need the standard CF attributes.  Also, as we start to add these features, it’s likely we may want a separate field module – a decision for a future date.

> Basically, the current design is not complete enough, esp. wrt
> attributes, and should probably be fleshed out some more.  We can always
> prioritize aspects of the implementation so we can at least get
> multi-block I/O for registry variables, etc. up quickly.  But we'll need
> all of this before too long.


Besides attributes, could you list the specific deficiencies in the
proposed design so I can be sure to address them completely? I'm really
not being deliberately obtuse here.

Sorry, that came across badly – didn’t mean to sound harsh.  The attributes were really the primary missing piece and I’m ok with the stream interfaces after thinking about them some more.  Guess the field data type might need some further additions to include the field attributes.  Might be nice to have a write use case in addition to your read example that also covers field definition.

Thanks,

Phil

---
Correspondence/TSPA/DUSA EARTH
------------------------------------------------------------
Philip Jones                                pwjones at lanl.gov
Climate, Ocean and Sea Ice Modeling
Los Alamos National Laboratory
T-3 MS B216                                 Ph: 505-500-2699
PO Box 1663                                Fax: 505-665-5926
Los Alamos, NM 87545-1663



-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucar.edu/pipermail/mpas-developers/attachments/20120301/5a986c88/attachment.html 


More information about the mpas-developers mailing list