[pyngl-talk] xray et al (PyNIO, pandas, R)

Tom Roche Tom_Roche at pobox.com
Mon Jun 15 11:34:49 MDT 2015

For those who haven't heard already, xray[1] appears to be a "pythonic rewrite" (my characterization) of netCDF, both API and datastructures. Not sure how this relates/competes with the PyNIO/netCDF stack, but I have asked in the comments to this interesting blogpost[2] on xray+dask[3]. I'd like to know:

1. Suppose a team is starting to work on netCDF-based earth-science data and planning to develop as much as possible in Python. Why should they base their code on PyNIO vs xray? What does PyNIO do better than xray?

2. xray seems to be targeting pandas[4] integration, which seems compelling to me as someone seeking to migrate from a {bash, NCL, Python, R} stack to more/all-python. So (apologies for the corporate-speak) I'd like to know "what is NCAR's story" WRT PyNIO/pandas integration?

3. IIUC (but ICBW, please correct where wrong) a major reason to choose R over Python/pandas for work on netCDF-based data is that pandas operations on dataframes (at least currently) don't conserve netCDF metadata the way R operations on dataframes do. (If I'm missing something, please lemme know.) If so, I'd appreciate knowing more about any plans "the netCDF folks" have for improving/extending pandas to solve this problem.

Apologies if

* I've lumped too much in one thread

* I've asked questions here that should better be asked elsewhere. If so, please point me to the more appropriate channel.

TIA, Tom Roche <Tom_Roche at pobox.com>

[1]: http://xray.readthedocs.org/en/stable/
[2]: http://eng.climate.com/2015/06/11/xray-dask-out-of-core-labeled-arrays-in-python/
[3]: http://dask.pydata.org/en/latest/
[4]: http://pandas.pydata.org/

More information about the pyngl-talk mailing list