[pyngl-talk] PyNIO 1.5.0 Beta vs 1.4.1 - NetCDF Variable Access and Numpy

David Brown dbrown at ucar.edu
Thu Dec 31 11:48:06 MST 2015


I am not totally sure what is going on here, but I can tell you that
for PyNIO, as for its predecessor, Konrad Hinson's scientific NetCDF
package, the design was that you needed to "dereference" the
NioVariable object using indexing syntax to get the NumPyarray values.
But the NioVariable object always supported the Python Sequence
protocol, and I believe that at some point, support for the Sequence
protocol in numpy was enhanced in a way that allowed NumPy arrays to
be derived from NioVariable objects. This was without any explicit
changes to support this feature in PyNIO.

However, in my experience, trying to use this in practice has
extremely bad performance, because NumPy has no real knowledge of the
data that the NioVariable object refers to, and consequently it asks
for array elements one at a time. For anything but very small datasets
this is extremely inefficient. The chances are that your example with
1.5.0-beta did not actually fail. It was just taking an extremely long
time.
But given this, I was not aware of a difference in performance between
1.4.1 and 1.5.0-beta. We can investigate, but as I said, this was
never an intended feature of PyNIO, but arose from later developments
in numpy.
 -dave

On Tue, Dec 29, 2015 at 11:54 AM, Jason Greenlaw - NOAA Affiliate
<jason.greenlaw at noaa.gov> wrote:
> Hi Heather,
>
> Yes, it is an NioVariable object.  Seems that at 1.4.1, NioVariable provided
> some numpy functionality from the object itself rather than requiring you to
> extract the numpy array first.
>
> I am not an expert with PyNIO/numpy and this code was written by someone
> else, but I was under the impression NioVariable provided access to the
> arrays via an iterator (i.e. lazy loading), which would be preferable in
> some cases to loading the entire arrays into memory using numpy indexing.
> But I could be totally off base there.
>
> Output is below.
>
> Thanks,
> Jason
>
>
> $ python
> Python 2.7.2 (default, May  7 2012, 16:54:01)
> [GCC 4.4.5 20110214 (Red Hat 4.4.5-6)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> import Nio
>>>> import numpy
>>>> Nio.__version__
> '1.4.1'
>>>> numpy.__version__
> '1.6.1'
>>>> f = Nio.open_file("glofs.leofs.fields.nowcast.20151229.t12z.nc", "r")
>>>> m = f.variables["mask"]
>>>> m
> <Nio.NioVariable object at 0x2290050>
>>>> print type(m)
> <class 'Nio.NioVariable'>
>>>> m.shape
> (24, 81)
>>>> m_contents = m[:,:]
>>>> print type(m_contents)
> <type 'numpy.ndarray'>
>>>> numpy.ma.masked_equal(m, 1.0)
> masked_array(data =
>  [[0.0 0.0 0.0 ..., 0.0 0.0 0.0]
>  [0.0 0.0 0.0 ..., 0.0 0.0 0.0]
>  [0.0 0.0 0.0 ..., 0.0 0.0 0.0]
>  ...,
>  [0.0 0.0 0.0 ..., 0.0 0.0 0.0]
>  [0.0 0.0 0.0 ..., 0.0 0.0 0.0]
>  [0.0 0.0 0.0 ..., 0.0 0.0 0.0]],
>              mask =
>  [[False False False ..., False False False]
>  [False False False ..., False False False]
>  [False False False ..., False False False]
>  ...,
>  [False False False ..., False False False]
>  [False False False ..., False False False]
>  [False False False ..., False False False]],
>        fill_value = 1.0)
>
>>>> numpy.ma.masked_equal(m_contents, 1.0)
> masked_array(data =
>  [[0.0 0.0 0.0 ..., 0.0 0.0 0.0]
>  [0.0 0.0 0.0 ..., 0.0 0.0 0.0]
>  [0.0 0.0 0.0 ..., 0.0 0.0 0.0]
>  ...,
>  [0.0 0.0 0.0 ..., 0.0 0.0 0.0]
>  [0.0 0.0 0.0 ..., 0.0 0.0 0.0]
>  [0.0 0.0 0.0 ..., 0.0 0.0 0.0]],
>              mask =
>  [[False False False ..., False False False]
>  [False False False ..., False False False]
>  [False False False ..., False False False]
>  ...,
>  [False False False ..., False False False]
>  [False False False ..., False False False]
>  [False False False ..., False False False]],
>        fill_value = 1.0)
>
>
> --
> Jason Greenlaw
> Software Developer, ERT, Inc.
> NOAA/NOS/OCS/CSDL
> http://nowcoast.noaa.gov
> Jason.Greenlaw at noaa.gov
>
>
> On Tue, Dec 29, 2015 at 1:19 PM, Cronk,Heather <Heather.Cronk at colostate.edu>
> wrote:
>>
>> Hi Jason,
>>
>> I am a PyNIO user, not a developer so I can’t speak to any intentionality,
>> but I am more surprised that your code works with version PyNIO version
>> 1.4.1 than that it does’t work with the beta. I don’t have the old version
>> anymore, but I am curious the output of type(m) with your original code? I
>> was under the impression that the call f.variables["mask”] had always
>> produced a Nio object and not a numpy array. Using the beta version I see
>> this:
>>
>>
>> m_obj = f.variables["mask"]
>>
>> print type(m_obj)
>>
>> >>  <class 'Nio.NioVariable'>
>>
>> m_contents = f.variables["mask"][:]
>>
>> print type(m_contents)
>>
>> >>  <type 'numpy.ndarray’>
>>
>>
>> What does the corresponding code produce with the 1.4.1?
>>
>>
>> Thanks!
>>
>> Heather
>>
>>
>> From: <pyngl-talk-bounces at ucar.edu> on behalf of Jason Greenlaw - NOAA
>> Affiliate <jason.greenlaw at noaa.gov>
>> Date: Tuesday, December 29, 2015 at 10:36 AM
>> To: "pyngl-talk at ucar.edu" <pyngl-talk at ucar.edu>
>> Subject: [pyngl-talk] PyNIO 1.5.0 Beta vs 1.4.1 - NetCDF Variable Access
>> and Numpy
>>
>> Hello,
>>
>> I recently installed the 1.5.0 beta versions of PyNGL and PyNIO (using
>> 64-bit binaries for CentOS6) and attempted to run some existing code, but
>> encountered an issue when numpy functions (e.g. numpy.ma.masked_equal()) are
>> called with NioVariable object arguments.
>>
>> At PyNIO v1.4.1/numpy1.6.1 I was able to do the following:
>>
>> >>> import Nio
>> >>> Nio.__version__
>> '1.4.1'
>> >>> import numpy
>> >>> numpy.__version__
>> '1.6.1'
>> >>> f = Nio.open_file("glofs.leofs.fields.nowcast.20151229.t12z.nc", "r")
>> >>> m = f.variables["mask"]
>> >>> numpy.ma.masked_equal(m, 1.0)
>> masked_array(data =
>>  [[0.0 0.0 0.0 ..., 0.0 0.0 0.0]
>>  [0.0 0.0 0.0 ..., 0.0 0.0 0.0]
>>  [0.0 0.0 0.0 ..., 0.0 0.0 0.0]
>>  ...,
>>  [0.0 0.0 0.0 ..., 0.0 0.0 0.0]
>>  [0.0 0.0 0.0 ..., 0.0 0.0 0.0]
>>  [0.0 0.0 0.0 ..., 0.0 0.0 0.0]],
>>              mask =
>>  [[False False False ..., False False False]
>>  [False False False ..., False False False]
>>  [False False False ..., False False False]
>>  ...,
>>  [False False False ..., False False False]
>>  [False False False ..., False False False]
>>  [False False False ..., False False False]],
>>        fill_value = 1.0)
>>
>>
>>
>> However at PyNIO 1.5.0 beta/numpy 1.9.2, the numpy function call hangs,
>> and the process begins consuming memory at an exponential rate until the
>> call is interrupted.
>>
>> >>> import Nio
>> >>> Nio.__version__
>> '1.5.0-beta'
>> >>> import numpy
>> >>> numpy.__version__
>> '1.9.2'
>> >>> f = Nio.open_file("glofs.leofs.fields.nowcast.20151229.t12z.nc", "r")
>> >>> m = f.variables["mask"]
>> >>> numpy.ma.masked_equal(m, 1.0)
>> ^C^CTraceback (most recent call last):
>>   File "<stdin>", line 1, in <module>
>>   File "/opt/pyngl/python/lib/python2.7/site-packages/numpy/ma/core.py",
>> line 1982, in masked_equal
>>     output = masked_where(equal(x, value), x, copy=copy)
>>   File "/opt/pyngl/python/lib/python2.7/site-packages/numpy/ma/core.py",
>> line 928, in __call__
>>     (da, db) = (getdata(a, subok=False), getdata(b, subok=False))
>>   File "/opt/pyngl/python/lib/python2.7/site-packages/numpy/ma/core.py",
>> line 667, in getdata
>>     data = np.array(a, copy=False, subok=subok)
>>   File "/opt/pyngl/python/lib/python2.7/site-packages/PyNIO/Nio.py", line
>> 325, in __getitem__
>>     ret = get_variable(self.file, self.varname, xsel)
>>   File "/opt/pyngl/python/lib/python2.7/site-packages/PyNIO/coordsel.py",
>> line 60, in get_variable
>>     ret = file.file.variables[varname][xsel]
>> KeyboardInterrupt
>>
>>
>> But if I change the call to use numpy indexing, it works:
>>
>> >>> m.shape
>> (24, 81)
>> >>> numpy.ma.masked_equal(m[:,:], 1.0)
>> masked_array(data =
>>  [[0.0 0.0 0.0 ..., 0.0 0.0 0.0]
>>  [0.0 0.0 0.0 ..., 0.0 0.0 0.0]
>>  [0.0 0.0 0.0 ..., 0.0 0.0 0.0]
>>  ...,
>>  [0.0 0.0 0.0 ..., 0.0 0.0 0.0]
>>  [0.0 0.0 0.0 ..., 0.0 0.0 0.0]
>>  [0.0 0.0 0.0 ..., 0.0 0.0 0.0]],
>>              mask =
>>  [[False False False ..., False False False]
>>  [False False False ..., False False False]
>>  [False False False ..., False False False]
>>  ...,
>>  [False False False ..., False False False]
>>  [False False False ..., False False False]
>>  [False False False ..., False False False]],
>>        fill_value = 1.0)
>>
>>
>> Was this change in functionality intentional?
>>
>> The NetCDF files I used are available at:
>>
>>     ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/nos/prod/glofs.20151229/
>>
>> (directory will change based on date)
>>
>> Thanks,
>> Jason
>> --
>> Jason Greenlaw
>> Software Developer, ERT, Inc.
>> NOAA/NOS/OCS/CSDL
>> http://nowcoast.noaa.gov
>> Jason.Greenlaw at noaa.gov
>>
>
>
> _______________________________________________
> pyngl-talk mailing list
> List instructions, subscriber options, unsubscribe:
> http://mailman.ucar.edu/mailman/listinfo/pyngl-talk
>


More information about the pyngl-talk mailing list