[ncl-talk] binary file read
Debasish Hazra
debasish.hazra5 at gmail.com
Thu Mar 30 20:18:41 MDT 2017
Many thanks David, Mary for all your inputs. It looks alright now.
Debasish
On Thu, Mar 30, 2017 at 5:55 PM, David Brown <dbrown at ucar.edu> wrote:
> My guess is that this file is a 2D 720 x 1440 array of doubles in
> little endian format. There is no specific IDL formatting.
> The file size is 8294400 which is exactly equal to 720 x 1440 x 8. It
> contains may NaNs (not a number) and it also has what is presumably a
> _FillValue with the value -999.9000244140625.
>
> Here's how I would read it:
>
> setfileoption("bin","ReadByteOrder","LittleEndian")
> d1 = cbinread("viirs_meandbdi_gridded_statis2015002.dat",-1,"double")
>
> Make an array of all the non-nan values (otherwise printMinMax will
> return NaN for both min and max)
> d1ind = ind(.not. isnan_ieee(d1))
> d1x = d1(d1ind)
> printMinMax(d1x,0)
> output: (0) min=-999.9000244140625 max=5.164013057067244
>
> If you scroll through the variable d1x values you will see that the
> min value is clearly an outlier and therefore is most likely a fill
> value.
> So set the _FillValue
> d1x at _FillValue = -999.9000244140625
> Now
> printMinMax(d1x,0)
> output: (0) min=0.2350846065940295 max=5.164013057067244
>
> Hopefully these are reasonable values.
>
> Now set the _FillValue for the original data and turn the NaNs into
> _FillValue
> d1 at _FillValue = d1x at _FillValue
> d1 = where(isnan_ieee(d1),d1 at _FillValue, d1)
> ncl 73> printMinMax(d1,0)
> (0) min=0.2350846065940295 max=5.164013057067244
>
> But note out of the whole array there are not very many valid values:
>
> ncl 74> printVarSummary(d1)
> Variable: d1
> Type: double
> Total Size: 8294400 bytes
> 1036800 values
> Number of Dimensions: 1
> Dimensions and sizes: [1036800]
> Coordinates:
> Number Of Attributes: 1
> _FillValue : -999.9000244140625
>
> ncl 75> print(num(.not. ismissing(d1)))
> (0) 1820
>
> Nevertheless I believe this is the correct interpretation of this dataset.
> -dave
>
>
>
>
> On Thu, Mar 30, 2017 at 2:29 PM, Debasish Hazra
> <debasish.hazra5 at gmail.com> wrote:
> > Thanks Gus. Mary and myself both tried "endian" options, and presently
> > trying with
> >
> > "setfileoption("bin","readbyteorder","bigendian") option which seems to
> > produce reasonable minimum and maximum of data values. However, as Mary
> > mentioned large number of values are constant whcih is bit strange.
> >
> > You mentioned about "double" and I think input is in "double precision
> > floating point data and it is 8 bytes".
> >
> > Thanks.
> > Debasish
> >
> > On Thu, Mar 30, 2017 at 4:06 PM, Gus Correa <gus at ldeo.columbia.edu>
> wrote:
> >>
> >> Hi Mary, Debasish
> >>
> >> Could it be a little-endian vs. big-endian issue?
> >> I don't know IDL (I should! My boss uses it! :) )
> >> but their "read_binary" default endianness is "native" (like NCL).
> >> I.e., the endianness of the data on the file depends on the
> >> machine it was created (and data_type=5 is indeed double precision).
> >>
> >> Maybe using setfileoption('bin',"ReadByteOrder","BigEndian"),
> >> and trying also "LittleEndian" if not lucky with "Big"
> >> (who knows where the file was written ....),
> >> then cbinread/fbindirread with datatype "double" would help?
> >> Just a guess, and you probably tried the endianness thing already ...
> >>
> >> Best,
> >> Gus Correa
> >>
> >> On 03/30/2017 02:54 PM, Mary Haley wrote:
> >> > Hi Debasish,
> >> >
> >> > Dennis guess that maybe the "read_binary" function in IDL was meant to
> >> > read files created by "write_binary" but I didn't see a function with
> >> > that name. However, is it possible that this is some kind of special
> IDL
> >> > file and not a flat C binary file?
> >> >
> >> > In your IDL script, you have:
> >> >
> >> >
> >> > fdata=read_binary('viirs_meandbdi_gridded_statis2013'+
> day+'.dat',data_type=5,data_dims=[1440,720])
> >> >
> >> > If you read the documentation for "read_binary", it states that
> >> > "data_type=5" is double.
> >> >
> >> > In your NCL script, you are reading the data as an unsigned integer.
> >> >
> >> > I tried reading your data as a double, but I get what looks like
> >> > nonsensical values:
> >> >
> >> > min=-1.642556686681977e+308 max=6.633924105807938e+307
> >> >
> >> > You are right that the unsigned integer values look reasonable, but
> only
> >> > after you multiply them by 1e-9.
> >> >
> >> > When I look at your unsigned values, I see that
> >> > 517,484
> >> > of your values are equal to the same number: 6.3615e-05, while only
> >> > 1,831
> >> > values are equal to something else.
> >> > This seems a bit suspicious to me, and is likely the source of the
> >> > problem.
> >> >
> >> > I modified your script to plot red markers where the values are all
> >> > equal to 6.3615e-05, and black markers everywhere else. Does this look
> >> > correct?
> >> >
> >> > I have a feeling that there's something more to the "read_binary"
> >> > function that we need to know in order to read the file correctly.
> As I
> >> > think I mentioned before: perhaps each byte of data represents
> something
> >> > different, and you need to use something like dim_gbits to pick off
> >> > values.
> >> >
> >> > In your IDL script, is there anything you have to do additionally to
> the
> >> > data before you plot it? Can you check the IDL script to see if you
> are
> >> > getting a lot of values equal to the same constant value that NCL is?
> >> >
> >> > --Mary
> >> >
> >> >
> >> >
> >> > On Thu, Mar 30, 2017 at 8:36 AM, Debasish Hazra
> >> > <debasish.hazra5 at gmail.com <mailto:debasish.hazra5 at gmail.com>> wrote:
> >> >
> >> > Mary,
> >> >
> >> > Thanks.Taking your suggestion and reading that as 2 * 720 * 1440
> and
> >> > assuming input as C binary file, I am getting min=1.4e-08
> >> > max=4.29371 , which is reasonble. Attached is the new script. Any
> >> > suggestions.
> >> >
> >> > Debasish
> >> >
> >> > On Wed, Mar 29, 2017 at 5:28 PM, Mary Haley <haley at ucar.edu
> >> > <mailto:haley at ucar.edu>> wrote:
> >> >
> >> > Hi Debasish,
> >> >
> >> > Kevin and I took a look at this. For starters, there *is* an
> >> > error message coming out of your script:
> >> >
> >> > warning:cbinread: The size implied by the dimension arrays is
> >> > greater that the size of the file.
> >> > The default _FillValue for the specified type will be filled
> >> > in.
> >> > Note dimensions and values may not be aligned properly
> >> >
> >> > If you look at the size of the file, it doesn't match with the
> >> > dimensions you're requesting:
> >> >
> >> > Size of file = 8294400 bytes
> >> >
> >> > Size of dimensions = 5 * 720 * 1440 * 4 (for a uint) =
> 20736000
> >> >
> >> > If this is truly a C binary file, it looks like it only has 2
> *
> >> > 720 * 1440 * 4 bytes.
> >> >
> >> > This doesn't really change the results, however, because you
> >> > still get two strange looking plots.
> >> >
> >> > We tried several different things:
> >> >
> >> > 1) reading the data as ubyte, int, and ushort
> >> > 2) reversing the array to 1440 x 720 x 2
> >> > 3) reading the data as little endian
> >> > 4) plotting the data as a simple contour plot to take out the
> >> > map component.
> >> >
> >> > Nothing we did produced more information about the file, or
> >> > produced better plots.
> >> >
> >> > Is there some documentation on this file to understand how it
> >> > was written? For example, are you sure the "uint" type is
> >> > correct? Are you sure the dimension sizes are correct? Why are
> >> > the values so large? Is it possible that this is "packed"
> data,
> >> > and that you need to use a function like dim_gbits to pick off
> >> > individual bits of information?
> >> >
> >> > If you can find a C or Fortran code that was used to create
> this
> >> > file, then it should be fairly straightforward to figure out
> how
> >> > to read it.
> >> >
> >> > --Mary
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > On Wed, Mar 29, 2017 at 2:18 PM, Debasish Hazra
> >> > <debasish.hazra5 at gmail.com <mailto:debasish.hazra5 at gmail.com
> >>
> >> > wrote:
> >> >
> >> > Hi,
> >> >
> >> > I am trying to read a binary file with the attached code,
> >> > but getting all empty fields in the figure with no
> apparent
> >> > error message. Uploaded the data file in the ftp server
> >> > "viirs_meandbdi_gridded_statis2015048.dat". Any help with
> >> > this is appreciated.
> >> >
> >> > Thanks.
> >> > Debasish
> >> >
> >> > On Wed, Mar 22, 2017 at 10:33 AM, Debasish Hazra
> >> > <debasish.hazra5 at gmail.com
> >> > <mailto:debasish.hazra5 at gmail.com>> wrote:
> >> >
> >> > Hi,
> >> >
> >> > I am trying to read a binary file with the attached
> >> > code, but getting all empty fields in the figure with
> >> > no apparent error message. Uploaded the data file in
> >> > the ftp server
> >> > "viirs_meandbdi_gridded_statis2015002.dat". Any help
> >> > with this is appreciated.
> >> >
> >> > Thanks.
> >> > Debasish.
> >> >
> >> >
> >> >
> >> >
> >> > _______________________________________________
> >> > ncl-talk mailing list
> >> > ncl-talk at ucar.edu <mailto:ncl-talk at ucar.edu>
> >> > List instructions, subscriber options, unsubscribe:
> >> > http://mailman.ucar.edu/mailman/listinfo/ncl-talk
> >> > <http://mailman.ucar.edu/mailman/listinfo/ncl-talk>
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > _______________________________________________
> >> > ncl-talk mailing list
> >> > ncl-talk at ucar.edu
> >> > List instructions, subscriber options, unsubscribe:
> >> > http://mailman.ucar.edu/mailman/listinfo/ncl-talk
> >> >
> >>
> >> _______________________________________________
> >> ncl-talk mailing list
> >> ncl-talk at ucar.edu
> >> List instructions, subscriber options, unsubscribe:
> >> http://mailman.ucar.edu/mailman/listinfo/ncl-talk
> >
> >
> >
> > _______________________________________________
> > ncl-talk mailing list
> > ncl-talk at ucar.edu
> > List instructions, subscriber options, unsubscribe:
> > http://mailman.ucar.edu/mailman/listinfo/ncl-talk
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucar.edu/pipermail/ncl-talk/attachments/20170330/f6612f80/attachment.html
More information about the ncl-talk
mailing list