[ncl-talk] binary file read

David Brown dbrown at ucar.edu
Thu Mar 30 15:55:35 MDT 2017


My guess is that this file is a 2D 720 x 1440 array of doubles in
little endian format. There is no specific IDL formatting.
The file size is 8294400 which is exactly equal to 720 x 1440 x 8. It
contains may NaNs (not a number) and it also has what is presumably a
_FillValue with the value -999.9000244140625.

Here's how I would read it:

setfileoption("bin","ReadByteOrder","LittleEndian")
d1 = cbinread("viirs_meandbdi_gridded_statis2015002.dat",-1,"double")

Make an array of all the non-nan values (otherwise printMinMax will
return NaN for both min and max)
d1ind = ind(.not. isnan_ieee(d1))
d1x = d1(d1ind)
printMinMax(d1x,0)
  output: (0)     min=-999.9000244140625   max=5.164013057067244

If you scroll through the variable d1x values you will see that the
min value is clearly an outlier and therefore is most likely a fill
value.
So set the _FillValue
d1x at _FillValue = -999.9000244140625
Now
printMinMax(d1x,0)
   output: (0)     min=0.2350846065940295   max=5.164013057067244

Hopefully these are reasonable values.

Now set the _FillValue for the original data and turn the NaNs into _FillValue
d1 at _FillValue = d1x at _FillValue
d1 = where(isnan_ieee(d1),d1 at _FillValue, d1)
ncl 73> printMinMax(d1,0)
(0)     min=0.2350846065940295   max=5.164013057067244

But note out of the whole array there are not very many valid values:

ncl 74> printVarSummary(d1)
Variable: d1
Type: double
Total Size: 8294400 bytes
            1036800 values
Number of Dimensions: 1
Dimensions and sizes: [1036800]
Coordinates:
Number Of Attributes: 1
  _FillValue : -999.9000244140625

ncl 75> print(num(.not. ismissing(d1)))
(0)     1820

Nevertheless I believe this is the correct interpretation of this dataset.
 -dave




On Thu, Mar 30, 2017 at 2:29 PM, Debasish Hazra
<debasish.hazra5 at gmail.com> wrote:
> Thanks Gus. Mary and myself both tried "endian" options, and presently
> trying with
>
> "setfileoption("bin","readbyteorder","bigendian") option which seems to
> produce reasonable minimum and maximum of data values. However, as Mary
> mentioned large number of values are constant whcih is bit strange.
>
> You mentioned about "double" and I think input is in "double precision
> floating point data and it is 8 bytes".
>
> Thanks.
> Debasish
>
> On Thu, Mar 30, 2017 at 4:06 PM, Gus Correa <gus at ldeo.columbia.edu> wrote:
>>
>> Hi Mary, Debasish
>>
>> Could it be a little-endian vs. big-endian issue?
>> I don't know IDL (I should! My boss uses it! :) )
>> but their "read_binary" default endianness is "native" (like NCL).
>> I.e., the endianness of the data on the file depends on the
>> machine it was created (and data_type=5 is indeed double precision).
>>
>> Maybe using setfileoption('bin',"ReadByteOrder","BigEndian"),
>> and trying also "LittleEndian" if not lucky with "Big"
>> (who knows where the file was written ....),
>> then cbinread/fbindirread with datatype "double" would help?
>> Just a guess, and you probably tried the endianness thing already ...
>>
>> Best,
>> Gus Correa
>>
>> On 03/30/2017 02:54 PM, Mary Haley wrote:
>> > Hi Debasish,
>> >
>> > Dennis guess that maybe the "read_binary" function in IDL was meant to
>> > read files created by "write_binary" but I didn't see a function with
>> > that name. However, is it possible that this is some kind of special IDL
>> > file and not a flat C binary file?
>> >
>> > In your IDL script, you have:
>> >
>> >
>> > fdata=read_binary('viirs_meandbdi_gridded_statis2013'+day+'.dat',data_type=5,data_dims=[1440,720])
>> >
>> > If you read the documentation for "read_binary", it states that
>> > "data_type=5" is double.
>> >
>> > In your NCL script, you are reading the data as an unsigned integer.
>> >
>> > I tried reading your data as a double, but I get what looks like
>> > nonsensical values:
>> >
>> >  min=-1.642556686681977e+308   max=6.633924105807938e+307
>> >
>> > You are right that the unsigned integer values look reasonable, but only
>> > after you multiply them by 1e-9.
>> >
>> > When I look at your unsigned values, I see that
>> > 517,484
>> > of your values are equal to the same number: 6.3615e-05, while only
>> > 1,831
>> >  values are equal to something else.
>> > This seems a bit suspicious to me, and is likely the source of the
>> > problem.
>> >
>> > I modified your script to plot red markers where the values are all
>> > equal to 6.3615e-05, and black markers everywhere else. Does this look
>> > correct?
>> >
>> > I have a feeling that there's something more to the "read_binary"
>> > function that we need to know in order to read the file correctly.  As I
>> > think I mentioned before: perhaps each byte of data represents something
>> > different, and you need to use something like dim_gbits to pick off
>> > values.
>> >
>> > In your IDL script, is there anything you have to do additionally to the
>> > data before you plot it?  Can you check the IDL script to see if you are
>> > getting a lot of values equal to the same constant value that NCL is?
>> >
>> > --Mary
>> >
>> >
>> >
>> > On Thu, Mar 30, 2017 at 8:36 AM, Debasish Hazra
>> > <debasish.hazra5 at gmail.com <mailto:debasish.hazra5 at gmail.com>> wrote:
>> >
>> >     Mary,
>> >
>> >     Thanks.Taking your suggestion and reading that as 2 * 720 * 1440 and
>> >     assuming input as C binary file, I am getting      min=1.4e-08
>> >     max=4.29371 , which is reasonble. Attached is the new script. Any
>> >     suggestions.
>> >
>> >     Debasish
>> >
>> >     On Wed, Mar 29, 2017 at 5:28 PM, Mary Haley <haley at ucar.edu
>> >     <mailto:haley at ucar.edu>> wrote:
>> >
>> >         Hi Debasish,
>> >
>> >         Kevin and I took a look at this. For starters, there *is* an
>> >         error message coming out of your script:
>> >
>> >         warning:cbinread: The size implied by the dimension arrays is
>> >         greater that the size of the file.
>> >          The default _FillValue for the specified type will be filled
>> > in.
>> >          Note dimensions and values may not be aligned properly
>> >
>> >         If you look at the size of the file, it doesn't match with the
>> >         dimensions you're requesting:
>> >
>> >         Size of file = 8294400 bytes
>> >
>> >         Size of dimensions = 5 * 720 * 1440 * 4 (for a uint) = 20736000
>> >
>> >         If this is truly a C binary file, it looks like it only has 2 *
>> >         720 * 1440 * 4 bytes.
>> >
>> >         This doesn't really change the results, however, because you
>> >         still get two strange looking plots.
>> >
>> >         We tried several different things:
>> >
>> >         1) reading the data as ubyte, int, and ushort
>> >         2) reversing the array to 1440 x 720 x 2
>> >         3) reading the data as little endian
>> >         4) plotting the data as a simple contour plot to take out the
>> >         map component.
>> >
>> >         Nothing we did produced more information about the file, or
>> >         produced better plots.
>> >
>> >         Is there some documentation on this file to understand how it
>> >         was written? For example, are you sure the "uint" type is
>> >         correct? Are you sure the dimension sizes are correct? Why are
>> >         the values so large? Is it possible that this is "packed" data,
>> >         and that you need to use a function like dim_gbits to pick off
>> >         individual bits of information?
>> >
>> >         If you can find a C or Fortran code that was used to create this
>> >         file, then it should be fairly straightforward to figure out how
>> >         to read it.
>> >
>> >         --Mary
>> >
>> >
>> >
>> >
>> >
>> >
>> >         On Wed, Mar 29, 2017 at 2:18 PM, Debasish Hazra
>> >         <debasish.hazra5 at gmail.com <mailto:debasish.hazra5 at gmail.com>>
>> >         wrote:
>> >
>> >             Hi,
>> >
>> >             I am trying to read a binary file with the attached code,
>> >             but  getting all empty fields in the figure with no apparent
>> >             error message. Uploaded  the data file in the ftp server
>> >             "viirs_meandbdi_gridded_statis2015048.dat". Any help with
>> >             this is appreciated.
>> >
>> >             Thanks.
>> >             Debasish
>> >
>> >             On Wed, Mar 22, 2017 at 10:33 AM, Debasish Hazra
>> >             <debasish.hazra5 at gmail.com
>> >             <mailto:debasish.hazra5 at gmail.com>> wrote:
>> >
>> >                 Hi,
>> >
>> >                 I am trying to read a binary file with the attached
>> >                 code, but  getting all empty fields in the figure with
>> >                 no apparent error message. Uploaded  the data file in
>> >                 the ftp server
>> >                 "viirs_meandbdi_gridded_statis2015002.dat". Any help
>> >                 with this is appreciated.
>> >
>> >                 Thanks.
>> >                 Debasish.
>> >
>> >
>> >
>> >
>> >             _______________________________________________
>> >             ncl-talk mailing list
>> >             ncl-talk at ucar.edu <mailto:ncl-talk at ucar.edu>
>> >             List instructions, subscriber options, unsubscribe:
>> >             http://mailman.ucar.edu/mailman/listinfo/ncl-talk
>> >             <http://mailman.ucar.edu/mailman/listinfo/ncl-talk>
>> >
>> >
>> >
>> >
>> >
>> >
>> > _______________________________________________
>> > ncl-talk mailing list
>> > ncl-talk at ucar.edu
>> > List instructions, subscriber options, unsubscribe:
>> > http://mailman.ucar.edu/mailman/listinfo/ncl-talk
>> >
>>
>> _______________________________________________
>> ncl-talk mailing list
>> ncl-talk at ucar.edu
>> List instructions, subscriber options, unsubscribe:
>> http://mailman.ucar.edu/mailman/listinfo/ncl-talk
>
>
>
> _______________________________________________
> ncl-talk mailing list
> ncl-talk at ucar.edu
> List instructions, subscriber options, unsubscribe:
> http://mailman.ucar.edu/mailman/listinfo/ncl-talk
>


More information about the ncl-talk mailing list