[Met_help] [rt.rap.ucar.edu #84094] History for Re: segmentation fault when running PB2NC on some prepbufr files from the RAP

John Halley Gotway via RT met_help at ucar.edu
Wed Jul 10 16:59:46 MDT 2019


----------------------------------------------------------------
  Initial Request
----------------------------------------------------------------

Hi Chunhua and Ming,
    I'm resending with the correct e-mail for MET Help.
Jonathan



On 02/15/2018 02:10 PM, Jonathan Vigh wrote:
>
> Dear Met Help,
>
> I'm working with Chunhua Zhou to do some verification for the DTC DA 
> task and I'm getting a segmentation fault when running PB2NC on 
> Cheyenne to process some prepbufr files from the RAP. It runs for 
> about 20 minutes and then segfaults. Before it segfaults, it produced 
> the following output:
>
> jvigh at r2i7n33:~/DTC/DA/verification/verify_source/MET_scripts> 
> ./da_pb2nc.ksh
> FCST_TIME=00
> valid time for  00 h forecast =  2016090300
> CALLING: /glade/u/home/jvigh/DTC/DA/verification/MET_exec/pb2nc 
> /glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/20162470000.rap.t00z.prepbufr.tm00.20160903 
> /glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/pb2nc/rap.2016090300.prepbufr.tm00.nc 
> /glade/u/home/jvigh/DTC/DA/verification/verify_source/MET_config/PB2NCConfig_HRRR.DA_task 
> -v 2 -index
> DEBUG 1: Default Config File: 
> /glade/p/ral/jnt/HRRR/code/met-6.1/share/met/config/PB2NCConfig_default
> DEBUG 1: User Config File: 
> /glade/u/home/jvigh/DTC/DA/verification/verify_source/MET_config/PB2NCConfig_HRRR.DA_task
> DEBUG 1:
> DEBUG 1: Pre-processing Bufr File for metadata (BUFR variable names) 
> from 
> /glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/20162470000.rap.t00z.prepbufr.tm00.20160903
> DEBUG 1:
> 5% 10% 15% 20% 25% 30% 35% 40% 45% 50% 55% 60% 65% 70% 75% 80% 85% 90% 
> 95% 100% DEBUG 1:
> DEBUG 1:    Header variables:
> DEBUG 1:         SID: STATION IDENTIFICATION
> DEBUG 1:         XOB: LONGITUDE
> DEBUG 1:         YOB: LATITUDE
> DEBUG 1:         DHR: OBSERVATION TIME MINUS CYCLE TIME
> DEBUG 1:         ELV: STATION ELEVATION
> DEBUG 1:         TYP: PREPBUFR REPORT TYPE
> DEBUG 1:         T29: DATA DUMP REPORT TYPE
> DEBUG 1:         ITP: INSTRUMENT TYPE
> DEBUG 1:
> DEBUG 1:    Observation variables:
> DEBUG 1:         QOB: SPECIFIC HUMIDITY 
> OBSERVATION                             types: ADPUPA  X¡ÿÿÿ AIRCAR  
> X¡ÿÿÿ AIRCFT  X¡ÿÿÿ SATWND  X¡ÿÿÿ PROFLR  X¡ÿÿÿ VADWND X¡ÿÿÿ ADPSFC  
> X¡ÿÿÿ SFCSHP  X¡ÿÿÿ GOESND  X¡ÿÿÿ MSONET  X¡ÿÿÿ RASSDA  X¡ÿÿÿ
> DEBUG 1:         TOB: TEMPERATURE 
> OBSERVATION                                   types: ADPUPA X¡ÿÿÿ 
> AIRCAR  X¡ÿÿÿ AIRCFT  X¡ÿÿÿ SATWND  X¡ÿÿÿ PROFLR  X¡ÿÿÿ VADWND  X¡ÿÿÿ 
> ADPSFC  X¡ÿÿÿ SFCSHP  X¡ÿÿÿ GOESND  X¡ÿÿÿ MSONET X¡ÿÿÿ
> DEBUG 1:         ZOB: HEIGHT 
> OBSERVATION                                        types: ADPUPA  
> X¡ÿÿÿ ADPSFC  X¡ÿÿÿ SFCSHP  X¡ÿÿÿ GOESND  X¡ÿÿÿ MSONET X¡ÿÿÿ
> DEBUG 1:         UOB: U-COMPONENT WIND 
> OBSERVATION                              types: ADPUPA  X¡ÿÿÿ AIRCAR  
> X¡ÿÿÿ AIRCFT  X¡ÿÿÿ ADPSFC  X¡ÿÿÿ SFCSHP  X¡ÿÿÿ MSONET X¡ÿÿÿ RASSDA  X¡ÿÿÿ
> DEBUG 1:         VOB: V-COMPONENT WIND 
> OBSERVATION                              types: ADPUPA  X¡ÿÿÿ AIRCAR  
> X¡ÿÿÿ AIRCFT  X¡ÿÿÿ ADPSFC  X¡ÿÿÿ SFCSHP  X¡ÿÿÿ MSONET X¡ÿÿÿ RASSDA  X¡ÿÿÿ
> DEBUG 1:       HBLCS: HEIGHT ABOVE SURFACE OF BASE OF LOWEST CLOUD 
> SEEN         types: ADPUPA  X¡ÿÿÿ ADPSFC  X¡ÿÿÿ SFCSHP X¡ÿÿÿ
> DEBUG 1:        CLTP: CLOUD 
> TYPE                                                types: AIRCFT  
> X¡ÿÿÿ ADPSFC  X¡ÿÿÿ SFCSHP  X¡ÿÿÿ
> DEBUG 1:        CLAM: CLOUD 
> AMOUNT                                              types: ADPUPA  
> X¡ÿÿÿ AIRCFT  X¡ÿÿÿ ADPSFC  X¡ÿÿÿ SFCSHP  X¡ÿÿÿ GOESND X¡ÿÿÿ
> DEBUG 1:         QFC: FORECAST (BACKGROUND) SPECIFIC HUMIDITY 
> VALUE             types: ADPUPA  X¡ÿÿÿ AIRCAR  X¡ÿÿÿ AIRCFT X¡ÿÿÿ 
> SATWND  X¡ÿÿÿ PROFLR  X¡ÿÿÿ VADWND  X¡ÿÿÿ ADPSFC  X¡ÿÿÿ SFCSHP  X¡ÿÿÿ 
> GOESND  X¡ÿÿÿ MSONET  X¡ÿÿÿ RASSDA  X¡ÿÿÿ
> DEBUG 1:         QRC: SPECIFIC HUMIDITY EVENT REASON 
> CODE                       types: ADPUPA  X¡ÿÿÿ AIRCAR  X¡ÿÿÿ AIRCFT  
> X¡ÿÿÿ SATWND  X¡ÿÿÿ PROFLR  X¡ÿÿÿ VADWND  X¡ÿÿÿ ADPSFC X¡ÿÿÿ SFCSHP  
> X¡ÿÿÿ GOESND  X¡ÿÿÿ MSONET  X¡ÿÿÿ RASSDA  X¡ÿÿÿ
> DEBUG 1:         QPC: SPECIFIC HUMIDITY EVENT PROGRAM 
> CODE                      types: ADPUPA  X¡ÿÿÿ AIRCAR  X¡ÿÿÿ AIRCFT  
> X¡ÿÿÿ SATWND  X¡ÿÿÿ PROFLR  X¡ÿÿÿ VADWND  X¡ÿÿÿ ADPSFC X¡ÿÿÿ SFCSHP  
> X¡ÿÿÿ GOESND  X¡ÿÿÿ MSONET  X¡ÿÿÿ RASSDA  X¡ÿÿÿ
> DEBUG 1:         QQM: SPECIFIC HUMIDITY (QUALITY) 
> MARKER                        types: ADPUPA  X¡ÿÿÿ AIRCAR  X¡ÿÿÿ 
> AIRCFT  X¡ÿÿÿ SATWND  X¡ÿÿÿ PROFLR  X¡ÿÿÿ VADWND  X¡ÿÿÿ ADPSFC X¡ÿÿÿ 
> SFCSHP  X¡ÿÿÿ GOESND  X¡ÿÿÿ MSONET  X¡ÿÿÿ RASSDA  X¡ÿÿÿ
> DEBUG 1:         TFC: FORECAST (BACKGROUND) TEMPERATURE 
> VALUE                   types: ADPUPA  X¡ÿÿÿ AIRCAR  X¡ÿÿÿ AIRCFT  
> X¡ÿÿÿ SATWND  X¡ÿÿÿ PROFLR  X¡ÿÿÿ ADPSFC  X¡ÿÿÿ SFCSHP X¡ÿÿÿ GOESND  
> X¡ÿÿÿ MSONET  X¡ÿÿÿ RASSDA  X¡ÿÿÿ
> DEBUG 1:         TRC: TEMPERATURE EVENT REASON 
> CODE                             types: ADPUPA  X¡ÿÿÿ AIRCAR X¡ÿÿÿ 
> AIRCFT  X¡ÿÿÿ SATWND  X¡ÿÿÿ PROFLR  X¡ÿÿÿ VADWND  X¡ÿÿÿ ADPSFC  X¡ÿÿÿ 
> SFCSHP  X¡ÿÿÿ GOESND  X¡ÿÿÿ MSONET  X¡ÿÿÿ
> DEBUG 1:         TPC: TEMPERATURE EVENT PROGRAM 
> CODE                            types: ADPUPA  X¡ÿÿÿ AIRCAR X¡ÿÿÿ 
> AIRCFT  X¡ÿÿÿ SATWND  X¡ÿÿÿ PROFLR  X¡ÿÿÿ VADWND  X¡ÿÿÿ ADPSFC  X¡ÿÿÿ 
> SFCSHP  X¡ÿÿÿ GOESND  X¡ÿÿÿ MSONET  X¡ÿÿÿ
> DEBUG 1:         TQM: TEMPERATURE (QUALITY) 
> MARKER                              types: ADPUPA  X¡ÿÿÿ AIRCAR X¡ÿÿÿ 
> AIRCFT  X¡ÿÿÿ SATWND  X¡ÿÿÿ PROFLR  X¡ÿÿÿ VADWND  X¡ÿÿÿ ADPSFC  X¡ÿÿÿ 
> SFCSHP  X¡ÿÿÿ GOESND  X¡ÿÿÿ MSONET  X¡ÿÿÿ
> DEBUG 1:         TDO: DEWPOINT TEMPERATURE OBSERVATION (NOT 
> ASSIMILATED)                types: ADPUPA  X¡ÿÿÿ AIRCAR  X¡ÿÿÿ AIRCFT  
> X¡ÿÿÿ SATWND  X¡ÿÿÿ PROFLR  X¡ÿÿÿ VADWND  X¡ÿÿÿ ADPSFC X¡ÿÿÿ SFCSHP  
> X¡ÿÿÿ GOESND  X¡ÿÿÿ MSONET  X¡ÿÿÿ RASSDA  X¡ÿÿÿ
> DEBUG 1:         TVO: NON-Q. CONTROLLED VIRTUAL TEMP OBS (NOT 
> ASSIMILATED)              types: ADPUPA  X¡ÿÿÿ AIRCAR  X¡ÿÿÿ AIRCFT  
> X¡ÿÿÿ SATWND  X¡ÿÿÿ PROFLR  X¡ÿÿÿ VADWND  X¡ÿÿÿ ADPSFC X¡ÿÿÿ SFCSHP  
> X¡ÿÿÿ GOESND  X¡ÿÿÿ MSONET  X¡ÿÿÿ
> DEBUG 1:         ZFC: FORECAST (BACKGROUND) HEIGHT 
> VALUE                        types: ADPUPA  X¡ÿÿÿ SATWND  X¡ÿÿÿ 
> ADPSFC  X¡ÿÿÿ SFCSHP  X¡ÿÿÿ GOESND  X¡ÿÿÿ MSONET  X¡ÿÿÿ RASSDA X¡ÿÿÿ
> DEBUG 1:         ZRC: HEIGHT EVENT REASON 
> CODE                                  types: ADPUPA  X¡ÿÿÿ AIRCAR  
> X¡ÿÿÿ AIRCFT  X¡ÿÿÿ SATWND  X¡ÿÿÿ ADPSFC  X¡ÿÿÿ SFCSHP X¡ÿÿÿ GOESND  
> X¡ÿÿÿ MSONET  X¡ÿÿÿ
> DEBUG 1:         ZPC: HEIGHT EVENT PROGRAM 
> CODE                                 types: ADPUPA  X¡ÿÿÿ ADPSFC  
> X¡ÿÿÿ SFCSHP  X¡ÿÿÿ GOESND  X¡ÿÿÿ MSONET  X¡ÿÿÿ
> DEBUG 1:         ZQM: HEIGHT (QUALITY) 
> MARKER                                   types: ADPUPA  X¡ÿÿÿ ADPSFC  
> X¡ÿÿÿ SFCSHP  X¡ÿÿÿ GOESND  X¡ÿÿÿ MSONET  X¡ÿÿÿ
> DEBUG 1:         CAT: PREPBUFR DATA LEVEL CATEGORY
> Segmentation fault
> da_pb2nc.ksh completed at Thu Feb 15 14:02:14 MST 2018
> This run took 1050 seconds.
>
>
>     I think the weird ÿ symbols in the output above are just a minor
>     issue with my Cheyenne environment (my run environment still has a
>     few buggy things about text is displayed in xterm). The PB2NC run
>     didn't produce any netcdf output file.
>
>     I've attached the config file I'm using for pb2nc
>     (PB2NCConfig_HRRR.DA_task_problematic), as well as the original
>     2016 HRRR-E config file. I modified this to expand the time
>     window, add more message types to be processed, and to update the
>     file to more closely match the entries in the default PB2NC config
>     file for MET v6.1. I've also attached the original config file
>     from the HRRR-E verification activity. I tried running PB2NC with
>     the the original HRRR-E configuration file and it still produces a
>     segfault.
>
>     Do you see anything obviously wrong with the problematic config
>     file? Do you have any suggestions on what to do to debug this issue?
>
>     Thanks,
>     Jonathan
>
>
>
>
>
>
>
>
>
>     On 02/14/2018 03:02 PM, Chunhua Zhou wrote:
>>     For the time window, I would suggest +/-1 hour for pb2nc, to be
>>     safe. For point-stat, we can specify a smaller window if needed.
>>     And since these prepbufr files are hourly, some observations
>>     might not be available. Can you add more obs types to include as
>>     many as possible? It might be good to try at different times, for
>>     example, 00z, 01z, 03z, ..., as there might different obs for
>>     different times.
>>     Somehow I can not use ncview to see observations from the file
>>     you pointed me ...
>>     Thanks!
>>
>>
>>     On Wed, Feb 14, 2018 at 2:48 PM, Jonathan Vigh <jvigh at ucar.edu
>>     <mailto:jvigh at ucar.edu>> wrote:
>>
>>         Yes, I just copied over those MET scripts and am modifying
>>         them for this task.
>>
>>         Okay, I just did a test run on the first one. It only took
>>         about a minute to run and turned a 49 MB prepbuf file into an
>>         11 MB netcdf file. See:
>>         /glade/scratch/jvigh/DTC/HRRR/RAP_DATA/prefbufr/pb2nc/rap.2016090300.prepbufr.tm00.nc
>>         <http://rap.2016090300.prepbufr.tm00.nc>
>>
>>         The config file specifies a +/- 15 minute observation window.
>>         Is that too small?
>>
>>         I'll figure out how to add those additional surface variables.
>>
>>         Jonathan
>>
>>
>>
>>         On 02/14/2018 02:43 PM, Chunhua Zhou wrote:
>>>         Is this config file the one used in HRRR RE? I mean the
>>>         sample scripts I pointed you too?
>>>         Looking at the obs variables, I am thinking maybe we can add
>>>         some surface variables too, like 2mT, 10m wind?
>>>         Thanks!
>>>
>>>         On Wed, Feb 14, 2018 at 2:39 PM, Jonathan Vigh
>>>         <jvigh at ucar.edu <mailto:jvigh at ucar.edu>> wrote:
>>>
>>>             Hi Chunhua,
>>>
>>>             Okay, thanks. I gathered that MOAD_DATAROOT might just
>>>             be the output directory, but wasn't sure.
>>>
>>>             With regard to the pb2nc config file, does the attached
>>>             file look good? I have not changed anything from the
>>>             config file in the MET examples that you pointed me to.
>>>             I don't know if any changes are needed in the list of
>>>             parameters to be processed, if the time window is
>>>             correct, etc.
>>>
>>>             Jonathan
>>>
>>>
>>>
>>>
>>>             On 02/14/2018 01:48 PM, Chunhua Zhou wrote:
>>>>             For the ndate.exe, you can use the one
>>>>             at /glade/p/ral/jnt/DAtask/chunhua/EnVar/code/Cheyenne/rapid-refresh/HRRR.Guoqing/exec/UPP
>>>>             (I compiled for Cheyenne)
>>>>             Regarding the MOAD_DATAROOT path, what I see from the
>>>>             script is that points to your working directory, where
>>>>             the output goes -
>>>>
>>>>             # Go to prepbufr dir
>>>>             pb2nc=${MOAD_DATAROOT}/pb2nc
>>>>             ${MKDIR} -p ${pb2nc}
>>>>
>>>>
>>>>             On Wed, Feb 14, 2018 at 1:40 PM, Jonathan Vigh
>>>>             <jvigh at ucar.edu <mailto:jvigh at ucar.edu>> wrote:
>>>>
>>>>                 Hi Chunhua,
>>>>
>>>>                 I'm working on modifying the MET_pb2nc.ksh example
>>>>                 script that you gave me. In it, there are some
>>>>                 shell script variables that need to be set for
>>>>                 various directories, including UNIPOST_EXEC and
>>>>                 MOAD_DATAROOT.
>>>>                 The UNIPOST_EXEC path is needed to point to the
>>>>                 ndate.exe utility so that it can compute the
>>>>                 forecast hour. I went browsing around in the DAtask
>>>>                 space on glade and found the following UPP
>>>>                 executables:
>>>>
>>>>                 /code/UPPV2.1/src/ndate/ndate.exe
>>>>                 ./code/UPPV2.1/bin/ndate.exe
>>>>                 ./code/UPPV2.0/src/ndate/ndate.exe
>>>>                 ./code/UPPV2.0/bin/ndate.exe
>>>>                 ./code/UPPV2.2/src/ndate/ndate.exe
>>>>                 ./code/UPPV2.2/bin/ndate.exe
>>>>                 ./code/UPPV2.2_nofix/src/ndate/ndate.exe
>>>>                 ./code/UPPV2.2_nofix/bin/ndate.exe
>>>>                 ./chunhua/EnVar/code/Cheyenne/UPPV3.1/bin/ndate.exe
>>>>                 ./chunhua/EnVar/code/Cheyenne/UPPV3.1/src/ndate/ndate.exe
>>>>                 ./chunhua/EnVar/code/Cheyenne/rapid-refresh/HRRR.Guoqing/exec/UPP/ndate.exe
>>>>                 ./chunhua/EnVar/code/Cheyenne/rapid-refresh/HRRR.orig/exec/UPP/ndate.exe
>>>>                 ./chunhua/EnVar/HRRR/retro.3km.small.4denvar/exec/UPP/ndate.exe
>>>>                 ./chunhua/EnVar/HRRR/HRRR/rapid-refresh/UPP_2015/comupp/bin/ndate.exe
>>>>                 ./chunhua/EnVar/HRRR/HRRR/rapid-refresh/UPP_2015/comupp/src/ndate/ndate.exe
>>>>                 ./chunhua/EnVar/HRRR/HRRR/narre-HRRRE/yellowstone/retro/exec/UPP/ndate.exe
>>>>                 ./chunhua/EnVar/HRRR/retro.3km.small/exec/UPP/ndate.exe
>>>>                 ./chunhua/EnVar/HRRR/retro/exec/UPP/ndate.exe
>>>>                 ./chunhua/EnVar/HRRR/retro.3km.small.3denvar.reg.ens/exec/UPP/ndate.exe
>>>>                 ./chunhua/EnVar/HRRR/retro.3km.small.4denvar.cycl/exec/UPP/ndate.exe
>>>>                 ./chunhua/EnVar/HRRR/retro.3km.small.3denvar.cycl/exec/UPP/ndate.exe
>>>>                 ./chunhua/EnVar/HRRR/201609_newobs/exec/UPP/ndate.exe
>>>>                 ./chunhua/EnVar/HRRR/retro.hrrre/exec/UPP/ndate.exe
>>>>                 ./chunhua/EnVar/HRRR/HRRR.orig/rapid-refresh/UPP_2015/comupp/bin/ndate.exe
>>>>                 ./chunhua/EnVar/HRRR/HRRR.orig/rapid-refresh/UPP_2015/comupp/src/ndate/ndate.exe
>>>>                 ./chunhua/EnVar/HRRR/HRRR.orig/narre-HRRRE/yellowstone/retro/exec/UPP/ndate.exe
>>>>                 ./chunhua/EnVar/HRRR/retro.3km.small.3denvar/exec/UPP/ndate.exe
>>>>
>>>>                 Which, if any, of these should I use for this path?
>>>>
>>>>                 I'm not sure I understand the MOAD_DATAROOT path
>>>>                 well enough to know where to point this. The
>>>>                 description in the script is the "Top level of the
>>>>                 WRF output". Does such a directory exist for this
>>>>                 HRRR run, or should I just modify the script to not
>>>>                 need this? My guess is that this is simply where
>>>>                 the output from the pb2nc run will go. If we're not
>>>>                 sticking this back into the original WRF directory
>>>>                 where the data were run, then I'll just put it in a
>>>>                 separate directory on scratch. Does this sound good?
>>>>
>>>>                 Thanks,
>>>>                    Jonathan
>>>>
>>>>
>>>>
>>>>
>>>>                 On 02/14/2018 10:38 AM, Chunhua Zhou wrote:
>>>>>                 I would suggest
>>>>>                 using 20162470000.rap.t00z.prepbufr.tm00.20160903
>>>>>                 unless it is not available.
>>>>>                 Thanks!
>>>>>
>>>>>                 On Wed, Feb 14, 2018 at 10:36 AM, Jonathan Vigh
>>>>>                 <jvigh at ucar.edu <mailto:jvigh at ucar.edu>> wrote:
>>>>>
>>>>>                     Hi Chunhua,
>>>>>
>>>>>                     It looks like there are three different types
>>>>>                     of files in that directory (or from different
>>>>>                     model runs?). Should I process all of these?
>>>>>                     Or just some of them?
>>>>>
>>>>>                     20162470000.rap_e.t00z.prepbufr.tm00.20160903
>>>>>                     20162470000.rap.t00z.prepbufr.tm00.20160903
>>>>>                     20162470300.rap_p.t03z.prepbufr.tm00.20160903
>>>>>
>>>>>                     Thanks,
>>>>>                       Jonathan
>>>>>
>>>>>
>>>>>                     On 02/13/2018 09:28 PM, Chunhua Zhou wrote:
>>>>>>                     The observation data we will be using is
>>>>>>                     hourly real-time prepbufr data from RAP
>>>>>>                     (/glade2/scratch2/chunhua/HRRR/RAP_data/prepbufr).
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>



----------------------------------------------------------------
  Complete Ticket History
----------------------------------------------------------------

Subject: Re: segmentation fault when running PB2NC on some prepbufr files from the RAP
From: Howard Soh
Time: Thu Feb 15 16:12:08 2018

This is a comment.

There is no NetCDF output with "-index". It ignores many
configurations. The output for "-index" is the variable names on the
screen. It's very time consuming process to find variables with any
valid data. Usually the PprefBufr file does not have all variables in
BUFR table. So it ended up to check every messages.

-index is expected to run once to identify the variable list into the
configuration file.

I can not reproduce the segfault dakota. There are more variables
after CAT variable.




On Thu Feb 15 14:19:18 2018, jvigh wrote:
> Hi Chunhua and Ming,
>     I'm resending with the correct e-mail for MET Help.
> Jonathan
>
>
>
> On 02/15/2018 02:10 PM, Jonathan Vigh wrote:
> >
> > Dear Met Help,
> >
> > I'm working with Chunhua Zhou to do some verification for the DTC
DA
> > task and I'm getting a segmentation fault when running PB2NC on
> > Cheyenne to process some prepbufr files from the RAP. It runs for
> > about 20 minutes and then segfaults. Before it segfaults, it
produced
> > the following output:
> >
> > jvigh at r2i7n33:~/DTC/DA/verification/verify_source/MET_scripts>
> > ./da_pb2nc.ksh
> > FCST_TIME=00
> > valid time for  00 h forecast =  2016090300
> > CALLING: /glade/u/home/jvigh/DTC/DA/verification/MET_exec/pb2nc
> >
/glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/20162470000.rap.t00z.prepbufr.tm00.20160903
> >
/glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/pb2nc/rap.2016090300.prepbufr.tm00.nc
> >
/glade/u/home/jvigh/DTC/DA/verification/verify_source/MET_config/PB2NCConfig_HRRR.DA_task
> > -v 2 -index
> > DEBUG 1: Default Config File:
> > /glade/p/ral/jnt/HRRR/code/met-
> > 6.1/share/met/config/PB2NCConfig_default
> > DEBUG 1: User Config File:
> >
/glade/u/home/jvigh/DTC/DA/verification/verify_source/MET_config/PB2NCConfig_HRRR.DA_task
> > DEBUG 1:
> > DEBUG 1: Pre-processing Bufr File for metadata (BUFR variable
names)
> > from
> >
/glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/20162470000.rap.t00z.prepbufr.tm00.20160903
> > DEBUG 1:
> > 5% 10% 15% 20% 25% 30% 35% 40% 45% 50% 55% 60% 65% 70% 75% 80% 85%
> > 90%
> > 95% 100% DEBUG 1:
> > DEBUG 1:    Header variables:
> > DEBUG 1:         SID: STATION IDENTIFICATION
> > DEBUG 1:         XOB: LONGITUDE
> > DEBUG 1:         YOB: LATITUDE
> > DEBUG 1:         DHR: OBSERVATION TIME MINUS CYCLE TIME
> > DEBUG 1:         ELV: STATION ELEVATION
> > DEBUG 1:         TYP: PREPBUFR REPORT TYPE
> > DEBUG 1:         T29: DATA DUMP REPORT TYPE
> > DEBUG 1:         ITP: INSTRUMENT TYPE
> > DEBUG 1:
> > DEBUG 1:    Observation variables:
> > DEBUG 1:         QOB: SPECIFIC HUMIDITY
> >  OBSERVATION                             types: ADPUPA  X¡ÿÿÿ
AIRCAR
> >  X¡ÿÿÿ AIRCFT  X¡ÿÿÿ SATWND  X¡ÿÿÿ PROFLR  X¡ÿÿÿ VADWND X¡ÿÿÿ
ADPSFC
> > X¡ÿÿÿ SFCSHP  X¡ÿÿÿ GOESND  X¡ÿÿÿ MSONET  X¡ÿÿÿ RASSDA  X¡ÿÿÿ
> > DEBUG 1:         TOB: TEMPERATURE
> > OBSERVATION                                   types: ADPUPA X¡ÿÿÿ
> > AIRCAR  X¡ÿÿÿ AIRCFT  X¡ÿÿÿ SATWND  X¡ÿÿÿ PROFLR  X¡ÿÿÿ VADWND 
X¡ÿÿÿ
> > ADPSFC  X¡ÿÿÿ SFCSHP  X¡ÿÿÿ GOESND  X¡ÿÿÿ MSONET X¡ÿÿÿ
> > DEBUG 1:         ZOB: HEIGHT
> >  OBSERVATION                                        types: ADPUPA
> > X¡ÿÿÿ ADPSFC  X¡ÿÿÿ SFCSHP  X¡ÿÿÿ GOESND  X¡ÿÿÿ MSONET X¡ÿÿÿ
> > DEBUG 1:         UOB: U-COMPONENT WIND
> >  OBSERVATION                              types: ADPUPA  X¡ÿÿÿ
AIRCAR
> > X¡ÿÿÿ AIRCFT  X¡ÿÿÿ ADPSFC  X¡ÿÿÿ SFCSHP  X¡ÿÿÿ MSONET X¡ÿÿÿ
RASSDA 
> > X¡ÿÿÿ
> > DEBUG 1:         VOB: V-COMPONENT WIND
> >  OBSERVATION                              types: ADPUPA  X¡ÿÿÿ
AIRCAR
> > X¡ÿÿÿ AIRCFT  X¡ÿÿÿ ADPSFC  X¡ÿÿÿ SFCSHP  X¡ÿÿÿ MSONET X¡ÿÿÿ
RASSDA 
> > X¡ÿÿÿ
> > DEBUG 1:       HBLCS: HEIGHT ABOVE SURFACE OF BASE OF LOWEST CLOUD
> > SEEN         types: ADPUPA  X¡ÿÿÿ ADPSFC  X¡ÿÿÿ SFCSHP X¡ÿÿÿ
> > DEBUG 1:        CLTP: CLOUD
> >  TYPE                                                types: AIRCFT
> > X¡ÿÿÿ ADPSFC  X¡ÿÿÿ SFCSHP  X¡ÿÿÿ
> > DEBUG 1:        CLAM: CLOUD
> >  AMOUNT                                              types: ADPUPA
> > X¡ÿÿÿ AIRCFT  X¡ÿÿÿ ADPSFC  X¡ÿÿÿ SFCSHP  X¡ÿÿÿ GOESND X¡ÿÿÿ
> > DEBUG 1:         QFC: FORECAST (BACKGROUND) SPECIFIC HUMIDITY
> > VALUE             types: ADPUPA  X¡ÿÿÿ AIRCAR  X¡ÿÿÿ AIRCFT X¡ÿÿÿ
> > SATWND  X¡ÿÿÿ PROFLR  X¡ÿÿÿ VADWND  X¡ÿÿÿ ADPSFC  X¡ÿÿÿ SFCSHP 
X¡ÿÿÿ
> > GOESND  X¡ÿÿÿ MSONET  X¡ÿÿÿ RASSDA  X¡ÿÿÿ
> > DEBUG 1:         QRC: SPECIFIC HUMIDITY EVENT REASON
> >  CODE                       types: ADPUPA  X¡ÿÿÿ AIRCAR  X¡ÿÿÿ
AIRCFT
> >  X¡ÿÿÿ SATWND  X¡ÿÿÿ PROFLR  X¡ÿÿÿ VADWND  X¡ÿÿÿ ADPSFC X¡ÿÿÿ
SFCSHP
> > X¡ÿÿÿ GOESND  X¡ÿÿÿ MSONET  X¡ÿÿÿ RASSDA  X¡ÿÿÿ
> > DEBUG 1:         QPC: SPECIFIC HUMIDITY EVENT PROGRAM
> >  CODE                      types: ADPUPA  X¡ÿÿÿ AIRCAR  X¡ÿÿÿ
AIRCFT
> >  X¡ÿÿÿ SATWND  X¡ÿÿÿ PROFLR  X¡ÿÿÿ VADWND  X¡ÿÿÿ ADPSFC X¡ÿÿÿ
SFCSHP
> > X¡ÿÿÿ GOESND  X¡ÿÿÿ MSONET  X¡ÿÿÿ RASSDA  X¡ÿÿÿ
> > DEBUG 1:         QQM: SPECIFIC HUMIDITY (QUALITY)
> > MARKER                        types: ADPUPA  X¡ÿÿÿ AIRCAR  X¡ÿÿÿ
> > AIRCFT  X¡ÿÿÿ SATWND  X¡ÿÿÿ PROFLR  X¡ÿÿÿ VADWND  X¡ÿÿÿ ADPSFC
X¡ÿÿÿ
> > SFCSHP  X¡ÿÿÿ GOESND  X¡ÿÿÿ MSONET  X¡ÿÿÿ RASSDA  X¡ÿÿÿ
> > DEBUG 1:         TFC: FORECAST (BACKGROUND) TEMPERATURE
> >  VALUE                   types: ADPUPA  X¡ÿÿÿ AIRCAR  X¡ÿÿÿ AIRCFT
> >  X¡ÿÿÿ SATWND  X¡ÿÿÿ PROFLR  X¡ÿÿÿ ADPSFC  X¡ÿÿÿ SFCSHP X¡ÿÿÿ
GOESND
> > X¡ÿÿÿ MSONET  X¡ÿÿÿ RASSDA  X¡ÿÿÿ
> > DEBUG 1:         TRC: TEMPERATURE EVENT REASON
> > CODE                             types: ADPUPA  X¡ÿÿÿ AIRCAR X¡ÿÿÿ
> > AIRCFT  X¡ÿÿÿ SATWND  X¡ÿÿÿ PROFLR  X¡ÿÿÿ VADWND  X¡ÿÿÿ ADPSFC 
X¡ÿÿÿ
> > SFCSHP  X¡ÿÿÿ GOESND  X¡ÿÿÿ MSONET  X¡ÿÿÿ
> > DEBUG 1:         TPC: TEMPERATURE EVENT PROGRAM
> > CODE                            types: ADPUPA  X¡ÿÿÿ AIRCAR X¡ÿÿÿ
> > AIRCFT  X¡ÿÿÿ SATWND  X¡ÿÿÿ PROFLR  X¡ÿÿÿ VADWND  X¡ÿÿÿ ADPSFC 
X¡ÿÿÿ
> > SFCSHP  X¡ÿÿÿ GOESND  X¡ÿÿÿ MSONET  X¡ÿÿÿ
> > DEBUG 1:         TQM: TEMPERATURE (QUALITY)
> > MARKER                              types: ADPUPA  X¡ÿÿÿ AIRCAR
X¡ÿÿÿ
> > AIRCFT  X¡ÿÿÿ SATWND  X¡ÿÿÿ PROFLR  X¡ÿÿÿ VADWND  X¡ÿÿÿ ADPSFC 
X¡ÿÿÿ
> > SFCSHP  X¡ÿÿÿ GOESND  X¡ÿÿÿ MSONET  X¡ÿÿÿ
> > DEBUG 1:         TDO: DEWPOINT TEMPERATURE OBSERVATION (NOT
> >  ASSIMILATED)                types: ADPUPA  X¡ÿÿÿ AIRCAR  X¡ÿÿÿ
> > AIRCFT
> >  X¡ÿÿÿ SATWND  X¡ÿÿÿ PROFLR  X¡ÿÿÿ VADWND  X¡ÿÿÿ ADPSFC X¡ÿÿÿ
SFCSHP
> > X¡ÿÿÿ GOESND  X¡ÿÿÿ MSONET  X¡ÿÿÿ RASSDA  X¡ÿÿÿ
> > DEBUG 1:         TVO: NON-Q. CONTROLLED VIRTUAL TEMP OBS (NOT
> >  ASSIMILATED)              types: ADPUPA  X¡ÿÿÿ AIRCAR  X¡ÿÿÿ
AIRCFT
> >  X¡ÿÿÿ SATWND  X¡ÿÿÿ PROFLR  X¡ÿÿÿ VADWND  X¡ÿÿÿ ADPSFC X¡ÿÿÿ
SFCSHP
> > X¡ÿÿÿ GOESND  X¡ÿÿÿ MSONET  X¡ÿÿÿ
> > DEBUG 1:         ZFC: FORECAST (BACKGROUND) HEIGHT
> > VALUE                        types: ADPUPA  X¡ÿÿÿ SATWND  X¡ÿÿÿ
> > ADPSFC  X¡ÿÿÿ SFCSHP  X¡ÿÿÿ GOESND  X¡ÿÿÿ MSONET  X¡ÿÿÿ RASSDA
X¡ÿÿÿ
> > DEBUG 1:         ZRC: HEIGHT EVENT REASON
> >  CODE                                  types: ADPUPA  X¡ÿÿÿ AIRCAR
> >  X¡ÿÿÿ AIRCFT  X¡ÿÿÿ SATWND  X¡ÿÿÿ ADPSFC  X¡ÿÿÿ SFCSHP X¡ÿÿÿ
GOESND
> > X¡ÿÿÿ MSONET  X¡ÿÿÿ
> > DEBUG 1:         ZPC: HEIGHT EVENT PROGRAM
> >  CODE                                 types: ADPUPA  X¡ÿÿÿ ADPSFC
> > X¡ÿÿÿ SFCSHP  X¡ÿÿÿ GOESND  X¡ÿÿÿ MSONET  X¡ÿÿÿ
> > DEBUG 1:         ZQM: HEIGHT (QUALITY)
> >  MARKER                                   types: ADPUPA  X¡ÿÿÿ
ADPSFC
> > X¡ÿÿÿ SFCSHP  X¡ÿÿÿ GOESND  X¡ÿÿÿ MSONET  X¡ÿÿÿ
> > DEBUG 1:         CAT: PREPBUFR DATA LEVEL CATEGORY
> > Segmentation fault
> > da_pb2nc.ksh completed at Thu Feb 15 14:02:14 MST 2018
> > This run took 1050 seconds.
> >
> >
> > I think the weird ÿ symbols in the output above are just a minor
> > issue with my Cheyenne environment (my run environment still has a
> > few buggy things about text is displayed in xterm). The PB2NC run
> > didn't produce any netcdf output file.
> >
> > I've attached the config file I'm using for pb2nc
> > (PB2NCConfig_HRRR.DA_task_problematic), as well as the original
> > 2016 HRRR-E config file. I modified this to expand the time
> > window, add more message types to be processed, and to update the
> > file to more closely match the entries in the default PB2NC config
> > file for MET v6.1. I've also attached the original config file
> > from the HRRR-E verification activity. I tried running PB2NC with
> > the the original HRRR-E configuration file and it still produces a
> > segfault.
> >
> > Do you see anything obviously wrong with the problematic config
> > file? Do you have any suggestions on what to do to debug this
issue?
> >
> > Thanks,
> > Jonathan
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > On 02/14/2018 03:02 PM, Chunhua Zhou wrote:
> >> For the time window, I would suggest +/-1 hour for pb2nc, to be
> >> safe. For point-stat, we can specify a smaller window if needed.
> >> And since these prepbufr files are hourly, some observations
> >> might not be available. Can you add more obs types to include as
> >> many as possible? It might be good to try at different times, for
> >> example, 00z, 01z, 03z, ..., as there might different obs for
> >> different times.
> >> Somehow I can not use ncview to see observations from the file
> >> you pointed me ...
> >> Thanks!
> >>
> >>
> >> On Wed, Feb 14, 2018 at 2:48 PM, Jonathan Vigh <jvigh at ucar.edu
> >> <mailto:jvigh at ucar.edu>> wrote:
> >>
> >> Yes, I just copied over those MET scripts and am modifying
> >> them for this task.
> >>
> >> Okay, I just did a test run on the first one. It only took
> >> about a minute to run and turned a 49 MB prepbuf file into an
> >> 11 MB netcdf file. See:
> >>
/glade/scratch/jvigh/DTC/HRRR/RAP_DATA/prefbufr/pb2nc/rap.2016090300.prepbufr.tm00.nc
> >> <http://rap.2016090300.prepbufr.tm00.nc>
> >>
> >> The config file specifies a +/- 15 minute observation window.
> >> Is that too small?
> >>
> >> I'll figure out how to add those additional surface variables.
> >>
> >> Jonathan
> >>
> >>
> >>
> >> On 02/14/2018 02:43 PM, Chunhua Zhou wrote:
> >>> Is this config file the one used in HRRR RE? I mean the
> >>> sample scripts I pointed you too?
> >>> Looking at the obs variables, I am thinking maybe we can add
> >>> some surface variables too, like 2mT, 10m wind?
> >>> Thanks!
> >>>
> >>> On Wed, Feb 14, 2018 at 2:39 PM, Jonathan Vigh
> >>> <jvigh at ucar.edu <mailto:jvigh at ucar.edu>> wrote:
> >>>
> >>> Hi Chunhua,
> >>>
> >>> Okay, thanks. I gathered that MOAD_DATAROOT might just
> >>> be the output directory, but wasn't sure.
> >>>
> >>> With regard to the pb2nc config file, does the attached
> >>> file look good? I have not changed anything from the
> >>> config file in the MET examples that you pointed me to.
> >>> I don't know if any changes are needed in the list of
> >>> parameters to be processed, if the time window is
> >>> correct, etc.
> >>>
> >>> Jonathan
> >>>
> >>>
> >>>
> >>>
> >>> On 02/14/2018 01:48 PM, Chunhua Zhou wrote:
> >>>> For the ndate.exe, you can use the one
> >>>> at /glade/p/ral/jnt/DAtask/chunhua/EnVar/code/Cheyenne/rapid-
> >>>> refresh/HRRR.Guoqing/exec/UPP
> >>>> (I compiled for Cheyenne)
> >>>> Regarding the MOAD_DATAROOT path, what I see from the
> >>>> script is that points to your working directory, where
> >>>> the output goes -
> >>>>
> >>>> # Go to prepbufr dir
> >>>> pb2nc=${MOAD_DATAROOT}/pb2nc
> >>>> ${MKDIR} -p ${pb2nc}
> >>>>
> >>>>
> >>>> On Wed, Feb 14, 2018 at 1:40 PM, Jonathan Vigh
> >>>> <jvigh at ucar.edu <mailto:jvigh at ucar.edu>> wrote:
> >>>>
> >>>> Hi Chunhua,
> >>>>
> >>>> I'm working on modifying the MET_pb2nc.ksh example
> >>>> script that you gave me. In it, there are some
> >>>> shell script variables that need to be set for
> >>>> various directories, including UNIPOST_EXEC and
> >>>> MOAD_DATAROOT.
> >>>> The UNIPOST_EXEC path is needed to point to the
> >>>> ndate.exe utility so that it can compute the
> >>>> forecast hour. I went browsing around in the DAtask
> >>>> space on glade and found the following UPP
> >>>> executables:
> >>>>
> >>>> /code/UPPV2.1/src/ndate/ndate.exe
> >>>> ./code/UPPV2.1/bin/ndate.exe
> >>>> ./code/UPPV2.0/src/ndate/ndate.exe
> >>>> ./code/UPPV2.0/bin/ndate.exe
> >>>> ./code/UPPV2.2/src/ndate/ndate.exe
> >>>> ./code/UPPV2.2/bin/ndate.exe
> >>>> ./code/UPPV2.2_nofix/src/ndate/ndate.exe
> >>>> ./code/UPPV2.2_nofix/bin/ndate.exe
> >>>> ./chunhua/EnVar/code/Cheyenne/UPPV3.1/bin/ndate.exe
> >>>> ./chunhua/EnVar/code/Cheyenne/UPPV3.1/src/ndate/ndate.exe
> >>>> ./chunhua/EnVar/code/Cheyenne/rapid-
> >>>> refresh/HRRR.Guoqing/exec/UPP/ndate.exe
> >>>> ./chunhua/EnVar/code/Cheyenne/rapid-
> >>>> refresh/HRRR.orig/exec/UPP/ndate.exe
> >>>> ./chunhua/EnVar/HRRR/retro.3km.small.4denvar/exec/UPP/ndate.exe
> >>>> ./chunhua/EnVar/HRRR/HRRR/rapid-
> >>>> refresh/UPP_2015/comupp/bin/ndate.exe
> >>>> ./chunhua/EnVar/HRRR/HRRR/rapid-
> >>>> refresh/UPP_2015/comupp/src/ndate/ndate.exe
> >>>> ./chunhua/EnVar/HRRR/HRRR/narre-
> >>>> HRRRE/yellowstone/retro/exec/UPP/ndate.exe
> >>>> ./chunhua/EnVar/HRRR/retro.3km.small/exec/UPP/ndate.exe
> >>>> ./chunhua/EnVar/HRRR/retro/exec/UPP/ndate.exe
> >>>>
./chunhua/EnVar/HRRR/retro.3km.small.3denvar.reg.ens/exec/UPP/ndate.exe
> >>>>
./chunhua/EnVar/HRRR/retro.3km.small.4denvar.cycl/exec/UPP/ndate.exe
> >>>>
./chunhua/EnVar/HRRR/retro.3km.small.3denvar.cycl/exec/UPP/ndate.exe
> >>>> ./chunhua/EnVar/HRRR/201609_newobs/exec/UPP/ndate.exe
> >>>> ./chunhua/EnVar/HRRR/retro.hrrre/exec/UPP/ndate.exe
> >>>> ./chunhua/EnVar/HRRR/HRRR.orig/rapid-
> >>>> refresh/UPP_2015/comupp/bin/ndate.exe
> >>>> ./chunhua/EnVar/HRRR/HRRR.orig/rapid-
> >>>> refresh/UPP_2015/comupp/src/ndate/ndate.exe
> >>>> ./chunhua/EnVar/HRRR/HRRR.orig/narre-
> >>>> HRRRE/yellowstone/retro/exec/UPP/ndate.exe
> >>>> ./chunhua/EnVar/HRRR/retro.3km.small.3denvar/exec/UPP/ndate.exe
> >>>>
> >>>> Which, if any, of these should I use for this path?
> >>>>
> >>>> I'm not sure I understand the MOAD_DATAROOT path
> >>>> well enough to know where to point this. The
> >>>> description in the script is the "Top level of the
> >>>> WRF output". Does such a directory exist for this
> >>>> HRRR run, or should I just modify the script to not
> >>>> need this? My guess is that this is simply where
> >>>> the output from the pb2nc run will go. If we're not
> >>>> sticking this back into the original WRF directory
> >>>> where the data were run, then I'll just put it in a
> >>>> separate directory on scratch. Does this sound good?
> >>>>
> >>>> Thanks,
> >>>>    Jonathan
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On 02/14/2018 10:38 AM, Chunhua Zhou wrote:
> >>>>> I would suggest
> >>>>> using 20162470000.rap.t00z.prepbufr.tm00.20160903
> >>>>> unless it is not available.
> >>>>> Thanks!
> >>>>>
> >>>>> On Wed, Feb 14, 2018 at 10:36 AM, Jonathan Vigh
> >>>>> <jvigh at ucar.edu <mailto:jvigh at ucar.edu>> wrote:
> >>>>>
> >>>>> Hi Chunhua,
> >>>>>
> >>>>> It looks like there are three different types
> >>>>> of files in that directory (or from different
> >>>>> model runs?). Should I process all of these?
> >>>>> Or just some of them?
> >>>>>
> >>>>> 20162470000.rap_e.t00z.prepbufr.tm00.20160903
> >>>>> 20162470000.rap.t00z.prepbufr.tm00.20160903
> >>>>> 20162470300.rap_p.t03z.prepbufr.tm00.20160903
> >>>>>
> >>>>> Thanks,
> >>>>>   Jonathan
> >>>>>
> >>>>>
> >>>>> On 02/13/2018 09:28 PM, Chunhua Zhou wrote:
> >>>>>> The observation data we will be using is
> >>>>>> hourly real-time prepbufr data from RAP
> >>>>>> (/glade2/scratch2/chunhua/HRRR/RAP_data/prepbufr).
> >>>>>
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>>
> >>
> >>
> >
> >



------------------------------------------------
Subject: Re: segmentation fault when running PB2NC on some prepbufr files from the RAP
From: John Halley Gotway
Time: Tue Feb 20 17:06:40 2018

Jonathan,

I see that Howard did take a look at this last week, but he added an
internal comment rather than sending a response to the requester...
i.e. you.  Here's the comment he added:

=============================================================================

There is no NetCDF output with "-index". It ignores many
configurations. The output for "-index" is the variable names on the
screen. It's very time consuming process to find variables with any
valid data. Usually the PprefBufr file does not have all variables in
BUFR table. So it ended up to check every messages.

-index is expected to run once to identify the variable list into the
configuration file.

I can not reproduce the segfault dakota. There are more variables
after CAT variable.

=============================================================================

I just logged on to cheyenne and ran the command you sent to me:

=============================================================================

time /glade/p/ral/jnt/MET/MET_releases/cheyenne/met-6.1/bin/pb2nc \
/glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/20162470000.rap.t00z.prepbufr.tm00.20160903
\
rap.2016090300.prepbufr.tm00.nc \
/glade/u/home/jvigh/DTC/DA/verification/verify_source/MET_config/PB2NCConfig_HRRR.DA_task
\
-v 2 -index -log pb2nc.log

=============================================================================

After running for 17 minutes, I was able to replicate the behavior
you're seeing, including all the odd characters in the whitespace
between the message type names.  When I rerun without the -index
option, pb2nc aborts with the following error message:

terminate called after throwing an instance of
'netCDF::exceptions::NcHdfErr'
  what():  NetCDF: HDF error
file: ncCheck.cpp  line:92
Abort (core dumped)

When I run this test on my local machine using /usr/local/met-
6.1/bin/pb2nc, it runs fine without any errors.

Howard and I will need to investigate this more on cheyenne.

Thanks,
John


------------------------------------------------
Subject: Re: [rt.rap.ucar.edu #84094] Re: segmentation fault when running PB2NC on some prepbufr files from the RAP
From: jvigh
Time: Tue Feb 20 17:25:58 2018

Hi John,

Thanks. If Howard had sent his comment along to me, that could have
saved me a lot of time last week (depending on which day he wrote it).

I'm glad to be back up and running now with Met. I'll press onward.
Good
luck in finding the bug.

Thanks,
   Jonathan



On 02/20/2018 05:06 PM, John Halley Gotway via RT wrote:
> Jonathan,
>
> I see that Howard did take a look at this last week, but he added an
internal comment rather than sending a response to the requester...
i.e. you.  Here's the comment he added:
>
>
=============================================================================
>
> There is no NetCDF output with "-index". It ignores many
configurations. The output for "-index" is the variable names on the
screen. It's very time consuming process to find variables with any
valid data. Usually the PprefBufr file does not have all variables in
BUFR table. So it ended up to check every messages.
>
> -index is expected to run once to identify the variable list into
the configuration file.
>
> I can not reproduce the segfault dakota. There are more variables
after CAT variable.
>
>
=============================================================================
>
> I just logged on to cheyenne and ran the command you sent to me:
>
>
=============================================================================
>
> time /glade/p/ral/jnt/MET/MET_releases/cheyenne/met-6.1/bin/pb2nc \
>
/glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/20162470000.rap.t00z.prepbufr.tm00.20160903
\
> rap.2016090300.prepbufr.tm00.nc \
>
/glade/u/home/jvigh/DTC/DA/verification/verify_source/MET_config/PB2NCConfig_HRRR.DA_task
\
> -v 2 -index -log pb2nc.log
>
>
=============================================================================
>
> After running for 17 minutes, I was able to replicate the behavior
you're seeing, including all the odd characters in the whitespace
between the message type names.  When I rerun without the -index
option, pb2nc aborts with the following error message:
>
> terminate called after throwing an instance of
'netCDF::exceptions::NcHdfErr'
>    what():  NetCDF: HDF error
> file: ncCheck.cpp  line:92
> Abort (core dumped)
>
> When I run this test on my local machine using /usr/local/met-
6.1/bin/pb2nc, it runs fine without any errors.
>
> Howard and I will need to investigate this more on cheyenne.
>
> Thanks,
> John
>
>


------------------------------------------------
Subject: Re: segmentation fault when running PB2NC on some prepbufr files from the RAP
From: John Halley Gotway
Time: Tue Feb 20 17:31:01 2018

So does that script actually run successfully with the "-index" option
removed?

What happens if you cut-and-paste these commands on the command line:

module purge
module use /glade/p/ral/jnt/MET/MET_releases/cheyenne/modulefiles
module load met/6.1
pb2nc
/glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/20162470000.rap.t00z.prepbufr.tm00.20160903
rap.2016090300.prepbufr.tm00.nc
/glade/u/home/jvigh/DTC/DA/verification/verify_source/MET_config/PB2NCConfig_HRRR.DA_task
-v 2 -log pb2nc.log

After 2 or 3 minutes, I still get the following core dump message:
terminate called after throwing an instance of
'netCDF::exceptions::NcHdfErr'
  what():  NetCDF: HDF error
file: ncCheck.cpp  line:92
Abort (core dumped)

Thanks,
John

------------------------------------------------
Subject: Re: [rt.rap.ucar.edu #84094] Re: segmentation fault when running PB2NC on some prepbufr files from the RAP
From: jvigh
Time: Tue Feb 20 17:46:10 2018

Hi John,

I don't get a 'terminated' message, but I do get an error:

jvigh at r8i4n1:/glade/u/home/jvigh> module purge
met/6.1jvigh at r8i4n1:/glade/u/home/jvigh> module use
/glade/p/ral/jnt/MET/MET_releases/cheyenne/modulefiles
jvigh at r8i4n1:/glade/u/home/jvigh> module load met/6.1
jvigh at r8i4n1:/glade/u/home/jvigh> pb2nc
/glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/20162470000.rap.t00z.prepbufr.tm00.20160903
rap.2016090300.prepbufr.tm00.nc
/glade/u/home/jvigh/DTC/DA/verification/verify_source/MET_config/PB2NCConfig_HRRR.DA_task
-v 2 -log pb2nc.log
DEBUG 1: Default Config File:
/glade/p/ral/jnt/MET/MET_releases/cheyenne/met-
6.1/share/met/config/PB2NCConfig_default
DEBUG 1: User Config File:
/glade/u/home/jvigh/DTC/DA/verification/verify_source/MET_config/PB2NCConfig_HRRR.DA_task
DEBUG 1: Creating NetCDF File:   rap.2016090300.prepbufr.tm00.nc
DEBUG 1:
DEBUG 1: Pre-processing Bufr File for metadata (BUFR variable names)
from
/glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/20162470000.rap.t00z.prepbufr.tm00.20160903
DEBUG 1:
DEBUG 1: Processing Bufr File:
/glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/20162470000.rap.t00z.prepbufr.tm00.20160903
DEBUG 1: Blocking Bufr file to: /tmp/tmp_pb2nc_blk_43159_0
DEBUG 2: PrepBufr Time Center:   20160903_000000
DEBUG 2: Searching Time Window:  20160902_234500 to 20160903_001500
DEBUG 2: Processing 606306 PrepBufr messages...
5% 10% 15% 20% 25% 30% 35% 40% 45% 50% 55% 60% 65% 70% 75% 80% 85% 90%
95% 100%
DEBUG 2: Total PrepBufr Messages processed = 606306
DEBUG 2: Rejected based on message type  = 583017
DEBUG 2: Rejected based on station id  = 0
DEBUG 2: Rejected based on valid time  = 8463
DEBUG 2: Rejected based on masking grid  = 0
DEBUG 2: Rejected based on masking polygon = 0
DEBUG 2: Rejected based on elevation   = 0
DEBUG 2: Rejected based on pb report type  = 0
DEBUG 2: Rejected based on input report type = 0
DEBUG 2: Rejected based on instrument type = 0
DEBUG 2: Rejected based on zero observations = 691
DEBUG 2: Total PrepBufr Messages retained  = 14135
DEBUG 2: Total observations retained or derived = 73646
ERROR  :
ERROR  : remove_temp_file() -> can't delete temporary file:
"/tmp/tmp_pb2nc_blk_43159_0" (Unknown error -1)
ERROR  :


The output file contents do not look complete:
netcdf rap.2016090300.prepbufr.tm00 {
dimensions:
   mxstr = 16 ;
   mxstr2 = 40 ;
   hdr_arr_len = 3 ;
   obs_arr_len = 5 ;
   nobs = UNLIMITED ; // (73646 currently)
variables:
   char obs_qty(nobs, mxstr) ;
    obs_qty:long_name = "quality flag" ;
   float obs_arr(nobs, obs_arr_len) ;
    obs_arr:long_name = "array of observation values" ;
    obs_arr:missing_value = -9999.f ;
    obs_arr:_FillValue = -9999.f ;
    obs_arr:hdr_id_long_name = "index of matching header data" ;
    obs_arr:columns = "hdr_id var_id lvl hgt ob" ;
    obs_arr:var_id_long_name = "index of BUFR variable corresponding
to
the observation type" ;
    obs_arr:lvl_long_name = "pressure level (hPa) or accumulation
interval (sec)" ;
    obs_arr:hgt_long_name = "height in meters above sea level (msl)" ;
    obs_arr:ob_long_name = "observation value" ;

// global attributes:
    :use_var_id = "true" ;
    :FileOrigins = "File rap.2016090300.prepbufr.tm00.nc generated
20180221_004145 UTC on host r8i4n1 by the MET pb2nc tool" ;
    :MET_version = "V6.1" ;
    :MET_tool = "pb2nc" ;
}

So should I continue to use the pb2nc found in
/glade/p/ral/jnt/MET/MET_releases/cheyenne/met-6.1/bin rather than the
one provided for by the module?

Thanks,
Jonathan


On 02/20/2018 05:31 PM, John Halley Gotway via RT wrote:
> So does that script actually run successfully with the "-index"
option
> removed?
>
> What happens if you cut-and-paste these commands on the command
line:
>
> module purge
> module use /glade/p/ral/jnt/MET/MET_releases/cheyenne/modulefiles
> module load met/6.1
> pb2nc
>
/glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/20162470000.rap.t00z.prepbufr.tm00.20160903
> rap.2016090300.prepbufr.tm00.nc
>
/glade/u/home/jvigh/DTC/DA/verification/verify_source/MET_config/PB2NCConfig_HRRR.DA_task
> -v 2 -log pb2nc.log
>
> After 2 or 3 minutes, I still get the following core dump message:
> terminate called after throwing an instance of
> 'netCDF::exceptions::NcHdfErr'
>    what():  NetCDF: HDF error
> file: ncCheck.cpp  line:92
> Abort (core dumped)
>
> Thanks,
> John
>


------------------------------------------------
Subject: Re: segmentation fault when running PB2NC on some prepbufr files from the RAP
From: John Halley Gotway
Time: Tue Feb 20 18:07:12 2018

Jonathan,

Well now that's the confusing part.  That path
"/glade/p/ral/jnt/MET/MET_releases/cheyenne/met-6.1/bin" *IS* the one
that's loaded by the module file!

Here's the module file that updates the PATH and LD_LIBRARY_PATH
variables:
   cat /glade/p/ral/jnt/MET/MET_releases/cheyenne/modulefiles/met/6.1

So when you run that executable directly *WITHOUT* loading the MET
module,
it runs.

But when I run with the MET module loaded I get a runtime error.

Julie Prestopnik recompiled MET on Feb 16th to apply a recent set of
patches.  Perhaps when she recompiled, there was a problem in her
environment?

I'm testing that theory out now.

John


On Tue, Feb 20, 2018 at 5:46 PM, jvigh via RT <met_help at ucar.edu>
wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84094 >
>
> Hi John,
>
> I don't get a 'terminated' message, but I do get an error:
>
> jvigh at r8i4n1:/glade/u/home/jvigh> module purge
> met/6.1jvigh at r8i4n1:/glade/u/home/jvigh> module use
> /glade/p/ral/jnt/MET/MET_releases/cheyenne/modulefiles
> jvigh at r8i4n1:/glade/u/home/jvigh> module load met/6.1
> jvigh at r8i4n1:/glade/u/home/jvigh> pb2nc
> /glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/
> 20162470000.rap.t00z.prepbufr.tm00.20160903
> rap.2016090300.prepbufr.tm00.nc
> /glade/u/home/jvigh/DTC/DA/verification/verify_source/
> MET_config/PB2NCConfig_HRRR.DA_task
> -v 2 -log pb2nc.log
> DEBUG 1: Default Config File:
> /glade/p/ral/jnt/MET/MET_releases/cheyenne/met-6.1/
> share/met/config/PB2NCConfig_default
> DEBUG 1: User Config File:
> /glade/u/home/jvigh/DTC/DA/verification/verify_source/
> MET_config/PB2NCConfig_HRRR.DA_task
> DEBUG 1: Creating NetCDF File:   rap.2016090300.prepbufr.tm00.nc
> DEBUG 1:
> DEBUG 1: Pre-processing Bufr File for metadata (BUFR variable names)
> from
> /glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/
> 20162470000.rap.t00z.prepbufr.tm00.20160903
> DEBUG 1:
> DEBUG 1: Processing Bufr File:
> /glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/
> 20162470000.rap.t00z.prepbufr.tm00.20160903
> DEBUG 1: Blocking Bufr file to: /tmp/tmp_pb2nc_blk_43159_0
> DEBUG 2: PrepBufr Time Center:   20160903_000000
> DEBUG 2: Searching Time Window:  20160902_234500 to 20160903_001500
> DEBUG 2: Processing 606306 PrepBufr messages...
> 5% 10% 15% 20% 25% 30% 35% 40% 45% 50% 55% 60% 65% 70% 75% 80% 85%
90%
> 95% 100%
> DEBUG 2: Total PrepBufr Messages processed = 606306
> DEBUG 2: Rejected based on message type  = 583017
> DEBUG 2: Rejected based on station id  = 0
> DEBUG 2: Rejected based on valid time  = 8463
> DEBUG 2: Rejected based on masking grid  = 0
> DEBUG 2: Rejected based on masking polygon = 0
> DEBUG 2: Rejected based on elevation   = 0
> DEBUG 2: Rejected based on pb report type  = 0
> DEBUG 2: Rejected based on input report type = 0
> DEBUG 2: Rejected based on instrument type = 0
> DEBUG 2: Rejected based on zero observations = 691
> DEBUG 2: Total PrepBufr Messages retained  = 14135
> DEBUG 2: Total observations retained or derived = 73646
> ERROR  :
> ERROR  : remove_temp_file() -> can't delete temporary file:
> "/tmp/tmp_pb2nc_blk_43159_0" (Unknown error -1)
> ERROR  :
>
>
> The output file contents do not look complete:
> netcdf rap.2016090300.prepbufr.tm00 {
> dimensions:
>    mxstr = 16 ;
>    mxstr2 = 40 ;
>    hdr_arr_len = 3 ;
>    obs_arr_len = 5 ;
>    nobs = UNLIMITED ; // (73646 currently)
> variables:
>    char obs_qty(nobs, mxstr) ;
>     obs_qty:long_name = "quality flag" ;
>    float obs_arr(nobs, obs_arr_len) ;
>     obs_arr:long_name = "array of observation values" ;
>     obs_arr:missing_value = -9999.f ;
>     obs_arr:_FillValue = -9999.f ;
>     obs_arr:hdr_id_long_name = "index of matching header data" ;
>     obs_arr:columns = "hdr_id var_id lvl hgt ob" ;
>     obs_arr:var_id_long_name = "index of BUFR variable corresponding
to
> the observation type" ;
>     obs_arr:lvl_long_name = "pressure level (hPa) or accumulation
> interval (sec)" ;
>     obs_arr:hgt_long_name = "height in meters above sea level (msl)"
;
>     obs_arr:ob_long_name = "observation value" ;
>
> // global attributes:
>     :use_var_id = "true" ;
>     :FileOrigins = "File rap.2016090300.prepbufr.tm00.nc generated
> 20180221_004145 UTC on host r8i4n1 by the MET pb2nc tool" ;
>     :MET_version = "V6.1" ;
>     :MET_tool = "pb2nc" ;
> }
>
> So should I continue to use the pb2nc found in
> /glade/p/ral/jnt/MET/MET_releases/cheyenne/met-6.1/bin rather than
the
> one provided for by the module?
>
> Thanks,
> Jonathan
>
>
> On 02/20/2018 05:31 PM, John Halley Gotway via RT wrote:
> > So does that script actually run successfully with the "-index"
option
> > removed?
> >
> > What happens if you cut-and-paste these commands on the command
line:
> >
> > module purge
> > module use /glade/p/ral/jnt/MET/MET_releases/cheyenne/modulefiles
> > module load met/6.1
> > pb2nc
> > /glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/
> 20162470000.rap.t00z.prepbufr.tm00.20160903
> > rap.2016090300.prepbufr.tm00.nc
> > /glade/u/home/jvigh/DTC/DA/verification/verify_source/
> MET_config/PB2NCConfig_HRRR.DA_task
> > -v 2 -log pb2nc.log
> >
> > After 2 or 3 minutes, I still get the following core dump message:
> > terminate called after throwing an instance of
> > 'netCDF::exceptions::NcHdfErr'
> >    what():  NetCDF: HDF error
> > file: ncCheck.cpp  line:92
> > Abort (core dumped)
> >
> > Thanks,
> > John
> >
>
>
>

------------------------------------------------
Subject: Re: segmentation fault when running PB2NC on some prepbufr files from the RAP
From: John Halley Gotway
Time: Tue Feb 20 18:36:05 2018

Same behavior when I recompiled met-6.1.  I'm now recompiling MET and
all
the dependent libraries with intel/17.0.1 which is (now) the default
intel
compiler on cheyenne.

John

On Tue, Feb 20, 2018 at 6:06 PM, John Halley Gotway <johnhg at ucar.edu>
wrote:

> Jonathan,
>
> Well now that's the confusing part.  That path
> "/glade/p/ral/jnt/MET/MET_releases/cheyenne/met-6.1/bin" *IS* the
one
> that's loaded by the module file!
>
> Here's the module file that updates the PATH and LD_LIBRARY_PATH
variables:
>    cat
/glade/p/ral/jnt/MET/MET_releases/cheyenne/modulefiles/met/6.1
>
> So when you run that executable directly *WITHOUT* loading the MET
module,
> it runs.
>
> But when I run with the MET module loaded I get a runtime error.
>
> Julie Prestopnik recompiled MET on Feb 16th to apply a recent set of
> patches.  Perhaps when she recompiled, there was a problem in her
> environment?
>
> I'm testing that theory out now.
>
> John
>
>
> On Tue, Feb 20, 2018 at 5:46 PM, jvigh via RT <met_help at ucar.edu>
wrote:
>
>>
>> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84094 >
>>
>> Hi John,
>>
>> I don't get a 'terminated' message, but I do get an error:
>>
>> jvigh at r8i4n1:/glade/u/home/jvigh> module purge
>> met/6.1jvigh at r8i4n1:/glade/u/home/jvigh> module use
>> /glade/p/ral/jnt/MET/MET_releases/cheyenne/modulefiles
>> jvigh at r8i4n1:/glade/u/home/jvigh> module load met/6.1
>> jvigh at r8i4n1:/glade/u/home/jvigh> pb2nc
>> /glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/201624700
>> 00.rap.t00z.prepbufr.tm00.20160903
>> rap.2016090300.prepbufr.tm00.nc
>> /glade/u/home/jvigh/DTC/DA/verification/verify_source/MET_
>> config/PB2NCConfig_HRRR.DA_task
>> -v 2 -log pb2nc.log
>> DEBUG 1: Default Config File:
>> /glade/p/ral/jnt/MET/MET_releases/cheyenne/met-6.1/share/
>> met/config/PB2NCConfig_default
>> DEBUG 1: User Config File:
>> /glade/u/home/jvigh/DTC/DA/verification/verify_source/MET_
>> config/PB2NCConfig_HRRR.DA_task
>> DEBUG 1: Creating NetCDF File:   rap.2016090300.prepbufr.tm00.nc
>> DEBUG 1:
>> DEBUG 1: Pre-processing Bufr File for metadata (BUFR variable
names)
>> from
>> /glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/201624700
>> 00.rap.t00z.prepbufr.tm00.20160903
>> DEBUG 1:
>> DEBUG 1: Processing Bufr File:
>> /glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/201624700
>> 00.rap.t00z.prepbufr.tm00.20160903
>> DEBUG 1: Blocking Bufr file to: /tmp/tmp_pb2nc_blk_43159_0
>> DEBUG 2: PrepBufr Time Center:   20160903_000000
>> DEBUG 2: Searching Time Window:  20160902_234500 to 20160903_001500
>> DEBUG 2: Processing 606306 PrepBufr messages...
>> 5% 10% 15% 20% 25% 30% 35% 40% 45% 50% 55% 60% 65% 70% 75% 80% 85%
90%
>> 95% 100%
>> DEBUG 2: Total PrepBufr Messages processed = 606306
>> DEBUG 2: Rejected based on message type  = 583017
>> DEBUG 2: Rejected based on station id  = 0
>> DEBUG 2: Rejected based on valid time  = 8463
>> DEBUG 2: Rejected based on masking grid  = 0
>> DEBUG 2: Rejected based on masking polygon = 0
>> DEBUG 2: Rejected based on elevation   = 0
>> DEBUG 2: Rejected based on pb report type  = 0
>> DEBUG 2: Rejected based on input report type = 0
>> DEBUG 2: Rejected based on instrument type = 0
>> DEBUG 2: Rejected based on zero observations = 691
>> DEBUG 2: Total PrepBufr Messages retained  = 14135
>> DEBUG 2: Total observations retained or derived = 73646
>> ERROR  :
>> ERROR  : remove_temp_file() -> can't delete temporary file:
>> "/tmp/tmp_pb2nc_blk_43159_0" (Unknown error -1)
>> ERROR  :
>>
>>
>> The output file contents do not look complete:
>> netcdf rap.2016090300 <(201)%20609-0300>.prepbufr.tm00 {
>> dimensions:
>>    mxstr = 16 ;
>>    mxstr2 = 40 ;
>>    hdr_arr_len = 3 ;
>>    obs_arr_len = 5 ;
>>    nobs = UNLIMITED ; // (73646 currently)
>> variables:
>>    char obs_qty(nobs, mxstr) ;
>>     obs_qty:long_name = "quality flag" ;
>>    float obs_arr(nobs, obs_arr_len) ;
>>     obs_arr:long_name = "array of observation values" ;
>>     obs_arr:missing_value = -9999.f ;
>>     obs_arr:_FillValue = -9999.f ;
>>     obs_arr:hdr_id_long_name = "index of matching header data" ;
>>     obs_arr:columns = "hdr_id var_id lvl hgt ob" ;
>>     obs_arr:var_id_long_name = "index of BUFR variable
corresponding to
>> the observation type" ;
>>     obs_arr:lvl_long_name = "pressure level (hPa) or accumulation
>> interval (sec)" ;
>>     obs_arr:hgt_long_name = "height in meters above sea level
(msl)" ;
>>     obs_arr:ob_long_name = "observation value" ;
>>
>> // global attributes:
>>     :use_var_id = "true" ;
>>     :FileOrigins = "File rap.2016090300.prepbufr.tm00.nc generated
>> 20180221_004145 UTC on host r8i4n1 by the MET pb2nc tool" ;
>>     :MET_version = "V6.1" ;
>>     :MET_tool = "pb2nc" ;
>> }
>>
>> So should I continue to use the pb2nc found in
>> /glade/p/ral/jnt/MET/MET_releases/cheyenne/met-6.1/bin rather than
the
>> one provided for by the module?
>>
>> Thanks,
>> Jonathan
>>
>>
>> On 02/20/2018 05:31 PM, John Halley Gotway via RT wrote:
>> > So does that script actually run successfully with the "-index"
option
>> > removed?
>> >
>> > What happens if you cut-and-paste these commands on the command
line:
>> >
>> > module purge
>> > module use /glade/p/ral/jnt/MET/MET_releases/cheyenne/modulefiles
>> > module load met/6.1
>> > pb2nc
>> > /glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/201624700
>> 00.rap.t00z.prepbufr.tm00.20160903
>> > rap.2016090300.prepbufr.tm00.nc
>> > /glade/u/home/jvigh/DTC/DA/verification/verify_source/MET_
>> config/PB2NCConfig_HRRR.DA_task
>> > -v 2 -log pb2nc.log
>> >
>> > After 2 or 3 minutes, I still get the following core dump
message:
>> > terminate called after throwing an instance of
>> > 'netCDF::exceptions::NcHdfErr'
>> >    what():  NetCDF: HDF error
>> > file: ncCheck.cpp  line:92
>> > Abort (core dumped)
>> >
>> > Thanks,
>> > John
>> >
>>
>>
>>
>

------------------------------------------------
Subject: Re: segmentation fault when running PB2NC on some prepbufr files from the RAP
From: John Halley Gotway
Time: Tue Feb 20 20:05:50 2018

Jonathan,

OK, I recompiled everything using intel/17.0.1.

But since then, I realized that all this nonsense with pb2nc aborting
was
just because I was running my test from my home directory, which
likely had
limited space!

When I write to a project directory with plenty of space, it runs
fine!
Ugh.

Please let me know if you still have issues or run into new ones.

Thanks,
John

On Tue, Feb 20, 2018 at 6:35 PM, John Halley Gotway <johnhg at ucar.edu>
wrote:

> Same behavior when I recompiled met-6.1.  I'm now recompiling MET
and all
> the dependent libraries with intel/17.0.1 which is (now) the default
intel
> compiler on cheyenne.
>
> John
>
> On Tue, Feb 20, 2018 at 6:06 PM, John Halley Gotway
<johnhg at ucar.edu>
> wrote:
>
>> Jonathan,
>>
>> Well now that's the confusing part.  That path
>> "/glade/p/ral/jnt/MET/MET_releases/cheyenne/met-6.1/bin" *IS* the
one
>> that's loaded by the module file!
>>
>> Here's the module file that updates the PATH and LD_LIBRARY_PATH
>> variables:
>>    cat
/glade/p/ral/jnt/MET/MET_releases/cheyenne/modulefiles/met/6.1
>>
>> So when you run that executable directly *WITHOUT* loading the MET
>> module, it runs.
>>
>> But when I run with the MET module loaded I get a runtime error.
>>
>> Julie Prestopnik recompiled MET on Feb 16th to apply a recent set
of
>> patches.  Perhaps when she recompiled, there was a problem in her
>> environment?
>>
>> I'm testing that theory out now.
>>
>> John
>>
>>
>> On Tue, Feb 20, 2018 at 5:46 PM, jvigh via RT <met_help at ucar.edu>
wrote:
>>
>>>
>>> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84094 >
>>>
>>> Hi John,
>>>
>>> I don't get a 'terminated' message, but I do get an error:
>>>
>>> jvigh at r8i4n1:/glade/u/home/jvigh> module purge
>>> met/6.1jvigh at r8i4n1:/glade/u/home/jvigh> module use
>>> /glade/p/ral/jnt/MET/MET_releases/cheyenne/modulefiles
>>> jvigh at r8i4n1:/glade/u/home/jvigh> module load met/6.1
>>> jvigh at r8i4n1:/glade/u/home/jvigh> pb2nc
>>> /glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/201624700
>>> 00.rap.t00z.prepbufr.tm00.20160903
>>> rap.2016090300.prepbufr.tm00.nc
>>> /glade/u/home/jvigh/DTC/DA/verification/verify_source/MET_co
>>> nfig/PB2NCConfig_HRRR.DA_task
>>> -v 2 -log pb2nc.log
>>> DEBUG 1: Default Config File:
>>> /glade/p/ral/jnt/MET/MET_releases/cheyenne/met-6.1/share/met
>>> /config/PB2NCConfig_default
>>> DEBUG 1: User Config File:
>>> /glade/u/home/jvigh/DTC/DA/verification/verify_source/MET_co
>>> nfig/PB2NCConfig_HRRR.DA_task
>>> DEBUG 1: Creating NetCDF File:   rap.2016090300.prepbufr.tm00.nc
>>> DEBUG 1:
>>> DEBUG 1: Pre-processing Bufr File for metadata (BUFR variable
names)
>>> from
>>> /glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/201624700
>>> 00.rap.t00z.prepbufr.tm00.20160903
>>> DEBUG 1:
>>> DEBUG 1: Processing Bufr File:
>>> /glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/201624700
>>> 00.rap.t00z.prepbufr.tm00.20160903
>>> DEBUG 1: Blocking Bufr file to: /tmp/tmp_pb2nc_blk_43159_0
>>> DEBUG 2: PrepBufr Time Center:   20160903_000000
>>> DEBUG 2: Searching Time Window:  20160902_234500 to
20160903_001500
>>> DEBUG 2: Processing 606306 PrepBufr messages...
>>> 5% 10% 15% 20% 25% 30% 35% 40% 45% 50% 55% 60% 65% 70% 75% 80% 85%
90%
>>> 95% 100%
>>> DEBUG 2: Total PrepBufr Messages processed = 606306
>>> DEBUG 2: Rejected based on message type  = 583017
>>> DEBUG 2: Rejected based on station id  = 0
>>> DEBUG 2: Rejected based on valid time  = 8463
>>> DEBUG 2: Rejected based on masking grid  = 0
>>> DEBUG 2: Rejected based on masking polygon = 0
>>> DEBUG 2: Rejected based on elevation   = 0
>>> DEBUG 2: Rejected based on pb report type  = 0
>>> DEBUG 2: Rejected based on input report type = 0
>>> DEBUG 2: Rejected based on instrument type = 0
>>> DEBUG 2: Rejected based on zero observations = 691
>>> DEBUG 2: Total PrepBufr Messages retained  = 14135
>>> DEBUG 2: Total observations retained or derived = 73646
>>> ERROR  :
>>> ERROR  : remove_temp_file() -> can't delete temporary file:
>>> "/tmp/tmp_pb2nc_blk_43159_0" (Unknown error -1)
>>> ERROR  :
>>>
>>>
>>> The output file contents do not look complete:
>>> netcdf rap.2016090300 <(201)%20609-0300>.prepbufr.tm00 {
>>> dimensions:
>>>    mxstr = 16 ;
>>>    mxstr2 = 40 ;
>>>    hdr_arr_len = 3 ;
>>>    obs_arr_len = 5 ;
>>>    nobs = UNLIMITED ; // (73646 currently)
>>> variables:
>>>    char obs_qty(nobs, mxstr) ;
>>>     obs_qty:long_name = "quality flag" ;
>>>    float obs_arr(nobs, obs_arr_len) ;
>>>     obs_arr:long_name = "array of observation values" ;
>>>     obs_arr:missing_value = -9999.f ;
>>>     obs_arr:_FillValue = -9999.f ;
>>>     obs_arr:hdr_id_long_name = "index of matching header data" ;
>>>     obs_arr:columns = "hdr_id var_id lvl hgt ob" ;
>>>     obs_arr:var_id_long_name = "index of BUFR variable
corresponding to
>>> the observation type" ;
>>>     obs_arr:lvl_long_name = "pressure level (hPa) or accumulation
>>> interval (sec)" ;
>>>     obs_arr:hgt_long_name = "height in meters above sea level
(msl)" ;
>>>     obs_arr:ob_long_name = "observation value" ;
>>>
>>> // global attributes:
>>>     :use_var_id = "true" ;
>>>     :FileOrigins = "File rap.2016090300.prepbufr.tm00.nc generated
>>> 20180221_004145 UTC on host r8i4n1 by the MET pb2nc tool" ;
>>>     :MET_version = "V6.1" ;
>>>     :MET_tool = "pb2nc" ;
>>> }
>>>
>>> So should I continue to use the pb2nc found in
>>> /glade/p/ral/jnt/MET/MET_releases/cheyenne/met-6.1/bin rather than
the
>>> one provided for by the module?
>>>
>>> Thanks,
>>> Jonathan
>>>
>>>
>>> On 02/20/2018 05:31 PM, John Halley Gotway via RT wrote:
>>> > So does that script actually run successfully with the "-index"
option
>>> > removed?
>>> >
>>> > What happens if you cut-and-paste these commands on the command
line:
>>> >
>>> > module purge
>>> > module use
/glade/p/ral/jnt/MET/MET_releases/cheyenne/modulefiles
>>> > module load met/6.1
>>> > pb2nc
>>> > /glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/201624700
>>> 00.rap.t00z.prepbufr.tm00.20160903
>>> > rap.2016090300.prepbufr.tm00.nc
>>> > /glade/u/home/jvigh/DTC/DA/verification/verify_source/MET_co
>>> nfig/PB2NCConfig_HRRR.DA_task
>>> > -v 2 -log pb2nc.log
>>> >
>>> > After 2 or 3 minutes, I still get the following core dump
message:
>>> > terminate called after throwing an instance of
>>> > 'netCDF::exceptions::NcHdfErr'
>>> >    what():  NetCDF: HDF error
>>> > file: ncCheck.cpp  line:92
>>> > Abort (core dumped)
>>> >
>>> > Thanks,
>>> > John
>>> >
>>>
>>>
>>>
>>
>

------------------------------------------------
Subject: Re: segmentation fault when running PB2NC on some prepbufr files from the RAP
From: Chunhua Zhou
Time: Tue Feb 20 22:11:33 2018

Just want to mention that this email chain has been going to a
different
Ming (should be ming.hu at noaa.gov instead of mingge at ucar.edu) ... She
must
be really confused ...
Thanks!

On Tue, Feb 20, 2018 at 8:05 PM, John Halley Gotway via RT <
met_help at ucar.edu> wrote:

> Jonathan,
>
> OK, I recompiled everything using intel/17.0.1.
>
> But since then, I realized that all this nonsense with pb2nc
aborting was
> just because I was running my test from my home directory, which
likely had
> limited space!
>
> When I write to a project directory with plenty of space, it runs
fine!
> Ugh.
>
> Please let me know if you still have issues or run into new ones.
>
> Thanks,
> John
>
> On Tue, Feb 20, 2018 at 6:35 PM, John Halley Gotway
<johnhg at ucar.edu>
> wrote:
>
> > Same behavior when I recompiled met-6.1.  I'm now recompiling MET
and all
> > the dependent libraries with intel/17.0.1 which is (now) the
default
> intel
> > compiler on cheyenne.
> >
> > John
> >
> > On Tue, Feb 20, 2018 at 6:06 PM, John Halley Gotway
<johnhg at ucar.edu>
> > wrote:
> >
> >> Jonathan,
> >>
> >> Well now that's the confusing part.  That path
> >> "/glade/p/ral/jnt/MET/MET_releases/cheyenne/met-6.1/bin" *IS* the
one
> >> that's loaded by the module file!
> >>
> >> Here's the module file that updates the PATH and LD_LIBRARY_PATH
> >> variables:
> >>    cat
/glade/p/ral/jnt/MET/MET_releases/cheyenne/modulefiles/met/6.1
> >>
> >> So when you run that executable directly *WITHOUT* loading the
MET
> >> module, it runs.
> >>
> >> But when I run with the MET module loaded I get a runtime error.
> >>
> >> Julie Prestopnik recompiled MET on Feb 16th to apply a recent set
of
> >> patches.  Perhaps when she recompiled, there was a problem in her
> >> environment?
> >>
> >> I'm testing that theory out now.
> >>
> >> John
> >>
> >>
> >> On Tue, Feb 20, 2018 at 5:46 PM, jvigh via RT <met_help at ucar.edu>
> wrote:
> >>
> >>>
> >>> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84094 >
> >>>
> >>> Hi John,
> >>>
> >>> I don't get a 'terminated' message, but I do get an error:
> >>>
> >>> jvigh at r8i4n1:/glade/u/home/jvigh> module purge
> >>> met/6.1jvigh at r8i4n1:/glade/u/home/jvigh> module use
> >>> /glade/p/ral/jnt/MET/MET_releases/cheyenne/modulefiles
> >>> jvigh at r8i4n1:/glade/u/home/jvigh> module load met/6.1
> >>> jvigh at r8i4n1:/glade/u/home/jvigh> pb2nc
> >>> /glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/201624700
> >>> 00.rap.t00z.prepbufr.tm00.20160903
> >>> rap.2016090300.prepbufr.tm00.nc
> >>> /glade/u/home/jvigh/DTC/DA/verification/verify_source/MET_co
> >>> nfig/PB2NCConfig_HRRR.DA_task
> >>> -v 2 -log pb2nc.log
> >>> DEBUG 1: Default Config File:
> >>> /glade/p/ral/jnt/MET/MET_releases/cheyenne/met-6.1/share/met
> >>> /config/PB2NCConfig_default
> >>> DEBUG 1: User Config File:
> >>> /glade/u/home/jvigh/DTC/DA/verification/verify_source/MET_co
> >>> nfig/PB2NCConfig_HRRR.DA_task
> >>> DEBUG 1: Creating NetCDF File:   rap.2016090300.prepbufr.tm00.nc
> >>> DEBUG 1:
> >>> DEBUG 1: Pre-processing Bufr File for metadata (BUFR variable
names)
> >>> from
> >>> /glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/201624700
> >>> 00.rap.t00z.prepbufr.tm00.20160903
> >>> DEBUG 1:
> >>> DEBUG 1: Processing Bufr File:
> >>> /glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/201624700
> >>> 00.rap.t00z.prepbufr.tm00.20160903
> >>> DEBUG 1: Blocking Bufr file to: /tmp/tmp_pb2nc_blk_43159_0
> >>> DEBUG 2: PrepBufr Time Center:   20160903_000000
> >>> DEBUG 2: Searching Time Window:  20160902_234500 to
20160903_001500
> >>> DEBUG 2: Processing 606306 PrepBufr messages...
> >>> 5% 10% 15% 20% 25% 30% 35% 40% 45% 50% 55% 60% 65% 70% 75% 80%
85% 90%
> >>> 95% 100%
> >>> DEBUG 2: Total PrepBufr Messages processed = 606306
> >>> DEBUG 2: Rejected based on message type  = 583017
> >>> DEBUG 2: Rejected based on station id  = 0
> >>> DEBUG 2: Rejected based on valid time  = 8463
> >>> DEBUG 2: Rejected based on masking grid  = 0
> >>> DEBUG 2: Rejected based on masking polygon = 0
> >>> DEBUG 2: Rejected based on elevation   = 0
> >>> DEBUG 2: Rejected based on pb report type  = 0
> >>> DEBUG 2: Rejected based on input report type = 0
> >>> DEBUG 2: Rejected based on instrument type = 0
> >>> DEBUG 2: Rejected based on zero observations = 691
> >>> DEBUG 2: Total PrepBufr Messages retained  = 14135
> >>> DEBUG 2: Total observations retained or derived = 73646
> >>> ERROR  :
> >>> ERROR  : remove_temp_file() -> can't delete temporary file:
> >>> "/tmp/tmp_pb2nc_blk_43159_0" (Unknown error -1)
> >>> ERROR  :
> >>>
> >>>
> >>> The output file contents do not look complete:
> >>> netcdf rap.2016090300 <(201)%20609-0300>.prepbufr.tm00 {
> >>> dimensions:
> >>>    mxstr = 16 ;
> >>>    mxstr2 = 40 ;
> >>>    hdr_arr_len = 3 ;
> >>>    obs_arr_len = 5 ;
> >>>    nobs = UNLIMITED ; // (73646 currently)
> >>> variables:
> >>>    char obs_qty(nobs, mxstr) ;
> >>>     obs_qty:long_name = "quality flag" ;
> >>>    float obs_arr(nobs, obs_arr_len) ;
> >>>     obs_arr:long_name = "array of observation values" ;
> >>>     obs_arr:missing_value = -9999.f ;
> >>>     obs_arr:_FillValue = -9999.f ;
> >>>     obs_arr:hdr_id_long_name = "index of matching header data" ;
> >>>     obs_arr:columns = "hdr_id var_id lvl hgt ob" ;
> >>>     obs_arr:var_id_long_name = "index of BUFR variable
corresponding to
> >>> the observation type" ;
> >>>     obs_arr:lvl_long_name = "pressure level (hPa) or
accumulation
> >>> interval (sec)" ;
> >>>     obs_arr:hgt_long_name = "height in meters above sea level
(msl)" ;
> >>>     obs_arr:ob_long_name = "observation value" ;
> >>>
> >>> // global attributes:
> >>>     :use_var_id = "true" ;
> >>>     :FileOrigins = "File rap.2016090300.prepbufr.tm00.nc
generated
> >>> 20180221_004145 UTC on host r8i4n1 by the MET pb2nc tool" ;
> >>>     :MET_version = "V6.1" ;
> >>>     :MET_tool = "pb2nc" ;
> >>> }
> >>>
> >>> So should I continue to use the pb2nc found in
> >>> /glade/p/ral/jnt/MET/MET_releases/cheyenne/met-6.1/bin rather
than the
> >>> one provided for by the module?
> >>>
> >>> Thanks,
> >>> Jonathan
> >>>
> >>>
> >>> On 02/20/2018 05:31 PM, John Halley Gotway via RT wrote:
> >>> > So does that script actually run successfully with the "-
index"
> option
> >>> > removed?
> >>> >
> >>> > What happens if you cut-and-paste these commands on the
command line:
> >>> >
> >>> > module purge
> >>> > module use
/glade/p/ral/jnt/MET/MET_releases/cheyenne/modulefiles
> >>> > module load met/6.1
> >>> > pb2nc
> >>> > /glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/201624700
> >>> 00.rap.t00z.prepbufr.tm00.20160903
> >>> > rap.2016090300.prepbufr.tm00.nc
> >>> > /glade/u/home/jvigh/DTC/DA/verification/verify_source/MET_co
> >>> nfig/PB2NCConfig_HRRR.DA_task
> >>> > -v 2 -log pb2nc.log
> >>> >
> >>> > After 2 or 3 minutes, I still get the following core dump
message:
> >>> > terminate called after throwing an instance of
> >>> > 'netCDF::exceptions::NcHdfErr'
> >>> >    what():  NetCDF: HDF error
> >>> > file: ncCheck.cpp  line:92
> >>> > Abort (core dumped)
> >>> >
> >>> > Thanks,
> >>> > John
> >>> >
> >>>
> >>>
> >>>
> >>
> >
>
>

------------------------------------------------
Subject: Re: segmentation fault when running PB2NC on some prepbufr files from the RAP
From: jvigh
Time: Tue Feb 20 23:12:07 2018

Hi Ming,
   Sorry for all the extraneous e-mails! These were meant for the
other
Ming!
Best,
  Jonathan

Jonathan Vigh
Project Scientist I, Joint Numerical Testbed
Research Applications Laboratory (RAL)
National Center for Atmospheric Research (NCAR)
P.O. Box 3000                    tel: +1 (303) 497-8205
Boulder, CO 80307-3000   fax: +1 (303)
497-
8171http://www.ral.ucar.edu/staff/jvigh/http://www.ral.ucar.edu/hurricanes/


On Tue, Feb 20, 2018 at 10:11 PM, Chunhua Zhou via RT
<met_help at ucar.edu>
wrote:

> Just want to mention that this email chain has been going to a
different
> Ming (should be ming.hu at noaa.gov instead of mingge at ucar.edu) ... She
must
> be really confused ...
> Thanks!
>
> On Tue, Feb 20, 2018 at 8:05 PM, John Halley Gotway via RT <
> met_help at ucar.edu> wrote:
>
> > Jonathan,
> >
> > OK, I recompiled everything using intel/17.0.1.
> >
> > But since then, I realized that all this nonsense with pb2nc
aborting was
> > just because I was running my test from my home directory, which
likely
> had
> > limited space!
> >
> > When I write to a project directory with plenty of space, it runs
fine!
> > Ugh.
> >
> > Please let me know if you still have issues or run into new ones.
> >
> > Thanks,
> > John
> >
> > On Tue, Feb 20, 2018 at 6:35 PM, John Halley Gotway
<johnhg at ucar.edu>
> > wrote:
> >
> > > Same behavior when I recompiled met-6.1.  I'm now recompiling
MET and
> all
> > > the dependent libraries with intel/17.0.1 which is (now) the
default
> > intel
> > > compiler on cheyenne.
> > >
> > > John
> > >
> > > On Tue, Feb 20, 2018 at 6:06 PM, John Halley Gotway
<johnhg at ucar.edu>
> > > wrote:
> > >
> > >> Jonathan,
> > >>
> > >> Well now that's the confusing part.  That path
> > >> "/glade/p/ral/jnt/MET/MET_releases/cheyenne/met-6.1/bin" *IS*
the one
> > >> that's loaded by the module file!
> > >>
> > >> Here's the module file that updates the PATH and
LD_LIBRARY_PATH
> > >> variables:
> > >>    cat
/glade/p/ral/jnt/MET/MET_releases/cheyenne/modulefiles/met/6.1
> > >>
> > >> So when you run that executable directly *WITHOUT* loading the
MET
> > >> module, it runs.
> > >>
> > >> But when I run with the MET module loaded I get a runtime
error.
> > >>
> > >> Julie Prestopnik recompiled MET on Feb 16th to apply a recent
set of
> > >> patches.  Perhaps when she recompiled, there was a problem in
her
> > >> environment?
> > >>
> > >> I'm testing that theory out now.
> > >>
> > >> John
> > >>
> > >>
> > >> On Tue, Feb 20, 2018 at 5:46 PM, jvigh via RT
<met_help at ucar.edu>
> > wrote:
> > >>
> > >>>
> > >>> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84094
>
> > >>>
> > >>> Hi John,
> > >>>
> > >>> I don't get a 'terminated' message, but I do get an error:
> > >>>
> > >>> jvigh at r8i4n1:/glade/u/home/jvigh> module purge
> > >>> met/6.1jvigh at r8i4n1:/glade/u/home/jvigh> module use
> > >>> /glade/p/ral/jnt/MET/MET_releases/cheyenne/modulefiles
> > >>> jvigh at r8i4n1:/glade/u/home/jvigh> module load met/6.1
> > >>> jvigh at r8i4n1:/glade/u/home/jvigh> pb2nc
> > >>> /glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/201624700
> > >>> 00.rap.t00z.prepbufr.tm00.20160903
> > >>> rap.2016090300.prepbufr.tm00.nc
> > >>> /glade/u/home/jvigh/DTC/DA/verification/verify_source/MET_co
> > >>> nfig/PB2NCConfig_HRRR.DA_task
> > >>> -v 2 -log pb2nc.log
> > >>> DEBUG 1: Default Config File:
> > >>> /glade/p/ral/jnt/MET/MET_releases/cheyenne/met-6.1/share/met
> > >>> /config/PB2NCConfig_default
> > >>> DEBUG 1: User Config File:
> > >>> /glade/u/home/jvigh/DTC/DA/verification/verify_source/MET_co
> > >>> nfig/PB2NCConfig_HRRR.DA_task
> > >>> DEBUG 1: Creating NetCDF File:
rap.2016090300.prepbufr.tm00.nc
> > >>> DEBUG 1:
> > >>> DEBUG 1: Pre-processing Bufr File for metadata (BUFR variable
names)
> > >>> from
> > >>> /glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/201624700
> > >>> 00.rap.t00z.prepbufr.tm00.20160903
> > >>> DEBUG 1:
> > >>> DEBUG 1: Processing Bufr File:
> > >>> /glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/201624700
> > >>> 00.rap.t00z.prepbufr.tm00.20160903
> > >>> DEBUG 1: Blocking Bufr file to: /tmp/tmp_pb2nc_blk_43159_0
> > >>> DEBUG 2: PrepBufr Time Center:   20160903_000000
> > >>> DEBUG 2: Searching Time Window:  20160902_234500 to
20160903_001500
> > >>> DEBUG 2: Processing 606306 PrepBufr messages...
> > >>> 5% 10% 15% 20% 25% 30% 35% 40% 45% 50% 55% 60% 65% 70% 75% 80%
85%
> 90%
> > >>> 95% 100%
> > >>> DEBUG 2: Total PrepBufr Messages processed = 606306
> > >>> DEBUG 2: Rejected based on message type  = 583017
> > >>> DEBUG 2: Rejected based on station id  = 0
> > >>> DEBUG 2: Rejected based on valid time  = 8463
> > >>> DEBUG 2: Rejected based on masking grid  = 0
> > >>> DEBUG 2: Rejected based on masking polygon = 0
> > >>> DEBUG 2: Rejected based on elevation   = 0
> > >>> DEBUG 2: Rejected based on pb report type  = 0
> > >>> DEBUG 2: Rejected based on input report type = 0
> > >>> DEBUG 2: Rejected based on instrument type = 0
> > >>> DEBUG 2: Rejected based on zero observations = 691
> > >>> DEBUG 2: Total PrepBufr Messages retained  = 14135
> > >>> DEBUG 2: Total observations retained or derived = 73646
> > >>> ERROR  :
> > >>> ERROR  : remove_temp_file() -> can't delete temporary file:
> > >>> "/tmp/tmp_pb2nc_blk_43159_0" (Unknown error -1)
> > >>> ERROR  :
> > >>>
> > >>>
> > >>> The output file contents do not look complete:
> > >>> netcdf rap.2016090300 <(201)%20609-0300>.prepbufr.tm00 {
> > >>> dimensions:
> > >>>    mxstr = 16 ;
> > >>>    mxstr2 = 40 ;
> > >>>    hdr_arr_len = 3 ;
> > >>>    obs_arr_len = 5 ;
> > >>>    nobs = UNLIMITED ; // (73646 currently)
> > >>> variables:
> > >>>    char obs_qty(nobs, mxstr) ;
> > >>>     obs_qty:long_name = "quality flag" ;
> > >>>    float obs_arr(nobs, obs_arr_len) ;
> > >>>     obs_arr:long_name = "array of observation values" ;
> > >>>     obs_arr:missing_value = -9999.f ;
> > >>>     obs_arr:_FillValue = -9999.f ;
> > >>>     obs_arr:hdr_id_long_name = "index of matching header data"
;
> > >>>     obs_arr:columns = "hdr_id var_id lvl hgt ob" ;
> > >>>     obs_arr:var_id_long_name = "index of BUFR variable
corresponding
> to
> > >>> the observation type" ;
> > >>>     obs_arr:lvl_long_name = "pressure level (hPa) or
accumulation
> > >>> interval (sec)" ;
> > >>>     obs_arr:hgt_long_name = "height in meters above sea level
(msl)"
> ;
> > >>>     obs_arr:ob_long_name = "observation value" ;
> > >>>
> > >>> // global attributes:
> > >>>     :use_var_id = "true" ;
> > >>>     :FileOrigins = "File rap.2016090300.prepbufr.tm00.nc
generated
> > >>> 20180221_004145 UTC on host r8i4n1 by the MET pb2nc tool" ;
> > >>>     :MET_version = "V6.1" ;
> > >>>     :MET_tool = "pb2nc" ;
> > >>> }
> > >>>
> > >>> So should I continue to use the pb2nc found in
> > >>> /glade/p/ral/jnt/MET/MET_releases/cheyenne/met-6.1/bin rather
than
> the
> > >>> one provided for by the module?
> > >>>
> > >>> Thanks,
> > >>> Jonathan
> > >>>
> > >>>
> > >>> On 02/20/2018 05:31 PM, John Halley Gotway via RT wrote:
> > >>> > So does that script actually run successfully with the "-
index"
> > option
> > >>> > removed?
> > >>> >
> > >>> > What happens if you cut-and-paste these commands on the
command
> line:
> > >>> >
> > >>> > module purge
> > >>> > module use
/glade/p/ral/jnt/MET/MET_releases/cheyenne/modulefiles
> > >>> > module load met/6.1
> > >>> > pb2nc
> > >>> > /glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/201624700
> > >>> 00.rap.t00z.prepbufr.tm00.20160903
> > >>> > rap.2016090300.prepbufr.tm00.nc
> > >>> > /glade/u/home/jvigh/DTC/DA/verification/verify_source/MET_co
> > >>> nfig/PB2NCConfig_HRRR.DA_task
> > >>> > -v 2 -log pb2nc.log
> > >>> >
> > >>> > After 2 or 3 minutes, I still get the following core dump
message:
> > >>> > terminate called after throwing an instance of
> > >>> > 'netCDF::exceptions::NcHdfErr'
> > >>> >    what():  NetCDF: HDF error
> > >>> > file: ncCheck.cpp  line:92
> > >>> > Abort (core dumped)
> > >>> >
> > >>> > Thanks,
> > >>> > John
> > >>> >
> > >>>
> > >>>
> > >>>
> > >>
> > >
> >
> >
>
>

------------------------------------------------
Subject: Re: [rt.rap.ucar.edu #84094] Re: segmentation fault when running PB2NC on some prepbufr files from the RAP
From: jvigh
Time: Wed Feb 21 17:54:27 2018

Hi John,

Here's an update on my further testing this afternoon.

Recompiling MET for intel/17.0.1 and then fixing the associated module
definition file seems to have been very helpful.

bash vs. Korn does not seem to make a difference, apart from Korn
complaining about not being able to load .profile when it runs.

Everything seems to work beautifully now when running on an
interactive
login, EVEN the original met_pb2nc.ksh script from Michelle/Jamie
(with
modifications for the path, data, etc.). Notably, that script was even
calling a different MET executable.

So in summary, MET is now working pretty well in my default
environment
(bash shell).

HOWEVER, I can reproduce some of the issues when running from a batch
node:

<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
  jvigh at r8i4n0:~/DTC/DA/verification/verify_source/test> ./test10.bash


/glade/p/ral/jnt/MET/MET_releases/cheyenne/met-6.1/bin/pb2nc


FCST_TIME=00
valid time for  00 h forecast =  2016090300
CALLING: /glade/p/ral/jnt/MET/MET_releases/cheyenne/met-6.1/bin/pb2nc
/glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/20162470000.rap.t00z.prepbufr.tm00.20160903
/glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/pb2nc/rap.2016090300.prepbufr.tm00.nc
/glade/u/home/jvigh/DTC/DA/verification/verify_source/MET_config/PB2NCConfig_HRRR.DA_task
-v 2
DEBUG 1: Default Config File:
/glade/p/ral/jnt/MET/MET_releases/cheyenne/met-
6.1/share/met/config/PB2NCConfig_default
DEBUG 1: User Config File:
/glade/u/home/jvigh/DTC/DA/verification/verify_source/MET_config/PB2NCConfig_HRRR.DA_task
DEBUG 1: Creating NetCDF File:
/glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/pb2nc/rap.2016090300.prepbufr.tm00.nc
DEBUG 1:
DEBUG 1: Pre-processing Bufr File for metadata (BUFR variable names)
from
/glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/20162470000.rap.t00z.prepbufr.tm00.20160903
DEBUG 1:
DEBUG 1: Processing Bufr File:
/glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/20162470000.rap.t00z.prepbufr.tm00.20160903
DEBUG 1: Blocking Bufr file to: /tmp/tmp_pb2nc_blk_51927_0
DEBUG 2: PrepBufr Time Center:          20160903_000000
DEBUG 2: Searching Time Window:         20160902_234500 to
20160903_001500
DEBUG 2: Processing 606306 PrepBufr messages...
5% 10% 15% 20% 25% 30% 35% 40% 45% 50% 55% 60% 65% 70% 75% 80% 85% 90%
95% 100%
DEBUG 2: Total PrepBufr Messages processed      = 606306
DEBUG 2: Rejected based on message type         = 583017
DEBUG 2: Rejected based on station id           = 0
DEBUG 2: Rejected based on valid time           = 8463
DEBUG 2: Rejected based on masking grid         = 0
DEBUG 2: Rejected based on masking polygon      = 0
DEBUG 2: Rejected based on elevation            = 0
DEBUG 2: Rejected based on pb report type       = 0
DEBUG 2: Rejected based on input report type    = 0
DEBUG 2: Rejected based on instrument type      = 0
DEBUG 2: Rejected based on zero observations    = 691
DEBUG 2: Total PrepBufr Messages retained       = 14135
DEBUG 2: Total observations retained or derived = 73646
ERROR  :
ERROR  : remove_temp_file() -> can't delete temporary file:
"/tmp/tmp_pb2nc_blk_51927_0" (Unknown error -1)
ERROR  :
da_pb2nc.ksh completed at Wed Feb 21 17:46:52 MST 2018
This run took 78 seconds.
 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

So apparently, the executable cannot remove the temporary file when it
is run in a batch node environment. This error happens whether I use
the
-l inception=login environment or not when I invoke the batch node.

Any thoughts on what to try? Now that I can reliably reproduce this,
you
should be able to as well.

Here's how I'm invoking the batch node (without the login inception):

jvigh at r8i4n0:~> cat rs_cheyenne_60min_share.bash
#!/bin/bash
# -l inception=login                    # This directive uses the
login
environment on the exclusive use batch nodes
# -l select=1:ncpus=1:mpiprocs=1        # This directive specifies the
number of nodes, cpus, and processes on each node
# -l walltime=01:00:00                  # This directive specifies the
wall clock limit in hh:mm:ss
# -q share                              # This directive specifies the
queue to submit the job to
# -A $PROJ                              # This directive specifies
which
project account to charge to
qsub -I -l select=1:ncpus=1:mpiprocs=1 -l walltime=00:60:00 -q share
-A
$PROJ $SHELL


With login inception:

#!/bin/bash
# -l inception=login                    # This directive uses the
login
environment on the exclusive use batch nodes
# -l select=1:ncpus=1:mpiprocs=1        # This directive specifies the
number of nodes, cpus, and processes on each node
# -l walltime=01:00:00                  # This directive specifies the
wall clock limit in hh:mm:ss
# -q share                              # This directive specifies the
queue to submit the job to
# -A $PROJ                              # This directive specifies
which
project account to charge to
qsub -I -l inception=login -l select=1:ncpus=1:mpiprocs=1 -l
walltime=00:60:00 -q share -A $PROJ $SHELL


Jonathan



On 02/20/2018 05:31 PM, John Halley Gotway via RT wrote:
> So does that script actually run successfully with the "-index"
option
> removed?
>
> What happens if you cut-and-paste these commands on the command
line:
>
> module purge
> module use /glade/p/ral/jnt/MET/MET_releases/cheyenne/modulefiles
> module load met/6.1
> pb2nc
>
/glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/20162470000.rap.t00z.prepbufr.tm00.20160903
> rap.2016090300.prepbufr.tm00.nc
>
/glade/u/home/jvigh/DTC/DA/verification/verify_source/MET_config/PB2NCConfig_HRRR.DA_task
> -v 2 -log pb2nc.log
>
> After 2 or 3 minutes, I still get the following core dump message:
> terminate called after throwing an instance of
> 'netCDF::exceptions::NcHdfErr'
>    what():  NetCDF: HDF error
> file: ncCheck.cpp  line:92
> Abort (core dumped)
>
> Thanks,
> John
>
>


------------------------------------------------
Subject: Re: segmentation fault when running PB2NC on some prepbufr files from the RAP
From: John Halley Gotway
Time: Wed Feb 21 21:48:00 2018

Jonathan,

In the Pb2nc config file, you could try setting tmp_dir to some
directory
other than /tmp.  See if the behavior is any different.

Perhaps /tmp works different in some way on the batch nodes than the
login
nodes?

John

On Wed, Feb 21, 2018 at 5:54 PM jvigh via RT <met_help at ucar.edu>
wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84094 >
>
> Hi John,
>
> Here's an update on my further testing this afternoon.
>
> Recompiling MET for intel/17.0.1 and then fixing the associated
module
> definition file seems to have been very helpful.
>
> bash vs. Korn does not seem to make a difference, apart from Korn
> complaining about not being able to load .profile when it runs.
>
> Everything seems to work beautifully now when running on an
interactive
> login, EVEN the original met_pb2nc.ksh script from Michelle/Jamie
(with
> modifications for the path, data, etc.). Notably, that script was
even
> calling a different MET executable.
>
> So in summary, MET is now working pretty well in my default
environment
> (bash shell).
>
> HOWEVER, I can reproduce some of the issues when running from a
batch node:
>
> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
>   jvigh at r8i4n0:~/DTC/DA/verification/verify_source/test>
./test10.bash
>
>
> /glade/p/ral/jnt/MET/MET_releases/cheyenne/met-6.1/bin/pb2nc
>
>
> FCST_TIME=00
> valid time for  00 h forecast =  2016090300
> CALLING: /glade/p/ral/jnt/MET/MET_releases/cheyenne/met-
6.1/bin/pb2nc
>
>
/glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/20162470000.rap.t00z.prepbufr.tm00.20160903
> /glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/pb2nc/
> rap.2016090300.prepbufr.tm00.nc
>
>
/glade/u/home/jvigh/DTC/DA/verification/verify_source/MET_config/PB2NCConfig_HRRR.DA_task
> -v 2
> DEBUG 1: Default Config File:
>
> /glade/p/ral/jnt/MET/MET_releases/cheyenne/met-
6.1/share/met/config/PB2NCConfig_default
> DEBUG 1: User Config File:
>
>
/glade/u/home/jvigh/DTC/DA/verification/verify_source/MET_config/PB2NCConfig_HRRR.DA_task
> DEBUG 1: Creating NetCDF File:
> /glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/pb2nc/
> rap.2016090300.prepbufr.tm00.nc
> DEBUG 1:
> DEBUG 1: Pre-processing Bufr File for metadata (BUFR variable names)
> from
>
>
/glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/20162470000.rap.t00z.prepbufr.tm00.20160903
> DEBUG 1:
> DEBUG 1: Processing Bufr File:
>
>
/glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/20162470000.rap.t00z.prepbufr.tm00.20160903
> DEBUG 1: Blocking Bufr file to: /tmp/tmp_pb2nc_blk_51927_0
> DEBUG 2: PrepBufr Time Center:          20160903_000000
> DEBUG 2: Searching Time Window:         20160902_234500 to
20160903_001500
> DEBUG 2: Processing 606306 PrepBufr messages...
> 5% 10% 15% 20% 25% 30% 35% 40% 45% 50% 55% 60% 65% 70% 75% 80% 85%
90%
> 95% 100%
> DEBUG 2: Total PrepBufr Messages processed      = 606306
> DEBUG 2: Rejected based on message type         = 583017
> DEBUG 2: Rejected based on station id           = 0
> DEBUG 2: Rejected based on valid time           = 8463
> DEBUG 2: Rejected based on masking grid         = 0
> DEBUG 2: Rejected based on masking polygon      = 0
> DEBUG 2: Rejected based on elevation            = 0
> DEBUG 2: Rejected based on pb report type       = 0
> DEBUG 2: Rejected based on input report type    = 0
> DEBUG 2: Rejected based on instrument type      = 0
> DEBUG 2: Rejected based on zero observations    = 691
> DEBUG 2: Total PrepBufr Messages retained       = 14135
> DEBUG 2: Total observations retained or derived = 73646
> ERROR  :
> ERROR  : remove_temp_file() -> can't delete temporary file:
> "/tmp/tmp_pb2nc_blk_51927_0" (Unknown error -1)
> ERROR  :
> da_pb2nc.ksh completed at Wed Feb 21 17:46:52 MST 2018
> This run took 78 seconds.
>  >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>
> So apparently, the executable cannot remove the temporary file when
it
> is run in a batch node environment. This error happens whether I use
the
> -l inception=login environment or not when I invoke the batch node.
>
> Any thoughts on what to try? Now that I can reliably reproduce this,
you
> should be able to as well.
>
> Here's how I'm invoking the batch node (without the login
inception):
>
> jvigh at r8i4n0:~> cat rs_cheyenne_60min_share.bash
> #!/bin/bash
> # -l inception=login                    # This directive uses the
login
> environment on the exclusive use batch nodes
> # -l select=1:ncpus=1:mpiprocs=1        # This directive specifies
the
> number of nodes, cpus, and processes on each node
> # -l walltime=01:00:00                  # This directive specifies
the
> wall clock limit in hh:mm:ss
> # -q share                              # This directive specifies
the
> queue to submit the job to
> # -A $PROJ                              # This directive specifies
which
> project account to charge to
> qsub -I -l select=1:ncpus=1:mpiprocs=1 -l walltime=00:60:00 -q share
-A
> $PROJ $SHELL
>
>
> With login inception:
>
> #!/bin/bash
> # -l inception=login                    # This directive uses the
login
> environment on the exclusive use batch nodes
> # -l select=1:ncpus=1:mpiprocs=1        # This directive specifies
the
> number of nodes, cpus, and processes on each node
> # -l walltime=01:00:00                  # This directive specifies
the
> wall clock limit in hh:mm:ss
> # -q share                              # This directive specifies
the
> queue to submit the job to
> # -A $PROJ                              # This directive specifies
which
> project account to charge to
> qsub -I -l inception=login -l select=1:ncpus=1:mpiprocs=1 -l
> walltime=00:60:00 -q share -A $PROJ $SHELL
>
>
> Jonathan
>
>
>
> On 02/20/2018 05:31 PM, John Halley Gotway via RT wrote:
> > So does that script actually run successfully with the "-index"
option
> > removed?
> >
> > What happens if you cut-and-paste these commands on the command
line:
> >
> > module purge
> > module use /glade/p/ral/jnt/MET/MET_releases/cheyenne/modulefiles
> > module load met/6.1
> > pb2nc
> >
>
/glade/scratch/jvigh/DTC/DA/HRRR/RAP_DATA/prepbufr/20162470000.rap.t00z.prepbufr.tm00.20160903
> > rap.2016090300.prepbufr.tm00.nc
> >
>
/glade/u/home/jvigh/DTC/DA/verification/verify_source/MET_config/PB2NCConfig_HRRR.DA_task
> > -v 2 -log pb2nc.log
> >
> > After 2 or 3 minutes, I still get the following core dump message:
> > terminate called after throwing an instance of
> > 'netCDF::exceptions::NcHdfErr'
> >    what():  NetCDF: HDF error
> > file: ncCheck.cpp  line:92
> > Abort (core dumped)
> >
> > Thanks,
> > John
> >
> >
>
>
>

------------------------------------------------


More information about the Met_help mailing list