[ncl-talk] Issues with ESMF_regrid

Mary Haley haley at ucar.edu
Thu Oct 26 10:05:55 MDT 2017


Laura,

I'm honestly not sure why we are seeing the issue with the Intel version of
ESMF.  I am waiting for them to release a version 7.0.1 of ESMF, as they
said this would have some fixes that might help with the Intel compiler.

My guess is that the Intel compiler may be trying to optimize things under
the hood, and this can cause problems. Sometimes we compile things with
optimization turned off to ward against these problems.

I'll get in touch with the ESMF group to see if they have some ideas about
what's going on.

--Mary


On Wed, Oct 25, 2017 at 3:05 PM, Laura Fowler <laura at ucar.edu> wrote:

> Hi Mary:
>
> Thanks for replying to my e-mail so quickly. I tried gnu and as you
> wrote, I did not have any issues running ESMF_regrid with my "old" and
> "new" meshes. I would not have thought of switching compiler on my
> own.
>
> Do you know if I will need to use gnu when using
> ESMF_regrid_with_weights? I guess that I can try intel and switch to
> gnu if it does not work?
>
> Thanks again to you and Rich. You have been very helpful.
>
> Laura
>
>
> On Wed, Oct 25, 2017 at 12:33 PM, Mary Haley <haley at ucar.edu> wrote:
> > Hi Laura (and Rick),
> >
> > I think there are issues with the Intel version of ESMF_RegridWeightGen,
> > which generates the weights file. We've had other issues trying to build
> > this application using the Intel compiler, and we've been in touch with
> the
> > ESMF group about it.
> >
> > Meanwhile, things seem to work if I use the GNU version of NCL on
> cheyenne:
> >
> > module load gnu
> > module load ncl
> > which ncl
> > /glade/u/apps/ch/opt/ncl/6.4.0/gnu/6.2.0/bin/ncl
> >
> > I just ran your script, and it completed with the following output:
> >
> > (0) ESMF_regrid_with_weights: retrieving interpolation weights ...
> > (0) ESMF_regrid_with_weights: calling sparse_matrix_mult to apply
> weights...
> > (0) ESMF_regrid_with_weights: dstData
> > (0)                          Dimensions: 1798 3598
> > (0)                          minSrcData: 1.5258789063e-05
> > (0)                          maxSrcData: 0.999069673989844
> > (0)                          minDstData: 1.5258789063e-05
> > (0)                          maxDstData: 0.9990587636447192
> >
> > Variable: mpas_regrid
> > Type: double
> > Total Size: 51753632 bytes
> > . . .
> > Number of Dimensions: 2
> > Dimensions and sizes: [lat | 1798] x [lon | 3598]
> > Coordinates:
> >             lat: [-89.875..89.875]
> >            lon: [0.125..359.875]
> >
> > Number Of Attributes: 3
> >  missing_value : 9.969209968386869e+36
> >  remap : remapped via ESMF_regrid_with_weights: Bilinear
> >  _FillValue : 9.969209968386869e+36
> >
> >
> > --Mary
> >
> > On Tue, Oct 24, 2017 at 5:14 PM, Laura Fowler <laura at ucar.edu> wrote:
> >>
> >> Hi Rick:
> >>
> >> Thank you for looking into this. The newer mesh I am trying to regrid
> >> has 6488066 cells while the mesh I regridded a couple of years ago had
> >> more cells (6848514). So I did not think that it was the size of the
> >> mesh that was the culprit.
> >>
> >> Here is what I found:
> >>
> >> 1. Using the largest mesh (6848514 cells) and on yellowstone, I tested
> >> the regrid script I used a couple of years ago using ncl/6.3.0 and
> >> ncl/6.4.0. I did not have any issue. You can look at the source,
> >> destination, and weight files in the directories
> >> /glade2/scratch2/laura/ncl/yellowstone.ncl-6.3.0 and
> >> /glade2/scratch2/laura/ncl/yellowstone.ncl-6.4.0. The ncl script is
> >> regrid.to_CMORPHdata.ncl
> >>
> >>
> >> 2. Then, I tested the same script but on cheyenne and I got the same
> >> SIGSEGV using ncl/6.4.0 or ncl/6.3.0. So it seems that it may have to
> >> do with ncl on cheyenne only.
> >>
> >>
> >> 3. Unfortunately, I cannot use yellowstone to regrid my newest mesh
> >> since it is uses the cdf5 format which I cannot read on yellowstone.
> >>
> >>
> >> Hope that this helps to resolve this issue.
> >> Laura
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> On Tue, Oct 24, 2017 at 4:19 PM, Rick Brownrigg <brownrig at ucar.edu>
> wrote:
> >> > Just to follow up on this, the message regarding max-value-size is
> >> > related
> >> > to the debugger, so fixed limits within the ESMF software are likely
> not
> >> > the
> >> > issue.
> >> >
> >> > Nonetheless, the SEGV is occuring in the ESMF software and appears to
> be
> >> > happening in the NetCDF library, function
> >> > netcdf_expanded.f90::nf90_get_var_2d_fourbyteint().   For anyone else
> >> > looking into this, the line number is 1960, and a link to the current
> >> > source
> >> > is:
> >> >
> >> >
> >> > https://github.com/Unidata/netcdf-fortran/blob/master/
> fortran/netcdf_expanded.f90
> >> >
> >> > (I don't know what version of NetCDF ESMF may be linked against, but
> >> > that
> >> > line number is in the right function).
> >> >
> >> >
> >> >
> >> > On Tue, Oct 24, 2017 at 3:48 PM, Rick Brownrigg <brownrig at ucar.edu>
> >> > wrote:
> >> >>
> >> >> Hi Laura,
> >> >>
> >> >> I don't really know much about the regridding process, but what I
> have
> >> >> been able to surmise running the script:
> >> >>
> >> >> i) NCL reads the MPAS, and creates source*.nc and destination*.nc
> >> >> files.
> >> >> These appear to reflect the geometry of the src/dest grids
> >> >>
> >> >> ii) The actual regridding is is done by ESMF software, with a
> command:
> >> >>
> >> >>   ESMF_RegridWeightGen --source source_grid_file.nc --destination
> >> >> destination_grid_file.nc --weight
> >> >> weights_onCells.15-3Mesh_to_0.15rectangular.nc --src_type ESMF -i
> >> >>
> >> >> This program SEGVs almost immediately, with a message:
> >> >>
> >> >> "values=<error reading variable: value requires 155713536 bytes,
> which
> >> >> is
> >> >> more than max-value-size>...."
> >> >>
> >> >> That value is exactly the size of one of the variables in the
> >> >> source*.nc
> >> >> file.  So it looks like some internal limit is being exceeded in the
> >> >> ESMF
> >> >> software.
> >> >>
> >> >> Is this one of the larger MPAS files you've attempted to regrid?  I
> >> >> wonder
> >> >> if anyone else can comment on this?  Those on the glade file system
> can
> >> >> see
> >> >> all the relevant files under /glade/scratch/brownrig
> >> >>
> >> >> I'm not sure what to tell you as a work-around. Without a debug
> version
> >> >> of
> >> >> the code, its nearly impossible for me to tell much more or to
> >> >> detemine
> >> >> what the limits might be. Wish I had a better answer.
> >> >>
> >> >> Rick
> >> >>
> >> >>
> >> >>
> >> >> On Tue, Oct 24, 2017 at 10:13 AM, Laura Fowler <laura at ucar.edu>
> wrote:
> >> >>>
> >> >>> Hi:
> >> >>>
> >> >>> I am trying to regrid an MPAS unstructured mesh to a rectangular
> mesh
> >> >>> on cheyenne using ncl/6.4.0. My script crashes with a SIGSEGV and
> I am
> >> >>> not understanding where this comes from. I have done this
> successfully
> >> >>> in the past but do not see what I am doing wrong right now, but I
> >> >>> recall that it was with an ealier version of ncl. I am attaching the
> >> >>> output of my script in regrid.to_rectMesh.out.
> >> >>>
> >> >>> The script itself can be found in
> >> >>>
> >> >>>
> >> >>> /glade2/scratch2/laura/MPAS.PacificOcean/initialization.
> centeredPacificOceanMesh.15-3km/regrid.to_rectMesh.ncl.
> >> >>> I also tried to regrid the same MPAS mesh to an other unstructured
> >> >>> mesh and got a similar SIGSEGV (see regrid.to_2621442Mesh.ncl), so I
> >> >>> assume that the errors are the same.
> >> >>>
> >> >>> Hope you can help me figure this one out.
> >> >>> Thanks,
> >> >>> Laura
> >> >>>
> >> >>>
> >> >>> --
> >> >>>
> >> >>>
> >> >>> !-----------------------------------------------------------
> --------------------------------------------------
> >> >>> Laura D. Fowler
> >> >>> Mesoscale and Microscale Meteorology Division (MMM)
> >> >>> National Center for Atmospheric Research
> >> >>> P.O. Box 3000, Boulder CO 80307-3000
> >> >>>
> >> >>> e-mail: laura at ucar.edu
> >> >>> phone: 303-497-1628
> >> >>>
> >> >>>
> >> >>>
> >> >>> !-----------------------------------------------------------
> --------------------------------------------------
> >> >>>
> >> >>> _______________________________________________
> >> >>> ncl-talk mailing list
> >> >>> ncl-talk at ucar.edu
> >> >>> List instructions, subscriber options, unsubscribe:
> >> >>> http://mailman.ucar.edu/mailman/listinfo/ncl-talk
> >> >>>
> >> >>
> >> >
> >>
> >>
> >>
> >> --
> >>
> >> !-----------------------------------------------------------
> --------------------------------------------------
> >> Laura D. Fowler
> >> Mesoscale and Microscale Meteorology Division (MMM)
> >> National Center for Atmospheric Research
> >> P.O. Box 3000, Boulder CO 80307-3000
> >>
> >> e-mail: laura at ucar.edu
> >> phone: 303-497-1628
> >>
> >>
> >> !-----------------------------------------------------------
> --------------------------------------------------
> >> _______________________________________________
> >> ncl-talk mailing list
> >> ncl-talk at ucar.edu
> >> List instructions, subscriber options, unsubscribe:
> >> http://mailman.ucar.edu/mailman/listinfo/ncl-talk
> >
> >
>
>
>
> --
> !-----------------------------------------------------------
> --------------------------------------------------
> Laura D. Fowler
> Mesoscale and Microscale Meteorology Division (MMM)
> National Center for Atmospheric Research
> P.O. Box 3000, Boulder CO 80307-3000
>
> e-mail: laura at ucar.edu
> phone: 303-497-1628
>
> !-----------------------------------------------------------
> --------------------------------------------------
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.ucar.edu/pipermail/ncl-talk/attachments/20171026/7daf8ffa/attachment.html>


More information about the ncl-talk mailing list