[ncl-install] Segmentation fault when compiling NCL from source on Amazon Linux 2 (ARM64)

yehwalashet fulas yehwalaan25 at gmail.com
Mon Jan 24 02:24:46 MST 2022


Hello Dear,

I want to learn NCL on my own. I was working on it a little before 5 years
ago. Now I want to learn and work on it. Please can anyone help me how I
can download it on my desktop ? I can not download it. I don't understand
my errors.

With regards
yehwala,

On Fri, Sep 10, 2021 at 8:30 PM Dave Allured - NOAA Affiliate via
ncl-install <ncl-install at mailman.ucar.edu> wrote:

> Michael, thank you for testing those various strategies.  Your reports and
> build logs are helping me understand some of the current problems in
> NCL/NCARG.
>
> NCL internally uses a lot of fortran.  Runtime errors "index above upper
> bound" are detected in fortran when *-fcheck-all* is used.  Some of these
> are caused by deliberate, old style array methods that are now considered
> unsafe, and have better alternatives.  Bounds checking is very useful for
> locating seg faults due to fortran array mismanagement.  However, it seems
> that bounds checking has never been seriously applied to NCL/NCARG.  There
> may be numerous cases to work through, before arriving at your original
> problem in that *create* block within *wrf_contour*.  Even then it may
> not help, because that particular fault might be in C code, rather than
> fortran.
>
> The undefined reference errors are usually secondary errors that cascade
> after some primary compile error.  As such they are not of much concern,
> until the primary errors are solved.
>
> For what it's worth, here is a fix for the bounds error in *binput.f*.
> This should fix the *graphc* executable, and thereby fix the
> graphcap section of the full build process.  This worked for me with
> gfortran 10 and 11 on Mac X86.  Original code near the end of *binput.f*:
>
>       DO 1111 II = 1,DUMSIZ
>         DUMSPC(II) = 0
>  1111 CONTINUE
>
>  Change the first line, and insert another one at the end:
>
>       DO 1111 II = 1,DUMSM1
>         DUMSPC(II) = 0
>  1111 CONTINUE
>       ENDDSP = 0
>
> I don't have much time to spend on this, but I will try to look at some of
> the other problems when possible.
>
>
> On Fri, Sep 10, 2021 at 6:09 AM <michael.graf at meteoprime.ch> wrote:
>
>> Now, I performed the last experiments without success. It seems that the
>> compilation of NCL on ARM64 architectures (Amazon Graviton 2) is very
>> tricky and I will do now a workaround, where WRF and NCL are running on two
>> different virtual machines with different architectures (ARM64 and X86_64).
>> It would be more convenient to have it on the same virtual machine.
>> However, at the moment, the needed effort for this seems to be too large.
>> Nevertheless I attached the make-output files (compiled with GCC 11.2) for
>> others as a debugging help.
>>
>>
>>
>> The compilation with the develop branch produced identical errors as with
>> the main branch. Afterwards I tried to compile it with the newest GCC
>> version 11.2. Now, a lot of *type mismatch error* and *rank mismatch
>> errors* occurs (see attached make-output files). I turned them off with
>> the option *-fallow-argument-mismatch, *followed by numerous errors of
>> type “Error: BOZ literal constant at (1) is neither a data-stmt-constant
>> nor an actual argument to INT, REAL, DBLE, or CMPLX intrinsic function” that
>> I override with the *-fallow-invalid-boz *option, but without success.
>> Many “undefined reference” errors remain and also some “Error: Operands of
>> binary numeric operator ‘/’at (1) are INTEGER(4)/BOZ*” *appear*. *
>>
>>
>>
>> *Von:* ncl-install <ncl-install-bounces at mailman.ucar.edu> *Im Auftrag
>> von *Michael Graf via ncl-install
>> *Gesendet:* Donnerstag, 9. September 2021 08:50
>> *An:* 'Dave Allured - NOAA Affiliate' <dave.allured at noaa.gov>
>> *Cc:* ncl-install at mailman.ucar.edu
>> *Betreff:* Re: [ncl-install] Segmentation fault when compiling NCL from
>> source on Amazon Linux 2 (ARM64)
>>
>>
>>
>> As suggested, I compiled NCL with compiler-based debugging features
>> enabled. Now, several errors occur (see below for some examples – zipped
>> make-output file is attached). E. g. the error ‘*At line 4397 of file
>> Iftran.f - Fortran runtime error: Index '2' of dimension 1 of array 'id'
>> above upper bound of 1’ *occurs several times and many messages *‘undefined
>> reference to `xxx’* appear now. The ncl binary cannot be created
>> anymore. It’s not clear to me what’s causing now the errors, because they
>> aren’t present, when debugging features are disabled. Maybe you have a
>> suggestion.
>>
>>
>>
>> Next steps will be to use the develop branch of ncl for compilation and
>> try out the newest version of GCC (11.2).
>>
>>
>>
>> *****
>>
>> Processing graphcap adm5
>>
>> *At line 316 of file binput.f*
>>
>> *Fortran runtime error: Index '327' of dimension 1 of array 'dumspc'
>> above upper bound of 326*
>>
>>
>>
>> Error termination. Backtrace:
>>
>> #0  0x40001b72195b in ???
>>
>> #1  0x40001b722893 in ???
>>
>> #2  0x40001b722ccb in ???
>>
>> #3  0x401983 in binput_
>>
>>                 at
>> /home/ec2-user/wrf/NCL/ncl_ncarg-6.6.2/common/src/graphcap/binput.f:316
>>
>> #4  0x40131f in capchg
>>
>>                 at
>> /home/ec2-user/wrf/NCL/ncl_ncarg-6.6.2/common/src/graphcap/capchg.f:811
>>
>> #5  0x401477 in main
>>
>>                 at
>> /home/ec2-user/wrf/NCL/ncl_ncarg-6.6.2/common/src/graphcap/capchg.f:839
>>
>> make[4]: *** [adm5] Error 2
>>
>> make[4]: Leaving directory
>> `/home/ec2-user/wrf/NCL/ncl_ncarg-6.6.2/common/src/graphcap'
>>
>>
>>
>> *****
>>
>> gcc -g -O0 -ansi -fPIC -fopenmp -std=c99  -O    -o Fsplit Fsplit.o
>> -L../../../.././common/src/libncarg_c -lncarg_c -L/usr/local/ncarg/lib
>> -L/usr/local/lib
>>
>> make[5]: Leaving directory
>> `/home/ec2-user/wrf/NCL/ncl_ncarg-6.6.2/ncarg2d/src/libncarg/Iftran'
>>
>> Making ./ncarg2d/src/libncarg/areas
>>
>> make[5]: Entering directory
>> `/home/ec2-user/wrf/NCL/ncl_ncarg-6.6.2/ncarg2d/src/libncarg/areas'
>>
>> *At line 4397 of file Iftran.f*
>>
>> *Fortran runtime error: Index '2' of dimension 1 of array 'id' above
>> upper bound of 1*
>>
>>
>>
>> Error termination. Backtrace:
>>
>> #0  0x4000104a895b in ???
>>
>> #1  0x4000104a9893 in ???
>>
>> #2  0x4000104a9ccb in ???
>>
>> #3  0x400f1f in xmit_
>>
>>                 at
>> /home/ec2-user/wrf/NCL/ncl_ncarg-6.6.2/ncarg2d/src/libncarg/Iftran/Iftran.f:4397
>>
>> #4  0x4016cb in iftrio_
>>
>>                 at
>> /home/ec2-user/wrf/NCL/ncl_ncarg-6.6.2/ncarg2d/src/libncarg/Iftran/Iftran.f:2060
>>
>> #5  0x403403 in rdcrd_
>>
>>                 at
>> /home/ec2-user/wrf/NCL/ncl_ncarg-6.6.2/ncarg2d/src/libncarg/Iftran/Iftran.f:3036
>>
>> #6  0x40ca6f in getsta_
>>
>>                 at
>> /home/ec2-user/wrf/NCL/ncl_ncarg-6.6.2/ncarg2d/src/libncarg/Iftran/Iftran.f:1701
>>
>> #7  0x4116d7 in iftrax_
>>
>>                 at
>> /home/ec2-user/wrf/NCL/ncl_ncarg-6.6.2/ncarg2d/src/libncarg/Iftran/Iftran.f:225
>>
>> #8  0x413093 in iftran
>>
>>                 at
>> /home/ec2-user/wrf/NCL/ncl_ncarg-6.6.2/ncarg2d/src/libncarg/Iftran/Iftran.f:89
>>
>> #9  0x413213 in main
>>
>>                 at
>> /home/ec2-user/wrf/NCL/ncl_ncarg-6.6.2/ncarg2d/src/libncarg/Iftran/Iftran.f:99
>>
>> make[5]: *** [IftranRun] Error 2
>>
>> make[5]: Leaving directory
>> `/home/ec2-user/wrf/NCL/ncl_ncarg-6.6.2/ncarg2d/src/libncarg/areas'
>>
>>
>>
>> *****
>>
>> make[5]: Entering directory
>> `/home/ec2-user/wrf/NCL/ncl_ncarg-6.6.2/ncarg2d/src/bin/ezmapdemo'
>>
>> gfortran -g -O0 -fbacktrace -fcheck=all -ffpe-trap=invalid,zero,overflow
>> -fPIC -fno-second-underscore -fno-range-check -fopenmp  -O    -o ezmapdemo
>> EzmapDemo.o -L../../../.././ncarg2d/src/libncarg -lncarg
>> -L../../../.././ncarg2d/src/libncarg_gks -lncarg_gks
>> -L../../../.././common/src/libncarg_c -lncarg_c -lcairo -lXrender
>> -lfontconfig -lpixman-1 -lfreetype -lexpat -lpng -lz -lbz2 -lpng -lz
>> -L/usr/local/ncarg/lib -L/usr/local/lib  -lX11 -lXext
>>
>> EzmapDemo.o: In function `colora_':
>>
>> /home/ec2-user/wrf/NCL/ncl_ncarg-6.6.2/ncarg2d/src/bin/ezmapdemo/EzmapDemo.f:2968:
>> undefined reference to `mapaci_'
>>
>> EzmapDemo.o: In function `drawla_':
>>
>> /home/ec2-user/wrf/NCL/ncl_ncarg-6.6.2/ncarg2d/src/bin/ezmapdemo/EzmapDemo.f:2984:
>> undefined reference to `mapaci_'
>>
>> EzmapDemo.o: In function `coninv_':
>>
>> /home/ec2-user/wrf/NCL/ncl_ncarg-6.6.2/ncarg2d/src/bin/ezmapdemo/EzmapDemo.f:4360:
>> undefined reference to `cpsetr_'
>>
>> /home/ec2-user/wrf/NCL/ncl_ncarg-6.6.2/ncarg2d/src/bin/ezmapdemo/EzmapDemo.f:4361:
>> undefined reference to `cpsetr_'
>>
>> /home/ec2-user/wrf/NCL/ncl_ncarg-6.6.2/ncarg2d/src/bin/ezmapdemo/EzmapDemo.f:4362:
>> undefined reference to `cpsetr_'
>>
>
> <snip>
>
>
>
>> *Von:* ncl-install <ncl-install-bounces at mailman.ucar.edu> *Im Auftrag
>> von *Michael Graf via ncl-install
>> *Gesendet:* Dienstag, 7. September 2021 11:43
>> *An:* 'Dave Allured - NOAA Affiliate' <dave.allured at noaa.gov>
>> *Cc:* ncl-install at mailman.ucar.edu
>> *Betreff:* Re: [ncl-install] Segmentation fault when compiling NCL from
>> source on Amazon Linux 2 (ARM64)
>>
>>
>>
>> Thanks for the helpful suggestions. I started with isolation of the
>> segmentation fault and found out that it occurs when the function
>> gsn_contour() is called. Then I checked where in this function the
>> segmentation fault is triggered and found out that it is in the block shown
>> below between the two print statements. Don’t know exactly what this part
>> is doing but it seems to be related to the contour plotting routine. I was
>> also using ncl -x but didn’t find some additional info about the error.
>> Next step will be the compilation with compiler-based debugging features
>> enabled.
>>
>>
>>
>> *****
>>
>> if (is_lb_mode) then
>>
>> if(res2.and.isatt(res2,"trGridType")) then
>>
>> plot_object = create wksname + "_contour" contourPlotClass wks
>>
>> "cnScalarFieldData" : data_object
>>
>> "pmLabelBarDisplayMode" : lb_mode
>>
>> "trXTensionF": xtension
>>
>> "trYTensionF": ytension
>>
>> "trGridType": res2 at trGridType
>>
>> end create
>>
>> delete(res2 at trGridType)
>>
>> else
>>
>> *print("START CREATE")*
>>
>> plot_object = create wksname + "_contour" contourPlotClass wks
>>
>> "cnScalarFieldData" : data_object
>>
>> "pmLabelBarDisplayMode" : lb_mode
>>
>> "trXTensionF": xtension
>>
>> "trYTensionF": ytension
>>
>> end create <-- *Segmentation fault*
>>
>> *print("END CREATE")*
>>
>> end if
>>
>>
>>
>> *****
>>
>> opts2 = opts
>>
>> delete_attrs(opts2); Clean up.
>>
>>   print("RUN gsn_contour()")
>>
>>   ;print(wks)
>>
>>   ;print(data)
>>
>>   ;print(opts2)
>>
>> cn = gsn_contour(wks,data,opts2); Create the plot. <-- S*egmentation
>> fault*
>>
>> print("FINISH gsn_contour()")
>>
>> _SetMainTitle(nc_file,wks,cn,opts); Set some titles
>>
>>
>>
>> *Von:* Dave Allured - NOAA Affiliate <dave.allured at noaa.gov>
>> *Gesendet:* Montag, 6. September 2021 21:51
>> *An:* michael.graf at meteoprime.ch
>> *Cc:* ncl-install at mailman.ucar.edu
>> *Betreff:* Re: [ncl-install] Segmentation fault when compiling NCL from
>> source on Amazon Linux 2 (ARM64)
>>
>>
>>
>> Here are a few more suggestions in between trial and error, and deeper
>> debugging.  I don't have anything better than these general suggestions,
>> sorry.
>>
>>
>>
>> * Use the debug mode *ncl -x* to further isolate the lower level NCL
>> statement that triggers the error.
>>
>>
>>
>> * wrf_contour is actually NCL code, inside
>> $NCARG_ROOT/lib/ncarg/nclscripts/wrf/WRFUserARW.ncl.   Make your own clone,
>> and isolate the lower level NCL statement that triggers the error.  You may
>> be able to bypass the problem with alternative coding, or simply eliminate
>> a non-essential section, such as logging.
>>
>>
>>
>> * Rebuild NCAR/NCL with compiler-based debugging features enabled, such
>> as *-g -O0 -fbacktrace -fcheck=all -ffpe-trap=invalid,zero,overflow*.
>>
>>
>> * Try the latest NCARG/NCL development version from
>> https://github.com/NCAR/ncl.  Take the "develop" branch.  There have
>> been several bug fixes and build improvements since the 6.6.2 release.
>>
>>
>>
>> * Upgrade your GCC/gfortran version.  There have been improvements in ARM
>> support.  Check to see what is available in the Extras package for Amazon
>> Linux.  Consider building your own GCC/gfortran to the latest version,
>> currently 11.2.  If you switch GCC/gfortran versions, you may also need to
>> rebuild some of your dependencies.
>>
>>
>>
>>
>>
>> On Mon, Sep 6, 2021 at 5:12 AM <michael.graf at meteoprime.ch> wrote:
>>
>> Thanks for the hint. Now (with copying the font files from another
>> distribution) I can compile NCL without error message on Amazon Linux 2.
>> However, when I run a script with NCL I’m still receiving a message
>> ‘Segmentation fault’. I’m also compiling it with the -g option, but don’t
>> get some additional hints (see below).  I found out that the segmentation
>> fault occurs when the function wrf_contour() is called. Other things seem
>> to work well. I can read NetCDF-4 files without problems, calculate CAPE
>> and other diagnostics. I also managed to install NCL version 6.6.2 from
>> EPEL8 on RHEL8 (ARM64, AWS *Graviton2*) without any problems. However,
>> exactly the same issue (Segmentation fault) occurred, when calling
>> wrf_contour(). It seems to me that other parts then the fontcap compilation
>> have similar problems. Maybe the problem is related to the CPU AWS *Graviton2,
>> but it’s also little endian.*
>>
>>
>>
>> *****
>>
>> > ncl wrf_mucape_cin.ncl
>>
>>
>>
>> Copyright (C) 1995-2019 - All Rights Reserved
>>
>> University Corporation for Atmospheric Research NCAR Command Language
>> Version 6.6.2
>>
>> The use of this software is governed by a License Agreement.
>>
>> See http://www.ncl.ucar.edu/ for more details.
>>
>> (0)Working on time: 2021-08-31_00:00:00
>>
>> *Segmentation fault*
>>
>>
>>
>> *****
>>
>> > lscpu
>>
>>
>>
>> Architecture: aarch64
>>
>> *Byte Order: Little Endian *
>>
>> CPU(s): 2 On-line
>>
>> CPU(s) list: 0,1
>>
>> Thread(s) per core: 1 Core(s) per socket: 2
>>
>> Socket(s): 1 NUMA node(s): 1
>>
>> Model: 1
>>
>> BogoMIPS: 243.75
>>
>> L1d cache: 64K
>>
>> L1i cache: 64K
>>
>> L2 cache: 1024K
>>
>> L3 cache: 32768K
>>
>> NUMA node0 CPU(s): 0,1
>>
>> Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp
>> cpuid asimdrdm lrcpc dcpop asimddp ssbs
>>
>>
>>
>>
>>
>> On Fri, Sep 3, 2021 at 4:39 PM Dave Allured - NOAA Affiliate <
>> dave.allured at noaa.gov> wrote:
>>
>> I think fontc is a standalone program that is used only during the NCL
>> build process.  You may be able to sidestep the program issue completely,
>> by simply copying over the compiled fontcap files from a different build.
>> Look at one of the X86 binary distributions, or a working install on any
>> X86 system.  I suspect that the only compatibility issue is endianness of
>> 16- and 32-bit integers.  ARM64 and X86 should both be little endian; not
>> sure because I lack ARM experience.
>>
>>
>>
>>
>>
>> On Wed, Sep 1, 2021 at 11:37 AM Michael Graf via ncl-install <
>> ncl-install at mailman.ucar.edu> wrote:
>>
>> Dear all,
>>
>> Thanks for adding me to the NCL mailing list.
>>
>> I am trying to compile the latest NCL Version 6.6.2 from scratch on Amazon
>> Linux 2 (ARM64 architecture). Everything works fine except that a
>> segmentation fault occurs when the fontcaps are compiled respectively when
>> the fontc binary is processing fontcaps (see output below). No other error
>> occurs. The ncl binary is compiled and it can be started without problems,
>> but when I run a plotting script a segmentation fault occurs that is
>> probably related to the compilation error in fontcap.
>>
>> I also compiled a minimal version with as few dependencies as possible (no
>> GDAL, HDF5, NETCDF-4 and so on) to rule out that they cause the problem
>> without any effect. I have also randomly tried different compiler options
>> for the compilation in the folder fontcap, but the error always remains
>> the
>> same. I suspect that the compiler is causing the problem, but there is no
>> alternative on Amazon Linux 2 so far. I'm using gfortran (version 7.3.1)
>> and
>> gcc (version 7.3.1), but here only Fortran77 code seems to be compiled.
>>
>> It would be great if somebody has a hint how to overcome this problem.
>> Maybe
>> there is another option, so that I don't have to build it from scratch.
>> The
>> installation with conda does not work on ARM64.
>>
>> Best, Michael
>>
>> ************************************************************************
>> Making ./common/src/fontcap
>> make[4]: Entering directory
>> `/home/ec2-user/wrf/NCL/ncl_ncarg-6.6.2/common/src/fontcap'
>>
>>
> <snip>
>
> gfortran -g -fbacktrace -Wall -fcheck=all      -o fontc cfaamn.o  cfrdln.o
>> cfwrit.o  ffgttk.o  ffinfo.o  ffphol.o  ffppkt.o  ffprcf.o ffprsa.o
>> fftbkd.o  fftkin.o  sffndc.o
>>   sfgtin.o  sfgtkw.o  sfprcf.o  sfskbk.o sftbkd.o
>> -L../../.././common/src/libncarg_c -lncarg_c -L/usr/local/ncarg/lib
>> -L/usr/local/lib
>> Processing fontcap font1
>>
>> Program received signal SIGSEGV: Segmentation fault - invalid memory
>> reference.
>>
>> Backtrace for this error:
>> #0  0x40001a29595b in ???
>> #1  0x40001a29488f in ???
>> #2  0x40001a26c667 in ???
>> #3  0x40723c in ???
>> #4  0x4072e7 in ???
>> #5  0x405c97 in sfgtwk_
>> at /home/ec2-user/wrf/NCL/ncl_ncarg-6.6.2/common/src/fontcap/sfgtkw.f:95
>> #6  0x4061cb in sfprcf_
>> at /home/ec2-user/wrf/NCL/ncl_ncarg-6.6.2/common/src/fontcap/sfprcf.f:108
>> #7  0x40119b in cfaamn
>> at /home/ec2-user/wrf/NCL/ncl_ncarg-6.6.2/common/src/fontcap/cfaamn.f:304
>> #8  0x401633 in main
>> at /home/ec2-user/wrf/NCL/ncl_ncarg-6.6.2/common/src/fontcap/cfaamn.f:358
>> make: *** [font1] Segmentation fault
>>
>> ************************************************************************
>> gfortran -g -fsanitize=address,undefined      -o fontc cfaamn.o  cfrdln.o
>> cfwrit.o  ffgttk.o  ffinfo.o  ffphol.o  ffppkt.o  ffprcf.o ffprsa.o
>> fftbkd.o  fftkin.o  sffndc.o
>>  sfgtin.o  sfgtkw.o  sfprcf.o  sfskbk.o sftbkd.o
>> -L../../.././common/src/libncarg_c -lncarg_c -L/usr/local/ncarg/lib
>> -L/usr/local/lib
>> Processing fontcap font1
>> ASAN:DEADLYSIGNAL
>> =================================================================
>> ==2477==ERROR: AddressSanitizer: SEGV on unknown address 0x100005104df40
>> (pc
>> 0x00000040ffd0 bp 0xffffd104daf0 sp 0xffffd104daf0 T0)
>> ==2477==The signal is caused by a READ memory access.
>>     #0 0x40ffcf in gbyte_
>> (/home/ec2-user/wrf/NCL/ncl_ncarg-6.6.2/common/src/fontcap/fontc+0x40ffcf)
>>     #1 0x410057 in gbytes_
>> (/home/ec2-user/wrf/NCL/ncl_ncarg-6.6.2/common/src/fontcap/fontc+0x410057)
>>     #2 0x40deab in sfprcf_
>> /home/ec2-user/wrf/NCL/ncl_ncarg-6.6.2/common/src/fontcap/sfprcf.f:117
>>     #3 0x40213f in cfaamn
>> /home/ec2-user/wrf/NCL/ncl_ncarg-6.6.2/common/src/fontcap/cfaamn.f:304
>>     #4 0x402d7b in main
>> /home/ec2-user/wrf/NCL/ncl_ncarg-6.6.2/common/src/fontcap/cfaamn.f:358
>>     #5 0x40002cbc7ce3 in __libc_start_main (/lib64/libc.so.6+0x1fce3)
>>     #6 0x4018a7
>> (/home/ec2-user/wrf/NCL/ncl_ncarg-6.6.2/common/src/fontcap/fontc+0x4018a7)
>>
>> AddressSanitizer can not provide additional info.
>> SUMMARY: AddressSanitizer: SEGV
>> (/home/ec2-user/wrf/NCL/ncl_ncarg-6.6.2/common/src/fontcap/fontc+0x40ffcf)
>> in gbyte_
>> ==2477==ABORTING
>> make: *** [font1] Error 1
>>
>> _______________________________________________
>> ncl-install mailing list
>> List instructions, subscriber options, unsubscribe:
>> https://mailman.ucar.edu/mailman/listinfo/ncl-install
>>
>> _______________________________________________
> ncl-install mailing list
> List instructions, subscriber options, unsubscribe:
> https://mailman.ucar.edu/mailman/listinfo/ncl-install
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.ucar.edu/pipermail/ncl-install/attachments/20220124/2beda7a1/attachment-0001.html>


More information about the ncl-install mailing list