[Wrf-users] wrf.exe on a RedHat 5 compatible cluster

Surya Ramaswamy Surya.Ramaswamy at erm.com
Fri Apr 23 10:52:56 MDT 2010


Hi Lampros,

Try reducing the time step. Usually these errors occur when you've your time step too high.

Regards

Surya
-----Original Message-----
From: wrf-users-bounces at ucar.edu [mailto:wrf-users-bounces at ucar.edu] On Behalf Of Lampros Mountrakis
Sent: Friday, April 23, 2010 2:52 AM
To: wrf-users at ucar.edu
Subject: [Wrf-users] wrf.exe on a RedHat 5 compatible cluster

I am trying to run wrf.exe 3.1.1 to a RedHat 5 compatible cluster and all I get is errors. I tried several compilation options, such as dmpar/dmsm and static/dynamic and all of them fail. The common options consist of the em_real case, MPICH1 and the Intel compiler.

The very same case provides reasonable output in a RedHat 4 based cluster.

" ulimit -s unlimited " is present at the running script, before the mpirun, as well as the assignment of the parameters, which I found on several topics, having similar problems:

    export MPICH_UNEX_BUFFER_SIZE=1024M
    export P4_GLOBMEMSIZE=536870912
    export MP_STACK_SIZE=64000000
    export KMP_STACKSIZE=2048M

The most common errors are the following:

    std error
    rm_l_4_18119: (1065.371648) net_send: could not write to fd=5, errno = 32
    rm_l_15_12081: (1052.342272) net_send: could not write to fd=5, errno = 32
    rm_l_6_27731: (1064.724480) net_send: could not write to fd=5, errno = 32
    rm_l_10_12047: (1063.745536) net_send: could not write to fd=5, errno = 32
    rm_l_14_12071: (1052.841984) net_send: could not write to fd=5, errno = 32
    rm_l_13_12065: (1053.071360) net_send: could not write to fd=5, errno = 32
    rm_l_7_12029: (1064.433664) net_send: could not write to fd=5, errno = 32
    rm_l_9_12041: (1063.974912) net_send: could not write to fd=5, errno = 32
    rm_l_11_12053: (1058.514944) net_send: could not write to fd=5, errno = 32
    rm_l_12_12059: (1053.300736) net_send: could not write to fd=5, errno = 32
    rm_l_2_21220: (1071.020032) net_send: could not write to fd=5, errno = 32





    ==> rsl.error.0000 <==
    -------------- FATAL CALLED ---------------
    FATAL CALLED FROM FILE: <stdin> LINE: 71
    program wrf: error opening wrfinput_d01 for reading ierr= -1021
    -------------------------------------------
    [0] MPI Abort by user Aborting program !
    [0] Aborting program!



    ==> rsl.out.0000 <==
    -------------- FATAL CALLED ---------------
    FATAL CALLED FROM FILE: <stdin> LINE: 71
    program wrf: error opening wrfinput_d01 for reading ierr= -1021
    -------------------------------------------
    taskid: 0 hostname: wn024.grid.auth.gr
    p0_30948: p4_error: : 1
    p0_30948: (33.830912) net_send: could not write to fd=5, errno = 32


    ==> rsl.out.0001 <==
    alloc_space_field: domain 1, 58257184 bytes allocated
    -------------- FATAL CALLED ---------------
    FATAL CALLED FROM FILE: <stdin> LINE: 71
    program wrf: error opening wrfinput_d01 for reading ierr= -1021
    -------------------------------------------
    taskid: 1 hostname: wn024.grid.auth.gr




>From time to time I get

    starting wrf task 7 of 16
    starting wrf task 9 of 16
    starting wrf task 13 of 16
    starting wrf task 0 of 16
    starting wrf task 1 of 16
    starting wrf task 2 of 16
    starting wrf task 3 of 16
    starting wrf task 4 of 16
    starting wrf task 5 of 16
    starting wrf task 6 of 16
    starting wrf task 8 of 16
    starting wrf task 15 of 16
    starting wrf task 10 of 16
    starting wrf task 11 of 16
    starting wrf task 12 of 16
    starting wrf task 14 of 16

    Killed by signal 2.
    Killed by signal 2.
    Killed by signal 2.
    Killed by signal 2.
    Killed by signal 2.



If you have something to suggest, or some kind of solution, I would be grateful.
Thank you for your time.

__
Lampros
_______________________________________________
Wrf-users mailing list
Wrf-users at ucar.edu
http://mailman.ucar.edu/mailman/listinfo/wrf-users

This message contains information which may be confidential, proprietary, privileged, or otherwise protected by law from disclosure or use by a third party.  If you have received this message in error, please contact us immediately and take the steps necessary to delete the message completely from your computer system.  Thank you.

Please visit ERM's web site: http://www.erm.com


More information about the Wrf-users mailing list