[mpas-developers] 1/10 degree problems
Xylar Asay-Davis
xylar at lanl.gov
Fri Apr 16 15:22:59 MDT 2010
It sounds like there isn't an output.nc to look at at all.
-Xylar
On 4/16/10 3:13 PM, Michael Duda wrote:
> Hi, Mat.
>
> I'd agree with Xylar's assessment that the F is probably not an
> indication of anything wrong -- rather, just indicating that none
> of the timers were active at the point where their times were
> printed.
>
> I wonder whether there could be some problem with the mesh
> decomposition file, graph.info.part.64, that is causing cells to
> not be assigned to any MPI task? Have you checked whether the
> fields in the output.nc file are garbage or not -- or perhaps
> whether all time periods look identical to the initial state?
>
> Michael
>
>
> On Fri, Apr 16, 2010 at 02:27:01PM -0600, Xylar Asay-Davis wrote:
>
>> Mat,
>>
>> Could you send the namelist.input file you're using, too? Who knows,
>> maybe something useful there?
>>
>> I don't think the F is an indication of the problem. If I'm reading the
>> code correctly, it just indicates that the timer (not the code) is no
>> longer running. If you did a call to the code that prints the timing
>> information before calling timer_stop(), then this flag would be T instead.
>>
>>
>> -Xylar
>>
>> On 4/16/10 2:16 PM, Mathew Maltrud wrote:
>>
>>> Hi Michael and Todd--
>>>
>>> i've been trying to run the 1/10 dipole POP grid in the sw
>>> configuration and am getting something i haven't seen before. all
>>> appears normal--all mpi process are going, etc. the *.err files say
>>> it is looping over timesteps, though clearly nothing is being done
>>> (happening too fast).there's no output.nc file. here are examples
>>> of the log.0000.* files (running on 64 cores):
>>>
>>> mm at cy-2.lanl.gov {10}% tail log.0000.err
>>> Doing timestep 11
>>> Doing timestep 12
>>> Doing timestep 13
>>> Doing timestep 14
>>> Doing timestep 15
>>> Doing timestep 16
>>> Doing timestep 17
>>> Doing timestep 18
>>> Doing timestep 19
>>> Doing timestep 20
>>> mm at cy-2.lanl.gov {11}% tail log.0000.out
>>>
>>> TIMINGS (process:event,running,cpu,wall,100*(wall/total wall))
>>> 0 : total time F 0.00000 196.55210
>>>
>>> 0 : initialize F 0.00000 67.82460 34.51
>>> 0 : time integration F 0.00000 11.05870 5.63
>>>
>>> so the 'F' is a clue, but i don't know what it means. note that the
>>> grid.nc file looks ok, and i successfully ran the 4/10 version of this
>>> grid earlier this week.
>>>
>>> any hints? maybe not enough memory? there are about 6 million cells...
>>>
>>> thanks...
>>> -mat
>>> _______________________________________________
>>> mpas-developers mailing list
>>> mpas-developers at mailman.ucar.edu
>>> http://mailman.ucar.edu/mailman/listinfo/mpas-developers
>>>
>>>
>>
>> --
>>
>> ***********************
>> Xylar S. Asay-Davis
>> E-mail: xylar at lanl.gov
>> Phone: (505) 606-0025
>> Fax: (505) 665-2659
>> CNLS, MS B258
>> Los Alamos National Laboratory
>> Los Alamos, NM 87545
>> ***********************
>>
>>
>> _______________________________________________
>> mpas-developers mailing list
>> mpas-developers at mailman.ucar.edu
>> http://mailman.ucar.edu/mailman/listinfo/mpas-developers
>>
> _______________________________________________
> mpas-developers mailing list
> mpas-developers at mailman.ucar.edu
> http://mailman.ucar.edu/mailman/listinfo/mpas-developers
>
--
***********************
Xylar S. Asay-Davis
E-mail: xylar at lanl.gov
Phone: (505) 606-0025
Fax: (505) 665-2659
CNLS, MS B258
Los Alamos National Laboratory
Los Alamos, NM 87545
***********************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucar.edu/pipermail/mpas-developers/attachments/20100416/74bd55cf/attachment.html
More information about the mpas-developers
mailing list