<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 14 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:Tahoma;
        panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
        {font-family:"Book Antiqua";
        panose-1:2 4 6 2 5 3 5 3 3 4;}
@font-face
        {font-family:"Bodoni MT";
        panose-1:2 7 6 3 8 6 6 2 2 3;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
span.EmailStyle17
        {mso-style-type:personal;
        font-family:"Calibri","sans-serif";
        color:windowtext;}
span.EmailStyle18
        {mso-style-type:personal-reply;
        font-family:"Calibri","sans-serif";
        color:#1F497D;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}
@page WordSection1
        {size:8.5in 11.0in;
        margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><span style="font-family:"Book Antiqua","serif";color:#1F497D">Hi Brian – Depending upon the processor, application and task, the optimum number of processors varies while performing the parallelization. In your case, the optimum number
of processors require to perform your task is 4, going anything over 4 processors takes too much of overhead and lose performance during parallelization.
<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Book Antiqua","serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Book Antiqua","serif";color:#1F497D">Regards,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Book Antiqua","serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Book Antiqua","serif";color:#1F497D">Surya Ramaswamy<o:p></o:p></span></b></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Book Antiqua","serif";color:black">ERM<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Bodoni MT","serif";color:black">75 Valley Stream Parkway Suite 200<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Bodoni MT","serif";color:black">Malvern, PA 19355<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Bodoni MT","serif";color:black">484-913-0300 (main)<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Bodoni MT","serif";color:black">484-913-0301 (fax)<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<div>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> wrf-users-bounces@ucar.edu [mailto:wrf-users-bounces@ucar.edu]
<b>On Behalf Of </b>Andrus, Brian Contractor<br>
<b>Sent:</b> Friday, July 24, 2015 2:52 PM<br>
<b>To:</b> wrf-users@ucar.edu<br>
<b>Subject:</b> [Wrf-users] Optimizing OMP_NUM_THREADS<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Hello,<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">I am a little confused about running wrf. I have built it and am following the testing for the Jan 2000 data set per:
<a href="http://www2.mmm.ucar.edu/wrf/OnLineTutorial/CASES/JAN00/wrf.htm">http://www2.mmm.ucar.edu/wrf/OnLineTutorial/CASES/JAN00/wrf.htm</a><o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Everything does work and I can get proper output, however I have noticed unusual timing that I cannot figure why.<o:p></o:p></p>
<p class="MsoNormal">I am running on a system with 64 cores and 256GB RAM.<o:p></o:p></p>
<p class="MsoNormal">I compiled with <o:p></o:p></p>
<p class="MsoNormal"> 1) compile/pgi/15.7 2) mpi/openmpi/1.8.5 3) app/netcdf/4.3.3.1<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Options were for dm+sm (option 55 pgf90/pgcc) and basic nesting (option 1)<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Now I am running wrf and producing wrfout_d01_2000-01-24_12:00:00<o:p></o:p></p>
<p class="MsoNormal">What is odd is the extreme variation with different OMP_NUM_THREADS set.<o:p></o:p></p>
<p class="MsoNormal">It seems it is best at 4. Any more or less and the time it takes increases.<o:p></o:p></p>
<p class="MsoNormal">Setting to 8 is close to the same as setting it to 2<o:p></o:p></p>
<p class="MsoNormal">Setting it to 64 and it takes almost 4 times as long as 4..??<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Here are some timings:<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><i>[bdandrus@compute-7-3 em_real]$ export OMP_NUM_THREADS=1<o:p></o:p></i></p>
<p class="MsoNormal"><i>[bdandrus@compute-7-3 em_real]$ time ./wrf.exe<o:p></o:p></i></p>
<p class="MsoNormal"><i>starting wrf task 0 of 1<o:p></o:p></i></p>
<p class="MsoNormal"><i><o:p> </o:p></i></p>
<p class="MsoNormal"><i>real 2m51.743s<o:p></o:p></i></p>
<p class="MsoNormal"><i>user 2m39.087s<o:p></o:p></i></p>
<p class="MsoNormal"><i>sys 0m12.277s<o:p></o:p></i></p>
<p class="MsoNormal"><i>[bdandrus@compute-7-3 em_real]$ export OMP_NUM_THREADS=2<o:p></o:p></i></p>
<p class="MsoNormal"><i>[bdandrus@compute-7-3 em_real]$ time ./wrf.exe<o:p></o:p></i></p>
<p class="MsoNormal"><i>starting wrf task 0 of 1<o:p></o:p></i></p>
<p class="MsoNormal"><i><o:p> </o:p></i></p>
<p class="MsoNormal"><i>real 1m49.172s<o:p></o:p></i></p>
<p class="MsoNormal"><i>user 3m15.582s<o:p></o:p></i></p>
<p class="MsoNormal"><i>sys 0m19.015s<o:p></o:p></i></p>
<p class="MsoNormal"><i>[bdandrus@compute-7-3 em_real]$ export OMP_NUM_THREADS=4<o:p></o:p></i></p>
<p class="MsoNormal"><i>[bdandrus@compute-7-3 em_real]$ time ./wrf.exe<o:p></o:p></i></p>
<p class="MsoNormal"><i>starting wrf task 0 of 1<o:p></o:p></i></p>
<p class="MsoNormal"><i><o:p> </o:p></i></p>
<p class="MsoNormal"><i>real 1m27.357s<o:p></o:p></i></p>
<p class="MsoNormal"><i>user 4m42.111s<o:p></o:p></i></p>
<p class="MsoNormal"><i>sys 0m35.187s<o:p></o:p></i></p>
<p class="MsoNormal"><i>[bdandrus@compute-7-3 em_real]$ export OMP_NUM_THREADS=8<o:p></o:p></i></p>
<p class="MsoNormal"><i>[bdandrus@compute-7-3 em_real]$ time ./wrf.exe<o:p></o:p></i></p>
<p class="MsoNormal"><i>starting wrf task 0 of 1<o:p></o:p></i></p>
<p class="MsoNormal"><i><o:p> </o:p></i></p>
<p class="MsoNormal"><i>real 1m35.480s<o:p></o:p></i></p>
<p class="MsoNormal"><i>user 8m20.966s<o:p></o:p></i></p>
<p class="MsoNormal"><i>sys 1m13.376s<o:p></o:p></i></p>
<p class="MsoNormal"><i>[bdandrus@compute-7-3 em_real]$ export OMP_NUM_THREADS=16<o:p></o:p></i></p>
<p class="MsoNormal"><i>[bdandrus@compute-7-3 em_real]$ time ./wrf.exe<o:p></o:p></i></p>
<p class="MsoNormal"><i>starting wrf task 0 of 1<o:p></o:p></i></p>
<p class="MsoNormal"><i><o:p> </o:p></i></p>
<p class="MsoNormal"><i>real 1m52.862s<o:p></o:p></i></p>
<p class="MsoNormal"><i>user 15m43.787s<o:p></o:p></i></p>
<p class="MsoNormal"><i>sys 2m4.978s<o:p></o:p></i></p>
<p class="MsoNormal"><i>[bdandrus@compute-7-3 em_real]$ export OMP_NUM_THREADS=64
<o:p></o:p></i></p>
<p class="MsoNormal"><i>[bdandrus@compute-7-3 em_real]$ time ./wrf.exe<o:p></o:p></i></p>
<p class="MsoNormal"><i>starting wrf task 0 of 1<o:p></o:p></i></p>
<p class="MsoNormal"><i><o:p> </o:p></i></p>
<p class="MsoNormal"><i>real 5m54.857s<o:p></o:p></i></p>
<p class="MsoNormal"><i>user 197m37.807s<o:p></o:p></i></p>
<p class="MsoNormal"><i>sys 7m57.993s<o:p></o:p></i></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Any ideas as to what would cause that?<o:p></o:p></p>
<p class="MsoNormal">I have all but given up on using mpirun as that seems to make it take HOURS no matter how many procs/threads I set. I do see it running 100%cpu on each core it is assigned when I do that, but it rarely writes anything.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><span style="color:#1F497D">Brian Andrus<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">ITACS/Research Computing<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">Naval Postgraduate School<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">Monterey, California<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">voice: 831-656-6238<o:p></o:p></span></p>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<br>
<hr>
<font face="Arial" color="Black" size="1"><br>
This message contains information which may be confidential, proprietary, privileged, or otherwise protected by law from disclosure or use by a third party. If you have received this message in error, please contact us immediately and take the steps necessary
to delete the message completely from your computer system. Thank you.<br>
<br>
Please visit ERM's web site: http://www.erm.com<br>
</font>
</body>
</html>