<div dir="ltr"><div class="gmail_default" style="font-family:trebuchet ms,sans-serif">Hi Dave,</div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif"><br></div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif">Thanks a lot for the insightful explanation. Yes, there are many missing values within the dataset. </div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif"><br></div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif">As a test, I performed manual averaging by using three nested loops (over each dimension) and summed the non-missing values while also counting the non-missing instances. The final result matched with the single step dim_avg_n result (I included coordinate arrays this time). This also confirmed that there is no built-in area-weighting in the dim_avg_n function. </div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif"><br></div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif">So, I then multiplied the data with a corresponding 2D array of weighting fractions that I had generated separately (these were just normalized areas per gridcell) before performing the single step dim_avg_n operation. </div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif"><br></div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif">Thanks again, and hope that this discussion is useful for other users in the future.</div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif"><br></div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif">best regards,</div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif"><br></div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif">Tabish </div><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><font face="trebuchet ms, sans-serif" color="#666666">--------------------------------------------------------------------------------------<br><span style="background-color:rgb(255,255,255)">Dr Tabish Ansari</span></font></div><div><font face="trebuchet ms, sans-serif" color="#666666">Research Associate </font></div><div><font face="trebuchet ms, sans-serif" color="#666666">Air Quality Modelling Group</font></div><div><span style="background-color:rgb(255,255,255)"><span style="font-weight:normal"><font face="trebuchet ms, sans-serif" color="#666666">Research Institute for Sustainability (RIFS) - Helmholtz Centre Potsdam </font></span></span></div><div><font face="trebuchet ms, sans-serif" color="#666666">Potsdam, Germany</font></div></div></div></div><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, 21 Aug 2024 at 17:13, Dave Allured - NOAA Affiliate <<a href="mailto:dave.allured@noaa.gov">dave.allured@noaa.gov</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Tabish, the differing results may be caused by missing values in your array, resulting in unequal weighting of the remaining values in 2-step averaging. This is basic math, not a particular NCL behavior. Offhand I would say that the single step method is the only correct method here.<br><br>No, dim_avg_n never performs a weighted average, whether or not coordinate arrays are included. As the documentation says, missing values are properly excluded from each individual averaging calculation.<div><br></div><div>Perhaps someone else can comment on the best way to perform a spatial weighted average.<br><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Aug 21, 2024 at 7:31 AM Tabish Ansari via ncl-talk <<a href="mailto:ncl-talk@mailman.ucar.edu" target="_blank">ncl-talk@mailman.ucar.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div style="font-family:"trebuchet ms",sans-serif">Hello,</div><div style="font-family:"trebuchet ms",sans-serif"><br></div><div style="font-family:"trebuchet ms",sans-serif">I've got a 3D variable called "monthlypm25NCP". It has 12 timesteps and 41 lat x 41 lon values. </div><div style="font-family:"trebuchet ms",sans-serif"><br></div><div style="font-family:"trebuchet ms",sans-serif">Here's the variable summary:</div><div style="font-family:"trebuchet ms",sans-serif"><font color="#0b5394">Variable: monthlypm25NCP<br>Type: float<br>Total Size: 80688 bytes<br> 20172 values<br>Number of Dimensions: 3<br>Dimensions and sizes: [12] x [41] x [41]<br>Coordinates:<br>Number Of Attributes: 1<br> _FillValue : 9.96921e+36<br></font></div><div style="font-family:"trebuchet ms",sans-serif"><br></div><div style="font-family:"trebuchet ms",sans-serif">Since this is a derived variable from another 3-hourly variable, the coordinate arrays are not retained. (I could have copied them over but I didn't in this instance.)</div><div style="font-family:"trebuchet ms",sans-serif"><br></div><div style="font-family:"trebuchet ms",sans-serif">Now, I want to average this 3D variable over the lat-lon grid to reduce it to a 1D variable containing only 12 values (one for each month). </div><div style="font-family:"trebuchet ms",sans-serif"><br></div><div style="font-family:"trebuchet ms",sans-serif">I tried using the dim_avg_n in two different ways to achieve this:</div><div style="font-family:"trebuchet ms",sans-serif"><br></div><div style="font-family:"trebuchet ms",sans-serif"><b>1. In 2-steps: </b></div><div style="font-family:"trebuchet ms",sans-serif">monthlypm25NCPsum1= dim_avg_n(monthlypm25NCP, 2)<br>monthlypm25NCPsum = dim_avg_n(monthlypm25NCPsum1, 1)<br></div><div style="font-family:"trebuchet ms",sans-serif">print(monthlypm25NCPsum+"")<br></div><div style="font-family:"trebuchet ms",sans-serif"><br></div><div style="font-family:"trebuchet ms",sans-serif"><font color="#0b5394">Result:</font></div><div style="font-family:"trebuchet ms",sans-serif"><font color="#0b5394">(0) 167.83<br>(1) 150.403<br>(2) 124.87<br>(3) 102.86<br>(4) 90.6969<br>(5) 80.4786<br>(6) 75.9811<br>(7) 71.1969<br>(8) 93.9213<br>(9) 117.72<br>(10) 136.6<br>(11) 139.528<br></font></div><div style="font-family:"trebuchet ms",sans-serif"><br></div><div style="font-family:"trebuchet ms",sans-serif"><b>2. In a single step:</b></div><div style="font-family:"trebuchet ms",sans-serif">monthlypm25NCPsum = dim_avg_n(monthlypm25NCP, (/1,2/)) <br>print(monthlypm25NCPsum+"")<br></div><div style="font-family:"trebuchet ms",sans-serif"><br></div><div style="font-family:"trebuchet ms",sans-serif"><font color="#0b5394">Result:</font></div><div style="font-family:"trebuchet ms",sans-serif"><font color="#0b5394">(0) 155.3<br>(1) 140.645<br>(2) 116.423<br>(3) 96.4202<br>(4) 84.4638<br>(5) 76.3392<br>(6) 72.5972<br>(7) 67.5716<br>(8) 88.2773<br>(9) 110.789<br>(10) 129.426<br>(11) 131.247</font><br></div><div style="font-family:"trebuchet ms",sans-serif"><br></div><div style="font-family:"trebuchet ms",sans-serif">I was expecting the results to be identical but strangely they're not, as you can see above. </div><div style="font-family:"trebuchet ms",sans-serif"><br></div><div style="font-family:"trebuchet ms",sans-serif">Can you please explain what's causing the difference here? </div><div style="font-family:"trebuchet ms",sans-serif"><br></div><div style="font-family:"trebuchet ms",sans-serif">Is it possible that in the second case, the dim_avg_n function is recognizing the lat-lon grid and using a weighted averaging based on actual grid area? But how can it recognize that when I have not included the coordinate arrays?</div><div style="font-family:"trebuchet ms",sans-serif"><br></div><div style="font-family:"trebuchet ms",sans-serif">Ultimately, I do want to perform a weighted averaging over the lat-lon grid and have obtained a separate matrix that contains gridcell area (I used the cdo tool to obtain it). Should I do a sparse matrix multiplication with the gridcell area before performing the grid averaging in NCL or does the dim_avg_n function take care of the grid area itself?</div><div style="font-family:"trebuchet ms",sans-serif"><br></div><div style="font-family:"trebuchet ms",sans-serif">Thanks</div><div style="font-family:"trebuchet ms",sans-serif">Tabish</div><div><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><font face="trebuchet ms, sans-serif" color="#666666">--------------------------------------------------------------------------------------<br><span style="background-color:rgb(255,255,255)">Dr Tabish Ansari</span></font></div><div><font face="trebuchet ms, sans-serif" color="#666666">Research Associate </font></div><div><font face="trebuchet ms, sans-serif" color="#666666">Air Quality Modelling Group</font></div><div><span style="background-color:rgb(255,255,255)"><span style="font-weight:normal"><font face="trebuchet ms, sans-serif" color="#666666">Research Institute for Sustainability (RIFS) - Helmholtz Centre Potsdam </font></span></span></div><div><font face="trebuchet ms, sans-serif" color="#666666">Potsdam, Germany</font></div></div></div></div></div>
</blockquote></div></div></div>
</blockquote></div>