[ncl-talk] Inconsistency in dimension averaging using dim_avg_n
Tabish Ansari
tabishumaransari at gmail.com
Wed Aug 21 09:58:48 MDT 2024
Hi Dave,
Thanks a lot for the insightful explanation. Yes, there are many missing
values within the dataset.
As a test, I performed manual averaging by using three nested loops (over
each dimension) and summed the non-missing values while also counting the
non-missing instances. The final result matched with the single step
dim_avg_n result (I included coordinate arrays this time). This also
confirmed that there is no built-in area-weighting in the dim_avg_n
function.
So, I then multiplied the data with a corresponding 2D array of weighting
fractions that I had generated separately (these were just normalized areas
per gridcell) before performing the single step dim_avg_n operation.
Thanks again, and hope that this discussion is useful for other users in
the future.
best regards,
Tabish
--------------------------------------------------------------------------------------
Dr Tabish Ansari
Research Associate
Air Quality Modelling Group
Research Institute for Sustainability (RIFS) - Helmholtz Centre Potsdam
Potsdam, Germany
On Wed, 21 Aug 2024 at 17:13, Dave Allured - NOAA Affiliate <
dave.allured at noaa.gov> wrote:
> Tabish, the differing results may be caused by missing values in your
> array, resulting in unequal weighting of the remaining values in 2-step
> averaging. This is basic math, not a particular NCL behavior. Offhand I
> would say that the single step method is the only correct method here.
>
> No, dim_avg_n never performs a weighted average, whether or not coordinate
> arrays are included. As the documentation says, missing values are
> properly excluded from each individual averaging calculation.
>
> Perhaps someone else can comment on the best way to perform a spatial
> weighted average.
>
>
> On Wed, Aug 21, 2024 at 7:31 AM Tabish Ansari via ncl-talk <
> ncl-talk at mailman.ucar.edu> wrote:
>
>> Hello,
>>
>> I've got a 3D variable called "monthlypm25NCP". It has 12 timesteps and
>> 41 lat x 41 lon values.
>>
>> Here's the variable summary:
>> Variable: monthlypm25NCP
>> Type: float
>> Total Size: 80688 bytes
>> 20172 values
>> Number of Dimensions: 3
>> Dimensions and sizes: [12] x [41] x [41]
>> Coordinates:
>> Number Of Attributes: 1
>> _FillValue : 9.96921e+36
>>
>> Since this is a derived variable from another 3-hourly variable, the
>> coordinate arrays are not retained. (I could have copied them over but I
>> didn't in this instance.)
>>
>> Now, I want to average this 3D variable over the lat-lon grid to reduce
>> it to a 1D variable containing only 12 values (one for each month).
>>
>> I tried using the dim_avg_n in two different ways to achieve this:
>>
>> *1. In 2-steps: *
>> monthlypm25NCPsum1= dim_avg_n(monthlypm25NCP, 2)
>> monthlypm25NCPsum = dim_avg_n(monthlypm25NCPsum1, 1)
>> print(monthlypm25NCPsum+"")
>>
>> Result:
>> (0) 167.83
>> (1) 150.403
>> (2) 124.87
>> (3) 102.86
>> (4) 90.6969
>> (5) 80.4786
>> (6) 75.9811
>> (7) 71.1969
>> (8) 93.9213
>> (9) 117.72
>> (10) 136.6
>> (11) 139.528
>>
>> *2. In a single step:*
>> monthlypm25NCPsum = dim_avg_n(monthlypm25NCP, (/1,2/))
>> print(monthlypm25NCPsum+"")
>>
>> Result:
>> (0) 155.3
>> (1) 140.645
>> (2) 116.423
>> (3) 96.4202
>> (4) 84.4638
>> (5) 76.3392
>> (6) 72.5972
>> (7) 67.5716
>> (8) 88.2773
>> (9) 110.789
>> (10) 129.426
>> (11) 131.247
>>
>> I was expecting the results to be identical but strangely they're not, as
>> you can see above.
>>
>> Can you please explain what's causing the difference here?
>>
>> Is it possible that in the second case, the dim_avg_n function is
>> recognizing the lat-lon grid and using a weighted averaging based on actual
>> grid area? But how can it recognize that when I have not included the
>> coordinate arrays?
>>
>> Ultimately, I do want to perform a weighted averaging over the lat-lon
>> grid and have obtained a separate matrix that contains gridcell area (I
>> used the cdo tool to obtain it). Should I do a sparse matrix multiplication
>> with the gridcell area before performing the grid averaging in NCL or does
>> the dim_avg_n function take care of the grid area itself?
>>
>> Thanks
>> Tabish
>>
>> --------------------------------------------------------------------------------------
>> Dr Tabish Ansari
>> Research Associate
>> Air Quality Modelling Group
>> Research Institute for Sustainability (RIFS) - Helmholtz Centre Potsdam
>> Potsdam, Germany
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.ucar.edu/pipermail/ncl-talk/attachments/20240821/4091ea93/attachment.htm>
More information about the ncl-talk
mailing list