[ncl-talk] Calculate percentile

Giorgio Graffino g.graffino at tim.it
Wed Mar 16 07:16:15 MDT 2022


Thanks Jonathan and Dave for your help.

I think I've found a solution to my problem. The function below computes 
the percentile given by the user. Nothing fancy, but it serves the 
purpose.

undef("computePercentile")
function computePercentile (x:numeric, perc:numeric)
; Compute custom percentile (%) from 2-D field (adapted from the 
function stat_dispersion)
;
local statx, work, nwork, numwork
begin
if (.not.all(ismissing(x))) then
  work = ndtooned(x)
  nwork = dimsizes(work)
  qsort(work) ; reordered from lowest to highest
  numwork = dim_num(.not.ismissing(work))
  statx = work( toint((perc/100.)*numwork-1) )
else
  statx = x at _FillValue
end if
return(statx)
delete(work)
delete(nwork)
delete(numwork)
end
Cheers,

Giorgio


    ------ Messaggio Originale ------
    Da: dave.allured at noaa.gov
    A: g.graffino at tim.it
Cc: ncl-talk at ucar.edu
    Inviato: martedì 15 marzo 2022 16:50
    Oggetto: Re: [ncl-talk] Calculate percentile


Georgio, you said you want to compute the 70th percentile month by 
month.  I recommend a single outer loop over time, and do all the 
percentile calculations independently inside the time loop.  Sort the 
data for only one month at a time, excluding missing values as needed; 
get the 70th percent index within that month; and that gives you the 
70th percentile for the current month.


As Jonathan said, you need to use ndtooned() to reshape the input data 
to 1-D for input to NCL's sort routines.  Do this independently for each 
month, inside the time loop.


A long time ago, NCL's dim_pqsort function was faster than the qsort 
function, due to internal coding.  If your data set is large, you might 
want to check on which function is currently faster.



On Tue, Mar 15, 2022 at 7:02 AM Buzan, Jonathan via ncl-talk 
<ncl-talk at mailman.ucar.edu <mailto:ncl-talk at mailman.ucar.edu> > wrote:

  Watchout, there’s a typo. The indexes need to match the lat and lon.


mastervar(j,k) = work( min((/nwork-1, nwork*70/100-1 /)) ) ; 70%





Change to:

mastervar(k,j) = work( min((/nwork-1, nwork*70/100-1 /)) ) ; 70%






On Mar 15, 2022, at 13:55, Buzan, Jonathan via ncl-talk 
<ncl-talk at mailman.ucar.edu <mailto:ncl-talk at mailman.ucar.edu> > wrote:



  Hi Giorgio,


The code for the stat_dispersion is in 
$NCARG_ROOT/lib/ncarg/nclscripts/csm/contributed.ncl

(Search for stat_dispersion for the explicit function)



Here are some snippets to help. Ultimately, the stat_dispersion operates 
on 1d fields, and you have to put your data into that format. Highly 
recommend that you have more than 100 data points in your time dimension 
or the function will not work.



mastervar = ; new array that is the stat dispersion by lat by lon

do k = 0, dimsizes(lat)-1

do j = 0, dimsizes(lon)-1

z = data(:,k,j)
  work = ndtooned(z)
nwork = dimsizes(work)

qsort(work) ; reordered from lowest to highest

mastervar(j,k) = work( min((/nwork-1, nwork*70/100-1 /)) ) ; 70%

delete(z)

delete(work)

delete(nwork)


end do


end do




Cheers,

-Jonathan





On Mar 15, 2022, at 13:25, Giorgio Graffino via ncl-talk 
<ncl-talk at mailman.ucar.edu <mailto:ncl-talk at mailman.ucar.edu> > wrote:


Dear NCLers,
I have a time,lat,lon monthly-mean SST field and I want to compute the 
70th percentile month by month. I'm trying to adapt Dennis's suggestion 
given here 
(https://www.ncl.ucar.edu/Support/talk_archives/2013/0954.html 
<https://www.ncl.ucar.edu/Support/talk_archives/2013/0954.html> ),  but 
I'm ending up with a single 2-D percentile field, calculated across the 
entire time period. The problem is that, since it's historical SST, most 
of the spatial domain is above the threshold at the end of the time 
period, because of global warming.

  I'd like to have something similar to stat_dispersion 
(https://www.ncl.ucar.edu/Document/Functions/Contributed/stat_dispersion.shtml 
<https://www.ncl.ucar.edu/Document/Functions/Contributed/stat_dispersion.shtml> 
),  but for a custom percentile and without detrending the data. Is 
there a way to do that? Here is a snippet of my code:
temp_sort = dim_pqsort_n(temp,2,0) ; (time, lat, lon) -> (time, lat, 
lon)

temp_num = dim_num_n(.not.ismissing(temp_sort),0) ; (time, lat, lon) -> 
(lat, lon)
do n = 0, ntim-1
temp_warmest30(n,:,:) = 
where(temp_sort(n,:,:).gt.floor(0.7*temp_num),temp(n,:,:),temp at _FillValue) 
; (time, lat, lon) -> (time, lat, lon)
end do
Thanks,
Giorgio _______________________________________________
  ncl-talk mailing list
  ncl-talk at mailman.ucar.edu <mailto:ncl-talk at mailman.ucar.edu>
  List instructions, subscriber options, unsubscribe:
  https://mailman.ucar.edu/mailman/listinfo/ncl-talk 
<https://mailman.ucar.edu/mailman/listinfo/ncl-talk>









-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.ucar.edu/pipermail/ncl-talk/attachments/20220316/746f077c/attachment.html>


More information about the ncl-talk mailing list