[ncl-talk] Proper percentile function

Mon Mar 23 13:03:16 MDT 2020

Hello,

I am uncertain if this is proper mail thread to submit contributions. If
not, please direct me to where I could send the following contribution.

There is no proper computation of percentiles in NCL presently. Some users
have created functions that loosly estimate percentiles based on index,
however these are not accurate for smaller data arrays. Additionally, there
is no support for missing values when computing percentiles in NCL.

Included below is a function that accurately determines percentiles for
small and large data arrays, and also supports missing values. This
function has been tested against other percentile computing software, and
shows identical results between them. I would like to recommend the
inclusion of this function in future releases of NCL.

; Percentile function
; For 1D X variables and -9999. missing values only
undef("Percentile")
function Percentile(x:numeric,P:float)
begin
; Convert incoming percentile value to percent
if(P.gt.0) then
 P = P/100.
end if
; Set and count the missing values, if any
x at _FillValue = -9999.
nm = num(ismissing(x))
; Sort values
ib = dim_pqsort(x,2)
; Remove missing value indices from sorted array
y = x(nm:dimsizes(x)-1)
; For determining percentiles, we need to know if the resultant
; index will be an integer or float. With the multiplication of P,
; the array index is automatically set to be float, and we lose all
; information for what the "true" index would be. So, we introduce
; two index values and a difference to determine whether the "true"
; index would be a float or integer value. A conditional check,
; determines if the difference is zero, i.e., an integer, or if the
; difference is not zero, i.e., a float. Float index values are the
; percentile value, however integer index values must be averaged
; over the n and n+1 indexes to determine the percentile.
; *Note: index arrays are subtracted by 1, as the index begins at
;        0, and not at 1.
; Index integer for value
index = toint(ceil(P*dimsizes(y)))-1
; Second index float for value
index2 = P*dimsizes(y)-1
; Index difference
i_diff = index-index2
if (i_diff.ne.0) then
  return(y(index))
else
  y_a = (y(index)+y(index+1))/2
  return(y_a)
end if
end

Cheers,
-------------------------------------------------------------
Brandon J Fisel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.ucar.edu/pipermail/ncl-talk/attachments/20200323/8f3dfe6c/attachment.html>