[ncl-talk] Query about probability density functions

Fri Sep 13 09:25:20 MDT 2024

I've found a way to normalize the PDF and make it like Python and IDL. 
Here is the modified function for reference. I removed the call to the 
Fortran subroutine, so now the function it's a bit slower than the 
original version, and made all values float instead of double.
undef("pdfx_norm")
function pdfx_norm(x:numeric, nbin[1]:integer, opt:logical)
local nGood, nbins, xMin, xMax, mnmxint, xSpace \
    ,bin, pdf, nTot, XMIN, XMAX, nLoOut, nHiOut
begin
 if (nbin.le.2) then
     nbins = 25             ; default 
 else
     nbins = nbin
 end if
 nGood = num(.not.ismissing(x))
 if (nGood.lt.3) then
     print("pdfx_norm: nGood="+nGood+" : Need more 
non-missing points")
     pdf = new( nbins, "double", getFillValue(x))
     return( pdf )
 end if
 XMIN = min(x)*1.
 XMAX = max(x)*1.
 xMin = 0.
 xMax = 0.
 if (opt .and. isatt(opt,"bin_min")) then  
     xMin = opt at bin_min                    ; user set
 else
     xMin = XMIN                           ; calculate
 end if
 if (opt .and. isatt(opt,"bin_max")) then  
     xMax = opt at bin_max                    ; user set
 else
     xMax = XMAX                           ; calculate
 end if
 if (opt .and. isatt(opt,"bin_nice")) then ; nice xMin, xMax
     outside = False
     if (isatt(opt,"bin_nice_outside")) then
         outside = opt at bin_nice_outside
     end if
     mnmxint = nice_mnmxintvl( XMIN, XMAX, nbins, outside)
     xMin    = mnmxint(0)
     xMax    = mnmxint(1)
     xSpace  = mnmxint(2)
   ;;nbins   = round( (xMax-xMin)/xSpace , 3) 
 end if
;;dbin        = (xMax-xMin)/(nbins-1)          ; 5.2.0   
 dbin        = (xMax-xMin)/nbins   
 binBound    = xMin + ispan(0,nbins,1)*dbin
 binBound(nbins) = xMax                       ; avoid roundoff           

 binCenter   = (binBound(0:nbins-1) + binBound(1:nbins))*0.5
 binBoundMin = binBound(0)           
 binBoundMax = binBound(nbins)      
 pdf         = new( nbins,typeof(x), getFillValue(x))
 pdf         = 0.

do nb=0,nbins-1      
  pdf(nb) = num( x.ge.binBound(nb) .and. x.lt.binBound(nb+1) )
  if (nb.eq.(nbins-1)) then                 ; last bin 
      pdf(nb) = pdf(nb) + num( x.eq.binBound(nb+1) )   ; include last 
bound
  end if
end do
;;----
 pdf!0    = "
 pdf&x    = binCenter
                                      ; max possible in data     
 nMax     = num(x.ge.XMIN .and. x.le.XMAX)
                                      ; actual number used
 nUse     = num(x.ge.binBoundMin .and. x.le.binBoundMax)
 nLoOut   = num(x.lt.binBoundMin)     ; number outliers
 nHiOut   = num(x.gt.binBoundMax)

;  pdf      = pdf/nMax*1.e2                      ; original 
percent-frequency PDF
 pdf      = pdf/sum(pdf)/dbin                 ; modified normalised PDF

 pdf at bin_center     = binCenter
 pdf at bin_bounds     = binBound
 pdf at bin_bound_min  = binBoundMin
 pdf at bin_bound_max  = binBoundMax
 pdf at bin_spacing    = dbin            ; binBound(2)-binBound(1)
 pdf at nbins          = nbins
 pdf at nMax           = nMax
 pdf at nUse           = nUse
 if (nLoOut.gt.0 .or. nHiOut.gt.0) then
     pdf at nLoOut     = nLoOut       
     pdf at nHiOut     = nHiOut
 end if
 pdf at long_name      = "
 if (isatt(x,"long_name")) then
     pdf at long_name  = "
 end if

 return( pdf )
end

------ Messaggio Originale ------
Da: ncl-talk at mailman.ucar.edu
A: dave.allured at noaa.gov Cc: ncl-talk at ucar.edu
Inviato: mercoledì, settembre 11º 2024, 05:35 PM
Oggetto: Re: [ncl-talk] Query about probability density functions

Thanks Dave,

The Fortran code looks straightforward. The last bit is for converting 
the pdf into frequency ("if flag is set", according to the 
comment), but the same code is also present in the contributed.ncl code. 
I'm assuming the flag isn't set, otherwise the conversion would be done 
twice. 

Unfortunately I'm not getting any closer to solve this problem. Simply 
dividing the NCL pdfs by 100 make them much smaller than Python or IDL 
pdfs. I don't know if anyone ever encountered this issue before.

Cheers,
Giorgio

------ Messaggio Originale ------
Da: dave.allured at noaa.gov
A: g.graffino at tim.it Cc: ncl-talk at ucar.edu
Inviato: mercoledì, settembre 11º 2024, 04:58 PM
Oggetto: Re: [ncl-talk] Query about probability density functions

Giorgio, the fortran routine is at ni/src/lib/nfpfort/xy1pdf77.f in the 
NCL v6.6.2 source code.

On Wed, Sep 11, 2024 at 8:01 AM Giorgio Graffino via ncl-talk 
<ncl-talk at mailman.ucar.edu <mailto:ncl-talk at mailman.ucar.edu> > wrote:

Hi Dennis,
Thank you for pointing out that detail.
I've found the code of the pdfx function in here 
(https://github.com/NCAR/ncl/blob/develop/ni/src/examples/gsun/contributed.ncl 
<https://github.com/NCAR/ncl/blob/develop/ni/src/examples/gsun/contributed.ncl> 
) and the frequency is computed as pdf = 100d0*pdf/nMax. I tried to 
normalize the pdf by multiplying the frequency pdf by the number of data 
and divide by 100, but the results is very different depending on the 
number of data to compute the pdf. Also, the results are different from 
similar computations using Python, especially with large number of data. 
Without knowing what the Fortran routine does (and I can't find it), I 
don't know how to normalize NCL to match Python.
Any suggestions?
Cheers,
Giorgio

------ Messaggio Originale ------
Da: shea at ucar.edu <mailto:shea at ucar.edu>
A: g.graffino at tim.it <mailto:g.graffino at tim.it>  Cc: ncl-talk at ucar.edu 
<mailto:ncl-talk at ucar.edu>
Inviato: lunedì, settembre 9º 2024, 08:15 PM
Oggetto: Re: [ncl-talk] Query about probability density functions

 From the pdfx 
<https://www.ncl.ucar.edu/Document/Functions/Contributed/pdfx.shtml> 
documentation:  "The PDF units are percent [%]."

Divide by 100.

On Mon, Sep 9, 2024 at 10:26 AM Giorgio Graffino via ncl-talk 
<ncl-talk at mailman.ucar.edu <mailto:ncl-talk at mailman.ucar.edu> > wrote:

Hello NCL experts,
I'm using pdfx to compute probability density functions 
(https://www.ncl.ucar.edu/Document/Functions/Contributed/pdfx.shtml 
<https://www.ncl.ucar.edu/Document/Functions/Contributed/pdfx.shtml> ). 
I noticed that the cumulative density functions sum up to 100 in NCL, 
instead of 1 as per definition. Why is that? How can I normalize them 
back to 1?

Thanks for your help.
Cheers,
Giorgio

_______________________________________________
ncl-talk mailing list
ncl-talk at mailman.ucar.edu
List instructions, subscriber options, unsubscribe:
https://mailman.ucar.edu/mailman/listinfo/ncl-talk 
<https://mailman.ucar.edu/mailman/listinfo/ncl-talk>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.ucar.edu/pipermail/ncl-talk/attachments/20240913/e6755e08/attachment.htm>