<p><span style="font-family:Arial;font-size:12pt;">Hi Dave,</span></p><p> </p><p><span style="font-family:Arial;font-size:12pt;">Unfortunately I have no idea. I think Python and IDL functions do a kind of normalisation similar to pdf = pdf/sum(pdf)/dbin, but I don't know whether they use a different formula than NCL.</span><br><br> </p><blockquote><p>------ Messaggio Originale ------<br>Da: dave.allured@noaa.gov<br>A: g.graffino@tim.it Cc: ncl-talk@ucar.edu<br>Inviato: venerdì, settembre 13º 2024, 05:42 PM<br>Oggetto: Re: [ncl-talk] Query about probability density functions<br> </p><div dir="ltr"><div>Giorgio, can you briefly explain what are the basic numerical differences between the NCL and Python/IDL versions?  Please explain mathematically, not with code.</div><div dir="ltr"> </div><p><br> </p><div class="gmail_quote"><div class="gmail_attr" dir="ltr">On Fri, Sep 13, 2024 at 9:25 AM Giorgio Graffino <<a href="mailto:g.graffino@tim.it"><span class="wt_Email">g.graffino@tim.it</span></a>> wrote:<br> </div><blockquote class="gmail_quote" style="border-left-color:rgb(204,204,204);border-left-style:solid;border-left-width:1.0px;margin:0.0px 0.0px 0.0px 0.8ex;padding-left:1.0ex;"><p><span style="font-family:Arial;font-size:12.0pt;">I've found a way to normalize the PDF and make it like Python and IDL. </span><br><br><span style="font-family:Arial;font-size:12.0pt;">Here is the modified function for reference. I removed the call to the Fortran subroutine, so now the function it's a bit slower than the original version, and made all values float instead of double.</span><br><br>undef("pdfx_norm")<br>function pdfx_norm(x:numeric, nbin[1]:integer, opt:logical)<br>local nGood, nbins, xMin, xMax, mnmxint, xSpace \<br>    ,bin, pdf, nTot, XMIN, XMAX, nLoOut, nHiOut<br>begin</p><p> if (nbin.le.2) then<br>     nbins = 25             ; default <br> else<br>     nbins = nbin<br> end if</p><p> nGood = num(.not.ismissing(x))<br> if (nGood.lt.3) then<br>     print("pdfx_norm: nGood="+nGood+" : Need more non-missing points")<br>     pdf = new( nbins, "double", getFillValue(x))<br>     return( pdf )<br> end if</p><p> XMIN = min(x)*1.<br> XMAX = max(x)*1.</p><p> xMin = 0.<br> xMax = 0.</p><p> if (opt .and. isatt(opt,"bin_min")) then  <br>     xMin = opt@bin_min                    ; user set<br> else<br>     xMin = XMIN                           ; calculate<br> end if</p><p> if (opt .and. isatt(opt,"bin_max")) then  <br>     xMax = opt@bin_max                    ; user set<br> else<br>     xMax = XMAX                           ; calculate<br> end if</p><p> if (opt .and. isatt(opt,"bin_nice")) then ; nice xMin, xMax<br>     outside = False<br>     if (isatt(opt,"bin_nice_outside")) then<br>         outside = opt@bin_nice_outside<br>     end if<br>     mnmxint = nice_mnmxintvl( XMIN, XMAX, nbins, outside)<br>     xMin    = mnmxint(0)<br>     xMax    = mnmxint(1)<br>     xSpace  = mnmxint(2)<br>   ;;nbins   = round( (xMax-xMin)/xSpace , 3) <br> end if</p><p>;;dbin        = (xMax-xMin)/(nbins-1)          ; 5.2.0   <br> dbin        = (xMax-xMin)/nbins   <br> binBound    = xMin + ispan(0,nbins,1)*dbin<br> binBound(nbins) = xMax                       ; avoid roundoff              <br> binCenter   = (binBound(0:nbins-1) + binBound(1:nbins))*0.5</p><p> binBoundMin = binBound(0)           <br> binBoundMax = binBound(nbins)      </p><p> pdf         = new( nbins,typeof(x), getFillValue(x))<br> pdf         = 0.<br> <br>do nb=0,nbins-1      <br>  pdf(nb) = num( x.ge.binBound(nb) .and. x.lt.binBound(nb+1) )<br>  if (nb.eq.(nbins-1)) then                 ; last bin <br>      pdf(nb) = pdf(nb) + num( x.eq.binBound(nb+1) )   ; include last bound<br>  end if<br>end do<br>;;----</p><p> pdf!0    = "x"<br> pdf&x    = binCenter<br>                                      ; max possible in data     <br> nMax     = num(x.ge.XMIN .and. x.le.XMAX)<br>                                      ; actual number used<br> nUse     = num(x.ge.binBoundMin .and. x.le.binBoundMax)</p><p> nLoOut   = num(x.lt.binBoundMin)     ; number outliers<br> nHiOut   = num(x.gt.binBoundMax)<br> <br>;  pdf      = pdf/nMax*1.e2                      ; original percent-frequency PDF<br> pdf      = pdf/sum(pdf)/dbin                 ; modified normalised PDF<br> <br> pdf@bin_center     = binCenter<br> pdf@bin_bounds     = binBound<br> pdf@bin_bound_min  = binBoundMin<br> pdf@bin_bound_max  = binBoundMax<br> pdf@bin_spacing    = dbin            ; binBound(2)-binBound(1)<br> pdf@nbins          = nbins<br> pdf@nMax           = nMax<br> pdf@nUse           = nUse<br> if (nLoOut.gt.0 .or. nHiOut.gt.0) then<br>     pdf@nLoOut     = nLoOut       <br>     pdf@nHiOut     = nHiOut<br> end if</p><p> pdf@long_name      = "PDF"<br> if (isatt(x,"long_name")) then<br>     pdf@long_name  = "PDF: "+x@long_name<br> end if<br> <br> return( pdf )<br>end<br> </p><blockquote><p>------ Messaggio Originale ------<br>Da: <a href="mailto:ncl-talk@mailman.ucar.edu" target="_blank"><span class="wt_Email">ncl-talk@mailman.ucar.edu</span></a><br>A: <a href="mailto:dave.allured@noaa.gov" target="_blank"><span class="wt_Email">dave.allured@noaa.gov</span></a> Cc: <a href="mailto:ncl-talk@ucar.edu" target="_blank"><span class="wt_Email">ncl-talk@ucar.edu</span></a><br>Inviato: mercoledì, settembre 11º 2024, 05:35 PM<br>Oggetto: Re: [ncl-talk] Query about probability density functions<br> </p><p><span style="font-family:Arial;font-size:12.0pt;">Thanks Dave,</span></p><p>The Fortran code looks straightforward. The last bit is for converting the pdf into frequency ("if flag is set", according to the comment), but the same code is also present in the contributed.ncl code. I'm assuming the flag isn't set, otherwise the conversion would be done twice. </p><p>Unfortunately I'm not getting any closer to solve this problem. Simply dividing the NCL pdfs by 100 make them much smaller than Python or IDL pdfs. I don't know if anyone ever encountered this issue before.</p><p>Cheers,</p><p>Giorgio<br> </p><blockquote><p>------ Messaggio Originale ------<br>Da: <a href="mailto:dave.allured@noaa.gov" target="_blank"><span class="wt_Email">dave.allured@noaa.gov</span></a><br>A: <a href="mailto:g.graffino@tim.it" target="_blank"><span class="wt_Email">g.graffino@tim.it</span></a> Cc: <a href="mailto:ncl-talk@ucar.edu" target="_blank"><span class="wt_Email">ncl-talk@ucar.edu</span></a><br>Inviato: mercoledì, settembre 11º 2024, 04:58 PM<br>Oggetto: Re: [ncl-talk] Query about probability density functions<br> </p><div dir="ltr"><div>Giorgio, the fortran routine is at <strong>ni/src/lib/nfpfort/xy1pdf77.f</strong> in the NCL v6.6.2 source code.</div><div><br> </div><div class="gmail_quote"><div class="gmail_attr" dir="ltr">On Wed, Sep 11, 2024 at 8:01 AM Giorgio Graffino via ncl-talk <<a href="mailto:ncl-talk@mailman.ucar.edu" target="_blank"><span class="wt_Email">ncl-talk@mailman.ucar.edu</span></a>> wrote:<br> </div><blockquote class="gmail_quote" style="border-left-color:rgb(204,204,204);border-left-style:solid;border-left-width:1.0px;margin:0.0px 0.0px 0.0px 0.8ex;padding-left:1.0ex;"><p><span style="font-family:Arial;font-size:12.0pt;">Hi Dennis,</span></p><p><span style="font-family:Arial;font-size:12.0pt;">Thank you for pointing out that detail.</span></p><p><span style="font-family:Arial;font-size:12.0pt;">I've found the code of the pdfx function in here (</span><a target="_blank" rel="noopener noreferrer" href="https://github.com/NCAR/ncl/blob/develop/ni/src/examples/gsun/contributed.ncl"><span style="font-family:Arial;font-size:12.0pt;">https://github.com/NCAR/ncl/blob/develop/ni/src/examples/gsun/contributed.ncl</span></a><span style="font-family:Arial;font-size:12.0pt;">) and the frequency is computed as pdf = 100d0*pdf/nMax. I tried to normalize the pdf by multiplying the frequency pdf by the number of data and divide by 100, but the results is very different depending on the number of data to compute the pdf. Also, the results are different from similar computations using Python, especially with large number of data. Without knowing what the Fortran routine does (and I can't find it), I don't know how to normalize NCL to match Python.</span></p><p><span style="font-family:Arial;font-size:12.0pt;">Any suggestions?</span></p><p><span style="font-family:Arial;font-size:12.0pt;">Cheers,</span></p><p><span style="font-family:Arial;font-size:12.0pt;">Giorgio</span><br><br> </p><blockquote><p>------ Messaggio Originale ------<br>Da: <a href="mailto:shea@ucar.edu" target="_blank"><span class="wt_Email">shea@ucar.edu</span></a><br>A: <a href="mailto:g.graffino@tim.it" target="_blank"><span class="wt_Email">g.graffino@tim.it</span></a> Cc: <a href="mailto:ncl-talk@ucar.edu" target="_blank"><span class="wt_Email">ncl-talk@ucar.edu</span></a><br>Inviato: lunedì, settembre 9º 2024, 08:15 PM<br>Oggetto: Re: [ncl-talk] Query about probability density functions<br> </p><div dir="ltr"><div>From the <a target="_blank" rel="noopener noreferrer" href="https://www.ncl.ucar.edu/Document/Functions/Contributed/pdfx.shtml"><strong>pdfx</strong></a> documentation:  "The PDF units are percent [%]."</div><div> </div><div>Divide by 100.<br> </div></div><p> </p><div class="gmail_quote"><div class="gmail_attr" dir="ltr">On Mon, Sep 9, 2024 at 10:26 AM Giorgio Graffino via ncl-talk <<a href="mailto:ncl-talk@mailman.ucar.edu" target="_blank"><span class="wt_Email">ncl-talk@mailman.ucar.edu</span></a>> wrote:<br> </div><blockquote class="gmail_quote" style="border-left-color:rgb(204,204,204);border-left-style:solid;border-left-width:1.0px;margin:0.0px 0.0px 0.0px 0.8ex;padding-left:1.0ex;"><p><span style="font-family:Arial;font-size:12.0pt;">Hello NCL experts,</span></p><p><span style="font-family:Arial;font-size:12.0pt;">I'm using pdfx to compute probability density functions (</span><a target="_blank" rel="noopener noreferrer" href="https://www.ncl.ucar.edu/Document/Functions/Contributed/pdfx.shtml"><span style="font-family:Arial;font-size:12.0pt;">https://www.ncl.ucar.edu/Document/Functions/Contributed/pdfx.shtml</span></a><span style="font-family:Arial;font-size:12.0pt;">). I noticed that the cumulative density functions sum up to 100 in NCL, instead of 1 as per definition. Why is that? How can I normalize them back to 1?</span><br> </p><p><span style="font-family:Arial;font-size:12.0pt;">Thanks for your help.</span></p><p><span style="font-family:Arial;font-size:12.0pt;">Cheers,</span></p><p><span style="font-family:Arial;font-size:12.0pt;">Giorgio</span></p></blockquote></div></blockquote></blockquote></div></div></blockquote></blockquote></blockquote></div></div></blockquote>