[ncl-talk] PDF plot changes with bin_nice and without.

Sri nandini bax8609 at uni-hamburg.de
Thu Jun 11 07:57:59 MDT 2020


Hello dear ncl-users,

I have yet another question regarding my pdf plotting.

Does the PDF function ignore missing values or does one have to remove 
it? I could not find this information on the function documentation?

my aim is to show what the detrended global sea surface height PDF looks 
like; on 100 model ensembles, as well as the statistics for skewness and 
kurtosis.However, the PDF looks very strange indeed, even though i have 
tried different methods, and specifiying different bins as well as 
removing missing values.

I also used another method is to first remove all missing values, find 
gridboxes where at least 50% of data over entire period and use for 
masking, detrended and then calculating PDFs, but i have the similar PDF 
plot: attached here. The PDF without detrending looks very normal- that 
is also attached here.

Would you know of another method or why the detrended PDF shows such 
high kurtosis and a supergaussian behaviour? I have no errors and 
everything seems to work, i simply dont understand why i end up with 
this strange PDF shape when i used detrend, and why i get a different 
PDF shape on normal data.

I would be very grateful for any advice. I can also send my data file (s).

Here is my script:


begin
    f     = addfile ("anom_hist_zo_slr.nc", "r")
    hist_anom    = f->hist_anom({1192152:1366728},:,{-88:88},:) 
;(:,:,{-90:90},{-180:180})
    printVarSummary(hist_anom)
    dimx = dimsizes(hist_anom)
    ntim = dimx(0)          ; 240
    nens = dimx(1)          ; 100
    nlat = dimx(2)          ; 45
    mlon = dimx(3)          ; 90
    hist_anom at _FillValue = -9.96921e+36

    f1     = addfile ("anom_rcp45_zo_slr.nc", "r")
    rcp45_anom   = f1->rcp45_anom({648672:823248},:,{-88:88},:)
    printVarSummary(rcp45_anom)
    rcp45_anom at _FillValue = -9.96921e+36
    dimx1 = dimsizes(rcp45_anom)
    ntim1 = dimx1(0)          ; 240
    nens1 = dimx1(1)          ; 100
    nlat1 = dimx1(2)          ; 45
    mlon1 = dimx1(3)          ; 90
;   hist_anom at _FillValue = -9.96921e+36

    f2     = addfile ("anom_rcp85_zo_slr.nc", "r")
    rcp85_anom   = f2->rcp85_anom({648672:823248},:,{-88:88},:)
    printVarSummary(rcp85_anom)
    rcp85_anom at _FillValue = -9.96921e+36
          dimx2 = dimsizes(rcp85_anom)
    ntim2 = dimx2(0)          ; 240
    nens2 = dimx2(1)          ; 100
    nlat2 = dimx2(2)          ; 45
    mlon2 = dimx2(3)          ; 90
  ;  hist_anom at _FillValue = -9.96921e+36


;Use dtrend_msg or dtrend_msg_n if missing values are present.
; =============================================================
; optional: find gridboxes where at least 50% of data over entire period 
and use for masking when calculating PDFs

; Consider x[*] with _FillValue attribute: (a) count the number of 
_FillValue (missing values); (b) extract only the non-missing values.
; (1) nmsg = num(ismissing(x))   ; count number of missing
; (2) igood = ind(.not.ismissing(x)), xgood = x(igood), print(xgood);
; =============================================================
  nomiss=dim_num_n(.not.ismissing(hist_anom({1192152:1366728},:,{-88:88},:) ),0)  ; count missing values at each location
    printVarSummary(nomiss)
    data50_hist=0.5*dimsizes(dimx)              ; threshold #of data 
equiv 50%
    printVarSummary(data50_hist)

    hist_anom  = mask(hist_anom, (nomiss.ge.data50_hist), True)
    printVarSummary(hist_anom)

    ;ihist = ind(.not.ismissing(hist_anom({1192152:1366728},:,{-88:88},:)))
    ;xhist = hist_anom(ihist)
    ;print(xhist);

    hist_anom1= dtrend_n(hist_anom,False,0)
    printVarSummary(hist_anom1)
    ;print(hist_anom1)
nomiss1=dim_num_n(.not.ismissing(rcp45_anom({648672:823248},:,{-88:88},:) 
),0)  ; count missing values at each location
     printVarSummary(nomiss1)
     data50_rcp45=0.5*dimsizes(dimx1)              ; threshold #of data 
equiv 50%
     printVarSummary(data50_rcp45)

     rcp45_anom  = mask(rcp45_anom, (nomiss1.ge.data50_rcp45), True)
     printVarSummary(rcp45_anom)
     rcp45_anom1= dtrend_n(rcp45_anom,False,0)
     printVarSummary(rcp45_anom1)

nomiss2=dim_num_n(.not.ismissing(rcp85_anom({648672:823248},:,{-88:88},:) 
),0)  ; count missing values at each location
      printVarSummary(nomiss2)
      data50_rcp85=0.5*dimsizes(dimx2)              ; threshold #of data 
equiv 50%
      printVarSummary(data50_rcp85)

      rcp85_anom  = mask(hist_anom, (nomiss2.ge.data50_hist), True)
      printVarSummary(rcp85_anom)

      rcp85_anom1= dtrend_n(rcp85_anom,False,0)
      printVarSummary(rcp85_anom1)
print("masking active, only using grid boxes with >50% data over entire 
period")

;=================================================================

   anom1_1d = ndtooned(hist_anom1)                     ;to carry out the 
dim_stat4_n on one dimension.
   printVarSummary(anom1_1d)

   anom2_1d = ndtooned(rcp45_anom1)
   printVarSummary(anom2_1d)
;print(anom2_1d)
   anom3_1d = ndtooned(rcp85_anom1)
   printVarSummary(anom3_1d)
;print(anom1_1d)

;  anom1_1d(ind(ismissing(anom1_1d))) = -9999               ;;(too many 
missing values - remove)
;  anom2_1d(ind(ismissing(anom2_1d))) = -9999
;=================================================================
   TTstat1 = dim_stat4(anom1_1d)
   printVarSummary(TTstat1)

   TTstat2 = dim_stat4(anom2_1d)
   printVarSummary(TTstat2)
       print("success")
   TTstat3 = dim_stat4(anom3_1d)
   printVarSummary(TTstat3)
     print("success")

  opt=True
   statdis1 = stat_dispersion(anom1_1d,opt)
   printVarSummary(statdis1)
   statdis2 = stat_dispersion(anom2_1d,opt)
   printVarSummary(statdis2)
   statdis3 = stat_dispersion(anom3_1d,opt)
   printVarSummary(statdis3)

;==================================================================
  ; percentiles (requires sorting the array)

      print("started percentiles")
;================================================================================
; calculate empirical PDF
; Using a larger number of bins can make your histogram more detailed, 
but that will also decrease the height of a histogram (note the y-axis 
values): here using bin minmax=25 creates smooth PDF shape.
             opt          = True
   opt at bin_min = -25.
   opt at bin_max =  25.
   opt at bin_nice=True
   pdf1 = pdfx(hist_anom1, 100, opt)
   printVarSummary(pdf1)
   pdf2 = pdfx(rcp45_anom1, 100, opt)
   printVarSummary(pdf2)
   pdf3 = pdfx(rcp85_anom1,100, opt)
   printVarSummary(pdf3)

;================================================================================
; calculate y-value for percentiles
;================================================================================
     x5=statdis1(24)
     x95=statdis1(25)
     print (x95)
     print (x5)

    p05pos=ind(pdf1 at bin_center.ge.x5)
    y05_1 = (x5-pdf1 at bin_center(p05pos(0)-1)) / 
(pdf1 at bin_center(p05pos(1)-1)-pdf1 at bin_center(p05pos(0)-1)) * 
(pdf1(p05pos(1)-1)-pdf1(p05pos(0)-1)) + pdf1(p05pos(0)-1)
    p95pos=ind(pdf1 at bin_center.ge.x95)
    y95_1 = (x95-pdf1 at bin_center(p95pos(0)-1)) / 
(pdf1 at bin_center(p95pos(1)-1)-pdf1 at bin_center(p95pos(0)-1)) * 
(pdf1(p95pos(1)-1)-pdf1(p95pos(0)-1)) + pdf1(p95pos(0)-1)
    delete(p05pos)
    delete(p95pos)

     x5_2=statdis2(24)
     x95_2=statdis2(25)
     print (x95_2)
     print (x5_2)
    p05pos=ind(pdf2 at bin_center.ge.x5_2)
    y05_2 = (x5_2-pdf2 at bin_center(p05pos(0)-1)) / 
(pdf2 at bin_center(p05pos(1)-1)-pdf2 at bin_center(p05pos(0)-1)) * 
(pdf2(p05pos(1)-1)-pdf2(p05pos(0)-1)) + pdf2(p05pos(0)-1)
    p95pos=ind(pdf2 at bin_center.ge.x95_2)
    y95_2 = (x95_2-pdf2 at bin_center(p95pos(0)-1)) / 
(pdf2 at bin_center(p95pos(1)-1)-pdf2 at bin_center(p95pos(0)-1)) * 
(pdf2(p95pos(1)-1)-pdf2(p95pos(0)-1)) + pdf2(p95pos(0)-1)
    delete(p05pos)
    delete(p95pos)
             x5_3=statdis3(24)
     x95_3=statdis3(25)
     print (x95_3)
     print (x5_3)
    p05pos=ind(pdf3 at bin_center.ge.x5_3)
    y05_3 = (x5_3-pdf3 at bin_center(p05pos(0)-1)) / 
(pdf3 at bin_center(p05pos(1)-1)-pdf3 at bin_center(p05pos(0)-1)) * 
(pdf3(p05pos(1)-1)-pdf3(p05pos(0)-1)) + pdf3(p05pos(0)-1)
    p95pos=ind(pdf3 at bin_center.ge.x95_3)
    y95_3 = (x95_3-pdf3 at bin_center(p95pos(0)-1)) / 
(pdf3 at bin_center(p95pos(1)-1)-pdf3 at bin_center(p95pos(0)-1)) * 
(pdf3(p95pos(1)-1)-pdf3(p95pos(0)-1)) + pdf3(p95pos(0)-1)
    delete(p05pos)
    delete(p95pos)

;================================================================================
; prepare data for plotting
;================================================================================

   nVar    = 3
   nBin    = pdf1 at nbins          ; retrieve the number of bins

   xx      = new ( (/nVar, nBin/), typeof(pdf1))

   xx(0,:) = pdf1 at bin_center     ; assign appropriate "x" axis values
   xx(1,:) = pdf2 at bin_center
   xx(2,:) = pdf3 at bin_center

   yy      = new ( (/nVar, nBin/), typeof(pdf1))
   yy(0,:) = (/ pdf1 /) * 0.01 ;; convert % to absolut
   yy(1,:) = (/ pdf2 /) * 0.01
   yy(2,:) = (/ pdf3 /) * 0.01



-------------- next part --------------
A non-text attachment was scrubbed...
Name: Hist_RCPs_detrednded_global_PDF_nicebins.pdf
Type: application/pdf
Size: 84214 bytes
Desc: not available
URL: <https://mailman.ucar.edu/pipermail/ncl-talk/attachments/20200611/14b5d14f/attachment-0004.pdf>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Hist_RCPs_detrended_global_PDF_50binminmax.pdf
Type: application/pdf
Size: 82066 bytes
Desc: not available
URL: <https://mailman.ucar.edu/pipermail/ncl-talk/attachments/20200611/14b5d14f/attachment-0005.pdf>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Hist_RCPs_global_PDF_bin25minmax.pdf
Type: application/pdf
Size: 85661 bytes
Desc: not available
URL: <https://mailman.ucar.edu/pipermail/ncl-talk/attachments/20200611/14b5d14f/attachment-0006.pdf>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Hist_RCPs_detrended_global_PDF_maskmethod_binnice.pdf
Type: application/pdf
Size: 83126 bytes
Desc: not available
URL: <https://mailman.ucar.edu/pipermail/ncl-talk/attachments/20200611/14b5d14f/attachment-0007.pdf>


More information about the ncl-talk mailing list