[ncl-talk] Sub-sampling

Dennis Shea shea at ucar.edu
Wed Oct 28 15:16:11 MDT 2015


https://www.ncl.ucar.edu/Document/Functions/Contributed/generate_sample_indices.shtml

-----
General Comments:


A good programming practice is to not 'hard-code' constants.

model = (/.../)
print(model)

M            = dimsizes(model)   ; M=36 ... not 39
nmodels = ispan(1,M,1)          ; (/1,2,3,...,34,35,36/)

samplesize = (/5,10,15,20,25,30,35/)
N            = dimsizes(samplesize)

Smean = new(N,"float")
            :
dPDJF = asciiread("pr_ens_percent_change_boxplot_DJF.txt",M,"float")
             :
do ni=0,N-1

------
???

Should not
     nmodels = ispan(1,M,1)
be
    nmodels = ispan(0,M-1,1)     ; (/0,1,...,34/)
----

Also, after you upgrade to 6.3.0 or 6.3.1,  rather than

   sdPDJF = dPDJF(generate_sample_indices(samplesize(ni),0))
                 :
   delete(sdPDJF)

You can use the := syntax which was introduced in 6.1.1

   http://www.ncl.ucar.edu/prev_releases.shtml#6.1.1

   sdPDJF := dPDJF(generate_sample_indices(samplesize(ni),0))


------

I am not sure what you mean by:

"I just have a question about how I can do 50 iterations of this to
obtain different models for each time I sub-sample."

Your sample size is 36 ... Not sure how you can sub sample this 50
times and get "different models".

---

If you specify a sample size of (say) 5 and you want each sub sample
of size 5 to have unique indices, I would suggest:

  M  = dimsizes(model)       ; # models
  N  = 5   ; sample size
  mu = generate_unique_indices(M)

  NU = M/N   ; 7

   nuStrt = 0
   nuLast= N-1
   do nu=0,NU-1
     im = mu(nuStrt:nuLast)    ; N unique subscripts
          :
   end do

In the bove the last 'mu' [ ie:  mu(M-1) ] would not be included.

Perhaps, I misunderstand?

What do you have against of sampling with replacement?

On Wed, Oct 28, 2015 at 6:41 AM, Melissa Lazenby <M.Lazenby at sussex.ac.uk> wrote:
> Hi
>
> I have managed to sub-sample 39 models using the generate_sample_indices
> function.
>
> I just have a question about how I can do 50 iterations of this to obtain
> different models for each time I sub-sample.
>
> I think I need to use a do loop but I am not sure as to what the syntax
> should be in my code as I have a loop already to loop through the different
> number of models I sub-sample.
>
> Below is my code currently that is working for just 1 iteration.
>
> I have commented out the attempt at adding the addition iterations loop.
>
> Many thanks!
>
> Kind Regards
> Melissa
>
> ; ==============================================================
> ; sub-sampling_1.ncl
> ; ==============================================================
>
> load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/gsn_code.ncl"
> load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/gsn_csm.ncl"
> load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/contributed.ncl"
> load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/shea_util.ncl"
> load "./generate_sample_indices.ncl"
>
> begin
>
> model = (/"ACCESS1-0", "ACCESS1-3", "bcc-csm1-1", "CanESM2", "CCSM4",
> "CESM1-BGC", "CESM1-CAM5", "CMCC-CM", "CMCC-CMS", "CNRM-CM5",
> "CSIRO-Mk3-6-0", "EC-EARTH", "FGOALS-g2", "FIO-ESM","GFDL-CM3",
> "GFDL-ESM2G", "GFDL-ESM2M", "GISS-E2-H_p1", "GISS-E2-H_p2", "GISS-E2-H_p3",
> "GISS-E2-R_p1", "GISS-E2-R_p2", "GISS-E2-R_p3", "HadGEM2-AO", "HadGEM2-CC",
> "HadGEM2-ES", "inmcm4", "IPSL-CM5A-LR", "IPSL-CM5A-MR",
> "IPSL-CM5B-LR","MIROC5", "MPI-ESM-LR", "MPI-ESM-MR", "MRI-CGCM3",
> "NorESM1-ME", "NorESM1-M"/)
>
> nmodels =
> (/1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39/)
>
> samplesize = (/5,10,15,20,25,30,35/)
> ntimes = 50
>
>
> printVarSummary(samplesize)
>
> Smean = new(7,"float")
> Stdand = new(7,"float")
> Srange = new(7,"float")
> S25 = new(7,"float")
> S75 = new(7,"float")
> Smedian = new(7,"float")
> ;stat = new(7,"float")
>
> dPDJF = asciiread("pr_ens_percent_change_boxplot_DJF.txt",39,"float")
>
>  printVarSummary(dPDJF)
>  print(dPDJF)     ; Print the values
>
>         ;do it=0, ntimes-1
>          do ni=0,6
>
>             sdPDJF = dPDJF(generate_sample_indices(samplesize(ni),0))
>
>               dimt = dimsizes(sdPDJF)   ; should be 5,10,15,20,25,30,35
>                x25  = round(.25*dimt,3)-1     ; -1 to account for NCL
> indexing starting
>                x75  = round(.75*dimt,3)-1     ; at 0
>
>          printVarSummary(sdPDJF)
>
>       ;opt = True
>       ;opt at PrintStat = True
>
>          ;stat(ni) = stat_dispersion(sdPDJF,opt)
>
>          Smean(ni) = avg(sdPDJF)
>          Stdand(ni) = stddev(sdPDJF)
>          Srange(ni) = max(sdPDJF)-min(sdPDJF)
>          S25(ni) = sdPDJF(x25)
>          S75(ni) = sdPDJF(x75)
>          Smedian(ni) = dim_median(sdPDJF)
>
>
>         print(Smean)
>         print(Stdand)
>         print(Srange)
>         print(S25)
>         print(S75)
>         print(Smedian)
>
>          delete(sdPDJF)
>
>          end do
>         ;end do
> end
>
> _______________________________________________
> ncl-talk mailing list
> ncl-talk at ucar.edu
> List instructions, subscriber options, unsubscribe:
> http://mailman.ucar.edu/mailman/listinfo/ncl-talk
>


More information about the ncl-talk mailing list