[ncl-talk] Using Shapefiles to create subset from global forecast datasets

Rick Brownrigg brownrig at ucar.edu
Thu May 25 14:15:49 MDT 2017


Hi Toni,

Sorry for the delay...this took a bit to figure out.

This *may* be a bug in gc_inout, or perhaps the documentation is
misleading. The issue is that although the shapefile says it contains 1
feature, that feature is comprised of 6 "segments":  1 for the large 4
state area and 5 for the islands off the coast of Texas (see the attached
png).  This apparently is confusing the in/out tests.

I don't know of a good way of programmatically working around this.  As a
hack (an extreme hack!), I know from looking at the shapefile that the
large 4-state area's coordinates begin at index 56, and extend to the end
of the x/y arrays. So if I change your script from:

  shp = addfile("./4_states_merged.nc","r")
  shp_lat = shp->y
  shp_lon = shp->x+360          ; adding 360 to get shapefile longitudes in
same value range as data longitudes

to

  shp = addfile("./4_states_merged.nc","r")
  shp_lat = shp->y(56:)
  shp_lon = shp->x(56:)+360          ; adding 360 to get shapefile
longitudes in same value range as data longitudes

Then it works, as seen in the second png.

The docs for gc_inout() suggest one should be able to pass in multiple
polygons to test against, but as practical matter, I don't see how they
would be delineated in the lat/lon arguments, so I suspect the docs are
wrong, or maybe the implementation incomplete.

If you have access to something like ArcGIS that can edit shapefiles, the
better solution would be to create a simpler mask excluding the island
polygons.

Wish I had a better answer, but I hope that helps.
Rick




On Wed, May 24, 2017 at 3:19 PM, Klemm, Toni <toni at ou.edu> wrote:

> Hi Rick,
>
> Thanks for getting back to me!
>
> I got it to work in the meantime using masking example 9 (
> https://www.ncl.ucar.edu/Applications/Scripts/mask_9.ncl), which looks
> similar to example 4 on the page you linked (http://ncl.ucar.edu/
> Applications/Scripts/shapefiles_4.ncl). The masking worked, but for some
> reason there is large wedge with missing data right in the middle of my
> plot now. I imagine it has somewhere to do with the “*if (MASK_INSIDE)
> then*” loop in my script below, but I can’t figure it out. Do you have
> any suggestions? I also attached my original data file, the resulting
> datafile (_shp.nc) and plot, and the .nc file of my study area. This
> shapefile was created in ArcMap 10.3 from the states.* files in the NCL
> shapefile set <https://www.ncl.ucar.edu/Applications/Data/#shp> and
> converted to .nc using ncl_convert2nc.
>
> Again, thank you very much!
>
> Best,
> Toni
>
> Script:
>
> *; ******************************************************************
> *; Create map of GFDL-OBS bias averages by FORECAST MONTH 1980-2014*
> *; *
> *; Use SHAPEFILE OF STUDY REGION to define subset and plot*
> *; ******************************************************************
>
> *load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/gsn_code.ncl"*
> *load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/gsn_csm.ncl"*
> *load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/contributed.ncl"*
>
> *; *************************************************************
>
> *begin*
>
> *do target_month = 1,12  ; do-loop through the 12 calendar months*
>
> *  if (target_month .lt. 10) then*
> *    target_month_zero = "0" + target_month ; turns "1"..."9" into
> "01"..."09" to match filename structure (months have 2 digits"*
> *  end if*
>
> * if (target_month .gt. 9) then*
> *    target_month_zero = target_month ; leaves "10..."12" unchanged*
> * end if*
>
> * print("****  START: target_month " + target_month + "  ****")*
>
> *  ; ******************************************************
> *  ; READ IN FILE INFORMATION BY TARGET MONTH*
> *  ; ******************************************************
>
> *; ******
> *print("1 - Read in file information")*
>
> *  bias_f =
> addfile("~/GFDL/TAS/DATAFILES/monthly/5b-4alt_GFDL_TAS_bias/avg_by_target_month/stack/GFDL_TAS_bias_1980-2014_target_"
> + target_month_zero + ".nc","r")*
> *  print("Finished: 1/8 Read in multiple files")*
>
>
> *  ; ************************************************
> *  ; READ IN THE VARIABLE (precipitation)*
> *  ; ************************************************
>
> *  print("2 - Read in variable and dimensions and average all years")*
> *  lat = (/bias_f->lat/)*
> *  lon = (/bias_f->lon/)*
>
> *  printMinMax(lat, False)*
> *  printMinMax(lon, False)*
>
> *  nlat = dimsizes(lat)*
> *  lat!0 = "lat"*
> *  lat&lat = lat*
> *  lat at long_name = "latitude"*
> *  lat at units = "degrees_north"*
>
> *  mlon = dimsizes(lon)*
> *  lon!0 = "lon"*
> *  lon&lon = lon*
> *  lon at long_name = "longitude"*
> *  lon at units = "degrees_east"*
>
> *  GFDL_bias_3D = rm_single_dims(bias_f->GFDL_OBS_bias) ; extract
> precipitation into new variable, output is mm/day*
> *  GFDL_bias_2D = dim_avg_n(GFDL_bias_3D,0)   ; averaging all bias for
> GFDL and LTM made in that month for 1985-2010*
>
> *  min_bias = decimalPlaces(min(GFDL_bias_2D),2,True)*
> *  max_bias = decimalPlaces(max(GFDL_bias_2D),2,True)*
> *  avg_bias = decimalPlaces(avg(GFDL_bias_2D),2,True)*
>
> *;  printVarSummary(GFDL_bias_2D)*
>
> *;  GFDL_bias_2D at _FillValue = 1.0E20*
>
> *;  printVarSummary(GFDL_bias_2D)*
>
> *;  printVarSummary(GFDL_bias_3D)*
> *;  printVarSummary(GFDL_bias_2D)*
>
>
> *; ******************************************
> *; ***** Read in shapefile information ***** (source:
> https://www.ncl.ucar.edu/Applications/Scripts/mask_9.ncl
> <https://www.ncl.ucar.edu/Applications/Scripts/mask_9.ncl>)*
>
> *  MASK_INSIDE = True    ; Whether to mask data inside or outside the
> given geographical area.*
>
> *  shp = addfile("~/States_shapefile_data/NCL/4_states_merged.shp","r")*
> *  shp_lat = shp->y*
> *  shp_lon = shp->x+360          ; adding 360 to get shapefile longitudes
> in same value range as data longitudes*
> *;  printVarSummary(shp_lat)*
> *;  printVarSummary(shp_lon)*
> *  printMinMax(shp_lat, False)*
> *  printMinMax(shp_lon, False)*
>
> *  nshp = dimsizes(shp_lon)*
>
> *  min_shp_lat = min(shp_lat)*
> *  max_shp_lat = max(shp_lat)*
> *  min_shp_lon = min(shp_lon)*
> *  max_shp_lon = max(shp_lon)*
>
>
> *; ***** Create new subset based on shapefile ***** (source:
> https://www.ncl.ucar.edu/Applications/Scripts/mask_9.ncl
> <https://www.ncl.ucar.edu/Applications/Scripts/mask_9.ncl>)*
>
> *  if(MASK_INSIDE) then*
> *;---Start with data filled in.*
> *    data_mask = GFDL_bias_2D*
> *  else*
> *;---Start with data all missing*
> *  data_mask =
> new(dimsizes(GFDL_bias_2D),typeof(GFDL_bias_2D),GFDL_bias_2D at _FillValue)*
> *  copy_VarCoords(GFDL_bias_2D,data_mask)*
> *  end if*
>
> *; ***** Determine the boundary lat/lons of the subset region within the
> lat/lon box *
> *  ilt_mn = ind(min_shp_lat.gt.lat)*
> *  ilt_mx = ind(max_shp_lat.lt.lat)*
> *  iln_mn = ind(min_shp_lon.gt.lon)*
> *  iln_mx = ind(max_shp_lon.lt.lon)*
> *  ilt1   = ilt_mn(dimsizes(ilt_mn)-1)    ; Start of lat box*
> *  iln1   = iln_mn(dimsizes(iln_mn)-1)    ; Start of lon box*
> *  ilt2   = ilt_mx(0)                     ; End of lat box*
> *  iln2   = iln_mx(0)                     ; End of lon box*
>
>
> *  if (MASK_INSIDE) then*
> *; ***** Put missing values in the areas that we want masked. (source:
> https://www.ncl.ucar.edu/Applications/Scripts/mask_9.ncl
> <https://www.ncl.ucar.edu/Applications/Scripts/mask_9.ncl>)*
> *    do ilt=ilt1,ilt2*
> *      do iln=iln1,iln2*
> *        if(gc_inout(lat(ilt),lon(iln),shp_lat,shp_lon)) then*
> *          data_mask(ilt,iln) = data_mask at _FillValue                 ;
> replace data inside and outside study region with missing values*
> *        end if*
> *      end do*
> *    end do*
> *  else*
> *; ***** Put data back in the areas that we don't want masked. (source:
> https://www.ncl.ucar.edu/Applications/Scripts/mask_9.ncl
> <https://www.ncl.ucar.edu/Applications/Scripts/mask_9.ncl>)*
> *    do ilt=ilt1,ilt2*
> *      do iln=iln1,iln2*
> *        if(gc_inout(lat(ilt),lon(iln),shp_lat,shp_lon)) then*
> *          data_mask(ilt,iln) = GFDL_bias_2D(ilt,iln)                ;
> replace missing values *inside* study region with data again*
> *        end if*
> *      end do*
> *    end do*
> *  end if*
>
>
> *  ; ************************************************
> *  ; WRITE THE AVERAGES INTO NEW netCDF FILE*
> *  ; ************************************************
>
> *  print("3 - Write the averages into new netCDF file")*
>
> *    netCDF1 = False ; Output format is NetCDF*
>
> *  if (netCDF1) then*
> *    diro =
> "~/GFDL/TAS/DATAFILES/monthly/5b-4alt_GFDL_TAS_bias/avg_by_target_month/nc_shp/"
> ; output in subdirectory "subset" within input directory*
>
> *    filename = "GFDL_TAS_bias_1980-2014_target_month_" +
> target_month_zero + "_shp" ; output filename for subset using
> initialization year and month to differentiate*
> *    filo = filename +".nc"*
>
> *  end if*
>
> *  setfileoption("nc","Format","LargeFile")*
>
> *  if (netCDF1) then*
> *    system("/bin/rm -f "+ diro + filo) ; removes old files with the same
> name if they are present*
> *    ncdf = addfile(diro + filo,"c")*
>
> *    setfileoption(ncdf,"DefineMode",True)*
>
> *    ; create attributes*
> *    fAtt = True*
> *    fAtt at title = "Data: GFDL FLOR-B01 for the Contiguous U.S., Forecast
> Month " + target_month_zero*
> *    fAtt at source = "Vecchi et al. 2014"*
> *    fAtt at Conventions = "None"*
> *    fAtt at creation_date = systemfunc("date")*
>
> *    fileattdef(ncdf,fAtt)*
>
> *    dimNames = (/"target_month","lat","lon"/)*
> *    dimSizes = (/dimsizes(target_month),dimsizes(lat),dimsizes(lon)/)*
> *    dimUnlim = (/True,False,False/)*
>
> *    filedimdef(ncdf,dimNames,dimSizes,dimUnlim)*
>
> *
> filevardef(ncdf,"target_month",typeof(target_month),(/"target_month"/))*
> *    filevardef(ncdf,"lat",typeof(lat),(/"lat"/))*
> *    filevardef(ncdf,"lon",typeof(lon),(/"lon"/))*
> *    filevardef(ncdf,"data_mask",typeof(data_mask),(/"lat","lon"/))*
>
> *    ; now write all the variables to the file*
>
> *    ncdf->target_month = (/target_month/)*
> *    ncdf->lat = (/lat/)*
> *    ncdf->lon = (/lon/)*
> *    ncdf->data_mask = (/data_mask/)*
>
> *  end if*
>
>
> *  ; ************************************************
> *  ; CREATE MAP*
> *  ; ************************************************
>
> *  ; Define boundaries for map*
> *  data_mask!0 = "lat" ; latitude information come from variable
> lat_subdomain*
> *  data_mask&lat = lat*
> *  data_mask!1 = "lon" ; longitude information come from variable
> lon_subdomain*
> *  data_mask&lon = lon*
> *  data_mask at long_name = "Range: " + min_bias + " - " + max_bias + "
> mm/day"    ; map subtitle, printed on the top left*
> *  data_mask at units = "Avg.: " + avg_bias + " mm/day" ; map units, printed
> on the top right*
>
> *  diro =
> "~/GFDL/TAS/DATAFILES/monthly/5b-4alt_GFDL_TAS_bias/avg_by_target_month/png_shp/"
> ; output directory*
> *  wks = gsn_open_wks("png", diro +
> "GFDL_TAS_bias_1980-2014_target_month_" + target_month_zero + "_shp") ; map
> name and file type*
>
> *  cmap = read_colormap_file("BlueGreen14")*
>
> *  res                      = True ; plot mods desired*
> *  res at tiMainString         = "Avg. GFDL bias, Forecast Month " +
> target_month_zero + ", 1980-2014, no bias correction" ; main title*
> *  res at cnFillOn             = True ; turn on color fill, works with
> gsn_csm_map() function 3 lines down*
> *  res at gsnAddCyclic         = False*
> *  res at cnLinesOn            = False ; contour lines of the map fill on or
> off  *
> *  res at mpProjection         = "Mercator" ; Mercator map projection*
> *  res at cnFillPalette        = cmap ; reverse color map*
>
> *  res at mpLimitMode    = "LatLon" ; define plotted area by lats and lons,
> namely those of the selection (20-50N, 230-300E)*
> *  res at mpMaxLatF      = 41.5*
> *  res at mpMinLatF      = 25*
> *  res at mpMaxLonF      = 267*
> *  res at mpMinLonF      = 250*
>
> *  res at mpOutlineBoundarySets = "GeophysicalAndUSStates" ; lines and
> boundaries of USA and US States*
> *  res at mpGeophysicalLineThicknessF = 2             ; thickness of the USA
> lines*
> *  res at mpUSStateLineThicknessF     = 2               ; thickness of the
> state lines*
>
> *  res at mpSpecifiedFillColors  = (/0,100/)             ; fill with
> background color, works with gsn_map() 2 lines down*
>
> *;  res at cnLevelSelectionMode = "ManualLevels"              ; manually set
> the contour levels with the following 3 resources*
> *;  res at cnMinLevelValF  = -2                             ; set the minimum
> contour level*
> *;  res at cnMaxLevelValF  = 4                           ; set the maximum
> contour level*
> *;  res at cnLevelSpacingF = 0.25                             ; set the
> interval between contours*
>
>
> *  ; ************************************************
> *  ; PLOT MAP*
> *  ; ************************************************
>
> *  print ("5 - Plot map")*
>
> *  plot = gsn_csm_contour_map(wks,data_mask, res) ; universal map
> projection, define in res at mpProjection*
>
> *  print("**** ALL DONE: target_month " + target_month + " ****")*
>
> *  delete(bias_f)    ; delete before the next loop or program will fail*
> *  delete(GFDL_bias_3D) ; delete before the next loop or program will fail*
> *  delete(GFDL_bias_2D) ; delete before the next loop or program will fail*
> *  delete(data_mask)   ; delete before the next loop or program will fail*
>
> *end do ; end the loop through the 12 calendar months (1-12)*
>
> *end*
>
> *Toni Klemm*
> Ph.D. Candidate
> The University of Oklahoma
> South Central Climate Science Center
> phone: 405 325 1272 <(405)%20325-1272>
> www.toni-klemm.de
> Early Career Climate Forum <https://www.eccforum.org>
> Twitter: @toniklemm <https://twitter.com/ToniKlemm>
>
>
>
>
>
> On May 24, 2017, at 12:35 PM, Rick Brownrigg <brownrig at ucar.edu> wrote:
>
> Hi Toni,
>
> Sorry for the delay in response. There are several examples of using
> shapefiles to mask out data, if that's what are wanting to do. In
> particular, look at examples 4, 5, and 21 at:
>
> http://ncl.ucar.edu/Applications/shapefiles.shtml
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__ncl.ucar.edu_Applications_shapefiles.shtml&d=DwMFaQ&c=qKdtBuuu6dQK9MsRUVJ2DPXW6oayO8fu4TfEHS8sGNk&r=trcSnQNnUT2ddBXcpd358A&m=_2unS1RvDS7egGpkWknJW0V9KMHhv2GWDckbCG7ZOcQ&s=VqnvWQCKomt1QM_SAC2MNal5QUSSlnZmBwXdI3_KjO0&e=>
>
> II think there's a utility script they make use that you'll need to
> download.  Hopefully this gets you at least started -- please post back the
> list if you have other questions.
>
> Rick
>
> On Mon, May 22, 2017 at 10:10 PM, Klemm, Toni <toni at ou.edu> wrote:
>
> Good evening,
>
> I am analyzing seasonal climate forecasts skill for the U.S. Great Plains,
> specifically Texas, Oklahoma, Kansas, and Colorado. So far I’ve been using
> lat/lon to define my study region. However, this rectangular box includes
> parts of Mexico and the Gulf of Mexico that are outside my study region and
> which have extreme outliers that distort my results.
>
> To get more accurate results, I’d like to use a shapefile of my 4 states (
> TX_OK_KS_CO.shp, created in QGIS 2.18 from a U.S. states shapefile) to
> define my study region. However, I’m not sure how to interpret and use the
> results I’m getting. I used   “ncl_convert2nc TX_OK_KS_CO.shp -nc2c“  to
> create   “TX_OK_KS_CO.nc“ .
>
> “printVarSummary(TX_OK_KS_CO.nc)” and ”PrintMinMax“ of the x-coordinate
> (longitude) as well as y-coordinate (latitude) look like this:
>
> Variable: shp
> Type: file
> File path: /home/States_shapefile_data/GADM/TX_OK_KS_CO.nc
> Number of global attributes: 11
> Number of dimensions: 5
> Number of variables: 4
> (0) min=0   max=0
>
> ncdump -h TX_OK_KS_CO.nc returns this:
>
> netcdf TX_OK_KS_CO {
> dimensions:
> geometry = 2 ;
> segments = 2 ;
> num_features = 4 ;
> num_segments = 262 ;
> num_points = 102749 ;
> variables:
> int geometry(num_features, geometry) ;
> int segments(num_segments, segments) ;
> double x(num_points) ;
> double y(num_points) ;
>
> // global attributes:
> :segs_numPnts = 1 ;
> :segs_xyzIndex = 0 ;
> :geom_numSegs = 1 ;
> :geom_segIndex = 0 ;
> :geometry_type = "polygon" ;
> :layer_name = "TX_OK_KS_CO" ;
> :creation_date = "Mon May 22 17:20:46 EDT 2017" ;
> :NCL_Version = "6.4.0" ;
> :system = "Linux ou.edu x86_64 #1 SMP Fri Mar 3 00:04:05 UTC 2017 x86_64
> x86_64 x86_64 GNU/Linux" ;
> :Conventions = "None" ;
> :title = "NCL: convert-OGR-to-netCDF" ;
> }
>
> I am not sure how to use this information, or if this is even the
> information I should be getting, to define the study region when extracting
> data from my global forecast dataset. I am fairly sure the shapefile is
> correct. Opening it again in QGIS showed the four states with their
> outlines and filling, and nothing else.
>
> I appreciate any help!
>
> Thank you very much,
> Toni
>
>
> *Toni Klemm*
> Ph.D. Candidate
> The University of Oklahoma
> South Central Climate Science Center
> phone: 405 325 1272 <(405)%20325-1272>
> www.toni-klemm.de
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.toni-2Dklemm.de&d=DwMFaQ&c=qKdtBuuu6dQK9MsRUVJ2DPXW6oayO8fu4TfEHS8sGNk&r=trcSnQNnUT2ddBXcpd358A&m=_2unS1RvDS7egGpkWknJW0V9KMHhv2GWDckbCG7ZOcQ&s=QAW2_xJ9gp71I8wBvMBw6JI7Kb7j7d8KF0tlyNuhEYE&e=>
> Early Career Climate Forum
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.eccforum.org&d=DwMFaQ&c=qKdtBuuu6dQK9MsRUVJ2DPXW6oayO8fu4TfEHS8sGNk&r=trcSnQNnUT2ddBXcpd358A&m=_2unS1RvDS7egGpkWknJW0V9KMHhv2GWDckbCG7ZOcQ&s=BsFxGtcxiOomgvIRjvEfgPA7YppuvrO0n-TW0SLSiok&e=>
> Twitter: @toniklemm
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__twitter.com_ToniKlemm&d=DwMFaQ&c=qKdtBuuu6dQK9MsRUVJ2DPXW6oayO8fu4TfEHS8sGNk&r=trcSnQNnUT2ddBXcpd358A&m=_2unS1RvDS7egGpkWknJW0V9KMHhv2GWDckbCG7ZOcQ&s=B5YLwKe0WRmXO4D2nzE_HgiPLcFnIFvPX6Z-kf3KSrw&e=>
>
>
>
>
>
>
>
> _______________________________________________
> ncl-talk mailing list
> ncl-talk at ucar.edu
> List instructions, subscriber options, unsubscribe:
> http://mailman.ucar.edu/mailman/listinfo/ncl-talk
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__mailman.ucar.edu_mailman_listinfo_ncl-2Dtalk&d=DwMFaQ&c=qKdtBuuu6dQK9MsRUVJ2DPXW6oayO8fu4TfEHS8sGNk&r=trcSnQNnUT2ddBXcpd358A&m=_2unS1RvDS7egGpkWknJW0V9KMHhv2GWDckbCG7ZOcQ&s=g_J7lY2OPhdO6e1uZkyCAbj2G1XmDbXoRk01EanWTdI&e=>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucar.edu/pipermail/ncl-talk/attachments/20170525/141686c4/attachment.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: GFDL_TAS_bias_1980-2014_target_month_01_shp.png
Type: image/png
Size: 103416 bytes
Desc: not available
Url : http://mailman.ucar.edu/pipermail/ncl-talk/attachments/20170525/141686c4/attachment.png 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: texasCoast.png
Type: image/png
Size: 58482 bytes
Desc: not available
Url : http://mailman.ucar.edu/pipermail/ncl-talk/attachments/20170525/141686c4/attachment-0001.png 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: GFDL.png.png
Type: image/png
Size: 56990 bytes
Desc: not available
Url : http://mailman.ucar.edu/pipermail/ncl-talk/attachments/20170525/141686c4/attachment-0002.png 


More information about the ncl-talk mailing list