[Wrf-users] Parallel NETCDF4?

Dominikus Heinzeller climbfuji at ymail.com
Sun Mar 6 05:14:49 MST 2016


Hi Wei-keng and Christ,

compression and parallel I/O is not (yet) supported in netCDF4, and therefore also not through PIO:

https://www.unidata.ucar.edu/support/help/MailArchives/netcdf/msg13353.html

Cheers

Dom

> On 6/03/2016, at 5:09 AM, Wei-keng Liao <wkliao at eecs.northwestern.edu> wrote:
> 
> Hi, Chris,
> 
> I learned that WRF 3.7 can use PIO library and PIO has an option to
> do parallel I/O through netCDF4. https://github.com/NCAR/ParallelIO
> (However, I have never given it a try.) I believe someone on this list
> will provide you the configure instructions.
> 
> Could you check the compression ratios in your netCDF-4 files?
> The command to show is "h5dump -p -H filename |grep COMPRESSION"
> I am interested in knowing the compression ratios.
> 
> Since you are using Lustre, PnetCDF enables an internal feature that
> aligns the starting file offsets of all fix-sized variables to the
> file system striping boundaries, which can add gaps between any
> two consecutive variables in the file. If your file contains many
> variables, then this alignment gaps may be the cause of large file
> size in your case. One way to disable it is to set the following
> environment variable before the run.
>    export PNETCDF_HINTS="nc_var_align_size=1"
> I am a PnetCDF developer. Your case can be important for me to tune
> the future PnetCDF design. Thanks.
> 
> Wei-keng
> 
> On Mar 5, 2016, at 8:04 PM, Christopher Thomas wrote:
> 
>> Thanks for your replies Dom and Wei-keng. 
>> 
>> Yes, I gather that the pnetCDF library cannot produce NETCDF4 files, but I have also read that the NETCD4 library can do parallel input and output http://www.unidata.ucar.edu/software/netcdf/docs/parallel_io.html. So I guess what I was asking is: does anyone know how to get WRF to write in parallel to a NETCDF4 file? I am guessing that the answer is no, or someone would have mentioned it, but if it were possible I think that implementing it would be a very worthwhile project. 
>> 
>> Yes, I can post-process the data to do the compression later, but the data volumes are very high (~1.2 TB per hour simulation time) so this sort of post-processing is a significant overhead. 
>> 
>> Thanks for the suggestion of I/O quilting. I'll give it a try. 
>> 
>> To answer you questions WEi-keng: I am using a lustre file system, and two versions of WRF:  3.7 compiled with pnetCDF, and 3.7.1 compiled with serial I/O. 
>> 
>> Regards
>> 
>> Chris 
>> 
>> On Sat, Mar 5, 2016 at 11:16 PM, Wei-keng Liao <wkliao at eecs.northwestern.edu> wrote:
>> Hi, Chris
>> 
>> The file size difference may be due to data compression when WRF uses HDF5
>> library internally to write NetCDF4 files. PnetCDF does not do compression.
>> You can use "ncgen" to convert a netCDF classical file to netCDF conformed
>> HDF5 files. See netCDF user guide for command-line options of ncgen.
>> http://www.unidata.ucar.edu/software/netcdf/docs/netcdf_utilities_guide.html#guide_ncgen
>> 
>> As for the files produced by PnetCDF, would you mind answering my
>> 2 questions. Were you writing the outputs to a Lustre file system,
>> or the name of parallel file system you used if not Lustre?
>> What version of WRF are you using?
>> 
>> Wei-keng
>> 
>> On Mar 4, 2016, at 3:54 AM, Christopher Thomas wrote:
>> 
>>> Hi there,
>>> 
>>> Does anyone know how to compile WRF to use parallel NETCDF but still output NETCDF4 files?
>>> 
>>> I am using 512 cores and outputting data at high temporal and spatial resolutions; parallel NETCDF reduces my wall times by about 1/3 but more than doubles output file sizes over NETCDF4 serial output.
>>> 
>>> Chris
>>> _______________________________________________
>>> Wrf-users mailing list
>>> Wrf-users at ucar.edu
>>> http://mailman.ucar.edu/mailman/listinfo/wrf-users
>> 
>> 
> 



More information about the Wrf-users mailing list