[Wrf-users] Parallel NETCDF4?
Dominikus Heinzeller
climbfuji at ymail.com
Sun Mar 6 05:14:49 MST 2016
Hi Wei-keng and Christ,
compression and parallel I/O is not (yet) supported in netCDF4, and therefore also not through PIO:
https://www.unidata.ucar.edu/support/help/MailArchives/netcdf/msg13353.html
Cheers
Dom
> On 6/03/2016, at 5:09 AM, Wei-keng Liao <wkliao at eecs.northwestern.edu> wrote:
>
> Hi, Chris,
>
> I learned that WRF 3.7 can use PIO library and PIO has an option to
> do parallel I/O through netCDF4. https://github.com/NCAR/ParallelIO
> (However, I have never given it a try.) I believe someone on this list
> will provide you the configure instructions.
>
> Could you check the compression ratios in your netCDF-4 files?
> The command to show is "h5dump -p -H filename |grep COMPRESSION"
> I am interested in knowing the compression ratios.
>
> Since you are using Lustre, PnetCDF enables an internal feature that
> aligns the starting file offsets of all fix-sized variables to the
> file system striping boundaries, which can add gaps between any
> two consecutive variables in the file. If your file contains many
> variables, then this alignment gaps may be the cause of large file
> size in your case. One way to disable it is to set the following
> environment variable before the run.
> export PNETCDF_HINTS="nc_var_align_size=1"
> I am a PnetCDF developer. Your case can be important for me to tune
> the future PnetCDF design. Thanks.
>
> Wei-keng
>
> On Mar 5, 2016, at 8:04 PM, Christopher Thomas wrote:
>
>> Thanks for your replies Dom and Wei-keng.
>>
>> Yes, I gather that the pnetCDF library cannot produce NETCDF4 files, but I have also read that the NETCD4 library can do parallel input and output http://www.unidata.ucar.edu/software/netcdf/docs/parallel_io.html. So I guess what I was asking is: does anyone know how to get WRF to write in parallel to a NETCDF4 file? I am guessing that the answer is no, or someone would have mentioned it, but if it were possible I think that implementing it would be a very worthwhile project.
>>
>> Yes, I can post-process the data to do the compression later, but the data volumes are very high (~1.2 TB per hour simulation time) so this sort of post-processing is a significant overhead.
>>
>> Thanks for the suggestion of I/O quilting. I'll give it a try.
>>
>> To answer you questions WEi-keng: I am using a lustre file system, and two versions of WRF: 3.7 compiled with pnetCDF, and 3.7.1 compiled with serial I/O.
>>
>> Regards
>>
>> Chris
>>
>> On Sat, Mar 5, 2016 at 11:16 PM, Wei-keng Liao <wkliao at eecs.northwestern.edu> wrote:
>> Hi, Chris
>>
>> The file size difference may be due to data compression when WRF uses HDF5
>> library internally to write NetCDF4 files. PnetCDF does not do compression.
>> You can use "ncgen" to convert a netCDF classical file to netCDF conformed
>> HDF5 files. See netCDF user guide for command-line options of ncgen.
>> http://www.unidata.ucar.edu/software/netcdf/docs/netcdf_utilities_guide.html#guide_ncgen
>>
>> As for the files produced by PnetCDF, would you mind answering my
>> 2 questions. Were you writing the outputs to a Lustre file system,
>> or the name of parallel file system you used if not Lustre?
>> What version of WRF are you using?
>>
>> Wei-keng
>>
>> On Mar 4, 2016, at 3:54 AM, Christopher Thomas wrote:
>>
>>> Hi there,
>>>
>>> Does anyone know how to compile WRF to use parallel NETCDF but still output NETCDF4 files?
>>>
>>> I am using 512 cores and outputting data at high temporal and spatial resolutions; parallel NETCDF reduces my wall times by about 1/3 but more than doubles output file sizes over NETCDF4 serial output.
>>>
>>> Chris
>>> _______________________________________________
>>> Wrf-users mailing list
>>> Wrf-users at ucar.edu
>>> http://mailman.ucar.edu/mailman/listinfo/wrf-users
>>
>>
>
More information about the Wrf-users
mailing list