During the update of input (nf_fread*d) routines, I noticed that we were processing the ghost points during reading in distributed-memory by setting the local variable Nghost:
Code: Select all
IF (model.eq.iADM) THEN
Nghost=0
ELSE
Nghost=GHOST_POINTS
END IF
Warning: With parallel I/O there are many combinations of NetCDF and HDF libraries that may need to be built. The build scripts build.sh and build.bash have changed accordingly. You may build NetCDF-3 and NetCDF-4 libraries in serial. If creating new NetCDF-4 file format (NETCDF4 C-preprocessing option) you may need serial versions of the NetCDF4/HDF5 libraries. This is the case in serial and shared-memory ROMS applications. If you want parallel I/O, you need to compile NetCDf4/HDF5 with the MPI library as explained below.
Parallel I/O in ROMS:
- To activate parallel I/O in ROMS you need to turn on both PARALLEL_IO and NETCDF4 C-preprocessing options. You also need to compile ROMS with the MPI library. That is, the macro USE_MPI must be activated in the makefile, build.sh or build.bash scripts.
- Parallel I/O is only possible in with the NetCDF-4 and HDF5 libraries.
- The NetCDF-4 and HDF5 libraries must be built with the same compiler and compiler options.
- The HDF5 library (version 1.8.1) must be built with the --enable-parallel flag. The NetCDF configure script will detect the parallel capability of HDF5 and build the NetCDF parallel I/O features automatically.
- Parallel I/O is only possible in distributed-memory and requires an implementation of MPI-2. Use, for example, the MPICH2 implementation. Check also this link. It does not work yet with the OpenMPI library because the variable MPI_COMM_WORLD is always zero in calls to mpi_init. We need a non-zero value for the HDF5 library to work. We reported this problem to the OpenMPI developers.
- Parallel I/O requires the MPI-IO layer which is part of MPI-2. You need to be sure that the MPI-IO layer is activated when building the implementation of MPI-2.
- In MPI parallel I/O applications, the processing can be Collective or Independent. In the NetCDF-4/HDF5 library the parallel I/O access is set by a call to the function nf90_var_par_access(ncid, varid, access) for each I/O variable. Calling this function affects only the open file. This information is not written to the data file. The default is to treat all the variables to Collective operations. The parallel access status lasts as long as the file is open or changed. This function can be called as often as desired. Independent I/O access means that processing is not dependent on or affected by other parallel processes (nodes). This is the case in ROMS non-tiled variables. Contrarily, Collective I/O access implies that all parallel processes participate during processing. This is the case for ROMS tiled variables: each node in the group reads/writes their own tile data when PARALLEL_IO is activated.
- File compression (DEFLATE C-preprocessing option) is not possible in parallel I/O for writing data. This is because the compression makes it imposible for the HDF5 library to exactly map the data to the disk location. However, deflated data can be read with parallel I/O.
- Parallel I/O performance gains can be seen on multi-core computer architectures. However, parallel I/O can be obtained on a Linux workstation where multiple processor can be simulated.
- The MPICH2 library generate a lot of messages to standard output. Therefore, use the following command when running ROMS to get rid of all those annoying messages:
Code: Select all
mpirun -np 4 oceanM ocean.in > & log < /ded/null &
HDF5 and NetCDF-4 compiling notes:
- As mentioned above, we have been unable to get the Fortran 90 interface of HDF5/NetCDF4 to work with OpenMPI. We have even tried the latest OpenMPI 1.3 that was release January 19, 2009. Unfortunately, all the parallel tests that come with NetCDF4 are C codes and thus don't catch this error.
- At this time we suggest you use MPICH2 to compile HDF5, NetCDF4 and ROMS. If you discover other parallel compilers that work, please let us know.
- You will need a separate HDF5 and NetCDF4 library for each compiler/mpi combination you want to use. For example, if you want to be able to run in serial and parallel you will need a separate serial version of HDF5 and NetCDF4. If you have 2 different MPI implementations with the same compiler (i.e. MPICH2 and MVAPICH2 for your PGI compler), you will need to compile a separate HDF5 and NetCDF4 library for each MPI implementation.
- When configuring HDF5, parallel libraries will be built automatically if your CC environment variable is set to mpicc. If your parallel compiler has a non-standard name, you will probably need to use the --enable-parallel flag when configuring. NetCDF4's configure script will recognize that HDF5 was built for parallel I/O and turn on NetCDF4's parallel I/O features.
- When configuring NetCDF4 you MUST include the --enable-netcdf4 option to build the version 4 interface (including parallel IO) for NetCDF.
- We had a lot of problems compiling NetCDF 4.0 and NetCDF 4.0.1-beta2 releases and ended up using the daily snapshot instead. We are using the snapshot from January 12th 2009.
- If building with gfortran, g95, pgi, or ifort (and possibly others), it is important to set the CPPFLAG -DpgiFortran for HDF5 and NetCDF4 or you will get name mismatches within the resulting libraries
There is an overhead at the beginning when the output NetCDF files are created. This is due to several of the scalars and application parameters that are defined by def_info.F and written by wrt_info.F. I am not too worried about this right now because this only happens when the file is created. I am investigating writing these into a structure or group. Notice that structures and groups are part of HDF. Processing structures is easy in C but more complicated in Fortran-90. My problem with structures is that we need to know its size in bytes in advance. The groups look promising, check NetCDF-4 new groups, compound and user derived types for more details. By the way, the NetCDF-4 file format is actually an HDF file with NetCDF self-describing metadata design.