Regarding Parallel IO with Open MPI
Regarding Parallel IO with Open MPI
Hi,
I have been working on Bay of Bengal Region for a resolution of 2.3 km. I was trying to run the model for the Parallel I/O support using OpenMPI, But I was facing an issue of Blowup in the code.
Just to see whether there is something wrong with the Netcdf files that I have converted from Classic to Netcdf-4, I tried the same approach of running the Parallel I/O using OpenMPI for DAMEE (scoord22) Testcase, by converting the Classic "*_a.nc" files into the NetCDF-4 Format using nccopy command. But strangely it is running fine.
I just wanted to know that the Issue which was posted while releasing the Parallel I/O in this post i.e., "Parallel I/O via the NetCDF-4/HDF5 libraries released" with the OpenMPI, Is it fixed or not??
Akash
I have been working on Bay of Bengal Region for a resolution of 2.3 km. I was trying to run the model for the Parallel I/O support using OpenMPI, But I was facing an issue of Blowup in the code.
Just to see whether there is something wrong with the Netcdf files that I have converted from Classic to Netcdf-4, I tried the same approach of running the Parallel I/O using OpenMPI for DAMEE (scoord22) Testcase, by converting the Classic "*_a.nc" files into the NetCDF-4 Format using nccopy command. But strangely it is running fine.
I just wanted to know that the Issue which was posted while releasing the Parallel I/O in this post i.e., "Parallel I/O via the NetCDF-4/HDF5 libraries released" with the OpenMPI, Is it fixed or not??
Akash
- Attachments
-
Bob2_3_66432_Parallel_IO_Worked.log
- (43.06 KiB) Downloaded 225 times
-
build_roms.sh
- (16.12 KiB) Downloaded 211 times
Re: Regarding Parallel IO with Open MPI
If you are referring to this post from 2009, then, yes, Open MPI (or possibly HDF5) fixed that many years ago.
Logs and build scripts from a model that is running fine and is a provided test case is not helpful. Please post the information from the model that is blowing up.
Logs and build scripts from a model that is running fine and is a provided test case is not helpful. Please post the information from the model that is blowing up.
Re: Regarding Parallel IO with Open MPI
Hi,
I am sharing the Log file and the build script for two cases. Serial I/O is working perfectly fine in both cases. But, when I am using the Parallel I/O after converting the *.nc classic to Netcdf-4 Format, using
nccopy, then the code is blowing up in the 1st Timestep itself. Please see to the log files and suggest any changes. Thanks.
- Akash
/*****************************************************************************************************************************/
1. openmpi
For testing this :
These are the options available->
[akashbansal@ioe input_netcdf4]$ nc-config
Usage: nc-config [OPTION]
Available values for OPTION include:
--help display this help message and exit
--all display all options
--cc C compiler
--cflags pre-processor and compiler flags
--has-c++ whether C++ API is installed
--has-c++4 whether netCDF-4 C++ API is installed
--has-fortran whether Fortran API is installed
--has-dap2 whether OPeNDAP (DAP2) is enabled in this build
--has-dap4 whether DAP4 is enabled in this build
--has-dap same as --has-dap2 (Deprecated)
--has-nc2 whether NetCDF-2 API is enabled
--has-nc4 whether NetCDF-4/HDF-5 is enabled in this build
--has-hdf5 whether HDF5 is used in build (always the same as --has-nc4)
--has-hdf4 whether HDF4 was used in build
--has-logging whether logging is enabled with --enable-logging.
--has-pnetcdf whether PnetCDF was used in build
--has-szlib whether szlib is included in build
--has-cdf5 whether cdf5 support is included in build
--has-parallel4 whether has parallel IO support via HDF5
--has-parallel whether has parallel IO support via HDF5 or PnetCDF
--libs library linking information for netcdf
--static library linking information for statically-compiled netcdf
--prefix Install prefix
--includedir Include directory
--libdir Library directory
--version Library version
--fc Fortran compiler
--fflags flags needed to compile a Fortran program
--flibs libraries needed to link a Fortran program
--has-f90 whether Fortran 90 API is installed
--has-f03 whether Fortran 03 API is installed (implies F90).
/*****************************************************************************************************************************/
2. intelmpi
[cdsaka@login10 Parallel_IO_Attempts]$ nc-config
Usage: nc-config [OPTION]
Available values for OPTION include:
--help display this help message and exit
--all display all options
--cc C compiler
--cflags pre-processor and compiler flags
--has-dap whether OPeNDAP is enabled in this build
--has-nc2 whether NetCDF-2 API is enabled
--has-nc4 whether NetCDF-4/HDF-5 is enabled in this build
--has-hdf5 whether HDF5 is used in build (always the same as --has-nc4)
--has-hdf4 whether HDF4 was used in build
--has-pnetcdf whether parallel-netcdf (a.k.a. pnetcdf) was used in build
--libs library linking information for netcdf
--prefix Install prefix
--includedir Include directory
--version Library version
--fc Fortran compiler
--fflags flags needed to compile a Fortran program
--flibs libraries needed to link a Fortran program
--has-f90 whether Fortran 90 API is installed
/*****************************************************************************************************************************/
I am sharing the Log file and the build script for two cases. Serial I/O is working perfectly fine in both cases. But, when I am using the Parallel I/O after converting the *.nc classic to Netcdf-4 Format, using
nccopy, then the code is blowing up in the 1st Timestep itself. Please see to the log files and suggest any changes. Thanks.
- Akash
/*****************************************************************************************************************************/
1. openmpi
For testing this :
These are the options available->
[akashbansal@ioe input_netcdf4]$ nc-config
Usage: nc-config [OPTION]
Available values for OPTION include:
--help display this help message and exit
--all display all options
--cc C compiler
--cflags pre-processor and compiler flags
--has-c++ whether C++ API is installed
--has-c++4 whether netCDF-4 C++ API is installed
--has-fortran whether Fortran API is installed
--has-dap2 whether OPeNDAP (DAP2) is enabled in this build
--has-dap4 whether DAP4 is enabled in this build
--has-dap same as --has-dap2 (Deprecated)
--has-nc2 whether NetCDF-2 API is enabled
--has-nc4 whether NetCDF-4/HDF-5 is enabled in this build
--has-hdf5 whether HDF5 is used in build (always the same as --has-nc4)
--has-hdf4 whether HDF4 was used in build
--has-logging whether logging is enabled with --enable-logging.
--has-pnetcdf whether PnetCDF was used in build
--has-szlib whether szlib is included in build
--has-cdf5 whether cdf5 support is included in build
--has-parallel4 whether has parallel IO support via HDF5
--has-parallel whether has parallel IO support via HDF5 or PnetCDF
--libs library linking information for netcdf
--static library linking information for statically-compiled netcdf
--prefix Install prefix
--includedir Include directory
--libdir Library directory
--version Library version
--fc Fortran compiler
--fflags flags needed to compile a Fortran program
--flibs libraries needed to link a Fortran program
--has-f90 whether Fortran 90 API is installed
--has-f03 whether Fortran 03 API is installed (implies F90).
/*****************************************************************************************************************************/
2. intelmpi
[cdsaka@login10 Parallel_IO_Attempts]$ nc-config
Usage: nc-config [OPTION]
Available values for OPTION include:
--help display this help message and exit
--all display all options
--cc C compiler
--cflags pre-processor and compiler flags
--has-dap whether OPeNDAP is enabled in this build
--has-nc2 whether NetCDF-2 API is enabled
--has-nc4 whether NetCDF-4/HDF-5 is enabled in this build
--has-hdf5 whether HDF5 is used in build (always the same as --has-nc4)
--has-hdf4 whether HDF4 was used in build
--has-pnetcdf whether parallel-netcdf (a.k.a. pnetcdf) was used in build
--libs library linking information for netcdf
--prefix Install prefix
--includedir Include directory
--version Library version
--fc Fortran compiler
--fflags flags needed to compile a Fortran program
--flibs libraries needed to link a Fortran program
--has-f90 whether Fortran 90 API is installed
/*****************************************************************************************************************************/
- Attachments
-
- intel_mpi.zip
- (25.71 KiB) Downloaded 217 times
-
- openmpi.zip
- (24.36 KiB) Downloaded 211 times
Re: Regarding Parallel IO with Open MPI
We have observed some sensitivity to the compiler version using PIO.
On our cluster, ifort 19.1.1 works with ROMS PIO options ...
In particular, we find PIO_METHOD = 2 works best. This is netcdf3 output, which consumes more storage but runs much faster. If necessary, we compress to nc4 outside of ROMS.
On our cluster, ifort 19.1.1 works with ROMS PIO options ...
Code: Select all
INP_LIB = 1
OUT_LIB = 2
! PIO library methods for reading/writing NetCDF files:
...
! [2] serial read and write of NetCDF3 (64-bit offset)
...
PIO_METHOD = 2
! PIO library MPI processes set-up:
PIO_IOTASKS = 3 ! number of I/O tasks to define
PIO_STRIDE = 32 ! stride in the MPI-ran between I/O tasks
PIO_BASE = 1 ! offset for the first I/O task
PIO_AGGREG = 1 ! number of MPI-aggregators to use
John Wilkin: DMCS Rutgers University
71 Dudley Rd, New Brunswick, NJ 08901-8521, USA. ph: 609-630-0559 jwilkin@rutgers.edu
71 Dudley Rd, New Brunswick, NJ 08901-8521, USA. ph: 609-630-0559 jwilkin@rutgers.edu
Re: Regarding Parallel IO with Open MPI
Okay, a couple things:
First, you are not really comparing apples to apples here. Your two non-working examples are running ROMS 3.9. The exact release is unclear because there is no SVN version info included, but it has to be between 3 and 4 years old. Meanwhile, the testcase you ran successfully is using the latest release from 3 weeks ago. A lot of changes and improvements have been made to ROMS parallel I/O capabilities in the last few years, such as...
Second, the built-in parallel NetCDF-4 never offered a significant speed improvement in our experience, so ROMS development moved on to using the NCAR developed
Parallel IO libraries (PIO). That is the library and configuration that John Wilkin is describing above.
I would suggest that you get the PIO libraries installed and working and use the latest version of ROMS to run your model.
First, you are not really comparing apples to apples here. Your two non-working examples are running ROMS 3.9. The exact release is unclear because there is no SVN version info included, but it has to be between 3 and 4 years old. Meanwhile, the testcase you ran successfully is using the latest release from 3 weeks ago. A lot of changes and improvements have been made to ROMS parallel I/O capabilities in the last few years, such as...
Second, the built-in parallel NetCDF-4 never offered a significant speed improvement in our experience, so ROMS development moved on to using the NCAR developed

I would suggest that you get the PIO libraries installed and working and use the latest version of ROMS to run your model.
Re: Regarding Parallel IO with Open MPI
Okay, Thanks a lot @robertson @wilkin.
I will use the latest ROMS version with the Bay of Bengal Model and get back to you for the Parallel I/O Option.
One of my Research goals is to investigate I/O Bottlenecks, for which I am using DARSHAN (developed by Argonne National Lab) and TAU (developed by the University of Oregon) profilers.
Also, I have been using manual timers, and have implemented certain optimizations such as Non-Blocking I/O in case of Serial I/O.
So, I am interested in exploring all the possible I/O options available in the ROMS model. So, I would like to get the Parallel I/O version for the Bay of Bengal Model ROMS model working too.
Please do suggest, if any possible compiler combination, I should set up to get it working.
Also, for setting up ROMS PIO options and installing the PIO libraries, please suggest more documentation related to test cases with ROMS.
Thanks. Akash
I will use the latest ROMS version with the Bay of Bengal Model and get back to you for the Parallel I/O Option.
One of my Research goals is to investigate I/O Bottlenecks, for which I am using DARSHAN (developed by Argonne National Lab) and TAU (developed by the University of Oregon) profilers.
Also, I have been using manual timers, and have implemented certain optimizations such as Non-Blocking I/O in case of Serial I/O.
So, I am interested in exploring all the possible I/O options available in the ROMS model. So, I would like to get the Parallel I/O version for the Bay of Bengal Model ROMS model working too.
Please do suggest, if any possible compiler combination, I should set up to get it working.
Also, for setting up ROMS PIO options and installing the PIO libraries, please suggest more documentation related to test cases with ROMS.
Thanks. Akash
Re: Regarding Parallel IO with Open MPI
I tried using the latest ROMS version with the Bay of Bengal Model for a 2.3 km resolution, but the code exploded in the first timestep itself.
What could be the possible wrong things I might be doing?
Please suggest any changes that might help make the Parallel I/O Option work. Thanks.
Also, I will attach the log file for the Serial I/O Case, which is working fine.
Akash
What could be the possible wrong things I might be doing?
Please suggest any changes that might help make the Parallel I/O Option work. Thanks.
Also, I will attach the log file for the Serial I/O Case, which is working fine.
Akash
- Attachments
-
Bob2_3_66798_trunk_latest_working_serial_IO.log
- (71.85 KiB) Downloaded 223 times
-
- PARALLEL_IO_Latest_trunk_IoE.zip
- (22.78 KiB) Downloaded 212 times
Re: Regarding Parallel IO with Open MPI
You are still using the PARALLEL_IO CPP Option, which uses the built-in parallel capability of NetCDF. As I mentioned above, we never saw performance improvement in ROMS so development using that method has seen little if any progress. From what I have read, even the developers of NetCDF suggest using PIO and the two main developers of PIO are also head NetCDF developers.
If you want to test parallel IO in your model, we highly recommend you install and use the PIO Library. Once you have PIO installed you need to make a few changes to your build script:
If you want to test parallel IO in your model, we highly recommend you install and use the PIO Library. Once you have PIO installed you need to make a few changes to your build script:
- In the MY_CPP_FLAGS section, around line 150 uncomment this:
and modify that line to add the PIO_LIB flag:
Code: Select all
#export MY_CPP_FLAGS="${MY_CPP_FLAGS} -D"
Code: Select all
export MY_CPP_FLAGS="${MY_CPP_FLAGS} -DPIO_LIB"
- Comment out "export USE_PARALLEL_IO=on".
- Uncomment "#export USE_PIO=on".
Re: Regarding Parallel IO with Open MPI
Which PIO version should I be using with the ROMS setup?
On wikiroms at https://www.myroms.org/wiki/External_Libraries, PIO Version 2.5.4 is mentioned.
But since, there has been a new Pull request successfully merged and closed by Arango called Parallel I/O (PIO) Revisited #53, should I stick to the version 2.5.4 for PIO or use the latest version of PIO?
Best Regards,
Akash
On wikiroms at https://www.myroms.org/wiki/External_Libraries, PIO Version 2.5.4 is mentioned.
But since, there has been a new Pull request successfully merged and closed by Arango called Parallel I/O (PIO) Revisited #53, should I stick to the version 2.5.4 for PIO or use the latest version of PIO?
Best Regards,
Akash
Re: Regarding Parallel IO with Open MPI
I would suggest installing the latest version. We currently run 2.6.2, but I don't see anything in the release notes that would prevent you from running the latest version.