Regarding Parallel IO with Open MPI

Report or discuss software problems and other woes

Moderators: arango, robertson

Post Reply
Message
Author
akash96
Posts: 5
Joined: Tue Nov 26, 2024 1:55 pm
Location: Indian Institute of Science, (IISc)

Regarding Parallel IO with Open MPI

#1 Unread post by akash96 »

Hi,

I have been working on Bay of Bengal Region for a resolution of 2.3 km. I was trying to run the model for the Parallel I/O support using OpenMPI, But I was facing an issue of Blowup in the code.

Just to see whether there is something wrong with the Netcdf files that I have converted from Classic to Netcdf-4, I tried the same approach of running the Parallel I/O using OpenMPI for DAMEE (scoord22) Testcase, by converting the Classic "*_a.nc" files into the NetCDF-4 Format using nccopy command. But strangely it is running fine.

I just wanted to know that the Issue which was posted while releasing the Parallel I/O in this post i.e., "Parallel I/O via the NetCDF-4/HDF5 libraries released" with the OpenMPI, Is it fixed or not??

Akash
Attachments
Bob2_3_66432_Parallel_IO_Worked.log
(43.06 KiB) Downloaded 226 times
build_roms.sh
(16.12 KiB) Downloaded 212 times

robertson
Site Admin
Posts: 235
Joined: Wed Feb 26, 2003 3:12 pm
Location: IMCS, Rutgers University

Re: Regarding Parallel IO with Open MPI

#2 Unread post by robertson »

If you are referring to this post from 2009, then, yes, Open MPI (or possibly HDF5) fixed that many years ago.

Logs and build scripts from a model that is running fine and is a provided test case is not helpful. Please post the information from the model that is blowing up.

akash96
Posts: 5
Joined: Tue Nov 26, 2024 1:55 pm
Location: Indian Institute of Science, (IISc)

Re: Regarding Parallel IO with Open MPI

#3 Unread post by akash96 »

Hi,

I am sharing the Log file and the build script for two cases. Serial I/O is working perfectly fine in both cases. But, when I am using the Parallel I/O after converting the *.nc classic to Netcdf-4 Format, using
nccopy, then the code is blowing up in the 1st Timestep itself. Please see to the log files and suggest any changes. Thanks.
- Akash


/*****************************************************************************************************************************/
1. openmpi

For testing this :
These are the options available->
[akashbansal@ioe input_netcdf4]$ nc-config
Usage: nc-config [OPTION]

Available values for OPTION include:

--help display this help message and exit
--all display all options
--cc C compiler
--cflags pre-processor and compiler flags
--has-c++ whether C++ API is installed
--has-c++4 whether netCDF-4 C++ API is installed
--has-fortran whether Fortran API is installed
--has-dap2 whether OPeNDAP (DAP2) is enabled in this build
--has-dap4 whether DAP4 is enabled in this build
--has-dap same as --has-dap2 (Deprecated)
--has-nc2 whether NetCDF-2 API is enabled
--has-nc4 whether NetCDF-4/HDF-5 is enabled in this build
--has-hdf5 whether HDF5 is used in build (always the same as --has-nc4)
--has-hdf4 whether HDF4 was used in build
--has-logging whether logging is enabled with --enable-logging.
--has-pnetcdf whether PnetCDF was used in build
--has-szlib whether szlib is included in build
--has-cdf5 whether cdf5 support is included in build
--has-parallel4 whether has parallel IO support via HDF5
--has-parallel whether has parallel IO support via HDF5 or PnetCDF
--libs library linking information for netcdf
--static library linking information for statically-compiled netcdf
--prefix Install prefix
--includedir Include directory
--libdir Library directory
--version Library version

--fc Fortran compiler
--fflags flags needed to compile a Fortran program
--flibs libraries needed to link a Fortran program
--has-f90 whether Fortran 90 API is installed
--has-f03 whether Fortran 03 API is installed (implies F90).

/*****************************************************************************************************************************/


2. intelmpi

[cdsaka@login10 Parallel_IO_Attempts]$ nc-config
Usage: nc-config [OPTION]

Available values for OPTION include:

--help display this help message and exit
--all display all options
--cc C compiler
--cflags pre-processor and compiler flags
--has-dap whether OPeNDAP is enabled in this build
--has-nc2 whether NetCDF-2 API is enabled
--has-nc4 whether NetCDF-4/HDF-5 is enabled in this build
--has-hdf5 whether HDF5 is used in build (always the same as --has-nc4)
--has-hdf4 whether HDF4 was used in build
--has-pnetcdf whether parallel-netcdf (a.k.a. pnetcdf) was used in build
--libs library linking information for netcdf
--prefix Install prefix
--includedir Include directory
--version Library version

--fc Fortran compiler
--fflags flags needed to compile a Fortran program
--flibs libraries needed to link a Fortran program
--has-f90 whether Fortran 90 API is installed

/*****************************************************************************************************************************/
Attachments
intel_mpi.zip
(25.71 KiB) Downloaded 218 times
openmpi.zip
(24.36 KiB) Downloaded 211 times

User avatar
wilkin
Posts: 931
Joined: Mon Apr 28, 2003 5:44 pm
Location: Rutgers University
Contact:

Re: Regarding Parallel IO with Open MPI

#4 Unread post by wilkin »

We have observed some sensitivity to the compiler version using PIO.
On our cluster, ifort 19.1.1 works with ROMS PIO options ...

Code: Select all

     INP_LIB =  1
     OUT_LIB =  2
     
! PIO library methods for reading/writing NetCDF files:
...
!   [2] serial   read and write of NetCDF3 (64-bit offset)
...
  PIO_METHOD =  2

! PIO library MPI processes set-up:
 PIO_IOTASKS =  3                 ! number of I/O tasks to define
  PIO_STRIDE =  32                ! stride in the MPI-ran between I/O tasks
    PIO_BASE =  1                 ! offset for the first I/O task
  PIO_AGGREG =  1                 ! number of MPI-aggregators to use     
In particular, we find PIO_METHOD = 2 works best. This is netcdf3 output, which consumes more storage but runs much faster. If necessary, we compress to nc4 outside of ROMS.
John Wilkin: DMCS Rutgers University
71 Dudley Rd, New Brunswick, NJ 08901-8521, USA. ph: 609-630-0559 jwilkin@rutgers.edu

robertson
Site Admin
Posts: 235
Joined: Wed Feb 26, 2003 3:12 pm
Location: IMCS, Rutgers University

Re: Regarding Parallel IO with Open MPI

#5 Unread post by robertson »

Okay, a couple things:

First, you are not really comparing apples to apples here. Your two non-working examples are running ROMS 3.9. The exact release is unclear because there is no SVN version info included, but it has to be between 3 and 4 years old. Meanwhile, the testcase you ran successfully is using the latest release from 3 weeks ago. A lot of changes and improvements have been made to ROMS parallel I/O capabilities in the last few years, such as...

Second, the built-in parallel NetCDF-4 never offered a significant speed improvement in our experience, so ROMS development moved on to using the NCAR developed :arrow: Parallel IO libraries (PIO). That is the library and configuration that John Wilkin is describing above.

I would suggest that you get the PIO libraries installed and working and use the latest version of ROMS to run your model.

akash96
Posts: 5
Joined: Tue Nov 26, 2024 1:55 pm
Location: Indian Institute of Science, (IISc)

Re: Regarding Parallel IO with Open MPI

#6 Unread post by akash96 »

Okay, Thanks a lot @robertson @wilkin.

I will use the latest ROMS version with the Bay of Bengal Model and get back to you for the Parallel I/O Option.

One of my Research goals is to investigate I/O Bottlenecks, for which I am using DARSHAN (developed by Argonne National Lab) and TAU (developed by the University of Oregon) profilers.
Also, I have been using manual timers, and have implemented certain optimizations such as Non-Blocking I/O in case of Serial I/O.

So, I am interested in exploring all the possible I/O options available in the ROMS model. So, I would like to get the Parallel I/O version for the Bay of Bengal Model ROMS model working too.
Please do suggest, if any possible compiler combination, I should set up to get it working.

Also, for setting up ROMS PIO options and installing the PIO libraries, please suggest more documentation related to test cases with ROMS.

Thanks. Akash

akash96
Posts: 5
Joined: Tue Nov 26, 2024 1:55 pm
Location: Indian Institute of Science, (IISc)

Re: Regarding Parallel IO with Open MPI

#7 Unread post by akash96 »

I tried using the latest ROMS version with the Bay of Bengal Model for a 2.3 km resolution, but the code exploded in the first timestep itself.
What could be the possible wrong things I might be doing?
Please suggest any changes that might help make the Parallel I/O Option work. Thanks.

Also, I will attach the log file for the Serial I/O Case, which is working fine.

Akash
Attachments
Bob2_3_66798_trunk_latest_working_serial_IO.log
(71.85 KiB) Downloaded 224 times
PARALLEL_IO_Latest_trunk_IoE.zip
(22.78 KiB) Downloaded 212 times

robertson
Site Admin
Posts: 235
Joined: Wed Feb 26, 2003 3:12 pm
Location: IMCS, Rutgers University

Re: Regarding Parallel IO with Open MPI

#8 Unread post by robertson »

You are still using the PARALLEL_IO CPP Option, which uses the built-in parallel capability of NetCDF. As I mentioned above, we never saw performance improvement in ROMS so development using that method has seen little if any progress. From what I have read, even the developers of NetCDF suggest using PIO and the two main developers of PIO are also head NetCDF developers.

If you want to test parallel IO in your model, we highly recommend you install and use the PIO Library. Once you have PIO installed you need to make a few changes to your build script:
  1. In the MY_CPP_FLAGS section, around line 150 uncomment this:

    Code: Select all

    #export      MY_CPP_FLAGS="${MY_CPP_FLAGS} -D"
    and modify that line to add the PIO_LIB flag:

    Code: Select all

     export      MY_CPP_FLAGS="${MY_CPP_FLAGS} -DPIO_LIB"
  2. Comment out "export USE_PARALLEL_IO=on".
  3. Uncomment "#export USE_PIO=on".
Then execute your build script again to rebuild ROMS with PIO.

akash96
Posts: 5
Joined: Tue Nov 26, 2024 1:55 pm
Location: Indian Institute of Science, (IISc)

Re: Regarding Parallel IO with Open MPI

#9 Unread post by akash96 »

Which PIO version should I be using with the ROMS setup?

On wikiroms at https://www.myroms.org/wiki/External_Libraries, PIO Version 2.5.4 is mentioned.
But since, there has been a new Pull request successfully merged and closed by Arango called Parallel I/O (PIO) Revisited #53, should I stick to the version 2.5.4 for PIO or use the latest version of PIO?

Best Regards,
Akash

robertson
Site Admin
Posts: 235
Joined: Wed Feb 26, 2003 3:12 pm
Location: IMCS, Rutgers University

Re: Regarding Parallel IO with Open MPI

#10 Unread post by robertson »

I would suggest installing the latest version. We currently run 2.6.2, but I don't see anything in the release notes that would prevent you from running the latest version.

Post Reply