For the purpose of setting up new (physically realistic) ROMS applications in a numerically stable and reliable fashion & for debugging purposes, I attempted to add a new routine to ROMS to do the following:
(1) Loop over all of the wet points in 3D and calculate the full 3D CFL number (|u|dt/dx + |v|dt/dy + |w|dt/dz),
(2) Find its maximum and then its corresponding U-, V- and W- components,
(3) From (2) find the (I,J,K) grid locations as to where this maximum CFL occurs and,
(4) Write out the Max. CFL, its U-, V-, W- components and the (I,J,k) on to the screen at each time step (after writing the time, KE, PE, total energy, total volume, thread, etc.)
Strictly speaking, there are some additional contributions to the stability condition which are Kh*dt/dx^2, Kh*dt/dy^2 and (Kh+Kt)*dt/dz^2 where Kh is any added numerical viscosity in the horizontal and Kt is the vertical eddy-viscosity from the eddy viscosity model employed (MY2.5, KPP, GLS family, etc.). For the moment, I will ignore these contributions and as I generally use upstream-biased advection schemes, I do not use any horizontal viscosity or diffusivity and hence Kh=0.
I wrote a routine to preform (1)-(4) above in SERIAL code and it works correctly and very well. However, even though I placed it just before the CALL to 'output(ng)' in main3d.F which is a SERIAL routine, it does not work properly in MPI and gives erroneous answers. I do not know how to code up (1)-(4) in MPI code because of steps (2) and (3) where we need to do some sorting to find the corresponding U-, V-, W- components of the CFL number and the (I,J,K) locations.
Could someone please let me know whether:
(a) you have written a CFL calculation routine for ROMS which I can have (and which works with MPI)? or,
(b) ROMS has any MPI sorting routines embedded in it (eg. as in the routines in Utility/distribute.F where there are routines for performing arithmetic operations such as finding max, min, sum - mp_reduce(..))? or,
(c) you have a MPI sorting routine (with a bit of documentation) which I can easily embed into ROMS do to (2) and (3)? or,
(d) you could show me some code segments to code up (1)-(4) in MPI?
Thanks very much,
Lyon.
Sorting routine in MPI & adding new CFL routine to ROMS
Re: Sorting routine in MPI & adding new CFL routine to R
I would first do a local sort and put in the value of the local CFL maximum in a local scalar.lanerolle wrote: (2) Find its maximum and then its corresponding U-, V- and W- components,
(3) From (2) find the (I,J,K) grid locations as to where this maximum CFL occurs and,
[snip]
I do not know how to code up (1)-(4) in MPI code because of steps (2) and (3) where we need to do some sorting to find the corresponding U-, V-, W- components of the CFL number and the (I,J,K) locations.
[snip]
(d) you could show me some code segments to code up (1)-(4) in MPI?
Then I would call MPI_Allreduce() with an OPTYPE of MPI_MAXLOC so that the global reduction operation would not only give the global maximum but also the index in the global distributed where it occurs. This index is coincident with the processor ID of the processor "owning" the maximum location in the domain decomposition. Call that processor "A". Use of MPI_Allreduce() instead of MPI_Reduce() means that all processors know the answer and therefore it is now possible for processor 0 to post and MPI_Recv() for the full information that the local sort produced on "A" and for "A" to packup that answer and send it to processor 0 with an MPI_Send().
For an example use of MPI_MAXLOC look at:
http://www.mpi-forum.org/docs/mpi-11-html/node79.html
Constantinos
The subroutine ROMS/Utility/metrics.F shows you how to do these global calculations. This is the routine that reports the grid metrics
to stdout immediately before the NLM: GET_STATE information on initial conditions. If you are unfamiliar with how the parallel tiling is implemented, be very careful in how you implement your new Courant number calculation.
The vertical mixing algorithm is implicit so is there any point in including (Kh+Kt)*dt/dz^2 in your "stability" condition?
Code: Select all
Minimum barotropic Courant Number =
Maximum barotropic Courant Number =
Maximum Coriolis Courant Number =
The vertical mixing algorithm is implicit so is there any point in including (Kh+Kt)*dt/dz^2 in your "stability" condition?
John Wilkin: DMCS Rutgers University
71 Dudley Rd, New Brunswick, NJ 08901-8521, USA. ph: 609-630-0559 jwilkin@rutgers.edu
71 Dudley Rd, New Brunswick, NJ 08901-8521, USA. ph: 609-630-0559 jwilkin@rutgers.edu