Issue with sacling of Nested grid applications

General scientific issues regarding ROMS

Moderators: arango, robertson

Post Reply
Message
Author
Patilc103
Posts: 7
Joined: Wed May 03, 2023 12:22 pm
Location: Banglore

Issue with sacling of Nested grid applications

#1 Unread post by Patilc103 »

I'm facing an issue while scaling any nested grid application beyond 144 processors.

The error which is displayed is :


INP_PAR - domain decomposition error in input script file for grid: 01

The domain partition parameter, NtileJ = 15
is incompatible with grid size, Mm = 80
because it yields too small tile, Jstr = 1 Jend = 1
Decrease partition parameter: NtileJ

on looking more into the code i found that a bound is set on the tile partition in the file MyDir/ROMS/Utility/inp_par.f90 :
!-----------------------------------------------------------------------
! Check tile partition starting and ending (I,J) indices for illegal
! domain decomposition parameters NtileI and NtileJ in standard input
! file.
!-----------------------------------------------------------------------

IF ((BOUNDS(ng)%Jend(tile)-
BOUNDS(ng)%Jstr(tile)+1).lt.2) THEN
WRITE (stdout,80) ng, 'NtileJ = ', NtileJ(ng),
'Mm = ', Mm(ng),
'Jstr = ', BOUNDS(ng)%Jstr(tile),
' Jend = ', BOUNDS(ng)%Jend(tile),
'NtileJ'


I wanna know why are these bounds set and how can i solve this issue of scaling ?
I want to scale with more number of processors

User avatar
wilkin
Posts: 922
Joined: Mon Apr 28, 2003 5:44 pm
Location: Rutgers University
Contact:

Re: Issue with sacling of Nested grid applications

#2 Unread post by wilkin »

This isn't a nesting issue, it's simply that you are asking for tile partitions that are too small.

See this link in WikiROMS https://www.myroms.org/wiki/File:communications.png to visualize the tiles and their associated halo regions that communicate the data from adjacent tiles to complete the numerical stencil.

Your request for 15 divisions along the dimension Mm = 80 will give tiles with 5 elements - pretty much exactly as in the plot. In this extreme illustration, there are 25 physical grid points in the tile that ROMS will compute on, but there are 56 grid points of information in the halo regions being passed back and forth by MPI. Even if this were allowed we would expect it to scale poorly. ROMS is catching this and telling you to rethink ... you can't just throw an infinite number of tiles at the task and expect it to keep speeding up.
John Wilkin: DMCS Rutgers University
71 Dudley Rd, New Brunswick, NJ 08901-8521, USA. ph: 609-630-0559 jwilkin@rutgers.edu

Post Reply