Hi All,
As a new user to ROMS, I was wondering what the approach was to debug the model blowing up. I have recently been running an application for the Delaware Estuary and obtained:
ROMS/TOMS -Blows Up .....................exit_flag: 1
Saving the latest model state into RESTART file
MAIN: Abnormal termination: Blowup.
I have several questions:
1. What triggers the BLOWUP are NANs searched and if so for what fields.
2. Can you point me to the appropriate routines and locations within ROMS, where the BLOWUP is detected?
3 Presumeably one examines the restart file, but if NANs are present, it may be difficult to debug.
4. How would one change the BLOWUP criteria to be a condition of maximum velocity exceeding 100 m/s as is done in POM?
5. In ROMS, how does one examine and determine the cause of the BLOWUP?
Any suggestions would be much appreciated. Thanks....Dick Schmalz
ROMS Debug Approach
-
- Posts: 24
- Joined: Thu Oct 04, 2007 4:14 am
- Location: NOAA
In reviewing the code in diag.F, I note that if either the kinetic or potential energy summed over the domain is N or n or * (exceeds 8 places) then the exit flag 1 is triggered and the model Blows Up.
I will consider adding a supplemental check, whereby if either a u or v velocity component exceeds 10 m/s a blow up condition will be triggered. In this manner, when the restart file is written, one may notice, where the velocities are exceeded.
I will consider adding a supplemental check, whereby if either a u or v velocity component exceeds 10 m/s a blow up condition will be triggered. In this manner, when the restart file is written, one may notice, where the velocities are exceeded.
Good points. I too have found the writing of the restart record with the NaNs in it to be pretty useless. Oh, for the good old days when the Cray compiler would let you dump core during the actual operation that caused the NaN/Inf so you could see what was going on.
You might also want to check for outrageous values of T and S.
You might also want to check for outrageous values of T and S.
-
- Posts: 24
- Joined: Thu Oct 04, 2007 4:14 am
- Location: NOAA
In diag.F, I believe one can add after:
Compute and report out volume averaged kinetic, potential
total energy, and volume.
In 3d inner loop:
if ( ABS(u(i,j,k,nstp)) .gt. 10.0 .or.
ABS(v(i,j,k,nstp)) .gt. 10.0)exit_flag=1
In 2d inner loop:
if ( ABS(ubar(i,j,krhs)) .gt. 10.0 .or.
ABS(vbar(i,j,krhs)) .gt. 10.0)exit_flag=1
In this manner, an exit condition will occur for excessive velocity component values and the necessary variable for debugging
will be written without Nans in the restart file.
Compute and report out volume averaged kinetic, potential
total energy, and volume.
In 3d inner loop:
if ( ABS(u(i,j,k,nstp)) .gt. 10.0 .or.
ABS(v(i,j,k,nstp)) .gt. 10.0)exit_flag=1
In 2d inner loop:
if ( ABS(ubar(i,j,krhs)) .gt. 10.0 .or.
ABS(vbar(i,j,krhs)) .gt. 10.0)exit_flag=1
In this manner, an exit condition will occur for excessive velocity component values and the necessary variable for debugging
will be written without Nans in the restart file.
-
- Posts: 24
- Joined: Thu Oct 04, 2007 4:14 am
- Location: NOAA
You can blame such default behavior (continuing to run way after infinities and NaNs have been produced by operations) on the IEEE standard for floating-point arithmetic, on systems that implement it.kate wrote: [...] Oh, for the good old days when the Cray compiler would let you dump core during the actual operation that caused the NaN/Inf so you could see what was going on.
Fortunately, there is a way to recover the Cray-like behavior. Under IRIX, one can set a runtime environment variable (TRAP_FPE) to a string that specifies the behavior for divide by zero, overflow, underflow, etc.; with Intel compilers under Linux (on x86_64 or ia64 processors) you enable it by compiling the main program with the "-fpe0" option.
Other systems or compilers may have other ways to alter the default floating point exception handling.
Saludos,
Gerardo
I believe what Richard wants is a way to see the fields after they've gone bad but before they are unplottable NaNs.
Here's one option for getting back the Cray-like behavior of old:
ttp://www.arsc.edu/support/news/HPCnews/HPCne ... l#article3
Here's one option for getting back the Cray-like behavior of old:
ttp://www.arsc.edu/support/news/HPCnews/HPCne ... l#article3