Hi all,
When I was submitting a job. It stops immediately and the log file gives
Caught error: Segmentation fault (signal 11)
It seems there something with the mpi?
But even I turn off the mpi, it gives error too:
READ_PhyPar - Error while processing line:
pu9 60848864 207 954289 59062278 40855 0 220 0 0
cpu10 60307697 88 907827 59667940 23801 0 425 0 0
cpu11 60824558 123 871282 59186992 21580 0 346 0 0
cpu12 60532883 0 903659 59451675 20867 0 371 0 0
cpu13 61129422 0 862952 58896509 21886 0 156 0 0
cpu14 60
Is this caused by the compilation? Or are there some problems with the source code?
Thanks in advance!
Fan
Caught error: Segmentation fault (signal 11)
Re: Caught error: Segmentation fault (signal 11)
Note that when turning off MPI, you need to "make clean" and rebuild the thing. It looks like you didn't turn off MPI there.
From my email to you:
From my email to you:
This is without USE_DEBUG, isn't it? Could you try again with it? You might get more useful information about where in read_phypar the thing is failing. If you get a line number, check your read_phypar.f90 file to see what's on that line. In any case, read_phypar is the routine that reads your ocean.in file. So, what are the differences between my branch and the trunk for read_phypar and ocean.in in the region of interest (trouble)?
Re: Caught error: Segmentation fault (signal 11)
Thanks very much for your reply.
I turned debug on, still it is not working properly.
log file with MPI:
Time is Mon Jul 8 16:03:16 EDT 2013
This jobs runs on the following processors:
iw-k30-24.pace.gatech.edu iw-k30-24.pace.gatech.edu iw-k30-24.pace.gatech.edu iw-k30-24.pace.gatech.edu iw-k30-24.pace.gatech.edu iw-k30-24.pace.gatech.edu iw-k30-24.pace.gatech.edu iw-k30-24.pace.gatech.edu iw-k30-24.pace.gatech.edu iw-k30-24.pace.gatech.edu iw-k30-24.pace.gatech.edu iw-k30-24.pace.gatech.edu iw-k30-24.pace.gatech.edu iw-k30-24.pace.gatech.edu iw-k30-24.pace.gatech.edu iw-k30-24.pace.gatech.edu
This job has allocated 16 nodes
Model Input Parameters: ROMS/TOMS version 3.6
Monday - July 8, 2013 - 4:03:17 PM
-----------------------------------------------------------------------------
=====================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= EXIT CODE: 11
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
=====================================================================================
APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)
log file without mpi:
Please see the attached file. There are plenty of non-readable symbols, including the line for Read_PhyPar
I tried with the previous main source code from roms.org, it runs for 6 steps and then blows up. Thus, the problem may arise from the compilation, excluding the possibility of server/cluster.
Fan
I turned debug on, still it is not working properly.
log file with MPI:
Time is Mon Jul 8 16:03:16 EDT 2013
This jobs runs on the following processors:
iw-k30-24.pace.gatech.edu iw-k30-24.pace.gatech.edu iw-k30-24.pace.gatech.edu iw-k30-24.pace.gatech.edu iw-k30-24.pace.gatech.edu iw-k30-24.pace.gatech.edu iw-k30-24.pace.gatech.edu iw-k30-24.pace.gatech.edu iw-k30-24.pace.gatech.edu iw-k30-24.pace.gatech.edu iw-k30-24.pace.gatech.edu iw-k30-24.pace.gatech.edu iw-k30-24.pace.gatech.edu iw-k30-24.pace.gatech.edu iw-k30-24.pace.gatech.edu iw-k30-24.pace.gatech.edu
This job has allocated 16 nodes
Model Input Parameters: ROMS/TOMS version 3.6
Monday - July 8, 2013 - 4:03:17 PM
-----------------------------------------------------------------------------
=====================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= EXIT CODE: 11
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
=====================================================================================
APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)
log file without mpi:
Please see the attached file. There are plenty of non-readable symbols, including the line for Read_PhyPar
I tried with the previous main source code from roms.org, it runs for 6 steps and then blows up. Thus, the problem may arise from the compilation, excluding the possibility of server/cluster.
Fan
kate wrote:Note that when turning off MPI, you need to "make clean" and rebuild the thing. It looks like you didn't turn off MPI there.
From my email to you:This is without USE_DEBUG, isn't it? Could you try again with it? You might get more useful information about where in read_phypar the thing is failing. If you get a line number, check your read_phypar.f90 file to see what's on that line. In any case, read_phypar is the routine that reads your ocean.in file. So, what are the differences between my branch and the trunk for read_phypar and ocean.in in the region of interest (trouble)?
- Attachments
-
- log.ross-nud-zice.txt
- (215.29 KiB) Downloaded 700 times
Re: Caught error: Segmentation fault (signal 11)
That's a singularly unhelpful log file there. It's full of null characters, nothing useful.
I would recompile this with USE_DEBUG and without USE_MPI. You can then run it in a debugger such as gdb in serial mode. Perhaps then you can see what the problem is, or at least see which line of read_phypar is giving you trouble.
I would recompile this with USE_DEBUG and without USE_MPI. You can then run it in a debugger such as gdb in serial mode. Perhaps then you can see what the problem is, or at least see which line of read_phypar is giving you trouble.
Re: Caught error: Segmentation fault (signal 11)
Thanks very much!
Now the problem has been solved and it is running smoothly.
It turns out that the problem is caused by the wrong header file.
I was using the header file I have been using from the old source code, which misses Ice boundary conditions.
Again, Kate, thanks very much for your continuous patience and time!
Fan
Now the problem has been solved and it is running smoothly.
It turns out that the problem is caused by the wrong header file.
I was using the header file I have been using from the old source code, which misses Ice boundary conditions.
Again, Kate, thanks very much for your continuous patience and time!
Fan
kate wrote:That's a singularly unhelpful log file there. It's full of null characters, nothing useful.
I would recompile this with USE_DEBUG and without USE_MPI. You can then run it in a debugger such as gdb in serial mode. Perhaps then you can see what the problem is, or at least see which line of read_phypar is giving you trouble.