The Segmentation fault of ROMS 3.6

General scientific issues regarding ROMS

Moderators: arango, robertson

Post Reply
Message
Author
cathyyangfeng
Posts: 24
Joined: Fri Aug 29, 2008 4:13 am
Location: Virginia Institute of Marine Sciences

The Segmentation fault of ROMS 3.6

#1 Unread post by cathyyangfeng »

I tried to search the segmentation fault on this Forum, but could not get a conclusion to solve the new ROMS 3.6 problem... Please help

I installed ROMS 3.6 on our institute cluster and ran the upwelling as the test case.
With the debug on, the program compiled, produced oceanG but with a warning at the end:

make: warning: Clock skew detected. Your build may be incomplete.

Then I entered

./oceanG < XX/ocean_upwelling.in

with a bunch of output, the program stopped in 30s with information:

STEP Day HH:MM:SS KINETIC_ENRG POTEN_ENRG TOTAL_ENRG NET_VOLUME
C => (i,j,k) Cu Cv Cw Max Speed

0 0 00:00:00 0.000000E+00 6.585677E+02 6.585677E+02 3.884376E+11
(00,00,00) 0.000000E+00 0.000000E+00 0.000000E+00 0.000000E+00
DEF_HIS - creating history file: ocean_his.nc
Segmentation fault


I tried ROMS 3.4 on the same cluster, no such problem occurred , the upwelling run smoothly.

Please help, any response will be highly appreciated

-Cathy

User avatar
kate
Posts: 4091
Joined: Wed Jul 02, 2003 5:29 pm
Location: CFOS/UAF, USA

Re: The Segmentation fault of ROMS 3.6

#2 Unread post by kate »

What system are you on, with what compiler? I tried gfortran on Mac, no problem. Since you are compiling in debug mode, do you also have array-bounds checking turned on? That is one common cause of seg faults. Do you have a debugger in which you can view a stack trace from a core file?

cathyyangfeng
Posts: 24
Joined: Fri Aug 29, 2008 4:13 am
Location: Virginia Institute of Marine Sciences

Re: The Segmentation fault of ROMS 3.6

#3 Unread post by cathyyangfeng »

Thanks Kate. I use linux + pgi fortran compiler. Could you let me know more about how to turn on the array-bounds? I never do that before.

User avatar
kate
Posts: 4091
Joined: Wed Jul 02, 2003 5:29 pm
Location: CFOS/UAF, USA

Re: The Segmentation fault of ROMS 3.6

#4 Unread post by kate »

Array bounds checking should be on for you already - it's the "-C" option to the compiler. I just tried Linux-pgi and got no problem. Are you sure you have the right input file?

cathyyangfeng
Posts: 24
Joined: Fri Aug 29, 2008 4:13 am
Location: Virginia Institute of Marine Sciences

Re: The Segmentation fault of ROMS 3.6

#5 Unread post by cathyyangfeng »

The right input file is the ocean_upwelling.in, yes? I am sure I input it right and change the NtileI NtileJ to match cpu number I give to the machine. What else I need for a right input file?

User avatar
kate
Posts: 4091
Joined: Wed Jul 02, 2003 5:29 pm
Location: CFOS/UAF, USA

Re: The Segmentation fault of ROMS 3.6

#6 Unread post by kate »

Ah, this is a parallel run? You probably need to use:

Code: Select all

mpirun -np xx ./oceanG XX/ocean_upwelling.in

cathyyangfeng
Posts: 24
Joined: Fri Aug 29, 2008 4:13 am
Location: Virginia Institute of Marine Sciences

Re: The Segmentation fault of ROMS 3.6

#7 Unread post by cathyyangfeng »

I tried parallel run first it showed the segmentation fault and then changed back to serial run, it shows the same problem.

I turn off the debug. It shows the following information (./oceanS < ocean_upwelling.in)

STEP Day HH:MM:SS KINETIC_ENRG POTEN_ENRG TOTAL_ENRG NET_VOLUME
C => (i,j,k) Cu Cv Cw Max Speed

0 0 00:00:00 0.000000E+00 6.585677E+02 6.585677E+02 3.884376E+11
(00,00,00) 0.000000E+00 0.000000E+00 0.000000E+00 0.000000E+00
DEF_HIS - creating history file: ocean_his.nc
0: ALLOCATE: 68719476836 bytes requested; not enough memory


I tried limit (like another post said, it showed)
cputime unlimited
filesize unlimited
datasize unlimited
stacksize 8192 kbytes
coredumpsize 0 kbytes
memoryuse 2985370 kbytes
vmemoryuse 6170560 kbytes
descriptors 1024
memorylocked 2985326 kbytes
maxproc 27903


Then I unlimited the stacksize, memoryuse coredumpsize vmemoryuse by: limit xx unlimit
cputime unlimited
filesize unlimited
datasize unlimited
stacksize unlimited
coredumpsize unlimited
memoryuse unlimited
vmemoryuse unlimited
descriptors 8192
memorylocked 2985326 kbytes
maxproc 27903

Run the oceanS again. It's back the same information

Any other suggestions?

User avatar
kate
Posts: 4091
Joined: Wed Jul 02, 2003 5:29 pm
Location: CFOS/UAF, USA

Re: The Segmentation fault of ROMS 3.6

#8 Unread post by kate »

You need to find out which allocate is asking for so very much memory and why. I would run it in a debugger, some would use print statements. Either way, we can't do it for you if we can't reproduce the problem.

mathieu

Re: The Segmentation fault of ROMS 3.6

#9 Unread post by mathieu »

The key warning is Clock skew detected. Your build may be incomplete.
This happens when you are compiling on a distant computer and that there is a clock shift between it and your computer. The solution to that is to set your clocks correctly by using the NTP protocol and/or to do make from a clean state by doing make clean && make.

This problem is classic. Google would have helped you.

cathyyangfeng
Posts: 24
Joined: Fri Aug 29, 2008 4:13 am
Location: Virginia Institute of Marine Sciences

Re: The Segmentation fault of ROMS 3.6

#10 Unread post by cathyyangfeng »

Great thanks for offering the help. My office mate helps me finding the way to get the executable file working. I would like to post it here for people who may meet the same problem in the future.

This is some conflicts between ROMS and setting of my institute cluster. It needs to change the Compiler/xx.mk file

The original file

Line 26: FFLAGS :=
Line 28: CPPFLAGS: = -P -traditional


My office mate changed it to:

FFLAGS := -mcmodel=medium
CPPFLAGS := -P -traditional -mcmodel=medium


However, I do not understand what FFLAGS and CPPFLAGS are and what the -mcmodel=medium mean. If anybody like to provide some future explanation. That will be highly appreciated!

tony1230
Posts: 87
Joined: Wed Mar 31, 2010 3:29 pm
Location: SKLEC,ECNU,Shanghai,China

Re: The Segmentation fault of ROMS 3.6

#11 Unread post by tony1230 »

Its a tricky thing. I can run my applications smoothly, but UPWELLING test case.Everything is left as they are, just use the upwelling.h instead of my file , then it can not work.So i dont think something worng in file Compilers/Linux-ifort.mk.Also i got something tricky
analytical.f90(394): error #5082: Syntax error, found ':' when expecting on of ...
ann_grid.h:(this is the place)no values provided for Xsize, Esize...
.I was so confused by that since i never touched anything related to UPWELLING. :shock: :shock:

Any ideas?

tony

User avatar
kate
Posts: 4091
Joined: Wed Jul 02, 2003 5:29 pm
Location: CFOS/UAF, USA

Re: The Segmentation fault of ROMS 3.6

#12 Unread post by kate »

This should be a new thread...

Anyway, how are you telling it that you want the UPWELLING case? Are you using the build script or the makefile? Are you telling to look in some specific directory for the ana_xx.h files or let it default to the ROMS/Functionals directory?

linzhenhua
Posts: 64
Joined: Mon Oct 17, 2005 2:02 am
Location: Institute of Oceanology,Chinese Academy of Sciences

Re: The Segmentation fault of ROMS 3.6

#13 Unread post by linzhenhua »

New or existing programs that use less than 2GB of memory will probably not require any modifications to their source code or to the way they are compiled. A large-memory program, on the other hand, in addition to the data size issues mentioned above, may need additional compiler flags, depending on the compiler used and how data is implemented.

Typically, large data objects are declared as either static global arrays or dynamically allocated memory. These are associated with the data and heap memory segments, respectively (see Understanding Memory for a detailed description of these terms). Large heap (and stack) segments do not need any special attention. However, if the size of the data segment (sum of all static global arrays) exceeds 2GB, use the appropriate flags from the following two tables.

C/C++ compilers flag
Portland pgcc -mcmodel=medium
pgCC -mcmodel=medium
Intel icc –
GNU gcc –
g++ –

Table 1 C/C++ compiler flags for accommodating large data segments.


Fortran compilers flag
Portland pgf77 -mcmodel=medium
pg90/pg95 -mcmodel=medium [-i8]¹
Intel ifort -mcmodel=medium -i-dynamic² [-i8]¹
GNU g77 -mcmodel=medium
¹ Makes intrinsic array enquiry functions return INTEGER*8 values.
² Add $INTEL_LIB_PATH to LD_LIBRARY_PATH.

Table 2 Fortran compiler flags for accommodating large data segments.


http://www.ualberta.ca/CNS/RESEARCH/Lin ... 4-bit.html

tony1230
Posts: 87
Joined: Wed Mar 31, 2010 3:29 pm
Location: SKLEC,ECNU,Shanghai,China

Re: The Segmentation fault of ROMS 3.6

#14 Unread post by tony1230 »

Two ploblems:
1) if i use a packege of bland-new code of ROMS 3.6, the upwelling case can goes to end with USE_DEBUG ?= on(oecanG), but can not going on with that option off(oceanM). Show me the following awful notice:
STEP Day HH:MM:SS KINETIC_ENRG POTEN_ENRG TOTAL_ENRG NET_VOLUME
C => (i,j,k) Cu Cv Cw Max Speed

0 0 00:00:00 0.000000E+00 6.585677E+02 6.585677E+02 3.884376E+11
(00,00,00) 0.000000E+00 0.000000E+00 0.000000E+00 0.000000E+00
DEF_HIS - creating history file: ocean_his.nc
WRT_HIS - wrote history fields (Index=1,1) into time record = 0000001
DEF_AVG - creating average file: ocean_avg.nc
DEF_DIAGS - creating diagnostics file: ocean_dia.nc
1 0 00:00:30 8.842193E-15 6.585677E+02 6.585677E+02 3.884376E+11
(01,01,01) 1.742432E-11 3.216616E-08 0.000000E+00 1.516534E-06
2 0 00:01:00 3.411147E-14 6.585677E+02 6.585677E+02 3.884376E+11
(01,80,16) 2.026023E-08 8.033716E-09 7.554283E-07 2.787508E-06
rank 15 in job 1 c0132_36016 caused collective abort of all ranks
exit status of rank 15: killed by signal 9
rank 14 in job 1 c0132_36016 caused collective abort of all ranks
exit status of rank 14: killed by signal 9
rank 10 in job 1 c0132_36016 caused collective abort of all ranks
exit status of rank 10: killed by signal 9
rank 9 in job 1 c0132_36016 caused collective abort of all ranks
...
2) if i run the upwelling case in the folder i have run much more applications, it all goes to die whehter i turn on or off the USE_DEBUG. Hereinafter to be showed just when i typed the make command after make clean:
analytical.f90(394): error #5082: Syntax error, found ':' when expecting one of: % . = =>
ana_grid.h: no values provided for Xsize, Esize, depth, f0, beta.
----------------^
analytical.f90(1349): error #5082: Syntax error, found ':' when expecting one of: % . = =>
ana_vmix.h: no values provided for Akv.
----------------^
analytical.f90(1368): error #5082: Syntax error, found ':' when expecting one of: % . = =>
ana_vmix.h: no values provided for Akt.
----------------^
compilation aborted for analytical.f90 (code 1)
make: *** [Build/analytical.o] Error 1
In addition, to kate's question, i use the makefile scirp and all the ana_xx.h files lie in default directory ROMS/Functionals

Ha-ha, it doesn't seems like a profound but troublesome question.
thanks for any suggestion or reply

Weiwei Shou

User avatar
arango
Site Admin
Posts: 1367
Joined: Wed Feb 26, 2003 4:41 pm
Location: DMCS, Rutgers University
Contact:

Re: The Segmentation fault of ROMS 3.6

#15 Unread post by arango »

It is very clear that you misspelled the UPWELLING C-preprocessing option or the build script cannot access the appropriate ana_grid.h. The error message is very clear

Code: Select all

analytical.f90(394): error #5082: Syntax error, found ':' when expecting one of: % . = =>
ana_grid.h: no values provided for Xsize, Esize, depth, f0, beta.
cannot find the application in the ana_grid routine that it is compiled. This will very easy to find with a little of curiosity and checking the processed file Build/analytical.f90.

You must be doing something wrong. The upwelling application has been run by thousands of users worldwide. You may download the ROMS test repository. The instructions are clear in :arrow: WikiROMS to how to download this repository:

svn checkout https://www.myroms.org/svn/src/test MyTest

Everything is there to run all ROMS provided applications. This is provided to help the user how to be familiar with ROMS structure. You just need to read the information that it is provided in :arrow: WikiROMS. There is a lot of stuff to learn. Otherwise, it will be extremely difficult to set-up your application.

Some of the messages posted in this thread is indicating to me the lack of familiarity with UNIX, Fortran, and compiling. Unfortunately, we cannot teach you that. Perhaps, you can find someone at your institution that may help get started.

tony1230
Posts: 87
Joined: Wed Mar 31, 2010 3:29 pm
Location: SKLEC,ECNU,Shanghai,China

Re: The Segmentation fault of ROMS 3.6

#16 Unread post by tony1230 »

Thank you arango

Yes, it's my carelessness result in the problem forementioned, the second one, but i still have the first problem. And these days, i was runing a real-application on Biological with Fennel CPP options on. It brings me blowup: without BIO_FENNEL on, model can
goes on smoothly, but if i #define BIO_FENNEL, after i qsub, without any hints but just give me the follow info:
rank 54 in job i c0124_50128 caused collective abort of all ranks
exit status of rank 54: killed by signal 9
...
rank 9 in job i c0124_50128 caused collective abort of all ranks
exit status of rank 9: killed by signal 9
I dont know whether someting wrong in my cluster that brings me the problem or something wrong in the ROMS code 3.4

Any suggestion or reply was appreciated.

Weiwei Shou

Post Reply