segmentation fault for large grid

Report or discuss software problems and other woes

Moderators: arango, robertson

Post Reply
Message
Author
wendy
Posts: 15
Joined: Thu Jan 06, 2005 5:19 pm
Location: Institute of Ocean Sciences

segmentation fault for large grid

#1 Unread post by wendy »

I've been getting a segmentation fault when I try to run my ROMS application, but have always been able to run the upwelling example without difficulty, so I tried adjusting the upwelling example to see where the problem lay. I've found that by only changing the horizontal dimensions of the grid in the upwelling case, I get a segmentation fault which occurs earlier in the initialization for larger grids (upwelling runs fine until Lm*Mm=~17500 when I get a seg fault right after the creation of the diagnostics file; this repeats until Lm*Mm=~25600 when I get a seg fault after the 'Power filter parameters' line. I haven't tested much further than this).

The only non-standard aspect of my version I can think of is that I'm using an xlf compiler on Linux; there was no compiler file for this option so I had to create my own. I used the same flags as I used in version 2.2. However, I had a similar problem with 2.2 in that for tiles that were approximately square, the model would run fine, for tiles that were too rectangular, I would get a seg fault. I didn't worry too much about this since I could get it to run.

Also, I'm using openMP.

Is anyone else compiling with xlf on Linux, and if so would you mind sharing your compiler file?

If anyone has any other thoughts of where the problem could be, they would be much appreciated.

Thanks!
Wendy

User avatar
kate
Posts: 4091
Joined: Wed Jul 02, 2003 5:29 pm
Location: CFOS/UAF, USA

#2 Unread post by kate »

If you are using N=16 in these tests, you are having trouble with smaller grids than I would have expected, though UPWELLING has the biological tracers. I figure the BENCHMARK2 problem takes about 1 GB so should be addressable by 32-bit systems while BENCHMARK3 is closer to 5 GB, needing 64-bit addressing.

Did you try this flag? LDFLAGS += -bmaxdata:0x70000000
The compiler default for xlf on AIX is more like 256 MB and this expands it. Other things to check include:

f2n1 1% ulimit -a
time(seconds) 7200
file(blocks) unlimited
data(kbytes) 2097152
stack(kbytes) unlimited
memory(kbytes) 2097152
coredump(blocks) 20000
nofiles(descriptors) 2000

or

cygnus.arsc.edu 540% limit
cputime unlimited
filesize unlimited
datasize unlimited
stacksize 10240 kbytes
coredumpsize 0 kbytes
memoryuse unlimited
vmemoryuse unlimited
descriptors 1021
memorylocked 32 kbytes
maxproc 40960

(limit is a shell built-in for the csh family.) I had to change my limits in the batch script under AIX. Check both datasize and memoryuse.

User avatar
m.hadfield
Posts: 521
Joined: Tue Jul 01, 2003 4:12 am
Location: NIWA

#3 Unread post by m.hadfield »

Hi Kate

Have you tried increasing the number of tiles?

For BENCHMARK1 with Intel Fortran (an old version, 8.1) I can use 1x4 tiling in serial mode but for OpenMP I have to increase this to something like 8x4.

guille
Posts: 1
Joined: Tue Oct 31, 2006 3:41 pm
Location: IMEDEA (csic-uib)

segmentation fault

#4 Unread post by guille »

Hi Kat
Try to increase the stack memory
ulimit -u unlimit

wendy
Posts: 15
Joined: Thu Jan 06, 2005 5:19 pm
Location: Institute of Ocean Sciences

#5 Unread post by wendy »

Hi Kate, thanks for the suggestions. I tried compiling with the -bmaxdata option, but got the error:

ld: invalid BFD target: 'maxdata:0x70000000'
I don't really understand this (and google didn't help); my knowledge in this area is a bit slim, but the computer guy here said that that flag is for 32bit exe files; I'm using 64bit (compiling with the -q64 flag). Would this inconsistency cause that error?

ulimit -a gave me:
wigginsw@pactsc:~> ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
file size (blocks, -f) unlimited
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
stack size (kbytes, -s) unlimited
cpu time (seconds, -t) unlimited
max user processes (-u) 125440
virtual memory (kbytes, -v) unlimited

this seems to be fine compared to yours. The kernal parameter maxstacksize is 32GB.

I have tried increasing the number of tiles, but without much luck. I'm currently testing upwelling with a grid of 504x532x18 (same size as my ROMS application) and even with a tiling of 126x133 (gives 4x4 tiles), I still get a seg fault.

I'm currently trying to track down where the seg fault is occuring, but if there are any other thoughts or suggestions, that would be great!

Thanks,
Wendy

User avatar
cvl
Posts: 18
Joined: Tue Jun 03, 2003 7:39 pm

OpenMP broken?

#6 Unread post by cvl »

This is somewhat consistent with my experience. I did get OpenMP BENCHMARK1 working at 4x4 on x86_64 Fedora Core 6, Intel Fortran, 4 cores, but consistently got segmentation faults with fewer tiles. I also ran into other issues that implied to me that the OpenMP code is broken.

My suggestion would be to punt and run the MPI version. Scaling has been poor on this machine, but I speculate that's a result of limited memory bandwidth rather than poor parallelization.

User avatar
arango
Site Admin
Posts: 1367
Joined: Wed Feb 26, 2003 4:41 pm
Location: DMCS, Rutgers University
Contact:

#7 Unread post by arango »

Obviously, your problem is too large for your computer. You cannot use OpenMP for this problem because of your memory limitations. As far as I know, OpenMP in ROMS is fine. The issue here is cache memory and not a bug in ROMS.

The design of ROMS for shared-memory (OpenMP) and distributed-memory (MPI) is quite different. In shared-memory, all arrays are allocated with global dimensions of the grid regradless of the tile partitions. If your program is large, you need to make sure that you have the memory required to run such program in your computer. Otherwise, you get into trouble and usually it is manifested in terms of segmentation faults. If your problem is not too large, you can play with some of the suggestions mentioned above up to a point. However, you will be penalized with a lot of memory page faults and your simulation will take much longer to run. If you exceeded too much the available memory there is nothing that you can do but make your problem smaller.

For these kind of problems is much advantageous to run in distributed-memory because the arrays are only allocated to the size of the tile plus the ghost points. This reduces the memory requirements substantially. You then can play with the tile partition in such a way that individual domain decomposition fits on cache. There is an optimal partition for a particular problem which you need to find :!: You cannot have too small tiles because you will be penalized by communications between tiles.

User avatar
kate
Posts: 4091
Joined: Wed Jul 02, 2003 5:29 pm
Location: CFOS/UAF, USA

#8 Unread post by kate »

Did you try this flag? LDFLAGS += -bmaxdata:0x70000000
The compiler default for xlf on AIX is more like 256 MB and this expands it.
Sorry, this is for 32-bit compiling. For 64-bit, you shouldn't have to worry about this flag. So it seems the advice to try other tilings or just MPI would be more to the point. I don't have a lot of experience with OpenMP (or luck with it either).

wendy
Posts: 15
Joined: Thu Jan 06, 2005 5:19 pm
Location: Institute of Ocean Sciences

#9 Unread post by wendy »

Thanks for all of the information. I gave the problem to one of the computer guys here to explore what the limitations of our machine are. He has asked me to post the following:

---------

I am running into a "segmentation fault" with this ocean_upwelling.in file :

Lm == 504 ! Number of I-direction INTERIOR RHO-points
Mm == 508 ! Number of J-direction INTERIOR RHO-points
N == 18 ! Number of vertical levels

NtileI == 8 ! I-direction partition
NtileJ == 8 ! J-direction partition

1. I have changed the NtileI and J to many values (multiples of threads=32) to no avail
2. If I change Mm to 507 it works running Ocean model with memory usage close to 4GB
3. Machine: IBM p5series, SUSE OS (SLES) not AIX, 64bit OS and 64bit code, xlf_r compiler, 32GB physical memory, 16GB swap space on raid, no other jobs running

Any suggestions in getting to my desired configuration of Lm=504, Mm=532 without a segmentation fault? I am wondering if there are any 4GB limits on this 64bit IBM machine?

----------

Thanks,
Wendy

Post Reply