MPI error ? and how to run MPI ?

Report or discuss software problems and other woes

Moderators: arango, robertson

Post Reply
Message
Author
FengZhou
Posts: 52
Joined: Wed Apr 07, 2004 10:48 pm
Location: 2nd Institute of Oceanography,SOA

MPI error ? and how to run MPI ?

#1 Unread post by FengZhou »

I got trouble when I wanted to use 2 or more nodes to run ROMS MPI,but
it works with only one node. The basic configuration of my model is :

parameter (LLm0=167, MMm0=244, N=20) ! <--East China Sea

parameter (NP_XI=2, NP_ETA=3, NNODES=NP_XI*NP_ETA)
parameter (NSUB_X=1, NSUB_E=2, NPP=1)

so horizontal grids could be divided into 2*3 nodes respectively.

The model stoped with information on screen:
...
GET_INITIAL -- Processing data for time = 15.00 record = 1
bm_list_5711: (1.774467) wakeup_slave: unable to interrupt slave 0 pid 5695
p2_2288: p4_error: interrupt SIGSEGV: 11
p5_1380: p4_error: interrupt SIGSEGV: 11
p1_2313: p4_error: interrupt SIGSEGV: 11
p4_7908: p4_error: interrupt SIGSEGV: 11
p3_4358: p4_error: interrupt SIGSEGV: 11

Should something else be taken into account? and what's problem with this
p4_error ?

Could anyone tell me what's wrong?

Could anyone post foundmental processes to start MPI running of ROMS?

Many thanks!

ZHOU

User avatar
kate
Posts: 4091
Joined: Wed Jul 02, 2003 5:29 pm
Location: CFOS/UAF, USA

Re: MPI error ? and how to run MPI ?

#2 Unread post by kate »

FengZhou wrote: parameter (LLm0=167, MMm0=244, N=20) ! <--East China Sea

parameter (NP_XI=2, NP_ETA=3, NNODES=NP_XI*NP_ETA)
parameter (NSUB_X=1, NSUB_E=2, NPP=1)

so horizontal grids could be divided into 2*3 nodes respectively.

ZHOU
I'm not familiar with these variables at all. In the ROMS I have, you simply compile with MPI enabled (-DMPI plus appropriate compiler switches). Then in the input file, set:

NtileI == 2 ! I-direction partition
NtileJ == 3 ! J-direction partition

and make sure to ask mpi for 6 processes. The way to ask for 6 processes is going to depend on your system. I run in batch mode on an IBM or Cray, in which the number of processes is one of the job script parameters. If you run interactively with mpirun, the number is going to be one of the options. That's all you need. The number is not compiled in with a modern east-coast ROMS.

User avatar
arango
Site Admin
Posts: 1367
Joined: Wed Feb 26, 2003 4:41 pm
Location: DMCS, Rutgers University
Contact:

#3 Unread post by arango »

You are using the UCLA version of ROMS. Perhaps, you can get advise from someone in their group.

Good luck, H

FengZhou
Posts: 52
Joined: Wed Apr 07, 2004 10:48 pm
Location: 2nd Institute of Oceanography,SOA

another questions: how to start MPI with ROMS/Rutgers 2.2

#4 Unread post by FengZhou »

Many thanks to both of you. Yes I am running west coast and east coast version of ROMS at the same time. The former is easy to start and I've got powerful preprocessing tools (Roms_tools). The results looks good, but I can't run them with MPI. The problem is as above. Any way, thank you so much!


I am now testing ROMS/Rutgers 2.2 with Damee input data, it works in as serial mode. To save time I would always like to run with MPI, could anyone tell me how to start with MPI???

I modified cppdefs.h like:
MPI := on
IFORT ?= mpif90 (mpi)
compile with make -f makefile.mpi with error yielding:

/usr/bin/cpp -P -traditional -DLINUX -I./netcdf_ifc -DMPI -DLINUX -DI686 -DMPIF90 -IInclude -INonlinear -IDrivers Modules/mod_kinds.F > mod_kinds.f90
Bin/cpp_clean mod_kinds.f90
mpif90 -c -g -check bounds mod_kinds.f90
/opt/mpich-1.2.5.10-ch_p4-gcc/bin/mpif90: line 332: eval: -c: invalid option
eval: usage: eval [arg ...]
make: *** [mod_kinds.o] Error 2
rm mod_kinds.f90

What's this problem mean?

ZHOU

User avatar
kate
Posts: 4091
Joined: Wed Jul 02, 2003 5:29 pm
Location: CFOS/UAF, USA

#5 Unread post by kate »

That's a compiler question, not a ROMS question. I'm afraid I've never used mpich, but you should say what computer you are running on with what compiler. Maybe make sure the mpich is configured correctly with some trivial MPI example.

WangLei

#6 Unread post by WangLei »

Zhou feng,

Try to check the Linux_mpif90.mk file in the Compilers subdirectory.

For my case, with the following line, compilation will not be Ok. After getting rid of this line, It was OK.

# LDFLAGS := -Vaxlib

Wang Lei

Post Reply