Problem with mpich

Report or discuss software problems and other woes

Moderators: arango, robertson

Post Reply
Message
Author
mathieu

Problem with mpich

#1 Unread post by mathieu »

Dear All,

I tried to run ROMS on a HP cluster of 4 computers, every one of them having 2 Xeons in hyperthreading with MPI.

The problem happens with the mp_exchange procedure. After doing mpi_irecv and mpi_send the procedure mpi_wait is called and the thread is stopped.

However, only the thread of rank 0 (i.e. master) is able to run mp_exchange. The thread of rank 1 never runs this procedure. Hence, the blocking which happens when NtileI*NtileJ>1. I have
checked with different applications and the blocking point is always this mp_exchange.

It seems reasonably clear to me that the problem is somehow related to the system chosen, i.e. debian linux 2.6.8-2-686-smp with mpich version 1.2.6 and not with the ROMS code itself.

Has anyone encounter similar problems with mpich?

Post Reply