Hi all,
I set-upped a Linux box with Intel i7 + Fedora11_64 + ifort11 + mpich2. I could compile and run test case smoothly. Happy! But something weird happens. Whatever I edit in "*.in" file (like NHIS, NINFO, somethings not important) and rerun the previously successful case, it will blow up and NaN values show up within several steps. Then run it again without changing anything, sometimes it works, sometimes it needs the third try.
We have a 16-node cluster with 32 AMD CPUs, Redhat Enter.3 and ifort9. I runned the same case aforementioned (same ROMS version, same CPP options...), but got different results. Take salinity for an example, the discrepancy at some points can be as high as 3psu.
What kind of things do you think cause that. Linux box configuration? Or the ROMS itself? Or different intel fortran? Or platforms and OS do give different results?
Sorry that I'm totally confused and the above may also confuse you.
Thank you.
Peng
Weird experience with ROMS3.72
-
- Posts: 64
- Joined: Mon Oct 17, 2005 2:02 am
- Location: Institute of Oceanology,Chinese Academy of Sciences
Re: Weird experience with ROMS3.72
Below is part of the discussion from Martin Schmidt to MOM4 mailing list,I guess it may help:
the modern intel architecture allows for several models how floating
point operations are defined. The
default is a "sloppy" mode where speed gain is preferred against
reproducibilty.
Accurate results are obtained with |"-O -fp-model strict". |
|Otherwise results may differ even from repeated runs with the same binary.
Re: Weird experience with ROMS3.72
That makes much sense. But it's still weird to me that in order to start running a certain case, I have to do double or even triple try. Just now, I submitted one case three times and after twice blowing-ups, it's running smoothly I don't want to do this sort of things for very case.
You know, I wasted several days trying to figure out what leads my model to blowing up. I checked the boundary, the forcing, the initial... And finally, it ends up having something to do with the machine itself.
You know, I wasted several days trying to figure out what leads my model to blowing up. I checked the boundary, the forcing, the initial... And finally, it ends up having something to do with the machine itself.
Re: Weird experience with ROMS3.72
Can you get consistent results using the strict compile flag?
- arango
- Site Admin
- Posts: 1367
- Joined: Wed Feb 26, 2003 4:41 pm
- Location: DMCS, Rutgers University
- Contact:
Re: Weird experience with ROMS3.72
I get identical results with the -fp-model precise flag. This is the flag that is distributed in configuration files.