Hello, I just tested upwelling (test case) on 3 different architectures, and the results are not what I expected.
I used OpenMP, Intel compilers. I installed zlib, hdf5, m4, netcdf, and netcdff.
(1) Intel(R) Xeon(R) CPU E5-2650 v1 @ 2.00GHz, 16 cores. 4*4 grid: wall time 17s
(2) Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz, 16 cores. 4*4 grid: wall time 27s
(3) Intel(R) Xeon Phi(TM) CPU 7210 @ 1.30GHz, 64 cores (KNL). 8*8 grid, wall time 28s
Any idea why the oldest, slowest chip could be fastest, and by such a wide margin? Thanks.