Hello All,
I have investigated the imbalance further within nf_fwrite2d.F and here are the results ---
Figure 1: Further Parallel writing Imbalance analysis
Observations and Conclusion:
1.The imbalance is flowing as below... -- The imbalance is found in nf90_put_var calls inside nf_fwrite2d.F function during define phase.
Total CPU time ==> Output time ==> Define file ==> Wrt_Info ==> nf_fwrite2d.F ==> nf90_put_var [Further result.] [Check Figure 1]
2. nf90_put_var inside nf_fwrite2d.F during define phase calls are imbalanced and taking significant time [Check Figure 1] whereas nf90_put_var inside nf_fwrite2d.F during write phase are balanced and taking insignificant time. [Figure not included as insignificant]
3. nf90_put_var inside nf_fwrite3d.F during write phase are balanced and taking significant time. [Check Figure 2]
4. With different runs with exactly same setup, the imbalance pattern is changing. -- Overshoots are happening in different PEs. [Check Figure 3]
My Doubts:
1.Why nf90_put_var inside nf_fwrite2d.F is imbalanced during define phase whereas balanced during write phase although data is distributed uniformly in both cases?
2.Is there any probable reason for such imbalance? I am working on to solve such imbalance to improve the overall I/O of ROMS model.
Thanks,
Koushik Sen