Deal All,
I am new in the business of 4DVAR. I am trying to implement the IS4DVAR in the Northwest Pacific Ocean. My grid is 601x401x30.
I am running the experiments in a Azure Virtual Machine which has 20 cores and is configured with Ubuntu linux. The WC3 test went well in this machine, with both the exact and the random methods. The experiment in my grid went well with the random method, but the experiment with the exact method is taking forever to complete.
I wonder if something is wrong and my first guess is my eastern boundary, which it is partially open.
Is there a problem with a partially open boundary for the computation of the normalization coefficients ?
In the c4dvar.in, the Lobc flag for this boundary was set as "T" and, in the ocean.in, the LBC keywords are equal to the other open boundaries.
Thanks in advance.
Ivan
Normalization coefficients
-
- Posts: 22
- Joined: Tue Feb 03, 2009 6:20 pm
- Location: Atlantech Environmental Sciences
- Contact:
Normalization coefficients
Ivan Dias Soares
Senior Research Scientist
Atlantech Environmental Sciences
ivan@atlantech.com.br
https://www.atlantech.com.br
Florianopolis, SC, BRAZIL
Senior Research Scientist
Atlantech Environmental Sciences
ivan@atlantech.com.br
https://www.atlantech.com.br
Florianopolis, SC, BRAZIL
- jivica
- Posts: 172
- Joined: Mon May 05, 2003 2:41 pm
- Location: The University of Western Australia, Perth, Australia
- Contact:
Re: Normalization coefficients
Hi Ivan,
I think that nothing is wrong, it just takes time.
Your grid is not that small (and depending on your de-correlation scales) and given small number of cores it takes quite a lot.
My recipe is to make exact solution only (!) for zeta (2D so should be faster) and then compare that with randomized solution just to have idea about (miss)match and what RAND number should I use to capture features of the bathy/domain. If you want you can do that for ubar / vbar as well. This is when I used big grids, for smaller I can do full exact solutions.
There is option just to write (and/or) create netcdf files, so in that way you can do multiple parts in parallel.
For example; in one sweep only create netcdf files for all variables separately, and then in the next multiple submits calculate and write each variable separately into it's own file. At the end just combine them into one and use. That way you can dramatically speedup things.
Note; DA is quite demanding on CPU/mem, so that grid with 20 cores is hardly doable.
Cheers,
Ivica
I think that nothing is wrong, it just takes time.
Your grid is not that small (and depending on your de-correlation scales) and given small number of cores it takes quite a lot.
My recipe is to make exact solution only (!) for zeta (2D so should be faster) and then compare that with randomized solution just to have idea about (miss)match and what RAND number should I use to capture features of the bathy/domain. If you want you can do that for ubar / vbar as well. This is when I used big grids, for smaller I can do full exact solutions.
There is option just to write (and/or) create netcdf files, so in that way you can do multiple parts in parallel.
For example; in one sweep only create netcdf files for all variables separately, and then in the next multiple submits calculate and write each variable separately into it's own file. At the end just combine them into one and use. That way you can dramatically speedup things.
Note; DA is quite demanding on CPU/mem, so that grid with 20 cores is hardly doable.
Cheers,
Ivica
-
- Posts: 22
- Joined: Tue Feb 03, 2009 6:20 pm
- Location: Atlantech Environmental Sciences
- Contact:
Re: Normalization coefficients
Hi Ivica,
Thanks a lot for the hints. I guess you are write, 20 cores is not good enough to run a grid this size 601x401x30.
I am doing as you said, running the exact and the random methods for the 2D variables only to get some insight of my case. Thanks for the hint.
One last question ... Do you know anything about the balance operators ? I have noticed that the computation of the Normalization coefficients becomes much more time consuming when I set the Balance operator logical switches as "T" for all state variables. What is the advantage of using these balance operators ? Is there a problem if I set all of them as "F" ?
cheers,
Ivan
Thanks a lot for the hints. I guess you are write, 20 cores is not good enough to run a grid this size 601x401x30.
I am doing as you said, running the exact and the random methods for the 2D variables only to get some insight of my case. Thanks for the hint.
One last question ... Do you know anything about the balance operators ? I have noticed that the computation of the Normalization coefficients becomes much more time consuming when I set the Balance operator logical switches as "T" for all state variables. What is the advantage of using these balance operators ? Is there a problem if I set all of them as "F" ?
cheers,
Ivan
Ivan Dias Soares
Senior Research Scientist
Atlantech Environmental Sciences
ivan@atlantech.com.br
https://www.atlantech.com.br
Florianopolis, SC, BRAZIL
Senior Research Scientist
Atlantech Environmental Sciences
ivan@atlantech.com.br
https://www.atlantech.com.br
Florianopolis, SC, BRAZIL
- jivica
- Posts: 172
- Joined: Mon May 05, 2003 2:41 pm
- Location: The University of Western Australia, Perth, Australia
- Contact:
Re: Normalization coefficients
Regarding cores;
I've made some benchmark for CRAY XC40 machine (dragonfly topology, ~70G/s infiniband), which agrees well with what long time ago Kate wrote, it is close to up to 20x20 tiles. In other words, you get reasonable speedup/CPU_cost if your tiles are up to 20x20. For your 600x400 that would mean you can go up to 30x20 = 600 cores. If you are not in a hurry then almost linear speedup is easy up to 192 cores with even slower networks (my case Mellanox FDR 56G/s, 24 real cpus * 8 nodes) which is reasonable price to pay (I am paying per node -> so 8 nodes). Note that your grid is more/less similar like mine.
Regarding balance operators, well, you'll have to figure it out by yourself and read Andy's & Hernan's papers
I am not using it, as I have many islands and rough topography...
Cheers,
Ivica
I've made some benchmark for CRAY XC40 machine (dragonfly topology, ~70G/s infiniband), which agrees well with what long time ago Kate wrote, it is close to up to 20x20 tiles. In other words, you get reasonable speedup/CPU_cost if your tiles are up to 20x20. For your 600x400 that would mean you can go up to 30x20 = 600 cores. If you are not in a hurry then almost linear speedup is easy up to 192 cores with even slower networks (my case Mellanox FDR 56G/s, 24 real cpus * 8 nodes) which is reasonable price to pay (I am paying per node -> so 8 nodes). Note that your grid is more/less similar like mine.
Regarding balance operators, well, you'll have to figure it out by yourself and read Andy's & Hernan's papers
I am not using it, as I have many islands and rough topography...
Cheers,
Ivica
-
- Posts: 22
- Joined: Tue Feb 03, 2009 6:20 pm
- Location: Atlantech Environmental Sciences
- Contact:
Re: Normalization coefficients
Thanks a lot.
Your comments have help me a lot.
I will let you know when I succeed running IS4DVAR
cheers
Ivan
Your comments have help me a lot.
I will let you know when I succeed running IS4DVAR
cheers
Ivan
Ivan Dias Soares
Senior Research Scientist
Atlantech Environmental Sciences
ivan@atlantech.com.br
https://www.atlantech.com.br
Florianopolis, SC, BRAZIL
Senior Research Scientist
Atlantech Environmental Sciences
ivan@atlantech.com.br
https://www.atlantech.com.br
Florianopolis, SC, BRAZIL