Hi frd,
I have following questions regarding the restart the model.....
suppose I submit a job for 10 years(long term) on HPC.I am saving his and avg files for each month.
In between , due to some technical problem in HPC, model stops at certain timestep and after certain number of avg file. suppose I got first 12 months avg and his netcdf file and model stops there.
so my question is
(1) How to restart the model?
(2) How can I restart the model where it stops and append the data after that timestep at which it stopped?
I did it already but I did not get the timestep problem. e.g, My model stops at 10000 timestep..now I want to restart model from that timestep onwards........not from the 1st timestep. It should indicate in log file.
any comments/suggestion ?
Thanking you !
about restarting the model
Re: about restarting the model
Are you also writing to a restart file?
I typically need to change three things from the initial run to the restart runs:
I typically need to change three things from the initial run to the restart runs:
- Change nrrec from 0 to -1 to read newest record.
- Change ininame to point to the restart file instead of the initial conditions file.
- Change ldefout to false to append to existing output files.
- ntimes
-
- Posts: 135
- Joined: Mon Jun 22, 2009 3:46 pm
- Location: Indian Institute of Tropical Meteorology, Pune, INDIA
Re: about restarting the model
Hi Kate,
Thanks for the suggestion. But I am still confuse about the ntimes while restarting the model.
Are you also writing to a restart file?
Yes
I typically need to change three things from the initial run to the restart runs:
* Change nrrec from 0 to -1 to read newest record.
YES
* Change ininame to point to the restart file instead of the initial conditions file.
YES
* Change ldefout to false to append to existing output files.
Ok
Oh, and after every chunk I change:
* ntimes
Here i do not understand.......it will be better if you give with example.
e.g.,
I am creating the files such as.....writing monthly avg for 10 years(separate file for each month)
his_rmed16_grd_e_0006.nc
avg_rmed16_grd_e_0006.nc
dia_rmed16_grd_e_0006.nc
and now model stops at this record ...and now I need to restart the model from this to onwards.
so how can i put ntimes while restarting the model.
Thanking you.
Thanks for the suggestion. But I am still confuse about the ntimes while restarting the model.
Are you also writing to a restart file?
Yes
I typically need to change three things from the initial run to the restart runs:
* Change nrrec from 0 to -1 to read newest record.
YES
* Change ininame to point to the restart file instead of the initial conditions file.
YES
* Change ldefout to false to append to existing output files.
Ok
Oh, and after every chunk I change:
* ntimes
Here i do not understand.......it will be better if you give with example.
e.g.,
I am creating the files such as.....writing monthly avg for 10 years(separate file for each month)
his_rmed16_grd_e_0006.nc
avg_rmed16_grd_e_0006.nc
dia_rmed16_grd_e_0006.nc
and now model stops at this record ...and now I need to restart the model from this to onwards.
so how can i put ntimes while restarting the model.
Thanking you.
Re: about restarting the model
Dear all -
I also have some doubts about restart/perfect_restart with ROMS. I have read the pieces of information I found here and I think I got the theory behind it. Nevertheless, I performed 3 experiments in order to evaluate how perfect_restart could (not??) affect a short run. Exp.A: 30 days run, with 2 restarts every 10 days. Exp.B: similar to the previous but using executable compiled with 'perfect_restart' option activated, and Exp.C: which ran continuously for the same 30 days period. From what I read I would expect the last snapshot form Exp.B and C to be identical, while Exp.A could be slightly different. However I found out plotting the differences in between them that all 3 experiments have unique states at the end of the run. It is true that the differences [Exp.C - Exp.B] are smaller than [Exp.C - Exp.A] and that the differences could be significant in some areas (mostly around the open borders, but in the domain interior as well). Below follows the file.h for each experiment, I am probably missing something simple but crucial here. Any help on clarifying perfect_restart behavior is much appreciated!?
Thanks for the attention,
João Marcelo Absy
INPE - Brazil
Exp. A and C file.h:
#define UV_ADV
#define UV_QDRAG
#define UV_COR
#define DJ_GRADPS
#define BULK_FLUXES
#undef LONGWAVE
#define EMINUSP
#define TS_U3HADVECTION
#define TS_C4VADVECTION
#define SOLVE3D
#define SALINITY
#define NONLIN_EOS
#define MASKING
#define SPLINES
#undef QCORRECTION
#undef SCORRECTION
#define SOLAR_SOURCE
#define CURVGRID
#define AVERAGES
#define LMD_MIXING
#ifdef LMD_MIXING
# define LMD_RIMIX
# define LMD_CONVEC
# define LMD_SKPP
# define LMD_NONLOCAL
#endif
#undef ZCLIMATOLOGY
#undef M2CLIMATOLOGY
#undef M3CLIMATOLOGY
#undef TCLIMATOLOGY
#undef ZCLM_NUDGING
#undef M2CLM_NUDGING
#undef M3CLM_NUDGING
#undef TCLM_NUDGING
#undef SSH_TIDES
#undef UV_TIDES
#undef RAMP_TIDES
#undef ADD_FSOBC
#undef ADD_M2OBC
#undef SPONGE
#define RADIATION_2D
#define EAST_FSCHAPMAN
#define EAST_M2FLATHER
#define EAST_M3RADIATION
#define EAST_M3NUDGING
#define EAST_TRADIATION
#define EAST_TNUDGING
#define SOUTHERN_WALL
#define WEST_FSCHAPMAN
#define WEST_M2FLATHER
#define WEST_M3RADIATION
#define WEST_M3NUDGING
#define WEST_TRADIATION
#define WEST_TNUDGING
#define NORTH_FSCHAPMAN
#define NORTH_M2FLATHER
#define NORTH_M3RADIATION
#define NORTH_M3NUDGING
#define NORTH_TRADIATION
#define NORTH_TNUDGING
#define ANA_BSFLUX
#define ANA_BTFLUX
#define ANA_CLOUD
Exp. B(perfect_restart on) file.h:
#define UV_ADV
#define UV_QDRAG
#define UV_COR
#define DJ_GRADPS
#define BULK_FLUXES
#undef LONGWAVE
#define EMINUSP
#define TS_U3HADVECTION
#define TS_C4VADVECTION
#define SOLVE3D
#define SALINITY
#define NONLIN_EOS
#define MASKING
#define SPLINES
#undef QCORRECTION
#undef SCORRECTION
#define SOLAR_SOURCE
#define CURVGRID
#define AVERAGES
#define PERFECT_RESTART
#ifdef PERFECT_RESTART
# undef AVERAGES
# undef DIAGNOSTICS_BIO
# undef DIAGNOSTICS_TS
# undef DIAGNOSTICS_UV
# define OUT_DOUBLE
#endif
#define LMD_MIXING
#ifdef LMD_MIXING
# define LMD_RIMIX
# define LMD_CONVEC
# define LMD_SKPP
# define LMD_NONLOCAL
#endif
#undef ZCLIMATOLOGY
#undef M2CLIMATOLOGY
#undef M3CLIMATOLOGY
#undef TCLIMATOLOGY
#undef ZCLM_NUDGING
#undef M2CLM_NUDGING
#undef M3CLM_NUDGING
#undef TCLM_NUDGING
#undef SSH_TIDES
#undef UV_TIDES
#undef RAMP_TIDES
#undef ADD_FSOBC
#undef ADD_M2OBC
#undef SPONGE
#define RADIATION_2D
#define EAST_FSCHAPMAN
#define EAST_M2FLATHER
#define EAST_M3RADIATION
#define EAST_M3NUDGING
#define EAST_TRADIATION
#define EAST_TNUDGING
#define SOUTHERN_WALL
#define WEST_FSCHAPMAN
#define WEST_M2FLATHER
#define WEST_M3RADIATION
#define WEST_M3NUDGING
#define WEST_TRADIATION
#define WEST_TNUDGING
#define NORTH_FSCHAPMAN
#define NORTH_M2FLATHER
#define NORTH_M3RADIATION
#define NORTH_M3NUDGING
#define NORTH_TRADIATION
#define NORTH_TNUDGING
#define ANA_BSFLUX
#define ANA_BTFLUX
#define ANA_CLOUD
ps: concerning the original post - Mashinde, ntimes is the duration of your run in time-steps and should be set inside your file.in. For example, I set my model to run 96 time-steps/day (DT=900 sec.) and if I would like to run it for 90 days, Ntimes = 8640 (90 days x 96 timesteps/day). Does that help?
I also have some doubts about restart/perfect_restart with ROMS. I have read the pieces of information I found here and I think I got the theory behind it. Nevertheless, I performed 3 experiments in order to evaluate how perfect_restart could (not??) affect a short run. Exp.A: 30 days run, with 2 restarts every 10 days. Exp.B: similar to the previous but using executable compiled with 'perfect_restart' option activated, and Exp.C: which ran continuously for the same 30 days period. From what I read I would expect the last snapshot form Exp.B and C to be identical, while Exp.A could be slightly different. However I found out plotting the differences in between them that all 3 experiments have unique states at the end of the run. It is true that the differences [Exp.C - Exp.B] are smaller than [Exp.C - Exp.A] and that the differences could be significant in some areas (mostly around the open borders, but in the domain interior as well). Below follows the file.h for each experiment, I am probably missing something simple but crucial here. Any help on clarifying perfect_restart behavior is much appreciated!?
Thanks for the attention,
João Marcelo Absy
INPE - Brazil
Exp. A and C file.h:
#define UV_ADV
#define UV_QDRAG
#define UV_COR
#define DJ_GRADPS
#define BULK_FLUXES
#undef LONGWAVE
#define EMINUSP
#define TS_U3HADVECTION
#define TS_C4VADVECTION
#define SOLVE3D
#define SALINITY
#define NONLIN_EOS
#define MASKING
#define SPLINES
#undef QCORRECTION
#undef SCORRECTION
#define SOLAR_SOURCE
#define CURVGRID
#define AVERAGES
#define LMD_MIXING
#ifdef LMD_MIXING
# define LMD_RIMIX
# define LMD_CONVEC
# define LMD_SKPP
# define LMD_NONLOCAL
#endif
#undef ZCLIMATOLOGY
#undef M2CLIMATOLOGY
#undef M3CLIMATOLOGY
#undef TCLIMATOLOGY
#undef ZCLM_NUDGING
#undef M2CLM_NUDGING
#undef M3CLM_NUDGING
#undef TCLM_NUDGING
#undef SSH_TIDES
#undef UV_TIDES
#undef RAMP_TIDES
#undef ADD_FSOBC
#undef ADD_M2OBC
#undef SPONGE
#define RADIATION_2D
#define EAST_FSCHAPMAN
#define EAST_M2FLATHER
#define EAST_M3RADIATION
#define EAST_M3NUDGING
#define EAST_TRADIATION
#define EAST_TNUDGING
#define SOUTHERN_WALL
#define WEST_FSCHAPMAN
#define WEST_M2FLATHER
#define WEST_M3RADIATION
#define WEST_M3NUDGING
#define WEST_TRADIATION
#define WEST_TNUDGING
#define NORTH_FSCHAPMAN
#define NORTH_M2FLATHER
#define NORTH_M3RADIATION
#define NORTH_M3NUDGING
#define NORTH_TRADIATION
#define NORTH_TNUDGING
#define ANA_BSFLUX
#define ANA_BTFLUX
#define ANA_CLOUD
Exp. B(perfect_restart on) file.h:
#define UV_ADV
#define UV_QDRAG
#define UV_COR
#define DJ_GRADPS
#define BULK_FLUXES
#undef LONGWAVE
#define EMINUSP
#define TS_U3HADVECTION
#define TS_C4VADVECTION
#define SOLVE3D
#define SALINITY
#define NONLIN_EOS
#define MASKING
#define SPLINES
#undef QCORRECTION
#undef SCORRECTION
#define SOLAR_SOURCE
#define CURVGRID
#define AVERAGES
#define PERFECT_RESTART
#ifdef PERFECT_RESTART
# undef AVERAGES
# undef DIAGNOSTICS_BIO
# undef DIAGNOSTICS_TS
# undef DIAGNOSTICS_UV
# define OUT_DOUBLE
#endif
#define LMD_MIXING
#ifdef LMD_MIXING
# define LMD_RIMIX
# define LMD_CONVEC
# define LMD_SKPP
# define LMD_NONLOCAL
#endif
#undef ZCLIMATOLOGY
#undef M2CLIMATOLOGY
#undef M3CLIMATOLOGY
#undef TCLIMATOLOGY
#undef ZCLM_NUDGING
#undef M2CLM_NUDGING
#undef M3CLM_NUDGING
#undef TCLM_NUDGING
#undef SSH_TIDES
#undef UV_TIDES
#undef RAMP_TIDES
#undef ADD_FSOBC
#undef ADD_M2OBC
#undef SPONGE
#define RADIATION_2D
#define EAST_FSCHAPMAN
#define EAST_M2FLATHER
#define EAST_M3RADIATION
#define EAST_M3NUDGING
#define EAST_TRADIATION
#define EAST_TNUDGING
#define SOUTHERN_WALL
#define WEST_FSCHAPMAN
#define WEST_M2FLATHER
#define WEST_M3RADIATION
#define WEST_M3NUDGING
#define WEST_TRADIATION
#define WEST_TNUDGING
#define NORTH_FSCHAPMAN
#define NORTH_M2FLATHER
#define NORTH_M3RADIATION
#define NORTH_M3NUDGING
#define NORTH_TRADIATION
#define NORTH_TNUDGING
#define ANA_BSFLUX
#define ANA_BTFLUX
#define ANA_CLOUD
ps: concerning the original post - Mashinde, ntimes is the duration of your run in time-steps and should be set inside your file.in. For example, I set my model to run 96 time-steps/day (DT=900 sec.) and if I would like to run it for 90 days, Ntimes = 8640 (90 days x 96 timesteps/day). Does that help?
Re: about restarting the model
In my example, the computer I run on has a queue limit of 24 hours. I need to restart the job each (real) day. In that 24 hours I can run for say 150 days. I want to stop cleanly right at day 150 rather than trying to squeeze in a couple extra days. What I want to avoid is having partial days and then having to clean up my hourly floats file. So I set NTIMES to match 150 days on the first day, 300 days on the second day, etc.mashinde wrote:Oh, and after every chunk I change:
* ntimes
Here i do not understand.......it will be better if you give with example.