Adjoint blow-up if misfit forcing is too far after initial time
Adjoint blow-up if misfit forcing is too far after initial time
I am making simple experiments with 4dvar and sometimes I get blow-ups in the first iteration of the adjoint model (it doesn't actually terminate during the adjoint, but the output is clearly nonsensical).
In one experiment I use:
* ARGO profiles from a single float, with one profile every 3 days
* assimilation window of 3 days
* initial conditions that have drifted away from current observations, e.g. a climatology
If I shift the start of the assimilation window far before the observation time, the adjoint model sometimes explodes in the first iteration. If I shift the window such that the adjoint forcing is close to the inital time, it runs fine. I think the closer the observation is to the start, the more it becomes "3dvar-like", i.e. the adjoint back-propagation becomes shorter and theoretically vanishes if the initial time coincides with the observation time, in which case only 3dvar is left.
Question: Is the blow-up of the adjoint model a misconfiguration, or is this normal when an assimilative model is spun up from a climatology?
In case it is normal, does it sometimes cause problems when a new instrument appears during an assimilation?
Background: I use strong constraint and the deprecated primal I4DVAR for debugging, because I can modify NADJ to write more freqently during the first inner loop. I found this good for debugging, since the growing instabilities make the final adjoint output of the first inner loop already completely nonsensical (i.e. during the adjoint run the instabilities spread from the vicinity of the misfit forcing to remote regions and are much stronger there).
In one experiment I use:
* ARGO profiles from a single float, with one profile every 3 days
* assimilation window of 3 days
* initial conditions that have drifted away from current observations, e.g. a climatology
If I shift the start of the assimilation window far before the observation time, the adjoint model sometimes explodes in the first iteration. If I shift the window such that the adjoint forcing is close to the inital time, it runs fine. I think the closer the observation is to the start, the more it becomes "3dvar-like", i.e. the adjoint back-propagation becomes shorter and theoretically vanishes if the initial time coincides with the observation time, in which case only 3dvar is left.
Question: Is the blow-up of the adjoint model a misconfiguration, or is this normal when an assimilative model is spun up from a climatology?
In case it is normal, does it sometimes cause problems when a new instrument appears during an assimilation?
Background: I use strong constraint and the deprecated primal I4DVAR for debugging, because I can modify NADJ to write more freqently during the first inner loop. I found this good for debugging, since the growing instabilities make the final adjoint output of the first inner loop already completely nonsensical (i.e. during the adjoint run the instabilities spread from the vicinity of the misfit forcing to remote regions and are much stronger there).
- jivica
- Posts: 172
- Joined: Mon May 05, 2003 2:41 pm
- Location: The University of Western Australia, Perth, Australia
- Contact:
Re: Adjoint blow-up if misfit forcing is too far after initial time
Hi Stef,
we need more info about your experiment (do you alter your boundary conditions, forcing, how frequently, and/or only the inital state, how you computed your model error cov, obs erros?) to help you more.
Do you start your application from a balanced run (i.e. some restart of already spun simulation)?
Just a few thoughts to think about;
Adjoint is all about perturbations and in that sense, you shouldn't start too far from the prior (you are not going along the basic principles of 4D-Var). In reality, you fix obvious problems within your system first and then you apply DA.
In your example, you provide one vertical transect (say ~30 points vertically) only once within the assimilation window and hope for some visible improvement while your system is (just a wild guess) 100x100x30x7 variables?
Departures from your background state will be hard if you have "B" matrix playing, remember you have 2 parts in the cost function.
If you are using 4DVar then you are assimilating at the exact space-time location, not sure what you mean with 3DVar type?
If you want to emulate 3DVar (not sure there is an option inside the current code) you might consider extending the obs time for more timesteps/duration.
In case you want to play with a similar example (not sure you are right now) then start from the already good solution, sample the system at some location, and perturb the initial state (taking care you are not introducing funny stuff i.e. inversion etc).
From the sampled data create synthetic observation you'll assimilate later, this is the truth you want to use and help you get to the original state you used to sample obs. This concept is well known - ocean sensitivity experiment and you can find vast literature.
Regarding the blow-up, as I said, ADJ and TLM are working in the perturbation space, and be sure you are within the basic principles of DA (we are evaluating only the first-order perturbations as in Taylor expansion, TLM is linearized hence L in name).
Cheers,
Ivica
we need more info about your experiment (do you alter your boundary conditions, forcing, how frequently, and/or only the inital state, how you computed your model error cov, obs erros?) to help you more.
Do you start your application from a balanced run (i.e. some restart of already spun simulation)?
Just a few thoughts to think about;
Adjoint is all about perturbations and in that sense, you shouldn't start too far from the prior (you are not going along the basic principles of 4D-Var). In reality, you fix obvious problems within your system first and then you apply DA.
In your example, you provide one vertical transect (say ~30 points vertically) only once within the assimilation window and hope for some visible improvement while your system is (just a wild guess) 100x100x30x7 variables?
Departures from your background state will be hard if you have "B" matrix playing, remember you have 2 parts in the cost function.
If you are using 4DVar then you are assimilating at the exact space-time location, not sure what you mean with 3DVar type?
If you want to emulate 3DVar (not sure there is an option inside the current code) you might consider extending the obs time for more timesteps/duration.
In case you want to play with a similar example (not sure you are right now) then start from the already good solution, sample the system at some location, and perturb the initial state (taking care you are not introducing funny stuff i.e. inversion etc).
From the sampled data create synthetic observation you'll assimilate later, this is the truth you want to use and help you get to the original state you used to sample obs. This concept is well known - ocean sensitivity experiment and you can find vast literature.
Regarding the blow-up, as I said, ADJ and TLM are working in the perturbation space, and be sure you are within the basic principles of DA (we are evaluating only the first-order perturbations as in Taylor expansion, TLM is linearized hence L in name).
Cheers,
Ivica
Re: Adjoint blow-up if misfit forcing is too far after initial time
Hi Ivica,
Thanks for the suggestions! Yes I haven't provided enough details about the run, I will try to write it up in a blog post. My present set up is more of a test case for learning 4dvar than a realistic application, i.e. an addition to the tutorial excercises. Alongside I am reading the literature on theory.
That said, the far and distant objective is to do tropical cyclone/storm surge forecasting, so I pick a test case within that context. I should probably take a look at the ROMS example cases around the U.S. Coast, but instead I use the Bay of Bengal. I'm starting from a CMEMS NEMO simulation and do 5 days without any simulation to let the initial condition adjust a little bit. I'm aware that for real operational DA this may not make sense, but for experimenting/learning it should be fine (?)
The map below shows the domain. The track is from Cyclone Sitrang, but it's a distraction because the problems I'm currently looking at occur before the storm. Also, for storm DA the mixed layer is important, and the BoB is special because of the barrier layers etc. I'm not sure if the implications of the salinity-dominated mixed-layer depth in the BoB are well understood in terms of DA (in particular the assumption of temperature-dominated stratification in the balance equation), and I still have to read a lot about it. Looking at this would be current research and I think/hope my problem is much more basic than that.
Unfortunately my workstation keeps freezing for unkown reasons (bios update ... very bad idea) and I'm in the process of temporarily moving my workflow to my laptop. So I'm currently a bit limited in terms of experiments, but I was able to produce some figures.
Figure 1 is the domain (again, ignore the TC track).
Figures 2 and 3 are ARGO profiles, the first figure is from the float inside the domain, the second one from the one outside . I have satellite data too, but I intentionally would like to work on one instrument only until I sufficiently understand what's going on.
Black lines are the instrument.
Green lines are EN4 climatology.
Red lines are NEMO-CMEMS.
Blue lines are ROMS free run starting from Sept. 30, initialized from NEMO-CMEMS.
There are large temperature misfits in the thermocline.
*) I adjust initial conditions only.
*) I don't use the balance operator.
*) the background error cov. is the simple full variance of CMEMS over 10 years, without removing any trend, seasonal cycle or interannual variations of the seasonal anomalies. Does this yield a too large error cov. because I leave all the long-term signals in? But even if so, how could this cause a blow-up in the first adjoint iteration? Via preconditioning perhaps? I don't think I use preconditioning, I have Lprecond=F. Besides that, would it make any sense to precondition without balancing? In Moore et al. 2011 part I. it says that preconditioning is applied to the unbalanced increment, so I don't think it would?
Thanks again for your help!
Thanks for the suggestions! Yes I haven't provided enough details about the run, I will try to write it up in a blog post. My present set up is more of a test case for learning 4dvar than a realistic application, i.e. an addition to the tutorial excercises. Alongside I am reading the literature on theory.
That said, the far and distant objective is to do tropical cyclone/storm surge forecasting, so I pick a test case within that context. I should probably take a look at the ROMS example cases around the U.S. Coast, but instead I use the Bay of Bengal. I'm starting from a CMEMS NEMO simulation and do 5 days without any simulation to let the initial condition adjust a little bit. I'm aware that for real operational DA this may not make sense, but for experimenting/learning it should be fine (?)
The map below shows the domain. The track is from Cyclone Sitrang, but it's a distraction because the problems I'm currently looking at occur before the storm. Also, for storm DA the mixed layer is important, and the BoB is special because of the barrier layers etc. I'm not sure if the implications of the salinity-dominated mixed-layer depth in the BoB are well understood in terms of DA (in particular the assumption of temperature-dominated stratification in the balance equation), and I still have to read a lot about it. Looking at this would be current research and I think/hope my problem is much more basic than that.
Unfortunately my workstation keeps freezing for unkown reasons (bios update ... very bad idea) and I'm in the process of temporarily moving my workflow to my laptop. So I'm currently a bit limited in terms of experiments, but I was able to produce some figures.
Figure 1 is the domain (again, ignore the TC track).
Figures 2 and 3 are ARGO profiles, the first figure is from the float inside the domain, the second one from the one outside . I have satellite data too, but I intentionally would like to work on one instrument only until I sufficiently understand what's going on.
Black lines are the instrument.
Green lines are EN4 climatology.
Red lines are NEMO-CMEMS.
Blue lines are ROMS free run starting from Sept. 30, initialized from NEMO-CMEMS.
There are large temperature misfits in the thermocline.
*) I adjust initial conditions only.
*) I don't use the balance operator.
*) the background error cov. is the simple full variance of CMEMS over 10 years, without removing any trend, seasonal cycle or interannual variations of the seasonal anomalies. Does this yield a too large error cov. because I leave all the long-term signals in? But even if so, how could this cause a blow-up in the first adjoint iteration? Via preconditioning perhaps? I don't think I use preconditioning, I have Lprecond=F. Besides that, would it make any sense to precondition without balancing? In Moore et al. 2011 part I. it says that preconditioning is applied to the unbalanced increment, so I don't think it would?
What I meant was if all observations occur at the initial time of a 4-dvar run, the 4-dvar run becomes much like a free-running forward model starting from a 3-dvar assimilated initial condition. Because there are no data assimilated other than at initial time.If you are using 4DVar then you are assimilating at the exact space-time location, not sure what you mean with 3DVar type?
Yes, that's great! I have done a similar thing by creating artifical obs. by sampling the model, and then perturb the observations. But only in pervious experiments with along-track satellite altimetry (which turned out to be too advanced for me at the moment). But I haven't thought about perturbing the initial conditions!In case you want to play with a similar example (not sure you are right now) then start from the already good solution, sample the system at some location, and perturb the initial state (taking care you are not introducing funny stuff i.e. inversion etc).
From the sampled data create synthetic observation you'll assimilate later, this is the truth you want to use and help you get to the original state you used to sample obs. This concept is well known - ocean sensitivity experiment and you can find vast literature.
Thanks again for your help!
Re: Adjoint blow-up if misfit forcing is too far after initial time
Hmm, regarding what I wrote above about preconditioning:
In i4dvar.f90 (as noted in the first post I use I4DVAR for debugging), there is a segment
which is executed after the first adjoint iteration, independent of the Lprecond switch. The Lprecond switch appears in cgradient.f90 only and refers to 'second-stage' preconditioning only. This is explained in Moore et al. 2011 chapter 6.2.
So apparently I do use the background error cov. to precondition, even though I don't use the balance operator. I have to read more to check if that makes sense. In any case, this is done only after the adjoint iteration to compute the gradient in 'x-space', so I doubt it's the cause of the blow-up?
In i4dvar.f90 (as noted in the first post I use I4DVAR for debugging), there is a segment
Code: Select all
! Convert observation cost function gradient, GRADx(Jo), from model
! space (x-space) to minimization space (v-space)
So apparently I do use the background error cov. to precondition, even though I don't use the balance operator. I have to read more to check if that makes sense. In any case, this is done only after the adjoint iteration to compute the gradient in 'x-space', so I doubt it's the cause of the blow-up?
- arango
- Site Admin
- Posts: 1367
- Joined: Wed Feb 26, 2003 4:41 pm
- Location: DMCS, Rutgers University
- Contact:
Re: Adjoint blow-up if misfit forcing is too far after initial time
Why do you use the primal formulation I4DVAR? That algorithm works, but it is deprecated. Please use the dual formulation RBL4DVAR. It is faster! Check the WC13 test cases for RBL4DVAR. The dual formulation works on the space span by the observations and allows impacts and sensitivities. We also coded the (4DVAR)T. That is the adjoint of the full 4DVAR algorithm for impacts and sensitivities, including the forecast.
Also, it is tough to make linear codes blow up. The TLM and ADM kernels start from the rest. The ADM is forced by the adjoint of the misfit between the observation and the prior. Work on the convolution parameters for the background error covariance.
Also, it is tough to make linear codes blow up. The TLM and ADM kernels start from the rest. The ADM is forced by the adjoint of the misfit between the observation and the prior. Work on the convolution parameters for the background error covariance.
Re: Adjoint blow-up if misfit forcing is too far after initial time
Thanks for the reply!
I used RBL4DVAR with RPCG, as in Exercise 3 of the tutorial. It worked in some cases, but then I got blow-ups and didn't now how to investigate the causes. That's when I switched to I4DVAR and modified the NADJ to write out the adjoint solutions more frequently (I'm aware that this may break the code when modifying this, but I terminated those debug runs anyway after the first iteration of the adjoint code. It was just to look what the adjoint model does.)
I used RBL4DVAR with RPCG, as in Exercise 3 of the tutorial. It worked in some cases, but then I got blow-ups and didn't now how to investigate the causes. That's when I switched to I4DVAR and modified the NADJ to write out the adjoint solutions more frequently (I'm aware that this may break the code when modifying this, but I terminated those debug runs anyway after the first iteration of the adjoint code. It was just to look what the adjoint model does.)
That's very useful information! I have zero experience. So there must be some more fundamental problem.Also, it is tough to make linear codes blow up.
Ok, thanks for the hint!Work on the convolution parameters for the background error covariance.
- arango
- Site Admin
- Posts: 1367
- Joined: Wed Feb 26, 2003 4:41 pm
- Location: DMCS, Rutgers University
- Contact:
Re: Adjoint blow-up if misfit forcing is too far after initial time
You cannot change NADJ in I4DVAR since the primal formulation is only a strong constraint algorithm (the model is assumed to be perfect). However, RBL4DVAR is both strong- and weak-constraint. But I suggest staying in the strong-constraint regime since modeling the model error, Q, is not trivial. NADJ should be the same size as ntimes, so the analysis adjustment due to data assimilation is for the initial state vector.
Re: Adjoint blow-up if misfit forcing is too far after initial time
Yes, I'm trying to look at the simplest case possible.But I suggest staying in the strong-constraint regime since modeling the model error, Q, is not trivial.
That's what I thought after I first tried to change it. Because if the input is lower than that, it 'automatically' resets itself to ntimes. In read_phypar.f90:NADJ should be the same size as ntimes.
Code: Select all
! If strong constraint, write only final adjoint solution since only
! we are estimating initial conditions.
!
nADJ(ng)=ntimes(ng)
I couldn't find any other way, so I decided to uncomment the line above (being aware that it may break the code), to write out the adjoint more frequently. Then after I had a couple of adjoint timesteps in the netcdf file, I terminated that run, because I was aware that I may have broken the rest of the code by modifying that line. Looking at the more frequent adjoint output helped me, in the sense that I could see that the initial misfit forcing was indeed located at the correct horizontal position, and that the problems at least started at the correct location. This was not visible from the final adjoint solution.
So from my beginner perspective, there is value in writing out the adjoint more frequently, and while I'm aware that uncommenting the line above breaks the code, I do it anyways because I do not see another option. I terminate after the first adjoint iteration, because the rest of the model run doesn't make sense any more.
Is there a way to track the initial adjoint output more frequently to an auxiliary field, without breaking the subsequent model run?
In case there isn't, I could modify the code more sensibly by creating a custom auxiliary output field, right? And leave nADJ(ng) untouched. This wouldn't break the model, right?
Would this also be possible for the dual code? I read that in the dual formulation (e.g. RBL4DVAR), there is the variable transformation from x to 'w', and if I understood correctly, 'w' has no straightforward physical interpretation. So I'm not sure if there is any value in writing out the adjoint more often (I mean to a different auxiliary field) in the dual formulation code.
Re: Adjoint blow-up if misfit forcing is too far after initial time
Sorry may I ask a question about the Tutorial in this thread (it may be related to my problem discussed here):
I noticed that the tutorial does not seem to work with git branch 'main'.
Currently 'main' is at 21d8ab47ad4b (from June 2023).
The debugger complains about what looks like related to the bug reported here:
viewtopic.php?t=6374
It's about state vector variables' indices. Hernan writes on the thread the he made some updates, but I assume these were for branch 'develop'.
Are you backporting these fixes to 'main', or should 'develop' be used for the tutorial?
I noticed that the tutorial does not seem to work with git branch 'main'.
Currently 'main' is at 21d8ab47ad4b (from June 2023).
The debugger complains about what looks like related to the bug reported here:
viewtopic.php?t=6374
It's about state vector variables' indices. Hernan writes on the thread the he made some updates, but I assume these were for branch 'develop'.
Are you backporting these fixes to 'main', or should 'develop' be used for the tutorial?
- arango
- Site Admin
- Posts: 1367
- Joined: Wed Feb 26, 2003 4:41 pm
- Location: DMCS, Rutgers University
- Contact:
Re: Adjoint blow-up if misfit forcing is too far after initial time
Yes, use the ROMS version on GitHub. We will no longer update all the SVN versions after January 1, 2025. I fixed all the indices in the code.
Re: Adjoint blow-up if misfit forcing is too far after initial time
Hmm, not sure if I understand. Which ROMS version on github, the 'develop' branch or the 'main' branch? Which revision should we use with the currently available tutorial?Yes, use the ROMS version on GitHub. We will no longer update all the SVN versions after January 1, 2025. I fixed all the indices in the code.
Thanks again for all the help!
- arango
- Site Admin
- Posts: 1367
- Joined: Wed Feb 26, 2003 4:41 pm
- Location: DMCS, Rutgers University
- Contact:
Re: Adjoint blow-up if misfit forcing is too far after initial time
I updated all the versions of ROMS in the svn and git repositories, but I recommend the GitHub version since it has branches of other developments that we are working on and will be merged to the default develop branch in the future. We will stop updating the other repositories on January 1, 2025, but keep them for historical reasons.
Re: Adjoint blow-up if misfit forcing is too far after initial time
You should use the default develop branch on GitHub for pretty much everything. Think of develop as the latest svn revision and main as the most recent tagged revision (currently 'roms-4.1').
We also suggest that you use ROMS test cases from github (https://github.com/myroms/roms_test.git) as the testcases available in the svn repository will also no longer be updated starting January 1st, 2025. As long as you keep both repositories (roms and roms_test) updated the tutorials should continue to work.
Re: Adjoint blow-up if misfit forcing is too far after initial time
Oh, I did not see the roms_test repository on github! On the 4dvar tutorial page still mentions svn:
https://www.myroms.org/wiki/4DVar_Tutorial_Introduction
(accessed 2024/02/14)
I might have missed some announcements...
Thanks for clarifying this again.
https://www.myroms.org/wiki/4DVar_Tutorial_Introduction
(accessed 2024/02/14)
I might have missed some announcements...
Excellent! I'll retire every ROMS-related svn repo I have on my computer.As long as you keep both repositories (roms and roms_test) updated the tutorials should continue to work.
Thanks for clarifying this again.
- arango
- Site Admin
- Posts: 1367
- Joined: Wed Feb 26, 2003 4:41 pm
- Location: DMCS, Rutgers University
- Contact:
Re: Adjoint blow-up if misfit forcing is too far after initial time
I updated the tutorial information in WikiROMS. Thank you! We are usually swamped and will take time to update the information. We also have minimal information and instructions in the GitHub repositories wiki. Some users are unfamiliar with Git and GitHub, and we are given plenty of time to learn.