Multiple input files

General scientific issues regarding ROMS

Moderators: arango, robertson

Post Reply
Message
Author
User avatar
kate
Posts: 4091
Joined: Wed Jul 02, 2003 5:29 pm
Location: CFOS/UAF, USA

Multiple input files

#1 Unread post by kate »

I just want to point out that the multiple input file feature does not behave the way one long input file would. Say you have fifty annual climatology files and you want to run smoothly from one to the next. Say you can run for 100 days at a time and you don't want to fuss with the input file all the time. You set it up with maybe five years worth, using the | between files:

Code: Select all

     CLMNAME == /wrkdir/kate/NEP/Files/SODA_2.1.6_clim_NEP6_1961.nc |
                /wrkdir/kate/NEP/Files/SODA_2.1.6_clim_NEP6_1962.nc |
                /wrkdir/kate/NEP/Files/SODA_2.1.6_clim_NEP6_1963.nc |
                /wrkdir/kate/NEP/Files/SODA_2.1.6_clim_NEP6_1964.nc |
                /wrkdir/kate/NEP/Files/SODA_2.1.6_clim_NEP6_1965.nc
Surprise, surprise, on startup in initial, it will only read the first file (1961). If you are currently in the midst of 1962, this is no problem from the point of view of ROMS, it's well before the current ocean_time. Then in the first call to get_data from main3d, we jump into the second file and find some future records, all is well. Perhaps ROMS is happy, but it didn't do exactly what I expected. :P

So then we get to the end of 1962 and do another restart. It once again gets a "before" record from 1961, then it checks the second file and still can't find any "after" records and fails before checking the third file.

Code: Select all

 GET_CYCLE - ending time for multi-file variable: ocean_time
             is less than current model time. 
             TMAX =      22993.5000  TDAYS =      23006.5000
What if you only give it two files? It gets to the end of the second and dies in a cryptic way:

Code: Select all

[n62:517516] *** Process received signal *** 
[n62:517516] Signal: Segmentation fault (11)
[n62:517516] Signal code: Address not mapped (1)
[n62:517516] Failing at address: 0x48
[n62:517516] *** End of error message ***
It knows it's in multi-file mode, but it doesn't know how to gracefully stop at the end of its file list. :?

User avatar
arango
Site Admin
Posts: 1367
Joined: Wed Feb 26, 2003 4:41 pm
Location: DMCS, Rutgers University
Contact:

Re: Multiple input files

#2 Unread post by arango »

Hmmm, so are you telling me that you need to remove the 1961 file from the list, if you restart in 1962? If the time clock is linear, it should jump to the next file. I added the generic routine ROMS/Utility/inquire.F to manage the multiple file option. It shouldn't be difficult to debug your case in that routine and find why is not jumping to the next file. I wonder if there is a problem with the time clock, tdays(ng)? It may have the wrong value for initialization, but why? Also, we need to check the value of Fcount in the file structure.

This is a complicate logic and I tried to make it as transparent and modular as possible. Maybe I missed a combination of options. It is difficult to come with code that works with everybody set-up. Can you please check this and let me know since I don't have your application to reproduce the problem?

The other that with need to add logic is to terminate gracefully if not data is found at the end of the file list. I thought that I code all the possible combinations of problems with the exit_fag.

User avatar
kate
Posts: 4091
Joined: Wed Jul 02, 2003 5:29 pm
Location: CFOS/UAF, USA

Re: Multiple input files

#3 Unread post by kate »

The inquire routine is getting nfiles set to 1. The only ones with a larger value are Forcing files, when nfiles is set to the number of different forcings. The CLM structure knows it has five files, but that's not what's in the get_data call to get_3dfld:

Code: Select all

!  KLUDGE - how many tracers to read? It depends...
      DO i=1,NAT
        CALL get_3dfld (ng, iNLM, idTclm(i), CLM(ng)%ncid,              &
     &                  1, CLM(ng), update(1),                          &
     &                  LBi, UBi, LBj, UBj, 1, N(ng), 2, 1,             &
#   ifdef MASKING
     &                  GRID(ng) % rmask(LBi,LBj),                      &
#   endif
     &                  CLIMA(ng) % tclmG(LBi,LBj,1,1,i))
        IF (exit_flag.ne.NoError) RETURN
      END DO
I didn't watch the other case in the debugger - can do that now.

User avatar
arango
Site Admin
Posts: 1367
Joined: Wed Feb 26, 2003 4:41 pm
Location: DMCS, Rutgers University
Contact:

Re: Multiple input files

#4 Unread post by arango »

OK, you need to examine several variables in the CLM structure. Yes, the forcing files are special. There is another layer in the file counter so we can split the different fields in other files. The number of climatology files is in CLM(ng)%Nfiles. I wonder if we need additional code to initialize the value of CLM(ng)%Fcount to go directed to the right file according to the initialization time. I have similar code in ROMS/Utility/close_io.F but for different purposes.

User avatar
kate
Posts: 4091
Joined: Wed Jul 02, 2003 5:29 pm
Location: CFOS/UAF, USA

Re: Multiple input files

#5 Unread post by kate »

From initial, Lmulti is false and nfiles in inquire is 1. Only the first file is searched. CLM%Fcount is 1.

The next time through get_data, Lmulti is true and Fcount gets set to 2.

The check for falling off the end looks like:

Code: Select all

IF ((1.gt.Fcount).and.(Fcount.gt.S(ifile)%Nfiles)) THEN
but I don't get the first condition. When is that ever true?

sarahgid
Posts: 7
Joined: Thu Feb 03, 2011 11:46 pm
Location: University of Washington

Re: Multiple input files

#6 Unread post by sarahgid »

Multiple input files for me is giving the same error as Kate has above but only makes it through the last time step in the first climatology file. In the .in file we have:

Code: Select all

     CLMNAME == Ocn/ocean_clm_1.nc | 
  Ocn/ocean_clm_2.nc |
  Ocn/ocean_clm_3.nc |
  Ocn/ocean_clm_4.nc
     BRYNAME == Ocn/ocean_bry_1.nc |
  Ocn/ocean_bry_2.nc |
  Ocn/ocean_bry_3.nc |
  Ocn/ocean_bry_4.nc 
The error occurs after reading in the first time step from the second bry file and then reading in only 2 variables from the first time step from the second clm file... Here is part of the output to help diagnose the problem.

ROMS recognizes there are 4 climatology and boundary files:

Code: Select all

          Input Climatology File:  Ocn/ocean_clm_1.nc
                                   Ocn/ocean_clm_2.nc
                                   Ocn/ocean_clm_3.nc
                                   Ocn/ocean_clm_4.nc
             Input Boundary File:  Ocn/ocean_bry_1.nc
                                   Ocn/ocean_bry_2.nc
                                   Ocn/ocean_bry_3.nc
                                   Ocn/ocean_bry_4.nc
It runs fine through the last time step in the first climatology and boundary files and then midway through reading in the first time step in the second climatology file it gives an error:

Code: Select all

    GET_NGFLD   - free-surface western boundary condition,   t =    92 00:00:00
                   (Rec=0003, Index=2, File: ocean_bry_2.nc)
                   (Tmin=         90.0000 Tmax=        182.0000)
                   (Min =  0.00000000E+00 Max =  0.00000000E+00)
    GET_NGFLD   - free-surface southern boundary condition,  t =    92 00:00:00
                   (Rec=0003, Index=2, File: ocean_bry_2.nc)
                   (Tmin=         90.0000 Tmax=        182.0000)
                   (Min =  0.00000000E+00 Max =  0.00000000E+00)
    GET_NGFLD   - 2D u-momentum western boundary condition,  t =    92 00:00:00
                   (Rec=0003, Index=2, File: ocean_bry_2.nc)
                   (Tmin=         90.0000 Tmax=        182.0000)
                   (Min = -7.95148206E-02 Max =  4.86689307E-02)
    GET_NGFLD   - 2D v-momentum western boundary condition,  t =    92 00:00:00
                   (Rec=0003, Index=2, File: ocean_bry_2.nc)
                   (Tmin=         90.0000 Tmax=        182.0000)
                   (Min = -2.40771787E-02 Max =  1.29980451E-01)
    GET_NGFLD   - 2D u-momentum southern boundary condition, t =    92 00:00:00
                   (Rec=0003, Index=2, File: ocean_bry_2.nc)
                   (Tmin=         90.0000 Tmax=        182.0000)
                   (Min = -1.97140224E-02 Max =  2.14538498E-02)
    GET_NGFLD   - 2D v-momentum southern boundary condition, t =    92 00:00:00
                   (Rec=0003, Index=2, File: ocean_bry_2.nc)
                   (Tmin=         90.0000 Tmax=        182.0000)
                   (Min = -9.09582488E-02 Max =  3.03645809E-02)
    GET_NGFLD   - 3D u-momentum western boundary condition,  t =    92 00:00:00
                   (Rec=0003, Index=2, File: ocean_bry_2.nc)
                   (Tmin=         90.0000 Tmax=        182.0000)
                   (Min = -2.42433726E-01 Max =  4.26087780E-01)
    GET_NGFLD   - 3D v-momentum western boundary condition,  t =    92 00:00:00
                   (Rec=0003, Index=2, File: ocean_bry_2.nc)
                   (Tmin=         90.0000 Tmax=        182.0000)
                   (Min = -1.91268487E-01 Max =  5.16416169E-01)
    GET_NGFLD   - 3D u-momentum southern boundary condition, t =    92 00:00:00
                   (Rec=0003, Index=2, File: ocean_bry_2.nc)
                   (Tmin=         90.0000 Tmax=        182.0000)
                   (Min = -8.10158132E-02 Max =  3.27968764E-01)
    GET_NGFLD   - 3D v-momentum southern boundary condition, t =    92 00:00:00
                   (Rec=0003, Index=2, File: ocean_bry_2.nc)
                   (Tmin=         90.0000 Tmax=        182.0000)
                   (Min = -1.81970499E-01 Max =  5.49691095E-02)
    GET_NGFLD   - temperature western boundary condition,    t =    92 00:00:00
                   (Rec=0003, Index=2, File: ocean_bry_2.nc)
                   (Tmin=         90.0000 Tmax=        182.0000)
                   (Min =  1.60688434E+00 Max =  2.00000000E+01)
    GET_NGFLD   - salinity western boundary condition,       t =    92 00:00:00
                   (Rec=0003, Index=2, File: ocean_bry_2.nc)
                   (Tmin=         90.0000 Tmax=        182.0000)
                   (Min =  0.00000000E+00 Max =  3.46601500E+01)
    GET_NGFLD   - temperature southern boundary condition,   t =    92 00:00:00
                   (Rec=0003, Index=2, File: ocean_bry_2.nc)
                   (Tmin=         90.0000 Tmax=        182.0000)
                   (Min =  1.60629153E+00 Max =  2.00000000E+01)
    GET_NGFLD   - salinity southern boundary condition,      t =    92 00:00:00
                   (Rec=0003, Index=2, File: ocean_bry_2.nc)
                   (Tmin=         90.0000 Tmax=        182.0000)
                   (Min =  0.00000000E+00 Max =  3.46597309E+01)
    GET_2DFLD   - sea surface height climatology,            t =    92 00:00:00
                   (Rec=0003, Index=1, File: ocean_clm_2.nc)
                   (Tmin=         90.0000 Tmax=        182.0000)
                   (Min =  0.00000000E+00 Max =  0.00000000E+00)
    GET_2DFLD   - vertically integrated u-momentum climatologt =    92 00:00:00
                   (Rec=0003, Index=1, File: ocean_clm_2.nc)
                   (Tmin=         90.0000 Tmax=        182.0000)
                   (Min = -1.35839046E-01 Max =  1.58233688E-01)
[n002:02681] *** Process received signal ***
[n002:02681] Signal: Segmentation fault (11)
[n002:02681] Signal code: Address not mapped (1)
[n002:02681] Failing at address: 0x85
[n002:02681] [ 0] /lib64/libpthread.so.0 [0x2af44481b4c0]
[n002:02681] [ 1] /usr/local/openmpi5-pgi7ib/lib/openmpi/mca_btl_openib.so [0x2af4485924c7]
[n002:02681] [ 2] /usr/local/openmpi5-pgi7ib/lib/libmpi.so.1(opal_progress+0x5a) [0x2af443ada49a]
[n002:02681] [ 3] /usr/local/openmpi5-pgi7ib/lib/libmpi.so.1 [0x2af443a46615]
[n002:02681] [ 4] /usr/local/openmpi5-pgi7ib/lib/openmpi/mca_coll_tuned.so [0x2af449a69bcd]
[n002:02681] [ 5] /usr/local/openmpi5-pgi7ib/lib/openmpi/mca_coll_tuned.so [0x2af449a6a0c7]
[n002:02681] [ 6] /usr/local/openmpi5-pgi7ib/lib/openmpi/mca_coll_tuned.so [0x2af449a5fd61]
[n002:02681] [ 7] /usr/local/openmpi5-pgi7ib/lib/openmpi/mca_coll_sync.so [0x2af44985c2d9]
[n002:02681] [ 8] /usr/local/openmpi5-pgi7ib/lib/libmpi.so.1(MPI_Bcast+0x171) [0x2af443a525d1]
[n002:02681] [ 9] /usr/local/openmpi5-pgi7ib/lib/libmpi_f77.so.1(pmpi_bcast+0x7d) [0x2af4437ef21d]
[n002:02681] [10] oceanM_pgi(distribute_mod_mp_scatter2d_+0xa21) [0x46d9d1]
[n002:02681] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 2681 on node n002 exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
Note that ocean_clm_1.nc goes from t = 0 to t = 91 and ocean_clm_2.nc goes from t = 90 to 182, but it correctly jumped from ocean_clm_1 at t = 91 to ocean_clm_2 at t = 92. I also tested this where ocean_clm_2.nc started at t = 92 to see if that was the problem but I got the same error as above although one field sooner in the reading in of the clm (i.e. right after reading in ssh from clm_2).

Kate, am I understanding your post correctly that your run made it through the first two files but only had a problem when restarting not at the beginning of the list?

Has anyone used this new multiple input feature successfully? Maybe there are some rules about the setup of the files?

Thanks for help from anyone!

User avatar
kate
Posts: 4091
Joined: Wed Jul 02, 2003 5:29 pm
Location: CFOS/UAF, USA

Re: Multiple input files

#7 Unread post by kate »

Yes, mine successfully goes from file 1 to file 2.

Can you recompile with array bounds checking on and restart just before the switch?

sarahgid
Posts: 7
Joined: Thu Feb 03, 2011 11:46 pm
Location: University of Washington

Re: Multiple input files

#8 Unread post by sarahgid »

Recompiling with array bounds checking on, restarting before the switch results in an error at the same time (switching from the first climatology file to the second, clm1 to clm2) with the additional information:

Code: Select all

0: Subscript out of range for array s%nrec (inquire.f90: 283)
    subscript=2013021384, lower bound=1, upper bound=4, dimension=1
--------------------------------------------------------------------------
mpirun has exited due to process rank 0 with PID 18993 on
node n002 exiting improperly. There are two reasons this could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

This may have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
This appears to point to line 283 in inquire.f90 which checks the dimension of the time variable. I have double checked all of my clm files however and they all have the proper time variables and dimensions.

Also, I think of particular interest is that when I run the code with only 3 clm files:

Code: Select all

     CLMNAME == Ocn/ocean_clm_1.nc | 
  Ocn/ocean_clm_2.nc |
  Ocn/ocean_clm_3.nc
OR restarting from a rst file on a day during the overlap of the two climatology files (i.e. day t = 90.5 where clm1 includes t = 0-91 and clm2 includes t = 90-182) and using

Code: Select all

     CLMNAME == Ocn/ocean_clm_2.nc |
  Ocn/ocean_clm_3.nc |
  Ocn/ocean_clm_4.nc
It does not have any problems and successfully goes from each clm file to the next reading in the proper time stamps along the way.

Is there some sort of limit on the number of clm files that can be input? Or does it dislike that because my clm files have overlap the total number of time stamps overall is greater than the number of time stamps I am running for?

Any thoughts as always would be much appreciated!
Thanks,
Sarah

User avatar
kate
Posts: 4091
Joined: Wed Jul 02, 2003 5:29 pm
Location: CFOS/UAF, USA

Re: Multiple input files

#9 Unread post by kate »

Wild guess? It probably doesn't like the overlap.

User avatar
kate
Posts: 4091
Joined: Wed Jul 02, 2003 5:29 pm
Location: CFOS/UAF, USA

Re: Multiple input files

#10 Unread post by kate »

OK, I've got a case where I bet I'm going to run into a lot of trouble with this. I've got some fields from SODA on a 5-daily frequency, ice fields daily, plus snow monthly. Right now, the time from the SODA files is called ocean_time, with the unlimited dimension. I've been creating the boundary conditions a year at a time, but I can see I'll need to do ncrcat on each type of BC independently, then mash them together. I can't keep each year as its own thing because starting in early January some year, I'll need to start with the previous year for the monthly fields, but the current year for the daily fields, one or the other for the 5-daily fields. :roll:

User avatar
kate
Posts: 4091
Joined: Wed Jul 02, 2003 5:29 pm
Location: CFOS/UAF, USA

Re: Multiple input files

#11 Unread post by kate »

I did indeed run into trouble and I think my easiest road will be to allow multiple BC files like there are multiple forcing files. With everything in one file per year, the first field to flip to the second file causes all to do so. Before the switch:

Code: Select all

    GET_NGFLD   - ice u-momentum western boundary condition, t = 31395 12:00:00
                   (Rec=0351, Index=1, File: SODA_bry_1985.nc)
                   (Tmin=      31045.5000 Tmax=      31409.5000)
                   (Min = -1.36604703E-02 Max =  0.00000000E+00)
    GET_NGFLD   - ice stress component 11 western boundary cot = 31396 00:00:00
                   (Rec=0353, Index=1, File: SODA_bry_1985.nc)
                   (Tmin=      31044.0000 Tmax=      31414.0000)
                   (Min = -7.74782324E+02 Max =  0.00000000E+00)
Here comes the switch (snow is once a month):

Code: Select all

    GET_NGFLD   - snow thickness western boundary condition, t = 31410 12:00:00
                   (Rec=0001, Index=2, File: SODA_bry_1986.nc)
                   (Tmin=      31410.5000 Tmax=      31760.2917)
                   (Min =  0.00000000E+00 Max =  7.66819235E-02)
After the switch:

Code: Select all

    GET_NGFLD   - ice u-momentum western boundary condition, t = 31761 12:00:00
                   (Rec=0352, Index=2, File: SODA_bry_1985.nc)
                   (Tmin=      31045.5000 Tmax=      31409.5000)
                   (Min = -4.53778547E-03 Max =  1.04682107E-02) 
    GET_NGFLD   - ice stress component 11 western boundary cot = 31768 00:00:00
                   (Rec=0354, Index=2, File: SODA_bry_1985.nc)
                   (Tmin=      31044.0000 Tmax=      31414.0000)
                   (Min = -1.40982982E+03 Max =  4.01483834E+01)
Note the Index is the file counter and the time is about a year later since the record counter didn't get reset.

User avatar
kate
Posts: 4091
Joined: Wed Jul 02, 2003 5:29 pm
Location: CFOS/UAF, USA

Re: Multiple input files

#12 Unread post by kate »

Those who get their code from me should know that I updated my branch to allow multiple boundary files. Note that this changes the ocean.in file.

nma

Re: Multiple input files

#13 Unread post by nma »

Hi Kate,

May I ask you for your code that allows multiple input files...I have multiple (monthly) input files for forcing (ncep2), climatology and boundary (soda) and therefore have been wondering from quite some time on how to use them to make a long inter-annual simulation! I hope your code might help me to solve that problem...correct me if I am wrong!

Have you also made some small test_run, if so I would like to use them too.

thanks and regards,
nilima

User avatar
kate
Posts: 4091
Joined: Wed Jul 02, 2003 5:29 pm
Location: CFOS/UAF, USA

Re: Multiple input files

#14 Unread post by kate »

If you are asking about having multiple files that are sequential in time (a file for March, a file for April, etc.) then that functionality is in the distributed ROMS now. What I added is multiple boundary files for having various fields each in their own file. I did it because between my fields, I had four different timings of the records. Each time wanted to be an unlimited dimension for use of ncrcat, etc.

User avatar
arango
Site Admin
Posts: 1367
Joined: Wed Feb 26, 2003 4:41 pm
Location: DMCS, Rutgers University
Contact:

Re: Multiple input files

#15 Unread post by arango »

I will add the capability to split the input fields into different files soon for both climatology and lateral boundary conditions NetCDF files. Currently, we can only split records in time. This will be equivalent to what we do with input forcing NetCDF files. They are split in both fields (winds, surface tracer fluxes, tides, rivers, etc) and time records (monthly, annual, etc). This will be done in the context of ROMS 3.6. We are making a lot of progress with ROMS input scripts python GUI. ROMS input script is getting very complex with nesting applications.

yang
Posts: 56
Joined: Mon Sep 25, 2006 2:37 pm
Location: Institue of oceanology ,Chinese acedemy of scinece

Re: Multiple input files

#16 Unread post by yang »

How to set the time attribute 'cycle_length' when using the multiple input files?
I am running the climatology case(for example, run 10 model years for spin-up). When the climatology file is given by one file(the cycle_length=360),the model can be nudged correctly. But when the climatology file is split monthly, the ROMS can not get the correct time index, specially when the model time is greater than 1 model year.
Input multiple Climatology Files:
Ocn/ocean_clm_month01.nc |
Ocn/ocean_clm_month02.nc |
Ocn/ocean_clm_month03.nc |
Ocn/ocean_clm_month04.nc |
......
Ocn/ocean_clm_month12.nc

johnluick

Re: Multiple input files

#17 Unread post by johnluick »

Hernan, did you ever get around to enabling the use of multiple input files for climatology?

User avatar
arango
Site Admin
Posts: 1367
Joined: Wed Feb 26, 2003 4:41 pm
Location: DMCS, Rutgers University
Contact:

Re: Multiple input files

#18 Unread post by arango »

¥es, both the boundary conditions and climatology files can be split into multiple files along the time record dimension. For example, we can have monthly, seasonal, or annual files.

Post Reply