%files memory allocation: Seg fault with NDEFHIS and/or NDEFAVG > 0

Report or discuss software problems and other woes

Moderators: arango, robertson

Post Reply
Message
Author
kearneyb10k
Posts: 14
Joined: Tue Oct 16, 2018 4:26 am
Location: University of Washington, JISAO

%files memory allocation: Seg fault with NDEFHIS and/or NDEFAVG > 0

#1 Unread post by kearneyb10k »

Can someone clarify where memory for the HIS(ng)%files and AVG(ng)%files pointer arrays are allocated?

I'm hitting an odd segmentation fault when related to these arrays. If I set NDEFHIS and NDEFAVG to 0, everything is fine. But if I choose a non-zero value, I'm getting a seg fault associated with assigning a value to these arrays (e.g., in output.F):

Code: Select all

HIS(ng)%files(Fcount)=TRIM(HIS(ng)%name)
Interestingly, the error log (in debug mode) identifies the seg fault as occurring at the next print to standard output:

Code: Select all

forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image              PC                Routine            Line        Source             
romsG_phys_202207  00000000032EC901  Unknown               Unknown  Unknown
romsG_phys_202207  00000000032EAA3B  Unknown               Unknown  Unknown
romsG_phys_202207  0000000003294084  Unknown               Unknown  Unknown
romsG_phys_202207  0000000003293E96  Unknown               Unknown  Unknown
romsG_phys_202207  0000000003243389  Unknown               Unknown  Unknown
romsG_phys_202207  0000000003246976  Unknown               Unknown  Unknown
libpthread-2.17.s  00002B5E71FC5630  Unknown               Unknown  Unknown
romsG_phys_202207  0000000003255BFB  Unknown               Unknown  Unknown
romsG_phys_202207  0000000003252729  Unknown               Unknown  Unknown
romsG_phys_202207  000000000327F477  Unknown               Unknown  Unknown
romsG_phys_202207  0000000001CFBC59  def_his_                   59  def_his.f90
romsG_phys_202207  00000000005113CB  output_                   105  output.f90
romsG_phys_202207  0000000000417716  main3d_                   254  main3d.f90
romsG_phys_202207  000000000040CBD9  ocean_control_mod         167  ocean_control.f90
romsG_phys_202207  000000000040B696  MAIN__                     86  master.f90
romsG_phys_202207  000000000040B11E  Unknown               Unknown  Unknown
libc-2.17.so       00002B5E723F8555  __libc_start_main     Unknown  Unknown
romsG_phys_202207  000000000040B029  Unknown               Unknown  Unknown
where

Code: Select all

      IF (FoundError(exit_flag, NoError, 87,                      &
     &               "ROMS/Utility/def_his.F")) RETURN
      ncname=HIS(ng)%name
!
      IF (Master) THEN
        IF (ldef) THEN
          WRITE (stdout,10) ng, TRIM(ncname)  ! <-- line 59 of def_his.f90
        ELSE
          WRITE (stdout,20) ng, TRIM(ncname)
        END IF
      END IF
but if I add a quick print statement to output.f, the error migrates to

Code: Select all

forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image              PC                Routine            Line        Source             
romsG_phys_202207  00000000032ECEF1  Unknown               Unknown  Unknown
romsG_phys_202207  00000000032EB02B  Unknown               Unknown  Unknown
romsG_phys_202207  0000000003294674  Unknown               Unknown  Unknown
romsG_phys_202207  0000000003294486  Unknown               Unknown  Unknown
romsG_phys_202207  0000000003243979  Unknown               Unknown  Unknown
romsG_phys_202207  0000000003246F66  Unknown               Unknown  Unknown
libpthread-2.17.s  00002B2522EA6630  Unknown               Unknown  Unknown
romsG_phys_202207  00000000032561EB  Unknown               Unknown  Unknown
romsG_phys_202207  0000000003252D19  Unknown               Unknown  Unknown
romsG_phys_202207  0000000003285EFC  Unknown               Unknown  Unknown
romsG_phys_202207  0000000000511286  output_                   103  output.f90
romsG_phys_202207  0000000000417716  main3d_                   254  main3d.f90
romsG_phys_202207  000000000040CBD9  ocean_control_mod         167  ocean_control.f90
romsG_phys_202207  000000000040B696  MAIN__                     86  master.f90
romsG_phys_202207  000000000040B11E  Unknown               Unknown  Unknown
libc-2.17.so       00002B25232D9555  __libc_start_main     Unknown  Unknown
romsG_phys_202207  000000000040B029  Unknown               Unknown  Unknown

Code: Select all

            IF (Master) THEN
              WRITE (HIS(ng)%name,10) TRIM(HIS(ng)%base), ifile
  10          FORMAT (a,'_',i5.5,'.nc')
            END IF
            print *, "TRIM(HIS(ng)%name)=", TRIM(HIS(ng)%name)
            HIS(ng)%files(Fcount)=TRIM(HIS(ng)%name)
            print *, "TRIM(HIS(ng)%name)=", TRIM(HIS(ng)%name) ! <-- line 103 of output.f90
            IF (HIS(ng)%ncid.ne.-1) THEN
              CALL netcdf_close (ng, iNLM, HIS(ng)%ncid)
            END IF
            CALL def_his (ng, NewFile)

I'm assuming that the actual error is related to something I broke during biogeochemical module development, but once again, the true mistake seems to be well away from the symptom. I'd like to track the HIS(ng)%files array to see where the memory is getting incorrectly deallocated (or never allocated in the first place?), but I can't seem to find those places in the code. Any advice appreciated!

User avatar
arango
Site Admin
Posts: 1367
Joined: Wed Feb 26, 2003 4:41 pm
Location: DMCS, Rutgers University
Contact:

Re: %files memory allocation: Seg fault with NDEFHIS and/or NDEFAVG > 0

#2 Unread post by arango »

It has a problem reporting to standard output. Put a print statement to check was is the value of ng and ncname. You can modify the def_his.f90 in the Build_romsG directory and recompile it with the -noclean option. You need a little debugging to figure out what is going on.

kearneyb10k
Posts: 14
Joined: Tue Oct 16, 2018 4:26 am
Location: University of Washington, JISAO

Re: %files memory allocation: Seg fault with NDEFHIS and/or NDEFAVG > 0

#3 Unread post by kearneyb10k »

As mentioned, the trigger for the seg fault was assignment to an unallocated pointer, not the write itself. Not quite sure why the seg fault didn't occur immediately and instead waited for the next print to standard output... but that's not really relevant to diagnosing the problem.

I did track down where the HIS(ng)%files and other similar multi-part output file arrays are allocated, in read_PhyPar. My version of the code (forked from Hedstrom sea ice+extra bio branch) was missing some more recent logic, the increment-OutFiles-by-one part of the code below, which handles situations where the total number of time steps being run is less that the number of time steps in each file. I was running a short two-week test simulation with 10-week output files, which was leading to OutFiles being set to 0, hence the memory issues when the code tried to stick a value in a 0-length array.

Code: Select all

   IF ((nHIS(ng).gt.0).and.(ndefHIS(ng).gt.0)) THEN
     OutFiles=ntimes(ng)/ndefHIS(ng)
     IF ((nHIS(ng).eq.ndefHIS(ng)).or.                             &
&        (MOD(ntimes(ng),ndefHIS(ng)).ge.nHIS(ng))) THEN
       OutFiles=Outfiles+1
     END IF
     CALL edit_file_struct (ng, OutFiles, HIS)
   END IF
I updated my code with the proper logic and the problem is solved.

User avatar
arango
Site Admin
Posts: 1367
Joined: Wed Feb 26, 2003 4:41 pm
Location: DMCS, Rutgers University
Contact:

Re: %files memory allocation: Seg fault with NDEFHIS and/or NDEFAVG > 0

#4 Unread post by arango »

Great, that's the way to do it. One needs to get deeper into debugging to solve the problems. Also, users learn the code more intimately and don't get scared of making changes.

kearneyb10k
Posts: 14
Joined: Tue Oct 16, 2018 4:26 am
Location: University of Washington, JISAO

Re: %files memory allocation: Seg fault with NDEFHIS and/or NDEFAVG > 0

#5 Unread post by kearneyb10k »

Haha, definitely not afraid to make changes... that's what gets me into these problems! For once the error wasn't of my own making, which is why I had such trouble tracking it down; I just assumed I had broken something and so was concentrating on the portions of code I had modified.

Post Reply