Segmentation Fault when executing oceanM

Report or discuss software problems and other woes

Moderators: arango, robertson

Post Reply
Message
Author
nirnimesh
Posts: 27
Joined: Tue Sep 11, 2007 6:57 pm
Location: University of Washington
Contact:

Segmentation Fault when executing oceanM

#1 Unread post by nirnimesh »

Hello Everybody

I am trying to setup ROMS for a new Linux-cluster and having some issues. Before I jump to the issue I will provide you with some details of the Linux cluster:

uname -a command gives the following output:

Code: Select all

Linux acm-chem.psc.sc.edu 2.6.18-128.1.14.el5 #1 SMP Wed Jun 17 06:38:05 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
Some of the other components being used are:
openmpi 1.2.8.
NETCDF4 with HDF5
Intel Fortran Compiler (IFORT 11.1):

Now I am working on using coupled ROMS-SWAN system for studying Rip Currents. I am able to successfully compile my case. When executing the compiled code, I use one processor each for ROMS and SWAN. This is the error I get:

Code: Select all

Running on hosts 1
NSLOTS is 2
Time is Sun May 16 13:56:33 EDT 2010
Directory is /home/kumar/Projects/FRFNC/Case1/CODE
 Coupled Input File name = ../RUNFILES/coupling_ripfrfnc.in



 Model Coupling Parallel Threads:

   Ocean Model MPI nodes:   000 - 000
   Waves Model MPI nodes:   001 - 001


   Ocean Export: bath:SSH:Ubar:Vbar
   Waves Export: Wdir:Wamp:Wlen:Wptop:Wpbot:Wdiss:Wbrk:Wubot


 Process Information:

 Node #  0 (pid=   12034) is active.

 Model Input Parameters:  ROMS/TOMS version 3.2  
                          Sunday - May 16, 2010 -  1:56:35 PM
 -----------------------------------------------------------------------------

 SWAN is preparing computation ...


 HOLE7s Massachusets

 Operating system : Linux
 CPU/hardware     : x86_64
 Compiler system  : ifort
 Compiler command : /usr/mpi/intel/openmpi-1.2.8/bin/mpif90
 Compiler flags   : -heap-arrays -fp-model precise -ip -O3 -xW -I/share/apps/MCT/intel/include -free

 Input Script  : ../RUNFILES/ocean_ripfrfnc.in

 SVN Root URL  : https://www.myroms.org/svn/src/trunk
 SVN Revision  : exported

 Local Root    : /home/kumar/Projects/FRFNC/Case1/CODE
 Header Dir    : /home/kumar/Projects/FRFNC/Case1/RUNFILES/
 Header file   : ripfrfnc.h
 Analytical Dir: /home/kumar/Projects/FRFNC/Case1/RUNFILES/

 Resolution, Grid 01: 0126x0158x020,  Parallel Nodes:   1,  Tiling: 001x001


 Physical Parameters, Grid: 01
 =============================

      14400  ntimes          Number of timesteps for 3-D equations.
      0.250  dt              Timestep size (s) for 3-D equations.
         20  ndtfast         Number of timesteps for 2-D equations between
                               each 3D timestep.
          1  ERstr           Starting ensemble/perturbation run number.
          1  ERend           Ending ensemble/perturbation run number.
          0  nrrec           Number of restart records to read from disk.
          T  LcycleRST       Switch to recycle time-records in restart file.
        360  nRST            Number of timesteps between the writing of data
                               into restart fields.
          1  ninfo           Number of timesteps between print of information
                               to standard output.
          T  ldefout         Switch to create a new output NetCDF file(s).
       2400  nHIS            Number of timesteps between the writing fields
                               into history file.
          1  ntsAVG          Starting timestep for the accumulation of output
                               time-averaged data.
       2400  nAVG            Number of timesteps between the writing of
                               time-averaged data into averages file.
          1  ntsDIA          Starting timestep for the accumulation of output
                               time-averaged diagnostics data.
       2400  nDIA            Number of timesteps between the writing of
                               time-averaged data into diagnostics file.
 1.0000E-01  visc2           Horizontal, harmonic mixing coefficient (m2/s)
                               for momentum.
 5.0000E-06  Akt_bak(01)     Background vertical mixing coefficient (m2/s)
                               for tracer 01: temp
 5.0000E-05  Akv_bak         Background vertical mixing coefficient (m2/s)
                               for momentum.
 5.0000E-06  Akk_bak         Background vertical mixing coefficient (m2/s)
                               for turbulent energy.
 5.0000E-06  Akp_bak         Background vertical mixing coefficient (m2/s)
                               for turbulent generic statistical field.
      3.000  gls_p           GLS stability exponent.
      1.500  gls_m           GLS turbulent kinetic energy exponent.
     -1.000  gls_n           GLS turbulent length scale exponent.
 7.6000E-06  gls_Kmin        GLS minimum value of turbulent kinetic energy.
 1.0000E-12  gls_Pmin        GLS minimum value of dissipation.
 5.4770E-01  gls_cmu0        GLS stability coefficient.
 1.4400E+00  gls_c1          GLS shear production coefficient.
 1.9200E+00  gls_c2          GLS dissipation coefficient.
-4.0000E-01  gls_c3m         GLS stable buoyancy production coefficient.
 1.0000E+00  gls_c3p         GLS unstable buoyancy production coefficient.
 1.0000E+00  gls_sigk        GLS constant Schmidt number for TKE.
 1.3000E+00  gls_sigp        GLS constant Schmidt number for PSI.
   1400.000  charnok_alpha   Charnok factor for Zos calculation.
      0.500  zos_hsig_alpha  Factor for Zos calculation using Hsig(Awave).
      0.250  sz_alpha        Factor for Wave dissipation surface tke flux .
    100.000  crgban_cw       Factor for Craig/Banner surface tke flux.
 6.0000E-04  rdrg            Linear bottom drag coefficient (m/s).
 3.0000E-03  rdrg2           Quadratic bottom drag coefficient.
 1.5000E-02  Zob             Bottom roughness (m).
 2.0000E-02  Zos             Surface roughness (m).
 2.0000E-01  Dcrit           Minimum depth for wetting and drying (m).
          1  Vtransform      S-coordinate transformation equation.
          1  Vstretching     S-coordinate stretching function.
 0.0000E+00  theta_s         S-coordinate surface control parameter.
 0.0000E+00  theta_b         S-coordinate bottom  control parameter.
      0.000  Tcline          S-coordinate surface/bottom layer width (m) used
                               in vertical coordinate stretching.
   1025.000  rho0            Mean density (kg/m3) for Boussinesq approximation.
      0.000  dstart          Time-stamp assigned to model initialization (days).
       0.00  time_ref        Reference time for units attribute (yyyymmdd.dd)
 0.0000E+00  Tnudg(01)       Nudging/relaxation time scale (days)
                               for tracer 01: temp
 0.0000E+00  Znudg           Nudging/relaxation time scale (days)
                               for free-surface.
 0.0000E+00  M2nudg          Nudging/relaxation time scale (days)
                               for 2D momentum.
 0.0000E+00  M3nudg          Nudging/relaxation time scale (days)
                               for 3D momentum.
 0.0000E+00  obcfac          Factor between passive and active
                               open boundary conditions.
     10.000  T0              Background potential temperature (C) constant.
     30.000  S0              Background salinity (PSU) constant.
   1027.000  R0              Background density (kg/m3) used in linear Equation
                               of State.
 1.7000E-04  Tcoef           Thermal expansion coefficient (1/Celsius).
 7.6000E-04  Scoef           Saline contraction coefficient (1/PSU).
      1.000  gamma2          Slipperiness variable: free-slip (1.0) or 
                                                    no-slip (-1.0).
          T  Hout(idFsur)    Write out free-surface.
          T  Hout(idUbar)    Write out 2D U-momentum component.
          T  Hout(idVbar)    Write out 2D V-momentum component.
          T  Hout(idUvel)    Write out 3D U-momentum component.
          T  Hout(idVvel)    Write out 3D V-momentum component.
          T  Hout(idWvel)    Write out W-momentum component.
          T  Hout(idOvel)    Write out omega vertical velocity.
          T  Hout(idTvar)    Write out tracer 01: temp
          T  Hout(idUbms)    Write out bottom U-momentum stress.
          T  Hout(idVbms)    Write out bottom V-momentum stress.
          T  Hout(idW2xx)    Write out 2D radiation stress, Sxx.
          T  Hout(idW2xy)    Write out 2D radiation stress, Sxy.
          T  Hout(idW2yy)    Write out 2D radiation stress, Syy.
          T  Hout(idU2rs)    Write out total 2D u-radiation stress.
          T  Hout(idV2rs)    Write out total 2D v-radiation stress.
          T  Hout(idU2Sd)    Write out 2D u-momentum stokes velocity.
          T  Hout(idV2Sd)    Write out 2D v-momentum stokes velocity.
          T  Hout(idW3xx)    Write out 3D horizonrtal radiation stress, Sxx.
          T  Hout(idW3xy)    Write out 3D horizonrtal radiation stress, Sxy.
          T  Hout(idW3yy)    Write out 3D horizonrtal radiation stress, Syy.
          T  Hout(idW3zx)    Write out 3D vertical radiation stress, Szx.
          T  Hout(idW3zy)    Write out 3D vertical radiation stress, Szy.
          T  Hout(idU3rs)    Write out total 3D u-radiation stress.
          T  Hout(idV3rs)    Write out total 3D v-radiation stress.
          T  Hout(idU3Sd)    Write out 3D u-momentum stokes velocity.
          T  Hout(idV3Sd)    Write out 3D v-momentum stokes velocity.
          T  Hout(idWamp)    Write out wave height.
          T  Hout(idWlen)    Write out wave length.
          T  Hout(idWdir)    Write out wave direction.
          T  Hout(idVvis)    Write out vertical viscosity coefficient.
          T  Hout(idMtke)    Write out turbulent kinetic energy.
          T  Hout(idMtls)    Write out turbulent generic length-scale.

 Output/Input Files:

             Output Restart File:  ../OUTPUT/rip_frfnc_5_1_rst.nc
             Output History File:  ../OUTPUT/rip_frfnc_5_1_his.nc
            Output Averages File:  ../OUTPUT/rip_frfnc_5_1_avg.nc
         Output Diagnostics File:  ../OUTPUT/rip_frfnc_5_1_dia.nc
        Physical parameters File:  ../RUNFILES/ocean_ripfrfnc.in
                 Input Grid File:  ../../Forcing/ripgrd_frf_nc_5_1.nc
    Input Nonlinear Initial File:  ../../Forcing/ripgrd_init_frf_nc_5_1.nc
           Input Forcing File 01:  ../../Forcing/ripgrd_wind_frf_nc_5_1.nc

 Tile partition information for Grid 01:  0126x0158x0020  tiling: 001x001

     tile     Istr     Iend     Jstr     Jend     Npts

        0        1      126        1      158   398160

 Tile minimum and maximum fractional grid coordinates:
   (interior points only)

     tile     Xmin     Xmax     Ymin     Ymax     grid

        0     0.50   126.50     0.50   158.50  RHO-points

        0     1.00   126.00     0.50   158.50    U-points

        0     0.50   126.50     1.00   158.00    V-points

 Maximum halo size in XI and ETA directions:

               HaloSizeI(1) =     411
               HaloSizeJ(1) =     507
                TileSide(1) =     163
                TileSize(1) =   21353


 Activated C-preprocessing Options:

 RIPFRFNC            HOLE7s Massachusets
 ANA_BSFLUX          Analytical kinematic bottom salinity flux.
 ANA_BTFLUX          Analytical kinematic bottom temperature flux.
 ANA_FSOBC           Analytical free-surface boundary conditions.
 ANA_M2OBC           Analytical 2D momentum boundary conditions.
 ANA_SSFLUX          Analytical kinematic surface salinity flux.
 ANA_STFLUX          Analytical kinematic surface temperature flux.
 ASSUMED_SHAPE       Using assumed-shape arrays.
 AVERAGES            Writing out time-averaged fields.
 DIAGNOSTICS_UV      Computing and writing momentum diagnostic terms.
 DJ_GRADPS           Parabolic Splines density Jacobian (Shchepetkin, 2002).
 DOUBLE_PRECISION    Double precision arithmetic.
 EAST_FSCHAPMAN      Eastern edge, free-surface, Chapman condition.
 EAST_M2FLATHER      Eastern edge, 2D momentum, Flather condition.
 EAST_M3GRADIENT     Eastern edge, 3D momentum, gradient condition.
 EAST_TGRADIENT      Eastern edge, tracers, gradient condition.
 GLS_MIXING          Generic Length-Scale turbulence closure.
 KANTHA_CLAYSON      Kantha and Clayson stability function formulation.
 MASKING             Land/Sea masking.
 MCT_LIB             Using Model Coupling Toolkit library.
 MIX_S_UV            Mixing of momentum along constant S-surfaces.
 MPI                 MPI distributed-memory configuration.
 NEARSHORE_MELLOR    Nearshore RAdiation Stress Terms.
 NONLINEAR           Nonlinear Model.
 !NONLIN_EOS         Linear Equation of State for seawater.
 NORTHERN_WALL       Wall boundary at Northern edge.
 N2S2_HORAVG         Horizontal smoothing of buoyancy and shear.
 OUT_DOUBLE          Double precision output fields in NetCDF files.
 POWER_LAW           Power-law shape time-averaging barotropic filter.
 PROFILE             Time profiling activated .
 K_GSCHEME           Third-order upstream advection of TKE fields.
 !RST_SINGLE         Double precision fields in restart NetCDF file.
 SOLVE3D             Solving 3D Primitive Equations.
 SOUTHERN_WALL       Wall boundary at Southern edge.
 SPLINES             Conservative parabolic spline reconstruction.
 SWAN_COUPLING       Two-way SWAN/ROMS coupling.
 THREE_GHOST         Using three Ghost Points in halo regions.
 TS_MPDATA           Recursive flux corrected MPDATA 3D advection of tracers.
 UV_ADV              Advection of momentum.
 UV_U3HADVECTION     Third-order upstream horizontal advection of 3D momentum.
 UV_C4VADVECTION     Fourth-order centered vertical advection of momentum.
 UV_LOGDRAG          Logarithmic bottom stress.
 UV_VIS2             Harmonic mixing of momentum.
 VAR_RHO_2D          Variable density barotropic mode.
 WAVES_OCEAN         Two-way wave-ocean models coupling.
 WESTERN_WALL        Wall boundary at Western edge.
 WET_DRY             Wetting and drying activated.

 INITIAL: Configuring and initializing forward nonlinear model ...


 Vertical S-coordinate System: 

 level   S-coord     Cs-curve          at_hmin  over_slope     at_hmax

    20   0.0000000   0.0000000           0.000       0.000       0.000
    19  -0.0500000  -0.0500000           0.096      -0.080      -0.256
    18  -0.1000000  -0.1000000           0.192      -0.160      -0.512
    17  -0.1500000  -0.1500000           0.288      -0.240      -0.767
    16  -0.2000000  -0.2000000           0.384      -0.319      -1.023
    15  -0.2500000  -0.2500000           0.480      -0.399      -1.279
    14  -0.3000000  -0.3000000           0.577      -0.479      -1.535
    13  -0.3500000  -0.3500000           0.673      -0.559      -1.790
    12  -0.4000000  -0.4000000           0.769      -0.639      -2.046
    11  -0.4500000  -0.4500000           0.865      -0.719      -2.302
    10  -0.5000000  -0.5000000           0.961      -0.798      -2.558
     9  -0.5500000  -0.5500000           1.057      -0.878      -2.814
     8  -0.6000000  -0.6000000           1.153      -0.958      -3.069
     7  -0.6500000  -0.6500000           1.249      -1.038      -3.325
     6  -0.7000000  -0.7000000           1.345      -1.118      -3.581
     5  -0.7500000  -0.7500000           1.441      -1.198      -3.837
     4  -0.8000000  -0.8000000           1.537      -1.278      -4.092
     3  -0.8500000  -0.8500000           1.633      -1.357      -4.348
     2  -0.9000000  -0.9000000           1.730      -1.437      -4.604
     1  -0.9500000  -0.9500000           1.826      -1.517      -4.860
     0  -1.0000000  -1.0000000           1.922      -1.597      -5.116

 Time Splitting Weights: ndtfast =  20    nfast =  29

    Primary            Secondary            Accumulated to Current Step

  1-0.0009651193358779 0.0500000000000000-0.0009651193358779 0.0500000000000000
  2-0.0013488780126037 0.0500482559667939-0.0023139973484816 0.1000482559667939
  3-0.0011514592651645 0.0501156998674241-0.0034654566136460 0.1501639558342179
  4-0.0003735756740661 0.0501732728306823-0.0038390322877122 0.2003372286649002
  5 0.0009829200513762 0.0501919516143856-0.0028561122363360 0.2505291802792858
  6 0.0029141799764308 0.0501428056118168 0.0000580677400948 0.3006719858911026
  7 0.0054132615310267 0.0499970966129952 0.0054713292711215 0.3506690825040978
  8 0.0084687837865132 0.0497264335364439 0.0139401130576347 0.4003955160405417
  9 0.0120633394191050 0.0493029943471183 0.0260034524767397 0.4496985103876600
 10 0.0161716623600090 0.0486998273761630 0.0421751148367486 0.4983983377638230
 11 0.0207585511322367 0.0478912442581626 0.0629336659689853 0.5462895820219856
 12 0.0257765478740990 0.0468533167015507 0.0887102138430843 0.5931428987235363
 13 0.0311633730493853 0.0455644893078458 0.1198735868924696 0.6387073880313821
 14 0.0368391158442262 0.0440063206553765 0.1567127027366958 0.6827137086867585
 15 0.0427031802506397 0.0421643648631652 0.1994158829873354 0.7248780735499237
 16 0.0486309868367616 0.0400292058506332 0.2480468698240970 0.7649072794005569
 17 0.0544704302037591 0.0375976565087951 0.3025173000278562 0.8025049359093520
 18 0.0600380921294285 0.0348741349986072 0.3625553921572847 0.8373790709079592
 19 0.0651152103984763 0.0318722303921358 0.4276706025557610 0.8692513013000949
 20 0.0694434033194839 0.0286164698722119 0.4971140058752449 0.8978677711723068
 21 0.0727201499285569 0.0251442997062377 0.5698341558038018 0.9230120708785445
 22 0.0745940258796570 0.0215082922098099 0.6444281816834588 0.9445203630883544
 23 0.0746596950216180 0.0177785909158270 0.7190878767050768 0.9622989540041814
 24 0.0724526566618460 0.0140456061647461 0.7915405333669228 0.9763445601689276
 25 0.0674437485167025 0.0104229733316538 0.8589842818836253 0.9867675335005814
 26 0.0590334053485720 0.0070507859058187 0.9180176872321973 0.9938183194064002
 27 0.0465456732896125 0.0040991156383901 0.9645633605218099 0.9979174350447904
 28 0.0292219798521905 0.0017718319739095 0.9937853403740005 0.9996892670186999
 29 0.0062146596259994 0.0003107329813000 0.9999999999999999 0.9999999999999999

 ndtfast, nfast =   20  29   nfast/ndtfast = 1.45000

 Centers of gravity and integrals (values must be 1, 1, approx 1/2, 1, 1):

    1.000000000000 1.060707743385 0.530353871693 1.000000000000 1.000000000000

 Power filter parameters, Fgamma, gamma =  0.28400   0.14200

 Minimum X-grid spacing, DXmin =  4.00000000E-03 km
 Maximum X-grid spacing, DXmax =  4.00000000E-03 km
 Minimum Y-grid spacing, DYmin =  4.00000000E-03 km
 Maximum Y-grid spacing, DYmax =  4.00000000E-03 km
 Minimum Z-grid spacing, DZmin = -9.60853172E-02 m
 Maximum Z-grid spacing, DZmax =  2.55775430E-01 m

 Minimum barotropic Courant Number =  4.24519066E-03
 Maximum barotropic Courant Number =  3.13071788E-02
 Maximum Coriolis   Courant Number =  3.64500000E-05


 NLM: GET_STATE - Read state initial conditions,             t =     0 00:00:00
                   (File: ripgrd_init_frf_nc_5_1.nc, Rec=0001, Index=1)
                - free-surface
                   (Min =  0.00000000E+00 Max =  0.00000000E+00)
                - vertically integrated u-momentum component
                   (Min =  0.00000000E+00 Max =  0.00000000E+00)
                - vertically integrated v-momentum component
                   (Min =  0.00000000E+00 Max =  0.00000000E+00)
                - u-momentum component
                   (Min =  0.00000000E+00 Max =  0.00000000E+00)
                - v-momentum component
                   (Min =  0.00000000E+00 Max =  0.00000000E+00)
                - potential temperature
                   (Min =  1.00000000E+01 Max =  1.00000000E+01)
    GET_2DFLD   - surface u-momentum stress,                 t =     0 00:00:00
                   (Rec=0001, Index=1, File: ripgrd_wind_frf_nc_5_1.nc)
                   (Tmin=          0.0000 Tmax=       3600.0000)
                   (Min =  0.00000000E+00 Max =  0.00000000E+00)
    GET_2DFLD   - surface v-momentum stress,                 t =     0 00:00:00
                   (Rec=0001, Index=1, File: ripgrd_wind_frf_nc_5_1.nc)
                   (Tmin=          0.0000 Tmax=       3600.0000)
                   (Min =  0.00000000E+00 Max =  0.00000000E+00)

 Maximum grid stiffness ratios:  rx0 =   5.857685E+00 (Beckmann and Haidvogel)
                                 rx1 =   2.284497E+02 (Haney)


 Initial basin volumes: TotVolume =  8.6976144384E+05 m3
                        MinVolume = -1.2499783885E+00 m3
                        MaxVolume =  4.0640533610E+00 m3
                          Max/Min = -3.2512989013E+00

      OCN2WAV   - (07) imported and (04) exported fields,    t =     0 00:00:00
                - ROMS coupling exchanges wait clock (s):
                   (Recv= 1.16152350E+01 Send= 1.99900000E-03)
                - ROMS Import: wave direction
                   (Min=  0.00000000E+00 Max=  0.00000000E+00)
                - ROMS Import: significant wave height
                   (Min=  0.00000000E+00 Max=  2.00687870E-02)
                - ROMS Import: average wave length
                   (Min=  0.00000000E+00 Max=  3.11923289E+00)
                - ROMS Import: surface wave relative peak period
                   (Min=  0.00000000E+00 Max=  1.90365362E+00)
                - ROMS Import: bottom wave period
                   (Min=  0.00000000E+00 Max=  0.00000000E+00)
                - ROMS Import: wave energy dissipation
                   (Min=  0.00000000E+00 Max=  0.00000000E+00)
                - ROMS Import: percent wave breaking
                   (Min=  0.00000000E+00 Max=  0.00000000E+00)
                - ROMS Import: wave bottom orbital velocity
                   (Min=  0.00000000E+00 Max=  0.00000000E+00)
                - ROMS Export: bathymetry
                   (Min= -1.92170634E+00 Max=  5.11550860E+00)
                - ROMS Export: free-surface
                   (Min=  0.00000000E+00 Max=  0.00000000E+00)
                - ROMS Export: vertically integrated u-momentum component
                   (Min=  0.00000000E+00 Max=  0.00000000E+00)
                - ROMS Export: vertically integrated v-momentum component
                   (Min=  0.00000000E+00 Max=  0.00000000E+00)


NL ROMS/TOMS: started time-stepping: (Grid: 01 TimeSteps: 00000001 - 00014400)
    GET_2DFLD   - surface u-momentum stress,                 t =  1800 00:00:00
                   (Rec=0002, Index=2, File: ripgrd_wind_frf_nc_5_1.nc)
                   (Tmin=          0.0000 Tmax=       3600.0000)
                   (Min =  0.00000000E+00 Max =  0.00000000E+00)
    GET_2DFLD   - surface v-momentum stress,                 t =  1800 00:00:00
                   (Rec=0002, Index=2, File: ripgrd_wind_frf_nc_5_1.nc)
                   (Tmin=          0.0000 Tmax=       3600.0000)
                   (Min =  0.00000000E+00 Max =  0.00000000E+00)
      WAV2OCN   - (04) imported and (07) exported fields,    t = 20000101.000000
                - SWAN coupling exchanges wait clock (s):
                   (Recv= 3.99900000E-03 Send= 4.99900000E-03)

   STEP   Day HH:MM:SS  KINETIC_ENRG   POTEN_ENRG    TOTAL_ENRG    NET_VOLUME

      0     0 00:00:00  0.000000E+00  1.723853E+01  1.723853E+01  8.826878E+05
[compute-1-2:12034] *** Process received signal ***
[compute-1-2:12034] Signal: Segmentation fault (11)
[compute-1-2:12034] Signal code: Address not mapped (1)
[compute-1-2:12034] Failing at address: 0xfffffff98c396e80
[compute-1-2:12034] [ 0] /lib64/libpthread.so.0 [0x2b49984894c0]
[compute-1-2:12034] [ 1] ./oceanM(radiation_stress_mod_mp_radiation_stress_tile_+0xaa7a) [0x45ce7a]
[compute-1-2:12034] [ 2] ./oceanM(radiation_stress_mod_mp_radiation_stress_+0x550) [0x4523d0]
[compute-1-2:12034] [ 3] ./oceanM(main3d_+0x36f) [0x44cecf]
[compute-1-2:12034] [ 4] ./oceanM(ocean_control_mod_mp_roms_run_+0x58) [0x446488]
[compute-1-2:12034] [ 5] ./oceanM(MAIN__+0x20d) [0x44631d]
[compute-1-2:12034] [ 6] ./oceanM(main+0x3c) [0x4460ec]
[compute-1-2:12034] [ 7] /lib64/libc.so.6(__libc_start_main+0xf4) [0x387c81d974]
[compute-1-2:12034] [ 8] ./oceanM [0x445ff9]
[compute-1-2:12034] *** End of error message ***
forrtl: error (78): process killed (SIGTERM)
Image              PC                Routine            Line        Source             
oceanM             00000000009C0998  Unknown               Unknown  Unknown
oceanM             000000000070CC32  Unknown               Unknown  Unknown
oceanM             00000000006D8E49  Unknown               Unknown  Unknown
oceanM             000000000071077C  Unknown               Unknown  Unknown
oceanM             000000000069D0AA  Unknown               Unknown  Unknown
oceanM             0000000000680F84  Unknown               Unknown  Unknown
oceanM             000000000044640E  Unknown               Unknown  Unknown
oceanM             00000000004460EC  Unknown               Unknown  Unknown
libc.so.6          000000387C81D974  Unknown               Unknown  Unknown
oceanM             0000000000445FF9  Unknown               Unknown  Unknown
mpirun noticed that job rank 0 with PID 12034 on node compute-1-2.local exited on signal 11 (Segmentation fault). 

*****************************************************************************************************
My understanding from the error is that there is a tiling bug in radiation_stress.F which is creating this error. The radiation_stress.F file in my case is not same as distributed by rutgers subversion system. It has been modified to include few more processes which Dr. John Warner and I have been working on. Since I am new to coding in ROMS, I wouldn't be surprised to see a tiling error.

What surprises me though is that, I also have a different cluster the details of which are provided below. The same simulation works with no error in this cluster.

Code: Select all

Linux wave 2.6.18-164.11.1.el5 #1 SMP Wed Jan 20 07:32:21 EST 2010 x86_64 x86_64 x86_64 GNU/Linux
Other details:
Compiler: PGI
MPICH
NETCDF4 with HDF5.

******************************************************************************************************
I have tried using PGI complier along with OPENMPI 1.2.8, NETCDF4 and HDF5 in the first cluster and I still get the same error. So I am not very sure what is going wrong. It would be very nice if someone could direct me to possible reason for segmentation fault. Feel free to ask me for any information you need.

Thanks
Nirnimesh

jcwarner
Posts: 1200
Joined: Wed Dec 31, 2003 6:16 pm
Location: USGS, USA

Re: Segmentation Fault when executing oceanM

#2 Unread post by jcwarner »

this is not clear.
Suggest that you do this:
1) try to compile the code w/ debug=on and run oceanG to see if that gives any more info.
2) if there is a 'tiling bug' as you say, then running it with one processor to swan and one to roms does NOT introduce any tiling! So not sure how this could be a tiling bug.
3) does it work if you use the radstress.F that is distributed on the rutgers site?

-j

nirnimesh
Posts: 27
Joined: Tue Sep 11, 2007 6:57 pm
Location: University of Washington
Contact:

Re: Segmentation Fault when executing oceanM

#3 Unread post by nirnimesh »

(1) The errors after using DEBUG ON are shown below:

Code: Select all

Running on hosts 1
NSLOTS is 2
Time is Sun May 16 20:32:33 EDT 2010
Directory is /home/kumar/Projects/FRFNC/Case1/CODE
 Coupled Input File name = ../RUNFILES/coupling_ripfrfnc.in

 Model Coupling Parallel Threads:

   Ocean Model MPI nodes:   000 - 000
   Waves Model MPI nodes:   001 - 001


   Output/Input Files:

             Output Restart File:  ../OUTPUT/rip_frfnc_5_1_rst.nc
             Output History File:  ../OUTPUT/rip_frfnc_5_1_his.nc
            Output Averages File:  ../OUTPUT/rip_frfnc_5_1_avg.nc
         Output Diagnostics File:  ../OUTPUT/rip_frfnc_5_1_dia.nc
        Physical parameters File:  ../RUNFILES/ocean_ripfrfnc.in
                 Input Grid File:  ../../Forcing/ripgrd_frf_nc_5_1.nc
    Input Nonlinear Initial File:  ../../Forcing/ripgrd_init_frf_nc_5_1.nc
           Input Forcing File 01:  ../../Forcing/ripgrd_wind_frf_nc_5_1.nc

 Tile partition information for Grid 01:  0126x0158x0020  tiling: 001x001

     tile     Istr     Iend     Jstr     Jend     Npts

        0        1      126        1      158   398160

 Tile minimum and maximum fractional grid coordinates:
   (interior points only)

     tile     Xmin     Xmax     Ymin     Ymax     grid

        0     0.50   126.50     0.50   158.50  RHO-points

        0     1.00   126.00     0.50   158.50    U-points

        0     0.50   126.50     1.00   158.00    V-points

 Maximum halo size in XI and ETA directions:

               HaloSizeI(1) =     411
               HaloSizeJ(1) =     507
                TileSide(1) =     163
                TileSize(1) =   21353


 
[compute-1-2:18844] *** Process received signal ***
[compute-1-2:18844] Signal: Floating point exception (8)
[compute-1-2:18844] Signal code: Invalid floating point operation (7)
[compute-1-2:18844] Failing at address: 0x87ed66
[compute-1-2:18844] [ 0] /lib64/libpthread.so.0 [0x2b75b9f2a4c0]
[compute-1-2:18844] [ 1] ./oceanG(swrbc_+0x1458) [0x87ed66]
[compute-1-2:18844] [ 2] ./oceanG(swan_initialize_+0x1008) [0x86c5c0]
[compute-1-2:18844] [ 3] ./oceanG(MAIN__+0x4ba) [0x4466e2]
[compute-1-2:18844] [ 4] ./oceanG(main+0x3c) [0x44620c]
[compute-1-2:18844] [ 5] /lib64/libc.so.6(__libc_start_main+0xf4) [0x387c81d974]
[compute-1-2:18844] [ 6] ./oceanG [0x446119]
[compute-1-2:18844] *** End of error message ***
forrtl: error (78): process killed (SIGTERM)
Image              PC                Routine            Line        Source             
libopen-pal.so.0   00002ABC35FF90F8  Unknown               Unknown  Unknown
libmpi.so.0        00002ABC35AEAED8  Unknown               Unknown  Unknown
libmpi.so.0        00002ABC35B1E565  Unknown               Unknown  Unknown
libmpi_f77.so.0    00002ABC358A503C  Unknown               Unknown  Unknown
oceanG             0000000000C29AB9  Unknown               Unknown  Unknown
oceanG             0000000000C292BA  Unknown               Unknown  Unknown
oceanG             0000000000447FAB  ocean_coupler_mod         160  ocean_coupler.f90
oceanG             0000000000446B30  ocean_control_mod         105  ocean_control.f90
oceanG             0000000000446758  MAIN__                    100  master.f90
oceanG             000000000044620C  Unknown               Unknown  Unknown
libc.so.6          000000387C81D974  Unknown               Unknown  Unknown
oceanG             0000000000446119  Unknown               Unknown  Unknown
mpirun noticed that job rank 1 with PID 18844 on node compute-1-2.local exited on signal 8 (Floating point exception). 
Now I am not sure what exactly is the error..

(2) Well then I must not say it is a "tiling bug" but an error which occurs when the code tries to access a variable which has not been allocated a value.

(3) Yes it does work fine when I use the radiation_stress.F from rutgers site. To check it I have also done simulations for the inlet_test case.

Thanks

Nirnimesh

jcwarner
Posts: 1200
Joined: Wed Dec 31, 2003 6:16 pm
Location: USGS, USA

Re: Segmentation Fault when executing oceanM

#4 Unread post by jcwarner »

what is on line 160 of ocean_coupler.f90?

nirnimesh
Posts: 27
Joined: Tue Sep 11, 2007 6:57 pm
Location: University of Washington
Contact:

Re: Segmentation Fault when executing oceanM

#5 Unread post by nirnimesh »

This is line 160 of ocean_coupler.f90

Code: Select all

CALL MCTWorld_init (Nmodels, MPI_COMM_WORLD, OCN_COMM_WORLD,      &
     &                    OCNid)
Also if required, Line 105 of ocean-control.f90 is

Code: Select all

 CALL initialize_ocn2wav_coupling (ng, MyRank)

tmortlock
Posts: 4
Joined: Thu Dec 31, 2015 1:05 pm
Location: Macquarie University

Re: Segmentation Fault when executing oceanM

#6 Unread post by tmortlock »

Hi all

Did anyone solve this problem in the end? I have a similar issue with coawstM, rather than oceanM, but I am guessing it is the same cause.

I have been trying to run the Coupled Inlet Test Case example for COAWST. So far I've built a coawstM.exe, and then I try to use mpirun:

cd /home/ThomasMortlock/COAWST/MyWorkingDir/Inlet_test/Coupled/
mpirun -np 2 ./coawstM coupling_inlet_test.in > output.out

and I get this error (similar to above):

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
#0 0xFFFFFFFFFFFFFFFF
#1 0xFFFFFFFFFFFFFFFF
#2 0xFFFFFFFFFFFFFFFF
#3 0xFFFFFFFFFFFFFFFF
#4 0xFFFFFFFFFFFFFFFF
#5 0xFFFFFFFFFFFFFFFF
#6 0xFFFFFFFFFFFFFFFF
#7 0xFFFFFFFFFFFFFFFF
#8 0xFFFFFFFFFFFFFFFF
#9 0xFFFFFFFFFFFFFFFF
#10 0xFFFFFFFFFFFFFFFF
#11 0xFFFFFFFFFFFFFFFF
#12 0xFFFFFFFFFFFFFFFF
#13 0xFFFFFFFFFFFFFFFF
#14 0xFFFFFFFFFFFFFFFF
#15 0xFFFFFFFFFFFFFFFF
#16 0xFFFFFFFFFFFFFFFF
#17 0xFFFFFFFFFFFFFFFF
#18 0xFFFFFFFFFFFFFFFF
#19 0xFFFFFFFFFFFFFFFF

At the end of my .out file it says:

Process Information:
Node # 0 (pid= 0) is active.
mpirun noticed that process rank 1 with PID 7640 on node SCI-7495 exited on signal 11 (Segmentation fault).

I have attached the .out file here.

Does any one have any idea how I can resolve this issue? I am using Cygwin (64 bit). After some searching, I think it may be to do with the amount of virtual memory allocated to Cygwin on Windows, but I am unsure.

Any ideas would be most appreciated, thanks, Tom

jcwarner
Posts: 1200
Joined: Wed Dec 31, 2003 6:16 pm
Location: USGS, USA

Re: Segmentation Fault when executing oceanM

#7 Unread post by jcwarner »

seeing that you are pointing coawstm directly to the coupling.in file, i will assume this issue is related to the paths of all the files. the coupling file will have paths to ocean.in and swan.in, and they will have paths to grids etc. make sure all the files are in the correct paths.

ywang152
Posts: 20
Joined: Wed Mar 27, 2019 2:31 am
Location: Stevens Institute of Technology

Re: Segmentation Fault when executing oceanM

#8 Unread post by ywang152 »

tmortlock wrote:Hi all

Did anyone solve this problem in the end? I have a similar issue with coawstM, rather than oceanM, but I am guessing it is the same cause.

I have been trying to run the Coupled Inlet Test Case example for COAWST. So far I've built a coawstM.exe, and then I try to use mpirun:

cd /home/ThomasMortlock/COAWST/MyWorkingDir/Inlet_test/Coupled/
mpirun -np 2 ./coawstM coupling_inlet_test.in > output.out

and I get this error (similar to above):

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
#0 0xFFFFFFFFFFFFFFFF
#1 0xFFFFFFFFFFFFFFFF
#2 0xFFFFFFFFFFFFFFFF
#3 0xFFFFFFFFFFFFFFFF
#4 0xFFFFFFFFFFFFFFFF
#5 0xFFFFFFFFFFFFFFFF
#6 0xFFFFFFFFFFFFFFFF
#7 0xFFFFFFFFFFFFFFFF
#8 0xFFFFFFFFFFFFFFFF
#9 0xFFFFFFFFFFFFFFFF
#10 0xFFFFFFFFFFFFFFFF
#11 0xFFFFFFFFFFFFFFFF
#12 0xFFFFFFFFFFFFFFFF
#13 0xFFFFFFFFFFFFFFFF
#14 0xFFFFFFFFFFFFFFFF
#15 0xFFFFFFFFFFFFFFFF
#16 0xFFFFFFFFFFFFFFFF
#17 0xFFFFFFFFFFFFFFFF
#18 0xFFFFFFFFFFFFFFFF
#19 0xFFFFFFFFFFFFFFFF

At the end of my .out file it says:

Process Information:
Node # 0 (pid= 0) is active.
mpirun noticed that process rank 1 with PID 7640 on node SCI-7495 exited on signal 11 (Segmentation fault).

I have attached the .out file here.

Does any one have any idea how I can resolve this issue? I am using Cygwin (64 bit). After some searching, I think it may be to do with the amount of virtual memory allocated to Cygwin on Windows, but I am unsure.

Any ideas would be most appreciated, thanks, Tom

Hi Tom,

I have the same problem as you in Cygwin. My process works if I only apply 1 node. Did you resolve this problem at the end? I would like to have your advice.
Thanks in advance.

Best regards
Yifan

jcwarner
Posts: 1200
Joined: Wed Dec 31, 2003 6:16 pm
Location: USGS, USA

Re: Segmentation Fault when executing oceanM

#9 Unread post by jcwarner »

i am not sure what you mean by a similar problem. How about we start over, and you explain the issue with as much information as you can. the last few lines of stdout saying "#0 0xFFFFFFFFFFFFFFFF" dont really tell me anything. so if you have something similar i would need to see the whole std out. Also, try to compile in DEBUG=on first to see if that provides useful information.

-j

ywang152
Posts: 20
Joined: Wed Mar 27, 2019 2:31 am
Location: Stevens Institute of Technology

Re: Segmentation Fault when executing oceanM

#10 Unread post by ywang152 »

jcwarner wrote:i am not sure what you mean by a similar problem. How about we start over, and you explain the issue with as much information as you can. the last few lines of stdout saying "#0 0xFFFFFFFFFFFFFFFF" dont really tell me anything. so if you have something similar i would need to see the whole std out. Also, try to compile in DEBUG=on first to see if that provides useful information.

-j
Thanks for your reply! I plan to work on running ROMS on Linux server first.
I have complete the upwelling case test by running OceanG created by compiling build.bash with 'USE_DEBUG=on', on Linux server. However, the goal is to test parallel. So I tried to create OceanM with the following setup in build.bash:

Code: Select all

 export           USE_MPI=            # distributed-memory parallelism
 export        USE_MPIF90=on            # compile with mpif90 script
#export         which_MPI=mpich         # compile with MPICH library
#export         which_MPI=mpich2        # compile with MPICH2 library
 export         which_MPI=openmpi       # compile with OpenMPI library

#export        USE_OpenMP=on            # shared-memory parallelism

#export              FORT=ifort
#export              FORT=gfortran
export              FORT=pgi

# export         USE_DEBUG=on            # use Fortran debugging flags
 export         USE_LARGE=on            # activate 64-bit compilation
export       USE_NETCDF4=on            # compile with NetCDF-4 library
export   USE_PARALLEL_IO=on            # Parallel I/O with Netcdf-4/HDF5
Then I got a lot of error message:

Code: Select all

....
cd /home/ywang/roms/Projects/Upwelling/Build; /opt/pgi_with_netcdf/linux86-64/16.10/bin/pgf90 -c  -Kieee -O3 -Mfree mod_kinds.f90
/usr/bin/cpp -P -traditional -DLINUX -DX86_64 -DPGI -D'ROOT_DIR="/home/ywang/roms/trunk"' -DUPWELLING -D'HEADER="upwelling.h"' -D'ROMS_HEADER="/home/ywang/roms/Proling/upwelling.h"' -DNestedGrids= -D'ANALYTICAL_DIR="/home/ywang/roms/Projects/Upwelling"' -D'MY_ANALYTICAL="on"' -D'SVN_REV="904M"' -IROMS/Include -I/home/ywang/rs/Upwelling -IROMS/Nonlinear -IROMS/Nonlinear/Biology -IROMS/Nonlinear/Sediment -IROMS/Utility -IROMS/Drivers -IROMS/Functionals -I/home/ywang/roms/Projects/Upwellr -ICompilers -D'HEADER_DIR="/home/ywang/roms/Projects/Upwelling"'  ROMS/Modules/mod_param.F > /home/ywang/roms/Projects/Upwelling/Build/mod_param.f90
ROMS/Bin/cpp_clean /home/ywang/roms/Projects/Upwelling/Build/mod_param.f90
cd /home/ywang/roms/Projects/Upwelling/Build; /opt/pgi_with_netcdf/linux86-64/16.10/bin/pgf90 -c  -Kieee -O3 -Mfree mod_param.f90
/tmp/pgf902hxwI21SDUtT.s: Assembler messages:
/tmp/pgf902hxwI21SDUtT.s:7230: Error: no such instruction: `vinserti128 $1,%xmm0,%ymm0,%ymm1'
/tmp/pgf902hxwI21SDUtT.s:7240: Error: no such instruction: `vinserti128 $1,%xmm0,%ymm0,%ymm0'
/tmp/pgf902hxwI21SDUtT.s:7254: Error: no such instruction: `vinserti128 $1,%xmm2,%ymm2,%ymm3'
/tmp/pgf902hxwI21SDUtT.s:7263: Error: no such instruction: `vinserti128 $1,%xmm2,%ymm2,%ymm2'
/tmp/pgf902hxwI21SDUtT.s:7282: Error: suffix or operands invalid for `vpaddd'
/tmp/pgf902hxwI21SDUtT.s:7284: Error: suffix or operands invalid for `vpsrld'
/tmp/pgf902hxwI21SDUtT.s:7285: Error: suffix or operands invalid for `vpaddd'
/tmp/pgf902hxwI21SDUtT.s:7286: Error: suffix or operands invalid for `vpsrad'
/tmp/pgf902hxwI21SDUtT.s:7287: Error: suffix or operands invalid for `vpaddd'
/tmp/pgf902hxwI21SDUtT.s:7289: Error: suffix or operands invalid for `vpsrld'
/tmp/pgf902hxwI21SDUtT.s:7290: Error: suffix or operands invalid for `vpaddd'
....
I just realized I even cannot create OceanS. if I command out 'USE_DEBUG=on' and 'USE_PARALLEL_IO=on', I also got those error message. I attached the build.bash and Linux-pgi.mk file in case it would be helpful.

Thanks for your attention!

Best regards
Yifan
Attachments
Linux-pgi.mk
(5.88 KiB) Downloaded 504 times
build.bash
(18.28 KiB) Downloaded 470 times

jcwarner
Posts: 1200
Joined: Wed Dec 31, 2003 6:16 pm
Location: USGS, USA

Re: Segmentation Fault when executing oceanM

#11 Unread post by jcwarner »

- i have not used pgi in awhile, so i did a Google on
"Error: no such instruction: `vinserti128 $1,%xmm0,%ymm0,%ymm1'"
and found this:
https://www.pgroup.com/userforum/viewto ... 5965caa1f6
looks like a pgi issue.

- Another thought is that you would not need parallel IO for oceanS (serial).
Suggest you try to compile without parallel IO.

- also, the DEBUG=on can work with mpi, serial, or openmp.

but overall it looks like a pgi issue. try to search those forums.

ywang152
Posts: 20
Joined: Wed Mar 27, 2019 2:31 am
Location: Stevens Institute of Technology

Re: Segmentation Fault when executing oceanM

#12 Unread post by ywang152 »

jcwarner wrote:- i have not used pgi in awhile, so i did a Google on
"Error: no such instruction: `vinserti128 $1,%xmm0,%ymm0,%ymm1'"
and found this:
https://www.pgroup.com/userforum/viewto ... 5965caa1f6
looks like a pgi issue.

- Another thought is that you would not need parallel IO for oceanS (serial).
Suggest you try to compile without parallel IO.

- also, the DEBUG=on can work with mpi, serial, or openmp.

but overall it looks like a pgi issue. try to search those forums.
Thank you very much!
I tried other compiler, gfortran, and I finally got the oceanS come out!
Next, I would like to test parallel. It doesn't have to be upwelling case, I just would like to make sure this command would work on my Linux cluster:
'mpirun -np 2 oceanM ocean_upwelling.in', since I already installed pnetcdf, hdf5 for parallel support. Hope I would get it work.
Thank you again!

Best regards
Yifan

HONGWANG
Posts: 14
Joined: Wed Jan 19, 2022 1:52 pm
Location: UM

Re: Segmentation Fault when executing coawstM:Case SANDY

#13 Unread post by HONGWANG »

Hi all

I know this post has been in the past for a long time but I have a similar issue with coawstM when I run the the classic case SANDY for COAWST. Did anyone solve this problem in the end?

I've built a coawstM.exe, and then I try to use mpirun:

cd /home/coawst/Build_COAWST/COAWST/Projects/Sandy/
mpirun -np 3 ./coawstM coupling_sandy.in > log2.file

and I get this error:

SWAN grid 1 is preparing computation
SWAN grid 2 is preparing computation
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
module_io_quilt_old.F 2931 F
Quilting with 1 groups of 0 I/O tasks.
#0 0x2BA09CE526D7
#1 0x2BA09CE52D1E
#2 0x2BA09D8E53FF
#3 0x8CD7C3 in __interp_swan_mod_MOD_swan_ref_init
#4 0x8A299B in __waves_control_mod_MOD_swan_driver_init
#5 0x40FCE7 in MAIN__ at master.f90:?

At the end of my .out file it says:

PProcess Information:

Node # 0 (pid= 1868) is active.
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 1869 RUNNING AT group5-wh.novalocal
= EXIT CODE: 11
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions

Does any one have any idea how I can resolve this issue? Any ideas would be most appreciated, thanks, Hong.

jcwarner
Posts: 1200
Joined: Wed Dec 31, 2003 6:16 pm
Location: USGS, USA

Re: Segmentation Fault when executing oceanM

#14 Unread post by jcwarner »

can you post this message here:
https://github.com/jcwarner-usgs/COAWST/issues

Post Reply