作業は /work00/"ユーザ名" で行う. データ分割ソフト(util_split-2010-09-25)のMakefileは以下のように編集する 16行目 NCDIR = /work00/lib-mpi/netcdf-c-4.6.3-mpi 27行目 NCDIR = /work00/lib-mpi/netcdf-fortran-4.4.5-mpi export LD_LIBRARY_PATH="/work00/lib-mpi/zlib-1.2.11-mpi/lib:/work00/lib-mpi/hdf5-1.10.5-mpi/lib:/work00/lib-mpi/netcdf-c-4.6.3-mpi/lib:/work00/lib-mpi/netcdf-fortran-4.4.5-mpi/lib:/work00/lib-mpi/gtool5-20160613-mpi/lib:/work00/lib-mpi/ispack-1.0.4-mpi:/work00/lib-mpi/spml-0.8.0-mpi/lib" @/work00/ono/dcpam $ ldd bin/dcpam_main でdcpam実行に必要なライブラリのパスが通っているかを調べる. 管理ノードだけでなく計算ノードでもこのコマンドを実行して調べること. どこかにnot foundがあったらエラーで実行できない. 以下のスクリプトで走った. #!/bin/sh #PBS -N earth_test #PBS -j oe #PBS -o /work00/ono/dcpam #PBS -q long #PBS -l select=2:ncpus=1 export LD_LIBRARY_PATH="/work00/lib-mpi/zlib-1.2.11-mpi/lib:/work00/lib-mpi/hdf5-1.10.5-mpi/lib:/work00/lib-mpi/netcdf-c-4.6.3-mpi/lib:/work00/lib-mpi/netcdf-fortran-4.4.5-mpi/lib:/work00/lib-mpi/gtool5-20160613-mpi/lib:/work00/lib-mpi/ispack-1.0.4-mpi:/work00/lib-mpi/spml-0.8.0-mpi/lib" cd $PBS_O_WORKDIR /work00/openmpi1/bin/mpiexec --hostfile /work00/openmpi1/host -np 2 ./bin/dcpam_main -N=./conf/dcpam_E_T21L26.conf > earth.log 2>&1 exit0 データ統合ソフト $ wget http://www.gfd-dennou.org/library/dcpam/related-program/util_merge-2011-03-28-2.tgz $ tar zxvf util_merge-2011-03-28-2.tgz $ cd util_merge-2011-03-28-2/ $ vim Makefile $ vim merge.mnl $ cp merge.nml ../ $ cp merge_ncf $ ../ $ ./merge_ncf $ exit @local Ubuntu $ sftp ono@nozomi $ cd /work00/ono/dcpam2 $ get U.nc $ get V.nc $ get Temp.nc $ exit $ gpview Temp.nc@Temp,time=360,sig=5 1438 /work00/openmpi1/bin/mpiexec -n 2 ./bin/dcpam_init_data -N=./conf/init_data_E_T21L26.conf /work00/openmpi1/bin/mpiexec -n 2 ./bin/dcpam_init_data_surface -N=./conf/surface_data_E_T21.conf /dcpam_init_data_surface -N=./conf/surface_data_E_T21.conf ./bin/dcpam_init_data_surface: error while loading shared libraries: libnetcdff.so.6: cannot open shared object file: No such file or directory ./bin/dcpam_init_data_surface: error while loading shared libraries: libnetcdff.so.6: cannot open shared object file: No such file or directory ./bin/dcpam_init_data_surface: error while loading shared libraries: libnetcdff.so.6: cannot open shared object file: No such file or directory -------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- ./bin/dcpam_init_data_surface: error while loading shared libraries: libnetcdff.so.6: cannot open shared object file: No such file or directory -------------------------------------------------------------------------- mpiexec detected that one or more processes exited with non-zero status, thus causing the job to be terminated. The first process to do so was: Process name: [[55648,1],0] Exit code: 127 -------------------------------------------------------------------