m000% tracejob -n 30 11999
Job: 11999.m000
11/22/2005 10:36:53 S Job Queued at request of userm000[dot]cfca[dot]nao[dot]ac[dot]jp,
owner = userm000[dot]cfca[dot]nao[dot]ac[dot]jp, job name = test.sh,
queue = short
11/22/2005 10:36:53 S Job Modified at request of
Schedulerm000[dot]cfca[dot]nao[dot]ac[dot]jp
11/22/2005 10:36:53 S enqueuing into short, state 1 hop 1
11/22/2005 10:36:53 A queue=short
11/22/2005 13:26:53 L Server job limit reached
11/22/2005 13:35:27 L Considering job to run
11/22/2005 13:35:27 S Job Modified at request of
Schedulerm000[dot]cfca[dot]nao[dot]ac[dot]jp
11/22/2005 13:35:27 S Job Run at request of Schedulerm000[dot]cfca[dot]nao[dot]ac[dot]jp
on hosts m004
11/22/2005 13:35:28 L Job run
11/22/2005 13:35:28 A user=user group=naocc jobname=test.sh queue=short
ctime=1132623413 qtime=1132623413 etime=1132623413
start=1132634128 exec_host=m004/0
Resource_List.cput=00:30:00 Resource_List.ncpus=1
Resource_List.neednodes=1 Resource_List.nodect=1
Resource_List.nodes=1
11/22/2005 13:39:28 S Obit received
11/22/2005 13:39:28 S Exit_status=0 resources_used.cpupercent=8
resources_used.cput=00:00:28 resources_used.mem=4324kb
resources_used.ncpus=1 resources_used.vmem=21992kb
resources_used.walltime=00:03:59
11/22/2005 13:39:28 A user=user group=naocc jobname=test.sh queue=short
ctime=1132623413 qtime=1132623413 etime=1132623413
start=1132634128 exec_host=m004/0
Resource_List.cput=00:30:00 Resource_List.ncpus=1
Resource_List.neednodes=1 Resource_List.nodect=1
Resource_List.nodes=1 session=24075 end=1132634368
Exit_status=0 resources_used.cpupercent=8
resources_used.cput=00:00:28 resources_used.mem=4324kb
resources_used.ncpus=1 resources_used.vmem=21992kb
resources_used.walltime=00:03:59
11/22/2005 13:45:44 S dequeuing from short, state 5
Job id Name User Time Use S Queue
---------------- ---------------- ---------------- -------- - -----
244.m000 JOB_A user 00:00:09 R short
245.m000 JOB_B user 00:00:00 H short
246.m000 JOB_C user 00:00:00 R short
stty: standard input: Invalid argument
forrtl: severe (174): SIGSEGV, possible program stack overflow occurred.
Program requirements exceed current stacksize resource limit.
Superusers may try increasing this resource by 'limit stacksize xxx',
where xxx is unlimited or something larger than your current limit.
Other users should contact your system administrator for help.
#!/bin/sh
#PBS -r y
#PBS -m ae
#PBS -q long
#PBS -l nodes=1
# This job's working directory
echo Working directory is $PBS_O_WORKDIR
cd $PBS_O_WORKDIR
echo Running on host `hostname`
echo Time is `date`
echo Directory is `pwd`
# Run your executable
ulimit -s unlimited # ★ここに追加
./a.out
計算ノードとの通信状況が悪い時にこれが発生することがあります。
qdel コマンドにオプション -W force を付けて実行してみてください。
qdel -W force ジョブID ...
オプション -W force の意味は "Deletes the job whether or not the job' execution host is reachable." です。
詳しくは man qdel を参照してください。
なお上記のオプションを付けてもジョブを qdel できない場合にはジョブに何らかの障害が発生している可能性があります。
その場合はこのページから報告をお願いします。
(最終更新日 2024年以前)