South Green Logo

South Green tutorials pages

SGE Job submission for cc2 login cluster


This page describes how you can submit job with SGE system.
Authors Sébastien RAVEL
Research Unit
Institut
Creation Date 30/03/2018
Last Modified Date 30/03/2018

Keywords : qsub, qrsh, job, cc2-login

Date : 02/06/2017

Summary

How to run correctly Jobs

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! WARNING !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Not starting a program on the master node of the cluster: (cf: cc2-admin)

1 - Use Module load

On the cluster the tools do not load by default. Each user must “load” programs to use them. This system allows you to have different versions of the same tools without creating conflicts. The disadvantage is that if you forget to load the module the program does not work … Here are 3 methods to manage modules:

A - Load modules by default on connection

If you often use the same program you can add to the .bashrc file For example the program GIT or python2.7 that I often use:

gedit /homedir/{YourName}/.bashrc
# Copy/Paste the two lines:
# add module load permanently
module load system/git/2.8.3 system/python/2.7.9
# Save file, quit gedit and actualise connection with: (or kill and open new connection):
source ~/.bashrc

B - Loading before job submission (before qsub)

This method loads modules before qsub. It will be MANDATORY to put the -V parameter in the qsub to pass the modules to the compute node.

##### load modules
module load mpi/openmpi/1.6.5 compiler/gcc/4.9.2 bioinfo/RAxML/8.2.4
##### run job
qsub -V -N raxmlALL -cwd -q long.q -pe parallel_smp 24 ./run_raxml.sh

C - Loading in Job Script

The best way for the job to work. (We always forget to load before qsub ….) It’s simple enough to have the modules loaded by the compute node. It is enough to add in the script.sh the module load before the control of the program. Example in file __run_raxml.sh__:

module load mpi/openmpi/1.6.5 compiler/gcc/4.9.2 bioinfo/RAxML/8.1.17
raxmlHPC-PTHREADS -T 24 -n 2241Ortho-82souches -f d -m GTRGAMMA -p 12345 -s /work/sravel/phylogenomique3/raxmlConcat/2241Ortho-82souches.fasta
##### run job
qsub -V -N raxmlALL -cwd -q long.q -pe parallel_smp 24 ./run_raxml.sh

PS: leave the -V argument in your job, because sometimes it is necessary despite the loading of the modules in the script.

2 - Run Job

A - qsub mode

To submit a job you must use the command qsub

# For normal job
qsub -V -N NomJob -cwd -q long.q script.sh

# example if the program command is in a script.sh (here run_Lorma.sh)
qsub -V -N Lorma -cwd -q long.q run_Lorma.sh

# OR you can run the command directly (to avoid because no trace of the parameters)
qsub -V -N Lorma -cwd -q long.q /work/sravel/lorma.sh -s /work/sravel/MinION/Minion_Sanger/eBSMYV_SeqSangerBAC29H14.fasta

With the following arguments:

Arguments submission Jobs  
-i file Use file as standard input for this job
-o file Set the job’s standard output to a file (which should be displayed in the terminal)
-e file Set the standard error output of the job to the file (when returns error)
-N name Name the job for output files (replaces STDIN by default)
-cwd Uses the current working directory for input and output, rather than /homedir/username/
-q queue Specifies a queue
-M my address@work Receive by e-mail job info:
  -m beas: Allows you to select events to receive:
  -b: warned at first
  -e: end of the job
  -a: job interrupted
  -s: job suspended
-l mem_free=nG run job with “n” Go de RAM (see below)
-pe parallel_smp n run job with “n” threads (see below)

B - qrsh mode (for test script)

There is a method for not running tests on the master node: Request an interactive job This type of job brings advantages but also disadvantages:

It is therefore necessary to use it ONLY for the debug.

The command is similar to the * qrsh * and takes at least the -q argument:

# For interactive job
qrsh -q normal.q

3 - Knowing / asking for resources

In computer science there are 2 types of resources to make a program:

The CPU is the number of cores of a machine, and each core divides into threads. A program uses the CPU when it needs to do a lot of calculations.

RAM, express in Go or To corresponds to the active memory of the computer. It preloads information to perform the calculation more quickly. (The access time is much faster than reading from disk) A program that needs to load files in memory uses more RAM

Once this theory is understood, one can see in the programs parameters such as

These are the famous parameters to increase the computing power.

On the Cluster the available resources are quite important:

queue NB Threads NB RAM
long.q and normal.q 48 200Go
bigmem.q 96 2.6To

4 - Jobs Resources

By default a job uses 1 thread and 10GB of RAM

# for njob with default parameters
qsub -V -N NomJob -cwd -q long.q script.sh

To boost the job you have to ask for more resources.

A - Request more RAM

# for a job with 20Go of RAM:
qsub -V -N NomJob -cwd -q long.q -l mem_free=20G script.sh

B - Request more Threads

# for a parallele job with 24 threads:
qsub -V -N NomJob -cwd -q long.q -pe parallel_smp 24 script.sh

5 - Get information about running jobs

Jobs status  
qstat Displays the status of all jobs
qstat -f Displays the status of all queues (long list)
qstat -u “*“ Displays the status of all jobs belonging to all users
qstat -g c Resources available
qstat -j jobid Displays the status of a particular job (jobid = 1st qstat column)

6 - Delete job

# delete one job
qdel jobID

# delete  multiple jobs:
qdel echo `seq -f "%.0f" 876775 876778`

# where 876775 is the first job to delete and 876778 the last

7 - More infos

https://doc.cc.in2p3.fr/en:ge_submit_a_job_qsub