Utilisation du noeud GPU

South Green tutorials pages

Description	Know how to use GPU node in I-Trop cluster
Author	Julie ORJUELA (julie.orjuela_at_ird.fr) and Aurore COMTE (aurore.comte_at_ird.fr)
Creation date	27/01/2020
modification date	27/01/2020

Summary

Objective
Launch jobs in GPU node with Slurm
Resources supervision with nvidia
Liens
License

Objectives

Know how to launch a Slurm job in GPU node in I-Trop Cluster and monitoring jobs in GPU

Basecalling with guppy-gpu using the i-Trop GPU node

Node GPU in I-trop cluster has 8 graphic cards RTX2080, each with 124G de RAM. In total this node has 24 threads.

Guppy is a data processing toolkit that contains the Oxford Nanopore Technologies’ basecalling algorithms, and several bioinformatic post-processing features.

Basecalling with guppy can be launch using gyppy-gpu tool. In guppy commande you have to specify data containig fast5 raw read files (-i), the output repertory to write fastq files (-o), How many worker threads you are using cpu_threads_per_caller (-c) and the number of parallel basecallers to create (-num_callers). We recommend to compress the fastq output (-compress_fastq).

We recommend to basecaller a data set using a graphic card to obtain results in only one folder. If you split data you can enjoy of the whole of graphic cards but your data results will be in several folders. In each results folder, reads can be share names. So, you can lost information if you decide to merge it.

Creating a slurm scritp to basecalling in GPU

Copy data in node26 /scratch before launching basecalling.

Create a sbatch script to allocate ressources by using slurm. Here, sbatch script lauchGuppyGPU.sbash takes 4 threads for lauch guppy-gpu, partition -p gpu. If you are using i-Trop GPU you are into gpu_group so, give this parametter to slurm whit -A option.

 
#!/bin/bash
#SBATCH -J Basecalling
#SBATCH -p gpu
#SBATCH -A gpu_group
#SBATCH -c 4
INPUT=$1
OUTPUT=$2
CUDA=$3

#loading modules
module load bioinfo/guppy-gpu/3.2.4

#running basecalling
guppy_basecaller -c dna_r9.4.1_450bps_hac.cfg -i ${INPUT} -r -s ${OUTPUT} --num_callers 4 --gpu_runners_per_device 8 --qscore_filtering --min_qscore 7 -x cuda:${CUDA}

Now you can launch lauchGuppyGPU.sbash script giving input, output and cuda graphic card (From 0 to 7):

In this example, basecalling is running only in cuda 0.

$ sbatch lauchGuppyGPU.sbash /path/to/fast5 /path/to/fastq 0

Note: Beside the path of our fast5 files folder (-i), the basecaller requires an output path (-s) and a config file or the flowcell/kit combination. In order to get a list of possible flowcell/kit combinations and config files, we use:

$ guppy_basecaller --print_workflows

Resources supervision with nvidia

$ nvidia-smi

Liens

Cours liés : Slurm Trainings

License

The resource material is licensed under the Creative Commons Attribution 4.0 International License (here).