South Green Logo

South Green Trainings pages

Description Hands On Lab Exercises for HPC
Related-course materials HPC
Authors Ndomassi TANDO (ndomassi.tando@ird.fr)
Creation Date 25/11/2019
Last Modified Date 30/03/2020

Summary


Preambule

Getting connected to a Linux servers from Windows with SSH (Secure Shell) protocol
Platform Software Description url
mobaXterm an enhanced terminal for Windows with an X11 server and a tabbed SSH client more
putty Putty allows to connect to a Linux server from a Windows workstation. Download
Transferring and copying files from your computer to a Linux servers with SFTP (SSH File Transfer Protocol) protocol
Platform Software Description url
filezilla FTP and SFTP client Download
Viewing and editing files on your computer before transferring on the linux server or directly on the distant server
Type Software url
Distant, consol mode nano Tutorial
Distant, consol mode vi Tutorial
Distant, graphic mode komodo edit Download
Linux & windows based editor Notepad++ Download

Practice 1: Get Connecting on a linux server by ssh

In mobaXterm:

  1. Click the session button, then click SSH.
    • In the remote host text box, type: bioinfo-master.ird.fr
    • Check the specify username box and enter your user name
  2. In the console, enter the password when prompted. Once you are successfully logged in, you will be use this console for the rest of the lecture.
  3. Type the command sinfo and comment the result
  4. type the command sinfo -N nodes --long and noticed what have been added

Practice 2: Reserve one core of a node using srun and create your working folder

  1. Type the command squeue and noticed the result
  2. Type the command squeue -u your_login with your_login to change with your account and noticed the difference
  3. More details with the command: squeue -O "username,name:40,partition,nodelist,NumCPUs,state,timeused,timelimit"
  4. Type the command srun -p short --pty bash -i then squeue again
  5. Create your own working folder in the /scratch of your node:
cd /scratch
 mkdir login
 with login : the name of your choice
  1. Type the following command with the nodeX of your choice expect the one you are already connected to
ssh nodeX "ls -al /scratch/"

Practice 3 : Transferring files with filezilla sftp

Download and install FileZilla
Open FileZilla and save the IRD cluster into the site manager

In the FileZilla menu, go to File > Site Manager. Then go through these 5 steps:

  1. Click New Site.
  2. Add a custom name for this site.
  3. Add the hostname bioinfo-nas.ird.fr to have access to /data2/formation
  4. Set the Logon Type to “Normal” and insert your username and password used to connect on the IRD cluster
  5. Press the “Connect” button.
Transferring files

  1. From your computer to the cluster : click and drag an text file item from the left local colum to the right remote column
  2. From the cluster to your computer : click and drag an text file item from he right remote column to the left local column
  3. Retrieve the file HPC_french.pdf from the right window into the folder /data/projects/formation/

Practice 4: Transfer your data from the nas server to the node

  1. Using scp, transfer the folder TPassembly located in /data2/formation/Slurm into your working directory
  2. Check your result with ls

Practice 5: Use module environment to load your tools

  1. Load ea-utils V2.7 module
  2. Check if the tool are loaded

Practice 6 : Launch analyses

Get stats on fastq

  1. Go into the folder TPassembly/Ebola
  2. Launch the command fastq-stats ebola1.fastq
  3. Launch the command fastq-stats -D ebola1.fastq

Perform an assembly with abyss-pe

With abyss software, we reassembly the sequences using the 2 fastq files ebola1.fastq and ebola2.fastq

Launch the commands

module load bioinfo/abyss/1.9.0
abyss-pe k=35 in='ebola1.fastq ebola2.fastq' name=k35

NB: you can do the same thing using srun directly from the master assuming that the data have been transfer to the /scratch of your node and that you know the nodename:

From the master, type the following commands:

module load bioinfo/abyss/1.9.0
srun -p partitionname --nodelist=nodename --chdir=/scratch/login/TPassembly/Ebola abyss-pe k=35 -j1 np=1  in='ebola1.fastq ebola2.fastq' name=k35

the -p allows to indicate the partition to use , replace partitionnameparameter the –nodelist allows to indicate the node to use , replace nodenameparameter the --chdir allows to change the working directory and to precise in which directory the analysis will be done directly into the node.


Practice 7: Transfering data to the nas server

  1. Using scp, transfer your results from your /scratch/login to your /home/login
  2. Check if the transfer is OK with ls

Practice 8: Deleting your temporary folder

 cd /scratch
rm -r login
exit

Practice 9: Launch a job with sbatch

Following the several steps performed during the practice, create a script to launch the analyses made in practice6:

1er step: create the Slurm section in your script

1) Set a name for your job

2) Precise your email

3) Choose the short parttion

2nd step: type the commands you want the script to launch:

1) create a personal folder in /scratch with mkdir

2) Using scp, transfer the folder TPassembly located in /data2/formation into your working directory

3) Launch abyss version 1.9.0 with module load

4) Into the the folder TPassembly/Ebola, lanch the following command:

abyss-pe k=35 in='ebola1.fastq ebola2.fastq' name=k35

5) Using scp, transfer your results from your /scratch/login to your /home/login

6) Delete the personal folder in the /scratch

Launch the following commands to obtain info on the finished job:

seff <JOB_ID>
sacct --format=JobID,elapsed,ncpus,ntasks,state,node -j <JOB_ID>

Bonus:

We are going to launch a 4 steps analysis:

1) Perform a multiple alignment with the nucmer tool

2) Filter these alignments with the delta-filter tool

3) Generate a tab file easy to parse the with show-coords tools

4) Generate a png image with mummerplot

 sbatch alignment.sh
seff <JOB_ID>
sacct --format=JobID,elapsed,ncpus,ntasks,state,node -j <JOB_ID>
sh /opt/scripts/scratch-scripts/scratch_use.sh

and choose the number of the node used



License

The resource material is licensed under the Creative Commons Attribution 4.0 International License (here).