FastQC perform some simple quality control checks to ensure that the raw data looks good and there are no problems or biases in data which may affect how user can usefully use it.
Loop on every file for a directory and print command on terminal
Loop on every file for a directory and write command on a file (cmd_fastqc.sh)
[droc@cc2-login RNASeq_fastq_MGX]$ for i in *fastq; do echo qsub -b y -q normal.q -N fastqc -V fastqc --format fastq --outdir ~/work/FastQC $i >> cmd_fastqc.sh;done
Using cutadapt to remove adapters and to trim reads based on quality
Cutadapt is a tool specifically designed to remove adapters from NGS data.
https://code.google.com/p/cutadapt/
cutadadapt command
-q 30, 30 : by default, only the 3’ end of each read is quality-trimmed. If you want to trim the 5’ end as well, use the -q option with two comma-separated cutoffs
Mapping reads with hisat2
https://ccb.jhu.edu/software/hisat2/index.shtml
There are several steps involved in mapping sequence reads.
Creating an index of the reference genome if necessary
Performing mapping
Here is a description for the contents of the SAM file: https://samtools.github.io/hts-specs/SAMv1.pdf
Do the same thing for all the library. For this you can use and adapt this Perl script
Convert and sort SAM to BAM with samtools
http://samtools.sourceforge.net/
To run on batch, you can use the command for
Transcript assembly and quantification with StringTie
https://ccb.jhu.edu/software/stringtie/
For all sample
Merge transcripts from all sample
Here mergelist.txt is a text file that has the names of the gene transfer format (GTF) files created in the previous step, with each file name on a single line.