Of demultiplexing and fastqc

Interpretation the of the demultiplexing and fastqc outputs

Once you have done the demultiplexing of your raw data and the FastQC of the resultsing FastQ files (see next sections), it is time to interpret the results to figure out whether the data have high enough quality to pursue the bioinformatics analysis.

This is not always as easy as it seems. To help you in this task, we provide:

Demultiplexing test cases in the https://github.com/sequana/sequana_demultiplex/wiki wiki page
FastQC test cases in the https://github.com/sequana/sequana_fastqc/wiki wiki page
and more generally a fastqc and demultiplexing manual provided by the Biomics platform: please see the current version download

Data example

We provide a very simple raw data from our ISeq100. The 4 samples have been anonymised. The data is stored as a zipped tar bundle. Just uncompress it and you are ready to try the sequana_demultiplex pipeline. It is a 700Mb archive.

Perform the demultiplexing using sequana (bcl2fastq)

You can install the sequana_demultiplex pipeline following the instructions from the pipeline page: https://github.com/sequana/sequana_demultiplex summarized as follows:

pip install sequana
pip install sequana_demultiplex

Then, you prepare the analysis by providing the sample sheet and place where to find the BCL raw data from the sequencer:

sequana_demultiplex --bcl-directory bcl  --sample-sheet SampleSheet.csv  --merging-strategy merge

Note that for a NextSeq sequencer you may want to merge the 4 lanes using --merging-strategy argument.

Finally you can execute the pipeline:

cd demultiplex
sh demultiplex.sh

Note that on a SLURM cluster, you will type (for example):

srun -c 1 sh demultiplex.sh

See https://github.com/sequana/sequana_demultiplex page and its wiki https://github.com/sequana/sequana_demultiplex for more information.

An example of resulting HTML page is available here

Perform the QC of your FastQ files using sequana_fastqc pipeline

You can install the sequana_demultiplex pipeline following the instructions from the pipeline page: https://github.com/sequana/sequana_fastqc summarized as:

pip install sequana
pip install sequana_demultiplex

Then, you can perform the fastqc on your set of samples using these commands:

cd fastq_directory
sequana_fastqc
cd fastq
sh fastq.sh

Note that on a SLURM cluster, you will type (for example):

srun -c 1 sh fastqc.sh

See https://github.com/sequana/sequana_fastqc page and its wiki https://github.com/sequana/sequana_fastqc for more information.

An example of resulting HTML page is available here

Interpretation the of the demultiplexing and fastqc outputs

Data example

Perform the demultiplexing using sequana (bcl2fastq)

Perform the QC of your FastQ files using sequana_fastqc pipeline

links

social