Interpretation the of the demultiplexing and fastqc outputs
Once you have done the demultiplexing of your raw data and the FastQC of the resultsing FastQ files (see next sections), it is time to interpret the results to figure out whether the data have high enough quality to pursue the bioinformatics analysis.
This is not always as easy as it seems. To help you in this task, we provide:
- Demultiplexing test cases in the https://github.com/sequana/sequana_demultiplex/wiki wiki page
- FastQC test cases in the https://github.com/sequana/sequana_fastqc/wiki wiki page
- and more generally a fastqc and demultiplexing manual provided by the Biomics platform: please see the current version download
Data example
We provide a very simple raw data from our ISeq100. The 4 samples have been anonymised. The data is stored as a zipped tar bundle. Just uncompress it and you are ready to try the sequana_demultiplex pipeline. It is a 700Mb archive.
Perform the demultiplexing using sequana (bcl2fastq)
You can install the sequana_demultiplex pipeline following the instructions from the pipeline page: https://github.com/sequana/sequana_demultiplex summarized as follows:
pip install sequana pip install sequana_demultiplex
Then, you prepare the analysis by providing the sample sheet and place where to find the BCL raw data from the sequencer:
sequana_demultiplex --bcl-directory bcl --sample-sheet SampleSheet.csv --merging-strategy merge
Note that for a NextSeq sequencer you may want to merge the 4 lanes using --merging-strategy argument.
Finally you can execute the pipeline:
cd demultiplex sh demultiplex.sh
Note that on a SLURM cluster, you will type (for example):
srun -c 1 sh demultiplex.sh
See https://github.com/sequana/sequana_demultiplex page and its wiki https://github.com/sequana/sequana_demultiplex for more information.
An example of resulting HTML page is available here
Perform the QC of your FastQ files using sequana_fastqc pipeline
You can install the sequana_demultiplex pipeline following the instructions from the pipeline page: https://github.com/sequana/sequana_fastqc summarized as:
pip install sequana pip install sequana_demultiplex
Then, you can perform the fastqc on your set of samples using these commands:
cd fastq_directory sequana_fastqc cd fastq sh fastq.sh
Note that on a SLURM cluster, you will type (for example):
srun -c 1 sh fastqc.sh
See https://github.com/sequana/sequana_fastqc page and its wiki https://github.com/sequana/sequana_fastqc for more information.
An example of resulting HTML page is available here