HiC scaffolding pipeline
Snakemake pipeline for scaffolding of a genome using HiC reads using yahs.
Prerequisites
This pipeine has been tested using Snakemake v7.32.4
and requires conda for installation of required tools. To run the pipline use the command:
snakemake --use-conda --cores N
where N is number of cores to use. There are provided a set of configuration and running scripts for exectution on a slurm queueing system. After configuring the cluster.json
file run:
./run_cluster
Before starting
You need to create a temporary folder and specify the path in the config.yaml
file. This should be able to hold the temporary files created when sorting the .pairsam
file (100s of GB or even many TBs)
The path to the genome assemly must be given in the config.yaml
.
The HiC reads should be paired and named as follows: Library_1.fastq.gz Library_2.fastq.gz
. The pipeline can accept any number of paired HiC read files, but the naming must be consistent. The folder containing these files must be provided in the config.yaml
.
Version History
Version 2 (latest) Created 21st Jun 2024 at 10:42 by Tom Brown
Add cluster json for execution on slurm
Frozen
Version-2
efc9e4b
Version 1 (earliest) Created 16th Mar 2024 at 09:01 by Tom Brown
Initial commit
Frozen
Version-1
cd486a3
Creator
Submitter
Views: 842 Downloads: 196
Created: 16th Mar 2024 at 09:01
This item has not yet been tagged.
None