Workflow Type: Snakemake
Stable

HiC scaffolding pipeline

Snakemake pipeline for scaffolding of a genome using HiC reads using yahs.

Prerequisites

This pipeine has been tested using Snakemake v7.32.4 and requires conda for installation of required tools. To run the pipline use the command:

snakemake --use-conda --cores N

where N is number of cores to use. There are provided a set of configuration and running scripts for exectution on a slurm queueing system. After configuring the cluster.json file run:

./run_cluster

Before starting

You need to create a temporary folder and specify the path in the config.yaml file. This should be able to hold the temporary files created when sorting the .pairsam file (100s of GB or even many TBs)

The path to the genome assemly must be given in the config.yaml.

The HiC reads should be paired and named as follows: Library_1.fastq.gz Library_2.fastq.gz. The pipeline can accept any number of paired HiC read files, but the naming must be consistent. The folder containing these files must be provided in the config.yaml.

Version History

Version 2 (latest) Created 21st Jun 2024 at 10:42 by Tom Brown

Add cluster json for execution on slurm


Frozen Version-2 efc9e4b

Version 1 (earliest) Created 16th Mar 2024 at 09:01 by Tom Brown

Initial commit


Frozen Version-1 cd486a3
help Creators and Submitter
Creator
Submitter
Citation
Brown, T. (2024). HiC scaffolding pipeline. WorkflowHub. https://doi.org/10.48546/WORKFLOWHUB.WORKFLOW.796.2
Activity

Views: 842   Downloads: 196

Created: 16th Mar 2024 at 09:01

Annotated Properties
Topic annotations
Operation annotations
help Tags

This item has not yet been tagged.

help Attributions

None

Total size: 8.5 KB
Powered by
(v.1.16.0-main)
Copyright © 2008 - 2024 The University of Manchester and HITS gGmbH