HiC scaffolding pipeline
Version 1

Workflow Type: Snakemake

HiC scaffolding pipeline

Snakemake pipeline for scaffolding of a genome using HiC reads using yahs.


This pipeine has been tested using Snakemake v7.32.4 and requires conda for installation of required tools. To run the pipline use the command:

snakemake --use-conda --cores N

where N is number of cores to use. There are provided a set of configuration and running scripts for exectution on a slurm queueing system. After configuring the cluster.json file run:


Before starting

You need to create a temporary folder and specify the path in the config.yaml file. This should be able to hold the temporary files created when sorting the .pairsam file (100s of GB or even many TBs)

The path to the genome assemly must be given in the config.yaml.

The HiC reads should be paired and named as follows: Library_1.fastq.gz Library_2.fastq.gz. The pipeline can accept any number of paired HiC read files, but the naming must be consistent. The folder containing these files must be provided in the config.yaml.

Version History

Version 1 (earliest) Created 16th Mar 2024 at 09:01 by Tom Brown

Initial commit

Frozen Version-1 cd486a3
help Creators and Submitter
Brown, T. (2024). HiC scaffolding pipeline. WorkflowHub. https://doi.org/10.48546/WORKFLOWHUB.WORKFLOW.796.1

Views: 353   Downloads: 75

Created: 16th Mar 2024 at 09:01

Annotated Properties
Topic annotations
Operation annotations
help Tags

This item has not yet been tagged.

help Attributions


Total size: 8.35 KB
Powered by
Copyright © 2008 - 2024 The University of Manchester and HITS gGmbH