Research Object Crate for HiC scaffolding pipeline

Original URL: https://workflowhub.eu/workflows/796/ro_crate?version=1

# HiC scaffolding pipeline Snakemake pipeline for scaffolding of a genome using HiC reads using yahs. ## Prerequisites This pipeine has been tested using `Snakemake v7.32.4` and requires conda for installation of required tools. To run the pipline use the command: `snakemake --use-conda --cores N` where N is number of cores to use. There are provided a set of configuration and running scripts for exectution on a slurm queueing system. After configuring the `cluster.json` file run: `./run_cluster` ## Before starting You need to create a temporary folder and specify the path in the `config.yaml` file. This should be able to hold the temporary files created when sorting the `.pairsam` file (100s of GB or even many TBs) The path to the genome assemly must be given in the `config.yaml`. The HiC reads should be paired and named as follows: `Library_1.fastq.gz Library_2.fastq.gz`. The pipeline can accept any number of paired HiC read files, but the naming must be consistent. The folder containing these files must be provided in the `config.yaml`.

Author
Tom Brown
License
CC-BY-4.0

Contents

Main Workflow: HiC scaffolding pipeline
Size: 4471 bytes