A workflow for marine Genomic Observatories data analysis
Version 1

Workflow Type: Common Workflow Language

A workflow for marine Genomic Observatories data analysis

An EOSC-Life project

Build Status

The workflows developed in the framework of this project are based on pipeline-v5 of the MGnify resource.

This branch is a child of the pipeline_5.1 branch that contains all CWL descriptions of the MGnify pipeline version 5.1.

The following comes from the initial repo and describes how to get the databases required.


pipeline-v5

This repository contains all CWL descriptions of the MGnify pipeline version 5.0.

Documentation

For a thorough read-the-docs, click here.


We kindly recommend use the MGnify resource for data processing.

If you want to run pipeline locally, we recommend you use our pre-build docker containers.

Requirements to run pipeline

  • python3 [v 3.6+]

  • docker [v 19.+] or singularity

  • cwltool [v 3.+] or toil [v 4.2+]

  • hdd for databases ~133G

Docker

All the tools are containerized.

Unfortunately, antiSMASH and InterProScan containers are very big. We provide two options:

  1. Pre-install these tools. The instructions on how to setup the environment are here.

  2. Use containers. First of all you need to uncomment hints in InterProScan-v5.cwl and antismash_v4.cwl. Pre-pull containers from https://hub.docker.com/u/microbiomeinformatics

docker pull microbiomeinformatics/pipeline-v5.interproscan:v5.36-75.0
docker pull microbiomeinformatics/pipeline-v5.antismash:v4.2.0

Installation

Create conda environment

Get the EOSC-Life marine GOs workflow

git clone https://github.com/EBI-Metagenomics/pipeline-v5.git 
cd pipeline-v5

Download necessary dbs

You can download databases for the EOSC-Life GOs workflow by running the download_dbs.sh script. If you have one or more already in your system, then create a symbolic link pointing at the ref-dbs folder.

How to run

  • activate the conda env

  • edit the gos_wf.yml file to set the parameter values of your choice

  • In case you are working in a HPC with Singularity, enable Singularity

  • run

./run_wf.sh -n false -n osd-short -d short-test-case -f test_input/wgs-paired-SRR1620013_1.fastq.gz -r test_input/wgs-paired-SRR1620013_2.fastq.gz

In case you are using Docker, it is strongly recommended to avoid installing it through snap

RuntimeError: slurm currently does not support shared caching, because it does not support cleaning up a worker after the last job finishes. Set the --disableCaching flag if you want to use this batch system.

Version History

eosc-life-gos @ 28122db (earliest) Created 19th Sep 2022 at 19:00 by Haris Zafeiropoulos

running version with workaround in conditionals


Frozen eosc-life-gos 28122db
help Creators and Submitter
Activity

Views: 125

Created: 19th Sep 2022 at 19:00

Last updated: 20th Sep 2022 at 11:44

Last used: 6th Oct 2022 at 19:51

help Tags

This item has not yet been tagged.

Total size: 110 MB
Powered by
(v.1.12.2)
Copyright © 2008 - 2022 The University of Manchester and HITS gGmbH

By continuing to use this site you agree to the use of cookies