A workflow for marine Genomic Observatories data analysis
Version 1

Workflow Type: Common Workflow Language

A workflow for marine Genomic Observatories data analysis

An EOSC-Life project

Build Status

The workflows developed in the framework of this project are based on pipeline-v5 of the MGnify resource.

This branch is a child of the pipeline_5.1 branch that contains all CWL descriptions of the MGnify pipeline version 5.1.

The following comes from the initial repo and describes how to get the databases required.


This repository contains all CWL descriptions of the MGnify pipeline version 5.0.


For a thorough read-the-docs, click here.

We kindly recommend use the MGnify resource for data processing.

If you want to run pipeline locally, we recommend you use our pre-build docker containers.

Requirements to run pipeline

  • python3 [v 3.6+]

  • docker [v 19.+] or singularity

  • cwltool [v 3.+] or toil [v 4.2+]

  • hdd for databases ~133G


All the tools are containerized.

Unfortunately, antiSMASH and InterProScan containers are very big. We provide two options:

  1. Pre-install these tools. The instructions on how to setup the environment are here.

  2. Use containers. First of all you need to uncomment hints in InterProScan-v5.cwl and antismash_v4.cwl. Pre-pull containers from https://hub.docker.com/u/microbiomeinformatics

docker pull microbiomeinformatics/pipeline-v5.interproscan:v5.36-75.0
docker pull microbiomeinformatics/pipeline-v5.antismash:v4.2.0


Create conda environment

Get the EOSC-Life marine GOs workflow

git clone https://github.com/EBI-Metagenomics/pipeline-v5.git 
cd pipeline-v5

Download necessary dbs

You can download databases for the EOSC-Life GOs workflow by running the download_dbs.sh script. If you have one or more already in your system, then create a symbolic link pointing at the ref-dbs folder.

How to run

  • activate the conda env

  • edit the gos_wf.yml file to set the parameter values of your choice

  • In case you are working in a HPC with Singularity, enable Singularity

  • run

./run_wf.sh -n false -n osd-short -d short-test-case -f test_input/wgs-paired-SRR1620013_1.fastq.gz -r test_input/wgs-paired-SRR1620013_2.fastq.gz

In case you are using Docker, it is strongly recommended to avoid installing it through snap

RuntimeError: slurm currently does not support shared caching, because it does not support cleaning up a worker after the last job finishes. Set the --disableCaching flag if you want to use this batch system.

Version History

eosc-life-gos @ 28122db (earliest) Created 19th Sep 2022 at 19:00 by Haris Zafeiropoulos

running version with workaround in conditionals

Frozen eosc-life-gos 28122db
help Creators and Submitter

Views: 125

Created: 19th Sep 2022 at 19:00

Last updated: 20th Sep 2022 at 11:44

Last used: 6th Oct 2022 at 19:51

help Tags

This item has not yet been tagged.

Total size: 110 MB
Powered by
Copyright © 2008 - 2022 The University of Manchester and HITS gGmbH

By continuing to use this site you agree to the use of cookies