mvgwas-nf
Version 1

Workflow Type: Nextflow
Stable

mvgwas-nf

nextflow CI-checks

A pipeline for multi-trait genome-wide association studies (GWAS) using MANTA.

The pipeline performs the following analysis steps:

  • Split genotype file
  • Preprocess phenotype and covariate data
  • Test for association between phenotypes and genetic variants
  • Collect summary statistics

The pipeline uses Nextflow as the execution backend. Please check Nextflow documentation for more information.

Requirements

  • Unix-like operating system (Linux, MacOS, etc.)
  • Java 8 or later
  • Docker (v1.10.0 or later) or Singularity (v2.5.0 or later)

Quickstart (~2 min)

  1. Install Nextflow:

    curl -fsSL get.nextflow.io | bash
    
  2. Make a test run:

    nextflow run dgarrimar/mvgwas-nf -with-docker
    

Notes: move the nextflow executable to a directory in your $PATH. Set -with-singularity to use Singularity instead of Docker.

(*) Alternatively you can clone this repository:

git clone https://github.com/dgarrimar/mvgwas-nf
cd mvgwas-nf
nextflow run mvgwas.nf -with-docker

Pipeline usage

Launching the pipeline with the --help parameter shows the help message:

nextflow run mvgwas.nf --help
N E X T F L O W  ~  version 20.04.1
Launching `mvgwas.nf` [amazing_roentgen] - revision: 56125073b7

mvgwas-nf: A pipeline for multivariate Genome-Wide Association Studies
==============================================================================================
Performs multi-trait GWAS using using MANTA (https://github.com/dgarrimar/manta)

Usage:
nextflow run mvgwas.nf [options]

Parameters:
--pheno PHENOTYPES          phenotype file
--geno GENOTYPES            indexed genotype VCF file
--cov COVARIATES            covariate file
--l VARIANTS/CHUNK          variants tested per chunk (default: 10000)
--t TRANSFOMATION           phenotype transformation: none, sqrt, log (default: none)
--i INTERACTION             test for interaction with a covariate: none,  (default: none)
--ng INDIVIDUALS/GENOTYPE   minimum number of individuals per genotype group (default: 10)
--dir DIRECTORY             output directory (default: result)
--out OUTPUT                output file (default: mvgwas.tsv)

Input files and format

mvgwas-nf requires the following input files:

  • Genotypes. bgzip-compressed and indexed VCF genotype file.

  • Phenotypes. Tab-separated file with phenotype measurements (quantitative) for each sample (i.e. n samples x q phenotypes). The first column should contain sample IDs. Columns should be named.

  • Covariates. Tab-separated file with covariate measurements (quantitative or categorical) for each sample (i.e. n samples x k covariates). The first column should contain sample IDs. Columns should be named.

Example data is available for the test run.

Pipeline results

An output text file containing the multi-trait GWAS summary statistics (default: ./result/mvgwas.tsv), with the following information:

  • CHR: chromosome
  • POS: position
  • ID: variant ID
  • REF: reference allele
  • ALT: alternative allele
  • F: pseudo-F statistic
  • R2: fraction of variance explained by the variant
  • P: P-value

The output folder and file names can be modified with the --dir and --out parameters, respectively.

Cite mvgwas-nf

If you find mvgwas-nf useful in your research please cite the related publication:

Garrido-Martín, D., Calvo, M., Reverter, F., Guigó, R. A fast non-parametric test of association for multiple traits. bioRxiv (2022). https://doi.org/10.1101/2022.06.06.493041

Version History

master @ aaa979d (earliest) Created 15th Feb 2023 at 11:58 by Diego Garrido-Martín

add citation


Frozen master aaa979d
help Creators and Submitter
Activity

Views: 931

Created: 15th Feb 2023 at 11:58

help Attributions

None

Total size: 7.8 MB
Powered by
(v.1.14.1)
Copyright © 2008 - 2023 The University of Manchester and HITS gGmbH