Workflow Type: Galaxy
Frozen
ATACseq Workflow
This workflow is highly concordant with the corresponding training material. You can have more information about ATAC-seq analysis in the slides and the tutorial.
Inputs dataset
- The workflow needs a single input which is a list of dataset pairs of fastqsanger.
Inputs values
- reference_genome: this field will be adapted to the genomes available for bowtie2 and the genomes available for bedtools slopbed (dbkeys table)
- effective_genome_size: this is used by macs2 and may be entered manually (indications are provided for heavily used genomes)
Processing
- The workflow will remove nextera adapters and low quality bases and filter out any read smaller than 15bp.
- The filtered reads are mapped with bowtie2 allowing dovetail and fragment length up to 1kb.
- The BAM is filtered to keep only MAPQ30, concordant pairs and pairs outside of the mitochondria.
- The PCR duplicates are removed with Picard.
- The BAM is converted to BED to enable macs2 to take both pairs into account.
- The peaks are called with macs2 which at the same time generates a coverage file.
- The coverage file is converted to bigwig
- The amount of reads 500bp from summits and the total number of reads are computed if further normalization is wanted.
- Other QC are performed:
- A histogram with fragment length is computed.
- The evaluation of percentage of reads to chrM or MT is computed.
- A multiQC is run to have an overview of the QC.
Warning
- The coverage output is not normalized.
- The
reference_genome
parameter value is used to select references in bowtie2 and bedtools slopbed. Only references that are present in bowtie2 and bedtools slopbed are selectable. If your favorite reference genome is not available ask your administrator to make sure that each bowtie2 reference has a corresponding len file for use in bedtools slopbed.
Inputs
ID | Name | Description | Type |
---|---|---|---|
PE fastq input | PE fastq input | Should be a paired collection with ATAC-seq fastqs |
|
effective_genome_size | effective_genome_size | Used by macs2:\nH. sapiens: 2700000000, M. musculus: 1870000000, D. melanogaster: 120000000, C. elegans: 90000000 |
|
reference_genome | reference_genome | reference_genome |
|
Steps
ID | Name | Description |
---|---|---|
3 | Cutadapt (remove adapter + bad quality bases) | toolshed.g2.bx.psu.edu/repos/lparsons/cutadapt/cutadapt/4.0+galaxy1 |
4 | Bowtie2 map on reference | toolshed.g2.bx.psu.edu/repos/devteam/bowtie2/bowtie2/2.5.0+galaxy0 |
5 | filter MAPQ30 concordant pairs and not mitochondrial pairs | toolshed.g2.bx.psu.edu/repos/devteam/bamtools_filter/bamFilter/2.5.1+galaxy0 |
6 | Get number of reads per chromosome | toolshed.g2.bx.psu.edu/repos/devteam/samtools_idxstats/samtools_idxstats/2.0.4 |
7 | remove PCR duplicates | toolshed.g2.bx.psu.edu/repos/devteam/picard/picard_MarkDuplicates/2.18.2.3 |
8 | reads in chrM/MT for multiQC | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_awk_tool/1.1.2 |
9 | convert BAM to BED to improve peak calling | toolshed.g2.bx.psu.edu/repos/iuc/bedtools/bedtools_bamtobed/2.30.0+galaxy2 |
10 | Compute fragment length histogram | toolshed.g2.bx.psu.edu/repos/iuc/pe_histogram/pe_histogram/1.0.1 |
11 | number of reads | toolshed.g2.bx.psu.edu/repos/iuc/samtools_view/samtools_view/1.15.1+galaxy0 |
12 | Call Peak with MACS2 | toolshed.g2.bx.psu.edu/repos/iuc/macs2/macs2_callpeak/2.2.7.1+galaxy0 |
13 | get summits +/-500kb | toolshed.g2.bx.psu.edu/repos/iuc/bedtools/bedtools_slopbed/2.30.0+galaxy1 |
14 | summary of MACS2 | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_grep_tool/1.1.1 |
15 | Bigwig from MACS2 | wig_to_bigWig |
16 | Merge summits +/-500kb | toolshed.g2.bx.psu.edu/repos/iuc/bedtools/bedtools_mergebed/2.30.0 |
17 | Compute coverage on summits +/-500kb | toolshed.g2.bx.psu.edu/repos/iuc/bedtools/bedtools_coveragebed/2.30.0 |
18 | number of reads in peaks | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_awk_tool/1.1.2 |
19 | Combine number of reads in peaks with total number of reads | cat1 |
20 | reads in peaks multiQC | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_awk_tool/1.1.2 |
21 | MultiQC | toolshed.g2.bx.psu.edu/repos/iuc/multiqc/multiqc/1.11+galaxy1 |
Outputs
ID | Name | Description | Type |
---|---|---|---|
mapping stats | mapping stats | n/a |
|
MarkDuplicates metrics | MarkDuplicates metrics | n/a |
|
BAM filtered rmDup | BAM filtered rmDup | n/a |
|
histogram of fragment length | histogram of fragment length | n/a |
|
MACS2 narrowPeak | MACS2 narrowPeak | n/a |
|
MACS2 report | MACS2 report | n/a |
|
Coverage from MACS2 (bigwig) | Coverage from MACS2 (bigwig) | n/a |
|
1kb around summits | 1kb around summits | n/a |
|
Nb of reads in summits +-500bp | Nb of reads in summits +-500bp | n/a |
|
MultiQC on input dataset(s): Stats | MultiQC on input dataset(s): Stats | n/a |
|
MultiQC webpage | MultiQC webpage | n/a |
|
Version History
v0.1 (earliest) Created 21st Oct 2022 at 03:01 by WorkflowHub Bot
Updated to v0.1
Frozen
v0.1
1071145
Creators and Submitter
Creator
Additional credit
Lucille Delisle
Submitter
License
Activity
Views: 7462 Downloads: 2073 Runs: 0
Created: 21st Oct 2022 at 03:01
Last updated: 17th Jan 2023 at 03:01
Tags
Attributions
None