Workflows
What is a Workflow?We present an R script that describes the workflow for analysing honey bee (Apis mellifera) wing shape. It is based on a large dataset of wing images and landmark coordinates available at Zenodo: https://doi.org/10.5281/zenodo.7244070. The dataset can be used as a reference for the identification of unknown samples. As unknown samples, we used data from Nawrocka et al. (2018), available at Zenodo: https://doi.org/10.5281/zenodo.7567336. Among others, the script can be used to identify the geographic ...
This workflow take as input a collection of paired fastq. Remove adapters with cutadapt, map pairs with bowtie2 allowing dovetail. Keep MAPQ30 and concordant pairs. BAM to BED. MACS2 with "ATAC" parameters.
This workflow take as input a collection of paired fastq. It uses HiCUP to go from fastq to validPair file. The pairs are filtered for MAPQ and sorted by cooler to generate a tabix dataset. Cooler is used to generate a balanced cool file to the desired resolution.
Type: Nextflow
Creators: Pablo Riesgo Ferreiro, Thomas Bukur, Patrick Sorn
Submitter: Pablo Riesgo Ferreiro
This workflow take as input a collection of paired fastq. It will remove bad quality and adapters with cutadapt. Map with Bowtie2 end-to-end. Will remove reads on MT and unconcordant pairs and pairs with mapping quality below 30 and PCR duplicates. Will compute the pile-up on 5' +- 100bp. Will call peaks and count the number of reads falling in the 1kb region centered on the summit. Will plot the number of reads for each fragment length.
This workflow takes as input a list of single-read fastqs. Adapters and bad quality bases are removed with cutadapt. Reads are mapped with STAR with ENCODE parameters and genes are counted simultaneously. The counts are reprocess to be similar to HTSeq-count output. FPKM are computed with cufflinks. Coverage (per million mapped reads) are computed with bedtools on uniquely mapped reads.
This workflow takes as input a list of paired-end fastqs. Adapters and bad quality bases are removed with cutadapt. Reads are mapped with STAR with ENCODE parameters and genes are counted simultaneously. The counts are reprocess to be similar to HTSeq-count output. FPKM are computed with cufflinks. Coverage (per million mapped reads) are computed with bedtools on uniquely mapped reads (with R2 orientation inverted).
ChIP-seq paired-end Workflow
Inputs dataset
- The workflow needs a single input which is a list of dataset pairs of fastqsanger.
Inputs values
- adapters sequences: this depends on the library preparation. If you don't know, use FastQC to determine if it is Truseq or Nextera.
- reference_genome: this field will be adapted to the genomes available for bowtie2.
- effective_genome_size: this is used by MACS2 and may be entered manually (indications are provided for heavily used genomes).
...