# RNA-Seq pipeline Here we provide the tools to perform paired end or single read RNA-Seq analysis including raw data quality control, differential expression (DE) analysis and functional annotation. As input files you may use either zipped fastq-files (.fastq.gz) or mapped read data (.bam files). In case of paired end reads, corresponding fastq files should be named using *.R1.fastq.gz* and *.R2.fastq.gz* suffixes. ## Pipeline Workflow All analysis steps are illustrated in the pipeline [flowchart](https://www.draw.io/?lightbox=1&highlight=0000ff&edit=_blank&layers=1&nav=1&title=NGSpipe2go_RNAseq_pipeline.html#R7R1Zk5s489e4Knmwi%2Ft4nDOz2WSyO5OtbPYlJZCw%2BYLBATzXr%2F90AOYQNtgY8Ex2UjuDACG1Wn13ayJfLJ8%2BhGC1%2BBxA5E0kAT5N5MuJJImKJE3IPwE%2BsxZd11nDPHRh8tCm4d59QUmjkLSuXYiiwoNxEHixuyo22oHvIzsutIEwDB6LjzmBV%2FzqCsxRpeHeBl619ZsL40XSKmrm5sYNcueL5NOGlMzPAvbPeRis%2FeR7fuAjdmcJ0m6SOUYLAIPHXJN8NZEvwiCI2V%2FLpwvkEbCmEGPvXdfczYYcIj9u8sLiRdX%2F%2FXT3%2FOR9Xj7NbwX7r3%2FtaTa4%2BDmFBYIYNMllEMaLYB74wLvatJ7T%2BSLSrYCv%2FrdertLn52CFWzZvfQoC3HApksdQHD8nKw%2FWcYCbFvHSS%2B6iJzf%2Bl3Q4U5Or77k7l0%2FJt%2BjFc3rhx%2BFz7iVy%2BT1%2Fb%2FMavUrfi%2BIw%2BJmtM16H8yooE%2BhGwTq0E7g82Ld%2F339bf1xfvvz39Y%2Bv19%2Bjh%2F%2BmcoIgMQjnKN7yoJIAmkA394lkqT6gYInwGPEDIfJA7D4UERMk%2BD3PntssNP4jWWv%2But8IH53FrRF7D4s76%2Fbm%2FvrHy8003XgPwFsnn5pImodncA7dB%2FJFz5379Ib2a01Q9Dyk6J9d4r%2FmyW%2F6mhWWW%2FDAaF9pawnPcqvvAQt559lWugi8IKQPydf0P%2B6iOYEfJ%2FgkamTcIFpQpBQLKEquHNfz8p1K5CfrNL1D9658Pg8BdDEilJrtYOnaCQbNPRBFKTal21qgn43x4gUEcFNT2IZZDyiM0dNWTEjuGmpCaxLaOjU1dv2Yo1RGQoAXOSKlpIt8CPZwkVl80%2BhzpZGfY6BPB9giGUIBWzIWm8MWRTCr2KJ1gS0v3%2BLLr8K%2F%2F328vrsKlyL8%2BuH5j6l8MIt5SwylIT%2BRlaH4CXeNVfG1LzLy4RmRNPGlTfYv2c2k8dr10uFU8aBAVHpACnkopBDlx9sb9zE4%2B3O%2B%2BvLjW%2FDxXBGnEodJ9I8kXJAcdx0S%2FtgD3LcNk8OcCdMrLEDKjsmNaURBeIYfEJXVE4dXX0iTs%2FMlcH0CP3eFPBfzNtx4vp2Ns68Wm5mYsO9AuplO0ku0An5FHCk3ALLublz6TuiDCP2apbCYYQEkeHjOfypdqbQFqMCWFc1GpiVA3QKCaiqm4uiOo5qaIAlT2wRQRbbqiFDSZR0auqBrjgUdVRUECymKAnUgI6vwkUWInMJnFnFMVOczgkjS9dyNF2trhmUQfOEuLSfACIz%2FvP1wT4YuzQN8YXmBRdANRDHCq3idTirCf9%2FdnuF54j92T5iBrfa5FCcAB08qQOdiE2%2B5apCsLEKOBvWmlCK5MbDw27m9JAkQOa7vUh1CEt5Z5M771ptr1PP2AgAjSgujFbIpNZQEOlFiNQng2iMod%2FrzQ1GEGYwLCLkGPswmNyXTdh0sP1AmBDDNJ%2Fttjwl3gINPyF7HBN45HKwZSb3%2B1YXaYooFtcWoai2yztFaROlYOq6yW3jButuK%2FInp6gpreWTyTBZJRUAjbchshwoXXFu5eGMYihWQpcDJQyxtayyKJN%2F7K3AptiUfm0pqYcH0YgeB40RYRCovQjboA2wP5pCaRvr394LWsUvTKOgZG7WjB3VSa6o5DKU4bBt1rXWpSvGs4IkQPNefM5JnBSFE4RQ34wsmIyX7ghJEAW8CNbuTN%2BKT2xNJRgr5yZ5YAQizvqVt%2FKMiw%2FDEFb4waWNyG2LpK5rFTwVLWVmGhKYGRV2Fmg5EYAiKaIgSFiNFxZIMgGQw1R1bMETTAVARJB0CSVJkU9ZsQVRtLGealiqogoNFxkFkyNp5MiCVb2%2BTGGtkwcoi8PlnKn5V0akeV8qsFS0LQLQxduNB2RRZch0KlacylMo9JdXjVWwF8LnSWBEIY1hucdMGvGD405dXVBwB3nPk4lXB%2B1OYMxGMeq6EFPoE6ckSCkwmwM9pYEmcO74VrdgssFQ%2FY1D488tD9O0rnsq7P79Mv319P8vB3eWsRXWU1RausFt9jDRWodAGLh%2Fwzl%2FhSfhYEosYNcM7zs%2BoZhkPxzqhhv2%2FkYGRRt6WIe1s29fItUTecm3gnSUejJhICOepP8NDDnkrwE85HjUGOtT%2Bl5cVHhdujO4xESA9PoZgtY2DtxCRi4Z9WZQ48p5YFfiMo3mB1N%2BSWHNJLF2u0Yhi%2F0Qo%2FGL9j4RTYCpPXGhvR9QqUvZaQUu1DV3XHMlxREUxVUeVbRGopmTqjgMdxZqKGJUtyTItLFxZQFZMXVRUJBkOhIajiwKybSQpUB1G0KqZJQMQn7n9FrJaCVkRlom8zGSyN6Nzy%2F1RYaS1zNGaQ2Pe1eHgQwSgjbc2oSekZzYHISd3vlv77q81xYa1NcXUlawNIReU2cDg%2FbYpH0tMmTP5rysozHPiJJGqHTLbApr0ObkQrTwszcSdLnPWJaGXR5pUO%2Fltu5wi5RhoIticoMCXBSPuEvjEPYQBfJmTB1oIgZLxWwhsLgSmRuSTEQJXXM69xMPHGjJlzsJqI%2Fex9indQeSelruHkT6eJrvqjLJnn%2Fk9dkl3f22cYzx3SUUM4VKdVbltUXlqr6ky8kBuGnxR9ZwKuW3l08x3NHsAYdTArWxZqmBAKIuyqEhYaIVAFIGlSw6CyNEgmiITmhDaClINBe9QoBu6ZGL6AQSkmo4sQQgVU5Hso0uqmYdvI6funC0DWd1jjWXXRONogybHQYoSR12FwYMLCYLnfYY8ZN%2FKYukIKHRv8YsFIBVmzvkue43MleubFN4Fq3jWXjrLgXMnl5a2c%2Bnx8eMs6WEXPzb11mR5f3bMi%2F3p1W2ZfnC0fkuluIrFDo7nt5R%2B%2By0369W%2F%2FMPHBimNhE6wQUv3T9oFk%2BiStzZIUe3IKMYviErJ6spmXOnoLAzBc%2B6xFXkg2jJgVeMOuHZcpefTcW2Qm42gW1TnxU6MUZ60A99x5%2BuQ5mzkg454tq1RiJCgkVSGBflZ6K1mPiFltZLZdCOdQfRAMvquGUBIO3lsxi5rBTL%2BQ7XiWI0dkRt%2BGfKnXydB9wGSVYjA0sKiay08OE%2FsYVftSQbtEav4xukg8OpBWbpbC8aq2NkCywYT%2BCsNNOqTUKTHIPxJAghp3z6YU%2Bl7m9zNazukIUSM5UUtvvuuuwFiQIC1RxiYm4wggWXkrcOCF4BELrjEuOoH5HEfISIo4cXb%2FRGCv0noY5LTlXbrBTbRvXIL2Ghe7ydpiCfvLkfNyihF9iJBeGKCwQNImBFWusji%2F0SI4OQGMFg%2BjV1%2FHu2MsihpX6NWplRZ52gBHDVA1EoSWmfu7DSgcUDVSWkLxVenOnGT1MXqQgyapD6MMtVVLlnjZDLtKDpYW41HF4wC1kmatlXjKT8vG%2BohGs8uA3mmKVyDKP77gkoBTWzTv9aYCMfPkyTijhJ%2F6isMwSOFKU0iIPPAktPPZHN0aX3Fj%2BH%2FO%2FjGL3vGmjN5a6exjiFRrbGORyZLPCfx33%2BiV5eyVsNpaJZ20rEw4eT1HcSE1DJuyZzUeo1DUTWhHssPMOnxU2sPJn49kbrjkSylIclSjmM2akuylFIOvpxYmepIVuV5xZwcQLKaodWOPArkWcHjb5Saqp3XbekEpRRVOj6K7CgO9BtFEqpjjhNFxD5QZHvdhzeCIsd2U6hiaWlTZ1LH3oUKI9J6YESpQjL%2B0iGd1nlpTF06V8MOK%2BazPd7rJDZ8ppNnF6xTaZdS3sdy68OQFE3rh6QoRrckhWsw0oekKCNztXeHmnLncs6eQYmZWeX%2B69ldY4vLHUrcKqsVDTs%2Fom0lc3JFGKjtjStKDqzHNK4ct4SDlNZrGMiYwodt1ciPF2t6T1ZL2EQGlKgHiTXIQTkJNShapwgUy6HkSxdCSlxKC7JpuUvAIE9qygnSipI04J3t%2FITe4mGp5%2FgfBt0FIRfqJfGIqefi5hr%2FI4%2BH8UXg466BS1cWYWx9RFE8qVZA7AAHZIN8vIAFSgUJuH4dWWmNBA1XvL6YwG5fLy2cknh8hT2CT76s49Wa5sMEHiQoc1ahYfjmL3ucoSeZuzLIpuEynyXL6dmMu0n05%2BEpF3mETUt4XoOl6xFcu0HeAyK9TvpwX%2BpKsSirrKrN8HwPYtcQz%2FWR4blQQfUQRWsvHmmcVT5zLdo2nGN8HLqOg8Is9Bs9rTCsIlYErAq1Xob04QsZiB%2B69mKJ%2FM33x7vVj7CtVaEqw3C39dGS7FN9e8Tb2gvmI93TeGTTlGXZgedhYZHGCDH2hYC9mGyCcl49MityjzwqejGMp2vBVeX%2FxPnl8i5a3vijO22gnaq%2Bs%2F4vvko%2BScTpxICCQpdm0dxnCetdK%2FBpqPlOBd4czJS4ddyjIW1VyTxERIEaKXXbN7nw8opNa3a3hPleKzmF0BINZDuSbkuOYQFoObZkAmSqOhJMQZyKliWZJoIahrJsAMG0LUMVBc2EDrB1UVQUAxi6WvzIMXIKabQw%2Fp0t13W0cP3nH6wK7Q%2FWjDffDxZleV0HAQa90t22IdxtY9d5CzSj48fDmC2Qt0JhNLvbtlJANGwk6tAwdFs2bVGWbVWWoQmRIUBoy1PbkgQJiQCJoi7aUHdMVdaBaMqGZTqKSRZMEnTdGuFK7QBFumT8x8Yafv8XHiim9BmFIXQoJIDMlwKmksssh0FZkO%2FnACsMhFBTDBVYGQ%2FM1mMiz9jrKKZdbaq8bXI5cwHHeHmjBSZ%2BWc2zNIR6z6Di16fUa5wqtPygZFk6nEfyg78OL6%2BQl5gSo2L%2FIlPy4Y28dFS%2F%2FJHiwDixpsX0wTRwa5cTrcOwU9cnhoOnFRUz%2FbixNwQSarBkhIZYhylNSLO%2FWWAqDUf1XCsECSIf3V1Smkxrz4nSV1jqccNQZXFYzwmfEh2evzwcJSq5%2F9uSpQMokToMJWoaIdQhJao0RGuLhLFLWLaLn1eIr0W19N5SKy2rGsbK02LaQYSZ5xXqh0SVJ9WeRqmvg0aZI6RRqRNx%2FKFkYzhqTGlaHF4Z7KwxbnSPNuQin1p0T%2BNFlo8TeNY6XiylF1n61o7cidLz6mG5E425XcIHfjC34aFczEEgXof9srD2nEs7Sc5VjkuS1RFyrh3VgcYRVju6XNbUXrOTuKXVecYTulhpsMDygljHyEnlhwjKNt7btDar5c6%2FufM%2Bkkjx0O1k6O2Jiv4qiIpqGrOqd3VwsiKfAlnpVNwxBhN3DqUIH778uMrCbRpv902ETtHin6u6PUe09E4fokVhDu1pgfE6aMHA5rvzG1G8mL%2B4yvJGhnfXcPnpCU5PpvrgcKa646Rx7jTVZTFm%2FZnq4Hp1ByAI9%2FIW4Jdp%2FfckOpEKG33QFvzdkAy6PVkxT5KslC1uysD5FPzSPIcnb55caZ4CheqSAqlNA7lGWppH3WGrGao0DxGAzgP4jPUrqTHJC1NTTaqTCeSvRJwiH0T%2BnMRb9EH6yCfJYRh4MFJr8qfmz6I%2FXfKnjtFsIw9anOfkHA6Nk6CNnpSzZkzuZCowjcLh0DideJwOB0XbzsSGcjhcXv24vLpHv5pzsMstSUUbS0EvojvCv9tzLuV1JEIr8rCca6tFfDTB99W8ojgE9s%2BRxt4z4zqZcjLIxOD24EaksGOUqMdvMWWOawk%2FWp4RP1bt1YcINMhGasHUj37MRqnas14%2BHaPxMRumVOior2M29OKxGeM4ZsMY9TEbd4gKHzbRlT8gP1jy3ZoNSWT%2FeVW0tgw5CB6iJ5o0Oi9Mopzlk0VwCLSWv0vpP%2F3GPHb6Ti%2BpJIxsStAnuSL5OvP0rPtdh6vtdeDXqEvO62K15LyYRjUWsjuEDvgUfwcPXnI%2BoyFvt%2BQ8P1N5e13L3u3aRVNPJkb0LVMMlOFscoIYucsmjivD2ayvPURPSucQfit4IiefUzgS%2Bm4FIdaQprg54wH156ZbWBeZUyxMVSoZKeQneyI7EJ3clrbxjpzhIUmHpWi5XRt7a%2Be%2FM5AU6yPhX%2B8wiuNVY7UmWF5nMak4zdX8iQWF8P3mGksO1FaT1XxhSViAohrvyACWt5WL%2B4hitIryJ8M0Odv7WKePt%2Bz%2FjQxs0upo81M4qdxMi4emibMm5xyF1OLUuWjFNwEMWgN0P89M5%2By6Sx4simqVCQ96dnkzHpyNexBL5x9%2Bkyoj4BGCGIzT0klPiJlsSv%2B9e3FXKwTfbxtZLzqtD5YshQJzwweiu1LHBjPDroDLuC9KRKGmNQ16HH8ElisPibM7cUZhPJu%2FjHuc0h7jPGnTdslgkF7urAZxNHtB6kgasdOGlFcuFBEZESmzwHJSKvD4plwzvIrM%2FRaAO%2FnC7HqfoRSpH3l3vEznhxQdmsxAqzHdJUWemgYsJAAhHJ3Zqikvx3v7JwweWSlUViyJjProWZJJPSkyg9aRC2Ye%2FqcbuaArVYLRZ%2BTCjfDRWdwasfewuLNub%2B6vf7zcTOsDFyqoRegsl4%2BUDUwKj118QnMiPm5Ql3VXYRdWLbtI6sbnFolb451wiVJTpdR8bWl5HlYUSWYnnEQuchK5ghdKGq%2FJcwMcwki4SCBXkIAeRj%2FZlHNt4EbZveW6AJ1Rio8wqluK50Epe7Y7A90hXmOtLEPm7K0pNDeNFK48BQNrFESBsyLyK1fxbrM4g8uKewqIbYxxR0CuzNiRxy6Fh13qkbCrT4%2FmVvR%2Buw5NLlhO0BOVGMOvmVNly3Z8a16nrcB4s26V0Q5sQH9PByym7NvhyS88DtPFgT5cSjbkOScuM%2BdfByVb2Cij2N6AhauMnKJYPW6KL%2F9oR8LO%2BuM6Mkv5HzEGp50PbN8Y0el6riOaaJggW1aX%2BdUql2UhVpOaLeLRtMuqmf2S5fxSsxNiQZOQJMETGYMXl0HDMN7QkvEsy%2BnBvgXfiHn4knEtyydT36K7HL1qgGHtyuZNzd9N7%2BNX948r6%2F7%2BKXgOnj9%2B%2F%2FLPtKmlOT2OaOjM83Isr1zIJN%2F5vKL1k3meHDpEbRyfyDFJjat4Batnuh7RT%2FIJesKSgNEvF2uWNKYSyBFt4ks3ImoN8FGwJtfJrKjFLbONS1zreD2ynZxxvJyQrgvDFvzipipzzC0Dxy23oo34IgsnLtPLtod17%2BmJ42Js05zm1H8yksR1aWTVWTrEhhL3lPtHhqHz1o3iQbspi%2BOytOrx72ViVpal9y5C1UxcS%2FW0V4iZJTo1MJniQ38oKrVt1I1cBuvQez4nKdco3s30N%2FjCrmKWlS1fTs2WkG8uJSi6XDoBO9uoBd2oKibIegc%2BGf5yH17A5IhFuqS21UFbVQhstJn62SQ1qkm5HKB5nJRmXSyqQJrZjwr0ee3F7t8XzYOA1sslCAkKSQJ%2BLYvX26Hf4CHMQm8184n7rFbHmW70HIgekFdQc1gx4yUZ7y97W8hPPbqcnFJTjvjhHXnWp1KzNauv1nlZDT0I6UjbGcULbsQSvcxBnSL7eebfTA11E0m%2Bpv9xaVbePq6VaViRT%2BWtf7jTK4381JkK5yGALl7%2BUjNGM5IOQingnCSHpMR0ASCllQKPIXaAT6YszLRiLQRFUbKmPAtUjZnICRkSDXV2LEYo1aeC1CFSbQwLfWvtlVu8zJb%2Fd9ksTA9i9AKW1sfCGWmJQY9Roi2Y6VWi7DctyZESJO6apUNEZFWjuOhh2NbDPxf3hM6mRRoEjBiPzANRrMCz79B%2BrYEfk2PldhfF2dbbp%2FQkOhom5BEhgfG%2FCEXRshg0tHVU9MQomjJF14BUxcJvFwa3o4sPrAQkc%2BrmSkQ2evme5lOkggOCPqIDwbt%2BTQ7iYyeUW6Tjz5f3OZRhqZ4LBGK81vsvyraqYO8ur95POGXEk3MB6cgoMoMo3n8EV7tLlmf5r3t%2B4gxCWoEDeMVPuL7trSHbJ5TRp%2BkKdC%2B2%2FR5urBKAHYzkcBdQIi4M6wAySwGhilgNC5BEns9uj7hGfImlsDgvmuJpLT4HkEidV%2F8H). Specify the desired analysis details for your data in the *essential.vars.groovy* file (see below) and run the pipeline *rnaseq.pipeline.groovy* as described [here](https://gitlab.rlp.net/imbforge/NGSpipe2go/-/blob/master/README.md). A markdown file *DEreport.Rmd* will be generated in the output reports folder after running the pipeline. Subsequently, the *DEreport.Rmd* file can be converted to a final html report using the *knitr* R-package. ### The pipelines includes - quality control of rawdata with FastQC and MultiQC - Read mapping to the reference genome using STAR - generation of bigWig tracks for visualisation of alignment with deeptools - Characterization of insert size for paired-end libraries - Read quantification with featureCounts (Subread) - Library complexity assessment with dupRadar - RNA class representation - Check for strand specificity - Visualization of gene body coverage - Illustration of sample relatedness with MDS plots and heatmaps - Differential Expression Analysis for depicted group comparisons with DESeq2 - Enrichment analysis for DE results with clusterProfiler and ReactomePA - Additional DE analysis including multimapped reads ### Pipeline parameter settings - targets.txt: tab-separated txt-file giving information about the analysed samples. The following columns are required - sample: sample identifier for use in plots and and tables - file: read counts file name (a unique sub-string of the file name is sufficient, this sub-string is grebbed against the count file names produced by the pipeline) - group: variable for sample grouping (e.g. by condition) - replicate: replicate number of samples belonging to the same group - contrasts.txt: indicate intended group comparisions for differential expression analysis, e.g. *KOvsWT=(KO-WT)* if targets.txt contains the groups *KO* and *WT*. Give 1 contrast per line. - essential.vars.groovy: essential parameter describing the experiment including: - ESSENTIAL_PROJECT: your project folder name - ESSENTIAL_STAR_REF: path to STAR indexed reference genome - ESSENTIAL_GENESGTF: genome annotation file in gtf-format - ESSENTIAL_PAIRED: either paired end ("yes") or single read ("no") design - ESSENTIAL_STRANDED: strandness of library (no|yes|reverse) - ESSENTIAL_ORG: UCSC organism name - ESSENTIAL_READLENGTH: read length of library - ESSENTIAL_THREADS: number of threads for parallel tasks - additional (more specialized) parameter can be given in the var.groovy-files of the individual pipeline modules ## Programs required - Bedtools - DEseq2 - deeptools - dupRadar (provided by another project from imbforge) - FastQC - MultiQC - Picard - R packages DESeq2, clusterProfiler, ReactomePA - RSeQC - Samtools - STAR - Subread - UCSC utilities