Teams: ParslRNA-Seq: an efficient and scalable RNAseq analysis workflow for studies of differentiated gene expression
Organizations: National Laboratory of Scientific Computinghttps://orcid.org/0000-0002-2151-7418
I am a bioinformatician and phylogenetics. I really love working on problems at the intersection of high-performance computing and scientific workflows applied to omics
Phylogenetic reconstruction using genome-wide and single-gene alignment data. Here we use maximum likelihood reconstruction program IQTree. Data can be prepared using the phylogenetic data preparation workflow prior to phylogenetic reconstruction. Resulting trees can be viewed interactively using Galaxy's 'Phyloviz' or 'Phylogenetic Tree Visualization'
This workflow begins from a set of genome assemblies of different samples, strains, species. The genome is first annotated with Funnanotate. Predicted proteins are furtner annotated with Busco. Next, 'ProteinOrtho' finds orthologs across the samples and makes orthogroups. Orthogroups where all samples are represented are extracted. Orthologs in each orthogroup are aligned with ClustalW. Test dataset: https://zenodo.org/record/6610704#.Ypn3FzlBw5k