Hi! I'm Russell.
I'm a microbiologist who uses graph theory and machine learning to study the relationships that bacteria and archaea form with their host organisms, and lately giant viruses and their hosts. Or, I'm a computer scientist who builds software that uses concepts from evolution to extract knowledge about ecology from large datasets. Or, I'm a data scientist who uses Python to explore biological systems. Or, I'm a physicist that went rouge and defected to the squishy side of science. ...
Phylogenetic reconstruction using genome-wide and single-gene alignment data. Here we use maximum likelihood reconstruction program IQTree. Data can be prepared using the phylogenetic data preparation workflow prior to phylogenetic reconstruction. Resulting trees can be viewed interactively using Galaxy's 'Phyloviz' or 'Phylogenetic Tree Visualization'
This workflow begins from a set of genome assemblies of different samples, strains, species. The genome is first annotated with Funnanotate. Predicted proteins are furtner annotated with Busco. Next, 'ProteinOrtho' finds orthologs across the samples and makes orthogroups. Orthogroups where all samples are represented are extracted. Orthologs in each orthogroup are aligned with ClustalW. Test dataset: https://zenodo.org/record/6610704#.Ypn3FzlBw5k