Publications

13 Publications visible to you, out of a total of 13

Abstract (Expand)

Preprint: https://arxiv.org/abs/2110.02168 The landscape of workflow systems for scientific applications is notoriously convoluted with hundreds of seemingly equivalent workflow systems, many isolatedd research claims, and a steep learning curve. To address some of these challenges and lay the groundwork for transforming workflows research and development, the WorkflowsRI and ExaWorks projects partnered to bring the international workflows community together. This paper reports on discussions and findings from two virtual "Workflows Community Summits" (January and April, 2021). The overarching goals of these workshops were to develop a view of the state of the art, identify crucial research challenges in the workflows community, articulate a vision for potential community efforts, and discuss technical approaches for realizing this vision. To this end, participants identified six broad themes: FAIR computational workflows; AI workflows; exascale challenges; APIs, interoperability, reuse, and standards; training and education; and building a workflows community. We summarize discussions and recommendations for each of these themes.

Authors: Rafael Ferreira da Silva, Henri Casanova, Kyle Chard, Ilkay Altintas, Rosa M Badia, Bartosz Balis, Taina Coleman, Frederik Coppens, Frank Di Natale, Bjoern Enders, Thomas Fahringer, Rosa Filgueira, Grigori Fursin, Daniel Garijo, Carole Goble, Dorran Howell, Shantenu Jha, Daniel S. Katz, Daniel Laney, Ulf Leser, Maciej Malawski, Kshitij Mehta, Loic Pottier, Jonathan Ozik, J. Luc Peterson, Lavanya Ramakrishnan, Stian Soiland-Reyes, Douglas Thain, Matthew Wolf

Date Published: 1st Nov 2021

Publication Type: Journal

Abstract (Expand)

In the recent years, the improvement of software and hardware performance has made biomolecular simulations a mature tool for the study of biological processes. Simulation length and the size and complexity of the analyzed systems make simulations both complementary and compatible with other bioinformatics disciplines. However, the characteristics of the software packages used for simulation have prevented the adoption of the technologies accepted in other bioinformatics fields like automated deployment systems, workflow orchestration, or the use of software containers. We present here a comprehensive exercise to bring biomolecular simulations to the “bioinformatics way of working”. The exercise has led to the development of the BioExcel Building Blocks (BioBB) library. BioBB’s are built as Python wrappers to provide an interoperable architecture. BioBB’s have been integrated in a chain of usual software management tools to generate data ontologies, documentation, installation packages, software containers and ways of integration with workflow managers, that make them usable in most computational environments.

Authors: Pau Andrio, Adam Hospital, Javier Conejero, Luis Jordá, Marc Del Pino, Laia Codo, Stian Soiland-Reyes, Carole Goble, Daniele Lezzi, Rosa M. Badia, Modesto Orozco, Josep Ll. Gelpi

Date Published: 1st Dec 2019

Publication Type: Journal

Abstract (Expand)

Computational workflows describe the complex multi-step methods that are used for data collection, data preparation, analytics, predictive modelling, and simulation that lead to new data products. They can inherently contribute to the FAIR data principles: by processing data according to established metadata; by creating metadata themselves during the processing of data; and by tracking and recording data provenance. These properties aid data quality assessment and contribute to secondary data usage. Moreover, workflows are digital objects in their own right. This paper argues that FAIR principles for workflows need to address their specific nature in terms of their composition of executable software steps, their provenance, and their development.

Authors: Carole Goble, Sarah Cohen-Boulakia, Stian Soiland-Reyes, Daniel Garijo, Yolanda Gil, Michael R. Crusoe, Kristian Peters, Daniel Schober

Date Published: 2020

Publication Type: Journal

Abstract

Not specified

Authors: Carole Goble, Sarah Cohen-Boulakia, Stian Soiland-Reyes, Daniel Garijo, Yolanda Gil, Michael R. Crusoe, Kristian Peters, Daniel Schober

Date Published: 2020

Publication Type: Journal

Abstract (Expand)

Background A new era of flu surveillance has already started based on the genetic characterization and exploration of influenza virus evolution at whole-genome scale. Although this has been prioritizedd by national and international health authorities, the demanded technological transition to whole-genome sequencing (WGS)-based flu surveillance has been particularly delayed by the lack of bioinformatics infrastructures and/or expertise to deal with primary next-generation sequencing (NGS) data. Results We developed and implemented INSaFLU (“INSide the FLU”), which is the first influenza-oriented bioinformatics free web-based suite that deals with primary NGS data (reads) towards the automatic generation of the output data that are actually the core first-line “genetic requests” for effective and timely influenza laboratory surveillance (e.g., type and sub-type, gene and whole-genome consensus sequences, variants’ annotation, alignments and phylogenetic trees). By handling NGS data collected from any amplicon-based schema, the implemented pipeline enables any laboratory to perform multi-step software intensive analyses in a user-friendly manner without previous advanced training in bioinformatics. INSaFLU gives access to user-restricted sample databases and projects management, being a transparent and flexible tool specifically designed to automatically update project outputs as more samples are uploaded. Data integration is thus cumulative and scalable, fitting the need for a continuous epidemiological surveillance during the flu epidemics. Multiple outputs are provided in nomenclature-stable and standardized formats that can be explored in situ or through multiple compatible downstream applications for fine-tuned data analysis. This platform additionally flags samples as “putative mixed infections” if the population admixture enrolls influenza viruses with clearly distinct genetic backgrounds, and enriches the traditional “consensus-based” influenza genetic characterization with relevant data on influenza sub-population diversification through a depth analysis of intra-patient minor variants. This dual approach is expected to strengthen our ability not only to detect the emergence of antigenic and drug resistance variants but also to decode alternative pathways of influenza evolution and to unveil intricate routes of transmission. Conclusions In summary, INSaFLU supplies public health laboratories and influenza researchers with an open “one size fits all” framework, potentiating the operationalization of a harmonized multi-country WGS-based surveillance for influenza virus.

Authors: Vítor Borges, Miguel Pinheiro, Pedro Pechirra, Raquel Guiomar, João Paulo Gomes

Date Published: 1st Dec 2018

Publication Type: InProceedings

Abstract (Expand)

This report reviews the current state-of-the-art applied approaches on automated tools, services and workflows for extracting information from images of natural history specimens and their labels. We consider the potential for repurposing existing tools, including workflow management systems; and areas where more development is required. This paper was written as part of the SYNTHESYS+ project for software development teams and informatics teams working on new software-based approaches to improve mass digitisation of natural history specimens.

Authors: Stephanie Walton, Laurence Livermore, Olaf Bánki, Robert W. N. Cubey, Robyn Drinkwater, Markus Englund, Carole Goble, Quentin Groom, Christopher Kermorvant, Isabel Rey, Celia M Santos, Ben Scott, Alan Williams, Zhengzhe Wu

Date Published: 14th Aug 2020

Publication Type: Journal

Abstract (Expand)

We here introduce the concept of Canonical Workflow Building Blocks (CWBB), a methodology of describing and wrapping computational tools, in order for them to be utilized in a reproducible manner from multiple workflow languages and execution platforms. We argue such practice is a necessary requirement for FAIR Computational Workflows [Goble 2020] to improve widespread adoption and reuse of a computational method across workflow language barriers.

Authors: Stian Soiland-Reyes, Genís Bayarri, Pau Andrio, Robin Long, Douglas Lowe, Ania Niewielska, Adam Hospital

Date Published: 7th Mar 2021

Publication Type: Journal

Powered by
(v.1.12.0)
Copyright © 2008 - 2022 The University of Manchester and HITS gGmbH

By continuing to use this site you agree to the use of cookies