This workflow demonstrates the integration of FAIR principles into the workflow management ecosystem through provenance integration in Autosubmit, a workflow manager developed at the Barcelona Supercomputing Center (BSC), and SUNSET (SUbseasoNal to decadal climate forecast post-processing and aSSEssmenT suite), an R-based verification workflow also developed at BSC.
Autosubmit supports the generation of data provenance information based on RO-Crate, facilitating the creation of machine-actionable digital objects that encapsulate detailed metadata about its executions. However, the provenance metadata provided by Autosubmit focuses on the workflow process and does not encapsulate the details of the data transformation processes. This is where SUNSET plays a complementary role. SUNSET’s approach to provenance information is based on the METACLIP (METAdata for CLImate Products) ontologies. METACLIP offers a semantic approach to describing climate products and their provenance. This framework enables SUNSET to provide specific, high-resolution provenance metadata for its operations, improving transparency and compliance with FAIR principles. The generated files provide detailed information about each transformation the data has undergone, as well as additional details about the data's state, location, structure, and associated source code, all represented in a tree-like structure.
The workflow uses a SUNSET configuration file, referred to as a "recipe," to generate a set of JSON files containing the provenance information of the workflow execution based on the METACLIP ontologies. For this, we compute some skill metrics and scorecard plots with SUNSET, using Autosubmit to dispatch jobs in parallel. In the recipe, we request three start dates for January, February, and March (0101, 0201, 0301). SUNSET will split the recipe into three atomic recipes, and Autosubmit will run three jobs, processing the verification for each recipe in parallel. When all the scorecards are generated, the "transfer_provenance" job will be triggered, transferring the SUNSET-generated provenance files to the Autosubmit experiment folder. Finally, an RO-Crate object will be created, encapsulating the entire process description.
Currently, this workflow can only be executed within the BSC infrastructure. Here is the complete use case: Use Case Documentation
The METACLIP-based JSON files can be interactively visualized using the METACLIP Interpreter.
Version History
Version 1 (earliest) Created 12th Feb 2025 at 09:25 by Albert Puiggros

Views: 315 Downloads: 72
Created: 12th Feb 2025 at 09:25

This item has not yet been tagged.
