CWL Molecular Structure Checking
Version 1

Workflow Type: Common Workflow Language
Stable

Molecular Structure Checking using BioExcel Building Blocks (biobb)


This tutorial aims to illustrate the process of checking a molecular structure before using it as an input for a Molecular Dynamics simulation. The workflow uses the BioExcel Building Blocks library (biobb). The particular structure used is the crystal structure of human Adenylate Kinase 1A (AK1A), in complex with the AP5A inhibitor (PDB code 1Z83).

Structure checking is a key step before setting up a protein system for simulations. A number of common issues found in structures at Protein Data Bank may compromise the success of the simulation, or may suggest that longer equilibration procedures are necessary.

The workflow shows how to:

  • Run basic manipulations on structures (selection of models, chains, alternative locations
  • Detect and fix amide assignments and wrong chiralities
  • Detect and fix protein backbone issues (missing fragments, and atoms, capping)
  • Detect and fix missing side-chain atoms
  • Add hydrogen atoms according to several criteria
  • Detect and classify atomic clashes
  • Detect possible disulfide bonds (SS)

An implementation of this workflow in a web-based Graphical User Interface (GUI) can be found in the https://mmb.irbbarcelona.org/biobb-wfs/ server (see https://mmb.irbbarcelona.org/biobb-wfs/help/create/structure#check).


Copyright & Licensing

This software has been developed in the MMB group at the BSC & IRB for the European BioExcel, funded by the European Commission (EU H2020 823830, EU H2020 675728).

Licensed under the Apache License 2.0, see the file LICENSE for details.

Click and drag the diagram to pan, double click or use the controls to zoom.

Inputs

ID Name Description Type
step0_structure_check_init_input_structure_path Input file Input structure file path.
  • File
step0_structure_check_init_output_summary_path Output file Output summary checking results.
  • string
step1_extract_model_input_structure_path Input file Input structure file path.
  • File
step1_extract_model_output_structure_path Output file Output structure file path.
  • string
step1_extract_model_config Config file Configuration file for biobb_structure_utils.extract_model tool.
  • string
step2_extract_chain_output_structure_path Output file Output structure file path.
  • string
step2_extract_chain_config Config file Configuration file for biobb_structure_utils.extract_chain tool.
  • string
step3_fix_altlocs_output_pdb_path Output file Output PDB file path.
  • string
step3_fix_altlocs_config Config file Configuration file for biobb_model.fix_altlocs tool.
  • string
step4_fix_ssbonds_output_pdb_path Output file Output PDB file path.
  • string
step5_remove_molecules_ions_output_molecules_path Output file Output molcules file path.
  • string
step5_remove_molecules_ions_config Config file Configuration file for biobb_structure_utils.remove_molecules tool.
  • string
step6_remove_molecules_ligands_output_molecules_path Output file Output molcules file path.
  • string
step6_remove_molecules_ligands_config Config file Configuration file for biobb_structure_utils.remove_molecules tool.
  • string
step7_reduce_remove_hydrogens_output_path Output file Path to the output file.
  • string
step8_remove_pdb_water_output_pdb_path Output file Output PDB file path.
  • string
step9_fix_amides_output_pdb_path Output file Output PDB file path.
  • string
step10_fix_chirality_output_pdb_path Output file Output PDB file path.
  • string
step11_fix_side_chain_output_pdb_path Output file Output PDB file path.
  • string
step12_fix_backbone_input_fasta_canonical_sequence_path Input file Input FASTA file path.
  • File
step12_fix_backbone_output_pdb_path Output file Output PDB file path.
  • string
step13_leap_gen_top_output_pdb_path Output file Output 3D structure PDB file matching the topology file.
  • string
step13_leap_gen_top_output_top_path Output file Output topology file (AMBER ParmTop).
  • string
step13_leap_gen_top_output_crd_path Output file Output coordinates file (AMBER crd).
  • string
step13_leap_gen_top_config Config file Configuration file for biobb_amber.leap_gen_top tool.
  • string
step14_sander_mdrun_output_traj_path Output file Output trajectory file.
  • string
step14_sander_mdrun_output_rst_path Output file Output restart file.
  • string
step14_sander_mdrun_output_log_path Output file Output log file.
  • string
step14_sander_mdrun_config Config file Configuration file for biobb_amber.sander_mdrun tool.
  • string
step15_amber_to_pdb_output_pdb_path Output file Structure PDB file.
  • string
step16_fix_pdb_output_pdb_path Output file Output PDB file path.
  • string
step17_structure_check_output_summary_path Output file Output summary checking results.
  • string

Steps

ID Name Description
step0_structure_check_init structure_check This class is a wrapper of the Structure Checking tool to generate summary checking results on a json file.
step1_extract_model extract_model This class is a wrapper of the Structure Checking tool to extract a model from a 3D structure.
step2_extract_chain extract_chain This class is a wrapper of the Structure Checking tool to extract a chain from a 3D structure.
step3_fix_altlocs fix_altlocs Fix alternate locations from residues.
step4_fix_ssbonds fix_ssbonds Fix SS bonds from residues.
step5_remove_molecules_ions remove_molecules Class to remove molecules from a 3D structure using Biopython.
step6_remove_molecules_ligands remove_molecules Class to remove molecules from a 3D structure using Biopython.
step7_reduce_remove_hydrogens reduce_remove_hydrogens Removes hydrogen atoms to small molecules.
step8_remove_pdb_water remove_pdb_water This class is a wrapper of the Structure Checking tool to remove water molecules from PDB 3D structures.
step9_fix_amides fix_amides Creates a new PDB file flipping the clashing amide groups.
step10_fix_chirality fix_chirality Creates a new PDB file fixing stereochemical errors in residue side-chains changing It's chirality.
step11_fix_side_chain fix_side_chain Reconstructs the missing side chains and heavy atoms of the given PDB file.
step12_fix_backbone fix_backbone Reconstructs the missing backbone atoms of the given PDB file.
step13_leap_gen_top leap_gen_top Generates a MD topology from a molecule structure using tLeap tool from the AmberTools MD package
step14_sander_mdrun sander_mdrun Runs energy minimization, molecular dynamics, and NMR refinements using sander tool from the AmberTools MD package
step15_amber_to_pdb amber_to_pdb Generates a PDB structure from AMBER topology (parmtop) and coordinates (crd) files, using the ambpdb tool from the AmberTools MD package
step16_fix_pdb fix_pdb Renumerates residues in a PDB structure according to a reference sequence from UniProt
step17_structure_check structure_check This class is a wrapper of the Structure Checking tool to generate summary checking results on a json file.

Outputs

ID Name Description Type
step0_structure_check_init_out1 output_summary_path Output summary checking results.
  • File
step1_extract_model_out1 output_structure_path Output structure file path.
  • File
step2_extract_chain_out1 output_structure_path Output structure file path.
  • File
step3_fix_altlocs_out1 output_pdb_path Output PDB file path.
  • File
step4_fix_ssbonds_out1 output_pdb_path Output PDB file path.
  • File
step5_remove_molecules_ions_out1 output_molecules_path Output molcules file path.
  • File
step6_remove_molecules_ligands_out1 output_molecules_path Output molcules file path.
  • File
step7_reduce_remove_hydrogens_out1 output_path Path to the output file.
  • File
step8_remove_pdb_water_out1 output_pdb_path Output PDB file path.
  • File
step9_fix_amides_out1 output_pdb_path Output PDB file path.
  • File
step10_fix_chirality_out1 output_pdb_path Output PDB file path.
  • File
step11_fix_side_chain_out1 output_pdb_path Output PDB file path.
  • File
step12_fix_backbone_out1 output_pdb_path Output PDB file path.
  • File
step13_leap_gen_top_out1 output_pdb_path Output 3D structure PDB file matching the topology file.
  • File
step13_leap_gen_top_out2 output_top_path Output topology file (AMBER ParmTop).
  • File
step13_leap_gen_top_out3 output_crd_path Output coordinates file (AMBER crd).
  • File
step14_sander_mdrun_out1 output_traj_path Output trajectory file.
  • File
step14_sander_mdrun_out2 output_rst_path Output restart file.
  • File
step14_sander_mdrun_out3 output_log_path Output log file.
  • File
step15_amber_to_pdb_out1 output_pdb_path Structure PDB file.
  • File
step16_fix_pdb_out1 output_pdb_path Output PDB file path.
  • File
step17_structure_check_out1 output_summary_path Output summary checking results.
  • File

Version History

Version 1 (earliest) Created 5th Mar 2024 at 08:41 by Genís Bayarri

Initial commit


Frozen Version-1 24d3421
help Creators and Submitter
Citation
Hospital, A., & Bayarri, G. (2024). CWL Molecular Structure Checking. WorkflowHub. https://doi.org/10.48546/WORKFLOWHUB.WORKFLOW.776.1
Activity

Views: 219

Created: 5th Mar 2024 at 08:41

help Tags

This item has not yet been tagged.

help Attributions

None

Total size: 468 KB
Powered by
(v.1.14.1)
Copyright © 2008 - 2023 The University of Manchester and HITS gGmbH