Workflow Type: Galaxy
Stable

High-Performance Computing (HPC) environments are integral to quantum chemistry and computationally intense research, yet their complexity poses challenges for non-HPC experts. Navigating these environments proves challenging for researchers lacking extensive computational knowledge, hindering efficient use of domain specific research software. The prediction of mass spectra for in silico annotation is therefore inaccessible for many wet lab scientists. Our main goal is to facilitate non-experts in HPC navigate this complexity and make semi-empirical Quantum Chemistry (QC)-based predictions available without needing advanced computational skills. To address this challenge, a comprehensive approach is proposed. We chose specific file formats for storing molecular structures, ensuring compatibility across diverse tools and platforms. The xTB quantum chemistry package for molecular geometry optimization is leveraged for its capability to balance between accuracy and computational cost, making it well-suited for non-HPC focused applications. Integrating QC-based Mass Spectrometry (QCxMS) into Galaxy enables the prediction of mass spectra and offers insights into molecular composition and properties. Our workflow demonstrates the utility of computing spectra using QCxMS along with complementary tools. We also present details of runtime performance metrics for four distinct molecules. This work highlights how non-HPC users can execute these predictions with ease, without requiring advanced computational skills. Additionally, a Docker image is created to encapsulate necessary tools, accompanied by user-friendly wrappers, simplifying the entire process for non-expert users. Within this context, potential improvements are considered, focusing on improving the Conda package for better performance by incorporating Fortran and Intel compiler optimizations. These considerations play a crucial role in refining the proposed methodology, enhancing user experience, and expanding the reach of semi-empirical predictions in quantum chemistry for mass spectra predictions

Inputs

ID Name Description Type
Input Molecules with SMILES and NAME without a header. Input Molecules with SMILES and NAME without a header. First column should containe the name of the molecule, the second should contain the SMILES code.
  • File
Number of conformers to generate Number of conformers to generate By default one conformer
  • int?
Optimization Levels Optimization Levels Level of accuracy for the optimization
  • string
QC Method QC Method Available: GFN1-xTB and GFN2-xTB
  • string

Steps

ID Name Description
4 Cut SMILES column toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_cut_tool/9.3+galaxy1
5 Cut NAME column toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_cut_tool/9.3+galaxy1
6 Split file toolshed.g2.bx.psu.edu/repos/bgruening/split_file_to_collection/split_file_to_collection/0.5.2
7 Split file toolshed.g2.bx.psu.edu/repos/bgruening/split_file_to_collection/split_file_to_collection/0.5.2
8 Parse parameter value param_value_from_file
9 Convert compounds from SMILES to SDF and add the name as title. toolshed.g2.bx.psu.edu/repos/bgruening/openbabel_compound_convert/openbabel_compound_convert/3.1.1+galaxy0
10 Merge the individual SDF files toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_cat/9.3+galaxy1
11 Generate conformers Generate 3D conformers from SDF input for each molecule. It requires the number of conformers as an input parameter. Default parameters value is 1. toolshed.g2.bx.psu.edu/repos/bgruening/ctb_im_conformers/ctb_im_conformers/1.1.4+galaxy0
12 Molecular format conversion Convert the conformer to cartesian coordinate (XYZ) format toolshed.g2.bx.psu.edu/repos/bgruening/openbabel_compound_convert/openbabel_compound_convert/3.1.1+galaxy0
13 xTB molecular optimization Semi-empirical optimization toolshed.g2.bx.psu.edu/repos/recetox/xtb_molecular_optimization/xtb_molecular_optimization/6.6.1+galaxy3
14 QCxMS neutral run Produce preparation input files for production runs toolshed.g2.bx.psu.edu/repos/recetox/qcxms_neutral_run/qcxms_neutral_run/5.2.1+galaxy4
15 QCxMS production run Calculate the mass spectra for a given molecule using QCxMS. It generates .res files, which are collected and converted into MSP format in the last step toolshed.g2.bx.psu.edu/repos/recetox/qcxms_production_run/qcxms_production_run/5.2.1+galaxy3
16 Filter failed datasets Remove failed runs __FILTER_FAILED_DATASETS__
17 QCxMS get results Produce simulated mass spectra into MSP file format. toolshed.g2.bx.psu.edu/repos/recetox/qcxms_getres/qcxms_getres/5.2.1+galaxy2

Outputs

ID Name Description Type
conformer_output conformer_output n/a
  • File
XYZ output XYZ output n/a
  • File
optimized output optimized output n/a
  • File
[.in] output [.in] output n/a
  • File
[.start] output [.start] output n/a
  • File
[.xyz] output [.xyz] output n/a
  • File
res output res output n/a
  • File
MSP output MSP output n/a
  • File

Version History

Galaxy Workflow End-to-end EI mass spectra prediction workflow using QCxMS (latest) Created 5th Aug 2024 at 14:53 by Helge Hecht

New version starting from a table with SMILES and NAMES to generate an SDF and then run the previous workflow.


Open master 1a50bdb

Version 1 (earliest) Created 3rd Jun 2024 at 14:52 by Wudmir Rojas

qcxms galaxy workflow


Frozen Version-1 0007a6a
help Creators and Submitter
Creators
Additional credit

RECETOX SpecDat

Submitter
License
Activity

Views: 1070   Downloads: 140   Runs: 0

Created: 3rd Jun 2024 at 14:52

Last updated: 5th Aug 2024 at 15:12

help Attributions

None

Total size: 318 KB
Powered by
(v.1.16.0-main)
Copyright © 2008 - 2024 The University of Manchester and HITS gGmbH