MetaboAnnotatoR is designed to perform metabolite annotation of features from LC-MS All-ion fragmentation (AIF) datasets, using ion fragment databases. It requires raw LC-MS AIF chromatograms acquired/transformed in centroid mode.
To install this package, start R (version “4.5.0” or higher) and enter:
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("MetaboAnnotatoR")
An example of feature annotation using LC-MS AIF chromatograms processed using xcms and RamClustR packages is illustrated here. The details of how the example dataset was obtained check MetaboAnnotatoR original paper for the full details: https://pubs.acs.org/doi/10.1021/acs.analchem.1c03032.
For more details on RAMClustR object, check the original publication: https://pubs.acs.org/doi/10.1021/ac501530d.
Firstly load library and dependencies:
library(MetaboAnnotatoR)
As an input, MetaboAnnotatoR requires a data frame containing the features to be annotated and either a raw AIF LC-MS chromatogram (as .mzML or CDF) or a processed dataset composed of two objects: RAMClustR (object containing the pseudo-MS/MS spectra) and an XCMS object containing the peak-picked data. Additionally, the fragment libraries need to be specified.
Firstly a data table (targets) containing one feature to annotate needs to be loaded. There is an example feature table in MetaboAnnotatoR (targetTable.csv) that it will be used in this example.
tfile <- system.file("extdata", "targetTable.csv", package="MetaboAnnotatoR")
targets <- read.csv(tfile)
This table contains 6 features from a LC-MS Lipidomics (ESI+) chromatogram to be annotated.
The example in this vignette will use of processed data, included in the package. These consist of: 1) an xcmsSet object (xset) containing the processed data from 100 AIF LC-MS chromatograms from human serum samples and 2) the respective pseudo-MS/Ms spectra obtained by processing the xcmsSet data using RAMClustR (RC). The data can be loaded as followed:
data("xset")
data("RC")
Since the features come from a ESI+ lipidomics experiment, annotation can be performed using the default Lipid Positive mode libraries “LipidPos”. For this, the default Lipid Positive libraries must be first loaded into the workspace:
data("LipidPos")
Then annotations can be performed using the annotateRC function. The results will be stored in an object (annotations ):
annotations <- annotateRC(targets, xcmsObject=xset, ramclustObj=RC,
libs="LipidPos")
#> No RT information provided...
#> ... Processing feature 1 of 6 ...
#> Searching candidates...
#> ... Processing feature 2 of 6 ...
#> Searching candidates...
#> ... Processing feature 3 of 6 ...
#> Searching candidates...
#> Matching fragments to pseudo-MS/MS and highCE spectra...
#> ... Processing feature 4 of 6 ...
#> Searching candidates...
#> Matching fragments to pseudo-MS/MS and highCE spectra...
#> ... Processing feature 5 of 6 ...
#> Searching candidates...
#> Matching fragments to pseudo-MS/MS and highCE spectra...
#> ... Processing feature 6 of 6 ...
#> Searching candidates...
#> Matching fragments to pseudo-MS/MS and highCE spectra...
#> Job done!
The most significant annotations (rank 1 annotations) for each feature are summarised in the global results object within the annotations object:
annotations$global
#> feature.mz feature.rt metabolite feature.type ion.type isotope mz.metabolite
#> 1 286.1442 40.77069 <NA> <NA> <NA> M+0 NA
#> 2 585.2692 72.79411 <NA> <NA> <NA> M+0 NA
#> 3 468.3095 82.92009 LPC(14:0) parent [M+H]+ M+0 468.3085
#> 4 520.3409 100.62388 LPC(18:2) parent [M+H]+ M+0 520.3398
#> 5 496.3410 113.59412 <NA> <NA> <NA> M+0 NA
#> 6 478.2938 104.22690 LPE(18:2) parent [M+H]+ M+0 478.2928
#> matched.mz mz.error pseudoMSMS fraction score
#> 1 NA NA FALSE <NA> NA
#> 2 NA NA TRUE <NA> NA
#> 3 468.3085 2.026865 TRUE 3 of 4 0.5716864
#> 4 520.3398 2.014641 FALSE 3 of 4 0.4231832
#> 5 NA NA FALSE <NA> NA
#> 6 478.2928 2.017588 FALSE 1 of 5 0.2540706
Three out of the six features were annotated with to a lipid.
It is also possible to inspect if there were other candidate annotations for a given feature, for instance feature 3: 468.3095 m/z, 82.92009 s. This information can be accessed from the rankedResult object stored in the annotations. For feature 3, it is accessed as follows:
annotations$rankedResult[[3]]
#> feature.mz feature.rt metabolite feature.type ion.type
#> 17 468.3095 82.92009 LPC(14:0) parent [M+H]+
#> 24.3 468.3095 82.92009 PC(20:0) PC(6:0_14:0) fragment [LPC_tail2]+
#> 24.4 468.3095 82.92009 PC(21:3) PC(7:3_14:0) fragment [LPC_tail2]+
#> 24.1 468.3095 82.92009 PC(33:1) PC(14:0_19:1) fragment [LPC_tail1]+
#> 24.2 468.3095 82.92009 PC(33:4) PC(14:0_19:4) fragment [LPC_tail1]+
#> 19 468.3095 82.92009 LPE(17:0) parent [M+H]+
#> isotope mz.metabolite matched.mz mz.error pseudoMSMS fraction score
#> 17 M+0 468.3085 468.3085 2.026865 TRUE 3 of 4 0.5716864
#> 24.3 M+0 566.3817 468.3087 1.717241 TRUE 4 of 9 0.4765815
#> 24.4 M+0 574.3504 468.3087 1.717241 TRUE 3 of 9 0.4203315
#> 24.1 M+0 746.5696 468.3087 1.717241 TRUE 3 of 9 0.3161648
#> 24.2 M+0 740.5226 468.3087 1.717241 TRUE 3 of 9 0.3161648
#> 19 M+0 468.3085 468.3085 2.069572 TRUE 2 of 5 0.2665959
#> rank
#> 17 1
#> 24.3 2
#> 24.4 3
#> 24.1 4
#> 24.2 4
#> 19 5
The rank 1 annotation is LPC(14:0). However, it is also possible to see this feature could also be annotated (although with lower score and hence confidence) to fragments of several PCs that also contain the 14:0 fatty acyl chain.
It is possible to visualise the spectra containing the matched ions to each candidate. The example code below will plot the rank 1 candidate for the annotation of the 3rd feature of the targets table:
plotResultSpec(annotations, 3, 1)
It is possible to save the annotation results to a user-specified directory. By default, the global annotations are saved specified directory. The annotation options can be also saved, as well as the pseudo-MS/MS spectra of each matched candidate will be saved (as .pdf) and any pseudo-MS/MS spectra as (.mgf file). For this examples we’ll make use of a temporary directory.
exampleDir <- tempdir()
saveAnnotations(annotations, DirPath=exampleDir, saveOptions=TRUE,
saveXCMSoptions=FALSE, saveRanked=TRUE,
saveRankedSpec=TRUE, savePseudoMSMS=TRUE)
sessionInfo()
#> R version 4.6.0 alpha (2026-04-05 r89794)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.4 LTS
#>
#> Matrix products: default
#> BLAS: /home/biocbuild/bbs-3.23-bioc/R/lib/libRblas.so
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0 LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_GB LC_COLLATE=C
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: America/New_York
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats4 stats graphics grDevices utils datasets methods
#> [8] base
#>
#> other attached packages:
#> [1] MetaboAnnotatoR_0.99.21 MSnbase_2.37.3 ProtGenerics_1.43.0
#> [4] S4Vectors_0.49.1 mzR_2.45.1 Rcpp_1.1.1
#> [7] Biobase_2.71.0 BiocGenerics_0.57.0 generics_0.1.4
#> [10] xcms_4.9.2 BiocParallel_1.45.0 BiocStyle_2.39.0
#>
#> loaded via a namespace (and not attached):
#> [1] DBI_1.3.0 rlang_1.2.0
#> [3] magrittr_2.0.5 clue_0.3-68
#> [5] MassSpecWavelet_1.77.0 otel_0.2.0
#> [7] matrixStats_1.5.0 compiler_4.6.0
#> [9] PTMods_0.99.6 systemfonts_1.3.2
#> [11] vctrs_0.7.3 reshape2_1.4.5
#> [13] stringr_1.6.0 crayon_1.5.3
#> [15] pkgconfig_2.0.3 MetaboCoreUtils_1.19.2
#> [17] fastmap_1.2.0 magick_2.9.1
#> [19] XVector_0.51.0 labeling_0.4.3
#> [21] rmarkdown_2.31 preprocessCore_1.73.0
#> [23] ragg_1.5.2 tinytex_0.59
#> [25] purrr_1.2.2 xfun_0.57
#> [27] MultiAssayExperiment_1.37.4 cachem_1.1.0
#> [29] jsonlite_2.0.0 progress_1.2.3
#> [31] DelayedArray_0.37.1 prettyunits_1.2.0
#> [33] parallel_4.6.0 cluster_2.1.8.2
#> [35] R6_2.6.1 bslib_0.10.0
#> [37] stringi_1.8.7 RColorBrewer_1.1-3
#> [39] limma_3.67.1 GenomicRanges_1.63.2
#> [41] jquerylib_0.1.4 Seqinfo_1.1.0
#> [43] bookdown_0.46 SummarizedExperiment_1.41.1
#> [45] iterators_1.0.14 knitr_1.51
#> [47] IRanges_2.45.0 Matrix_1.7-5
#> [49] igraph_2.2.3 tidyselect_1.2.1
#> [51] dichromat_2.0-0.1 abind_1.4-8
#> [53] yaml_2.3.12 doParallel_1.0.17
#> [55] codetools_0.2-20 affy_1.89.0
#> [57] lattice_0.22-9 tibble_3.3.1
#> [59] plyr_1.8.9 withr_3.0.2
#> [61] S7_0.2.1 evaluate_1.0.5
#> [63] Spectra_1.21.7 pillar_1.11.1
#> [65] affyio_1.81.0 BiocManager_1.30.27
#> [67] MatrixGenerics_1.23.0 foreach_1.5.2
#> [69] MALDIquant_1.22.3 ncdf4_1.24
#> [71] hms_1.1.4 ggplot2_4.0.2
#> [73] scales_1.4.0 MsExperiment_1.13.1
#> [75] glue_1.8.0 MsFeatures_1.19.0
#> [77] lazyeval_0.2.3 tools_4.6.0
#> [79] mzID_1.49.1 data.table_1.18.2.1
#> [81] QFeatures_1.21.2 vsn_3.79.6
#> [83] fs_2.0.1 XML_3.99-0.23
#> [85] grid_4.6.0 impute_1.85.0
#> [87] tidyr_1.3.2 MsCoreUtils_1.23.7
#> [89] PSMatch_1.15.3 cli_3.6.6
#> [91] textshaping_1.0.5 S4Arrays_1.11.1
#> [93] dplyr_1.2.1 AnnotationFilter_1.35.0
#> [95] pcaMethods_2.3.0 gtable_0.3.6
#> [97] sass_0.4.10 digest_0.6.39
#> [99] SparseArray_1.11.13 farver_2.1.2
#> [101] htmltools_0.5.9 lifecycle_1.0.5
#> [103] statmod_1.5.1 MASS_7.3-65