Rega 0.99.6
library(Rega)
The Rega package follows security best practices by storing sensitive information (like API keys or passwords) in credential store or as environment variables rather than hard-coding them into your scripts.
To keep your credentials secure, we offer two options (see below for details):
You can add an entry to your operating system credential store using keyring
package. By default, Rega will look for a REGA_EGA service name. You should
also specify your username, to avoid typing it every time you connect to the
API. Avoid using more than a single user for this service, for simplicity
Rega will only retrieve the first username.
# You will be prompted for password
keyring::key_set(
service = "REGA_EGA",
username = "<your-ega-username>"
)
httr2 secret# Run this in your R console to generate a key
httr2::secret_make_key()
[1] "FPAc6dQWJ3FTXrblIcFW7Q"
To make this key available every time you open R, you must store it in your
user-level .Renviron file.
usethis::edit_r_environ() to open the file.REGA_KEY="<your-generated-key>"Important: Restart R after saving to ensure the variable is loaded into your environment.
Now, use your master key (REGA_KEY) to encrypt your actual EGA password. This ensures that even if someone sees your .Renviron file, they cannot read your password.
Run httr2::secret_encrypt("<your-ega-password>", "REGA_KEY") and copy the
encrypted password.
Finally, store the encrypted string (not your plain-text password) in your .Renviron file.
REGA_EGA_PASSWORD="<your-encrypted-string>"usethis::edit_r_environ() to open the file.REGA_EGA_USERNAME="<your-ega-username>"Download the empty MS Excel template from
inst/extdata/ega_full_template_v3.xlsx and fill it in according to the
instructions in the ‘Instructions’ tab.
The default parser is pre-configured to handle the bundled xlsx template
(inst/extdata/ega_full_template_v3.xlsx) automatically. As long as the
templateis filled out according to the provided instructions, the default
parameters will work seamlessly, and no manual adjustments are required.
If you need to customize the parser’s behavior—such as toggling the c4gh
file extension, you can modify the settings via the YAML configuration. To do
this, create a local copy of inst/extdata/default_parser_params.yaml,
adjust the values as needed, and pass the path of your new file to the
param_file argument in the default_parser function.
metadata_file <- system.file(
"extdata/submission_example.xlsx",
package = "Rega"
)
parsed_metadata <- default_parser(metadata_file)
head(parsed_metadata)
$aliases
$aliases$studies
[1] "Study1"
$aliases$experiments
[1] "Experiment1"
$aliases$datasets
[1] "Dataset1"
$aliases$samples
[1] "Sample1"
$aliases$runs
[1] "Run1"
$aliases$analyses
[1] "Analysis1"
$files
# A tibble: 1 × 2
file ega_file
<chr> <chr>
1 example.fastq.gz /example.fastq.gz.c4gh
$submission
# A tibble: 1 × 1
title
<chr>
1 Your submission name
$studies
# A tibble: 1 × 4
study title description study_type
<chr> <chr> <chr> <chr>
1 Study1 Example Study Example Study Description Transcriptome Analysis
$samples
# A tibble: 1 × 4
alias phenotype biological_sex subject_id
<chr> <chr> <chr> <chr>
1 Sample1 Control male S1
$experiments
# A tibble: 1 × 8
study experiment design_description library_selection instrument_model_id
<chr> <chr> <chr> <chr> <int>
1 Study1 Experiment1 Expermient Design RANDOM 1
# ℹ 3 more variables: library_layout <chr>, library_strategy <chr>,
# library_source <chr>
To ensure a seamless submission process, the package includes a client-side validation layer. This system automatically cross-references your metadata against the schema requirements of both the EGA API and the underlying target database. To ensure your submission continues smoothly, you should address all flagged validation failures and errors.
validation_summary <- default_validator(parsed_metadata)
validation_summary
name items passes fails nNA error warning
1 study_is_na 1 1 0 0 FALSE FALSE
2 study_is_unique 1 1 0 0 FALSE FALSE
3 study_in_aliases 1 1 0 0 FALSE FALSE
4 study_all_aliases 1 1 0 0 FALSE FALSE
5 experiment_is_na 1 1 0 0 FALSE FALSE
6 experiment_is_unique 1 1 0 0 FALSE FALSE
7 experiment_in_aliases 1 1 0 0 FALSE FALSE
8 experiment_all_aliases 1 1 0 0 FALSE FALSE
9 alias_is_na 1 1 0 0 FALSE FALSE
10 alias_is_unique 1 1 0 0 FALSE FALSE
11 alias_in_aliases 1 1 0 0 FALSE FALSE
12 alias_all_aliases 1 1 0 0 FALSE FALSE
13 run_is_na 1 1 0 0 FALSE FALSE
14 run_is_unique 1 1 0 0 FALSE FALSE
15 run_in_aliases 1 1 0 0 FALSE FALSE
16 run_all_aliases 1 1 0 0 FALSE FALSE
17 dataset_is_na 1 1 0 0 FALSE FALSE
18 dataset_is_unique 1 1 0 0 FALSE FALSE
19 dataset_in_aliases 1 1 0 0 FALSE FALSE
20 dataset_all_aliases 1 1 0 0 FALSE FALSE
21 submission_title_is_na 1 1 0 0 FALSE FALSE
22 run_experiment_is_na 1 1 0 0 FALSE FALSE
23 run_sample_is_na 1 1 0 0 FALSE FALSE
24 run_file_type_is_na 1 1 0 0 FALSE FALSE
25 run_file_is_na 1 1 0 0 FALSE FALSE
26 run_file_is_unique 1 1 0 0 FALSE FALSE
27 run_experiment_in_aliases 1 1 0 0 FALSE FALSE
28 run_sample_in_aliases 1 1 0 0 FALSE FALSE
29 studies_title_is_unique 1 1 0 0 FALSE FALSE
30 studies_description_is_unique 1 1 0 0 FALSE FALSE
31 studies_title_length 1 0 1 0 FALSE FALSE
32 studies_description_length 1 0 1 0 FALSE FALSE
33 dataset_title_is_unique 1 1 0 0 FALSE FALSE
34 dataset_description_is_unique 1 1 0 0 FALSE FALSE
35 dataset_run_in_aliases 1 1 0 0 FALSE FALSE
36 dataset_all_aliases_in_run 1 1 0 0 FALSE FALSE
37 dataset_title_length 1 0 1 0 FALSE FALSE
38 dataset_description_length 1 0 1 0 FALSE FALSE
expression
1 !is.na(study)
2 is_unique(study)
3 study %vin% aliases[["studies"]]
4 aliases[["studies"]] %vin% study
5 !is.na(experiment)
6 is_unique(experiment)
7 experiment %vin% aliases[["experiments"]]
8 aliases[["experiments"]] %vin% experiment
9 !is.na(alias)
10 is_unique(alias)
11 alias %vin% aliases[["samples"]]
12 aliases[["samples"]] %vin% alias
13 !is.na(run)
14 is_unique(run)
15 run %vin% aliases[["runs"]]
16 aliases[["runs"]] %vin% run
17 !is.na(dataset)
18 is_unique(dataset)
19 dataset %vin% aliases[["datasets"]]
20 aliases[["datasets"]] %vin% dataset
21 !is.na(title)
22 !is.na(experiment)
23 !is.na(alias)
24 !is.na(run_file_type)
25 !is.na(files)
26 is_unique(unlist(files))
27 experiment %vin% aliases[["experiments"]]
28 alias %vin% aliases[["samples"]]
29 is_unique(title)
30 is_unique(description)
31 get_word_number(title) >= 3 & get_word_number(title) <= 20
32 get_sentence_number(description) >= 3 & get_sentence_number(description) <= 5
33 is_unique(title)
34 is_unique(description)
35 unlist(runs) %vin% aliases[["runs"]]
36 aliases[["runs"]] %vin% unlist(runs)
37 (get_word_number(title) >= 3) & (get_word_number(title) <= 20)
38 (get_sentence_number(description) >= 3) & (get_sentence_number(description) <= 5)
new_submission workflowresponses <- new_submission(parsed_metadata, logfile = "log.yaml")
If you encounter errors during metadata submission and would like to get more details, you can create a client with verbose logging.
Extract EGA API using the bundled YAML specification and create a client using
the embedded httr2 OAuth authentication (default), changing the verbosity.
api <- extract_api()
ega <- create_client(api, verbosity = 3)
Run the new_submission workflow with the custom client.
responses <- new_submission(parsed_metadata, client = ega)
This will create your metadata submission in EGA and fill in all provided
information. However, this workflow does not finalize your submission. In order
to finalize submission either use the GUI interface of EGA Submitter Portal,
or run finalise_submission("<returned_submission_id>", "<release_date>").
Note that the release date should ideally be around 2 weeks away from
metadata submission to allow for review by EGA team.
There are several other workflow available:
get_submission:get_entry_by_title:delete_submission_contents:delete_submission:rollback_submission:Please see the corresponding help pages for more details.
You can get the detailed data on individual tables (submissions, studies,
samples, experiments, runs, analyses and datasets) that contain a
specific string in their title column using get_entry_by_title function.
# checks all tables
resp <- get_entry_by_title("RNASeq")
# checks only samples and studies, logs responses
resp <- get_entry_by_title(
"RNASeq", type = c("samples", "studies"), logfile = "log.yaml"
)
Or delete the entire contents of current submission metadata via
delete_submission_contents workflow or delete the entire submission by
using the delete_submission workflow.
resp <- delete_submission_contents(00001, ega)
resp <- delete_submission(00001, ega)
If you wish to create your own templates for EGA submissions, we provide a few functions to retrieve properties and enums through API and save them in text files. We will use the API and the client created above.
Relevant functions include:
get_schemas()get_properties()For testing, debugging and prototyping purposes, it is possible to directly use generated bearer token with API when creating the client. It is then the responsibility of the user to track the validity and refresh the token as necessary.
bt <- ega_token()
ega <- create_client(api, bt$access_token)
ega$get__enums()
Workflow for updating the submission metadata by PUT method is not available.
For this particular use case, the users are advised to create the client with
ega <- create_client(extract_api()) and use individual functions prefixed with
put__ e.g. ega$put__samples__accession_id to update the submission
metadata.
sessionInfo()
R version 4.6.0 alpha (2026-04-05 r89794)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 24.04.4 LTS
Matrix products: default
BLAS: /home/biocbuild/bbs-3.23-bioc/R/lib/libRblas.so
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0 LAPACK version 3.12.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_GB LC_COLLATE=C
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
time zone: America/New_York
tzcode source: system (glibc)
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] Rega_0.99.6 knitr_1.51 BiocStyle_2.39.0
loaded via a namespace (and not attached):
[1] jsonlite_2.0.0 dplyr_1.2.1 compiler_4.6.0
[4] BiocManager_1.30.27 tidyselect_1.2.1 validate_1.1.7
[7] stringr_1.6.0 tidyr_1.3.2 jquerylib_0.1.4
[10] yaml_2.3.12 fastmap_1.2.0 readxl_1.4.5
[13] jsonvalidate_1.5.0 R6_2.6.1 generics_0.1.4
[16] httr2_1.2.2 tibble_3.3.1 bookdown_0.46
[19] openssl_2.3.5 bslib_0.10.0 pillar_1.11.1
[22] rlang_1.2.0 utf8_1.2.6 cachem_1.1.0
[25] stringi_1.8.7 xfun_0.57 sass_0.4.10
[28] otel_0.2.0 cli_3.6.6 magrittr_2.0.5
[31] grid_4.6.0 digest_0.6.39 settings_0.2.7
[34] keyring_1.4.1 rappdirs_0.3.4 askpass_1.2.1
[37] lifecycle_1.0.5 vctrs_0.7.3 evaluate_1.0.5
[40] glue_1.8.0 cellranger_1.1.0 rmarkdown_2.31
[43] purrr_1.2.2 tools_4.6.0 pkgconfig_2.0.3
[46] htmltools_0.5.9