1 Overview
2 Load Reference Resources
3 Build Sample Manifest
- 3.1 Quick start wrapper (get_splicing_impact)
4 Read Splicing Events
5 Sample-Level QC and Exploration
- 5.1 Compare HITindex Between Conditions
- 5.2 PSI Distribution Overview
6 Differential Inclusion (DI)
7 Match Events to Sequences and Build Transcript Pairs
8 Sequence-Level Consequences
9 Domain-Level Changes and Domain Enrichment
10 PPI Rewiring
11 Gene Enrichment Foreground Probes
12 Visualize Transcript and Protein Consequences
13 Inspect a Single Event
14 Integrative Visualization
15 Output Description (hits_final, data, res)
16 S4 Applications and Accessors
- 16.1 S4-first Main Workflow (Code Only)
17 Custom Input Entry Points
18 Session Info

1 Overview

SpliceImpactR links differential splicing to protein-level outcomes, including sequence changes, domain gain/loss, and potential PPI rewiring.

SpliceImpactR is a self-contained and efficient pipeline designed to identify how alternative RNA processing events affect resulting proteins. After initial alternative RNA processing quantification, the method runs entirely within R and does not require additional external software.

The pipeline begins by importing alternative RNA processing events that differ across conditions. It is built to support highly flexible input, requiring only inclusion and exclusion coordinates for each event isoform. This makes it compatible with both user-defined events and outputs from existing tools.

Custom input functions detect multiple classes of alternative RNA processing, including alternative first exons (AFE), hybrid first exons (HFE), retained introns (RI), skipped exons (SE), alternative 5′ and 3′ splice sites (A5SS and A3SS), mutually exclusive exons (MXE), hybrid last exons (HLE), and alternative last exons (ALE).

SpliceImpactR then performs differential inclusion analysis to identify condition-specific transcript pairs that differ by at least one alternative RNA processing event. These events are mapped to transcript and protein annotations to evaluate their effects on coding potential, sequence similarity, and frameshifts.

The pipeline next uses curated and user-provided protein feature sets to annotate domain-level changes, assess shifts in domain usage across conditions, and integrate domain-domain interaction data, domain-motif interaction data, and protein-protein interaction information to predict downstream effects on protein interaction networks.

Finally, SpliceImpactR performs a global analysis to detect co-regulated splicing events and quantify the relative contribution of each alternative RNA processing type. The pipeline is also available through an R Shiny interface to improve accessibility and ease of use.

SpliceImpactR is built to handle both S4 SpliceImpactResult object with custom accession functions and standard data.table input / workflow. SpliceImpactResult objects contain all relevant data along with detailed sequence information for each event. This object allows for maximal interoperability with any level of data from this pipeline.

This vignette uses bundled test data so every core step is runnable.

1.1 Installation

Install with one of the following methods.

1.1.1 Option 1: BiocManager

if (!requireNamespace("BiocManager", quietly = TRUE)) {
  install.packages("BiocManager")
}
BiocManager::install("SpliceImpactR")

1.1.2 Option 2: devtools (GitHub)

if (!requireNamespace("devtools", quietly = TRUE)) {
  install.packages("devtools")
}
devtools::install_github("fiszbein-lab/SpliceImpactR")

1.2 External tools and acronyms

rMATS (replicate Multivariate Analysis of Transcript Splicing) provides event-level splicing tables.
HITindex quantifies first/last exon usage and labels hybrid exons.
PPIDM Protein-Protein Interaction Domain Miner for domain-domain Interaction derived from PPI and 3did’s domain-domain interactions
ELM Eukaryotic Linear Motif Database for accessing Short Linear Motif occurences and domain-motif interactions
BiomaRt accesses data from InterPro, PFAM, SignalP, TMHMM, CDD, Mobidb-lite

1.3 Workflow map

Load reference resources (annotations + protein features).
Build sample manifest (sample_frame).
(Optional) run the quick-start wrapper for end-to-end execution.
Read splicing events for stepwise analysis.
Run QC summaries and plots.
Run differential inclusion.
Match DI events to transcript/protein sequences and build pairs.
Quantify sequence and domain impacts.
Infer PPI rewiring.
Build enrichment foreground gene sets from table or S4 inputs.
Visualize transcript and protein consequences for one event pair.
Generate integrated multi-layer summaries.
Use S4 container workflows and accessors.
Run optional custom-input entry points.

2 Load Reference Resources

get_protein_features() supports these feature sources: - interpro: integrated domain/family/superfamily signatures. - pfam: protein domain HMM families. - cdd: NCBI Conserved Domain Database annotations. - gene3d: CATH/Gene3D structural domain annotations. - signalp: signal peptide calls. - tmhmm: transmembrane helix calls. - ncoils: coiled-coil region predictions. - seg: low-complexity segments. - mobidblite: intrinsically disordered region calls. - elm: short linear motifs (SLiMs; retrieved from ELM, not BioMart).

For BioMart-backed sources, each feature must provide three attributes: {feature}, {feature}_start, and {feature}_end (for example pfam, pfam_start, pfam_end). You can add non-BioMart custom features with get_manual_features() (explained further down in the Custom Input Entry Points section) and merge everything through get_comprehensive_annotations().

We load test annotations and three representative feature sources (interpro, signalp, elm) and combine them. Loading multiple databases in one call is possible, but not recommended due to timeouts. These are automatically cached and loaded from cache if the correct signature (organism, version, etc) is found. BiomaRt can give difficulties in loading these, so this procedure will sometimes take a few tries. If this is unsuccessful, restart R and/or try suggested methods related to this issue (eg: httr::reset_config())

library(SpliceImpactR)
annotation_df <- get_annotation(load = "test")
#> [INFO] Loading bundled test annotation data

interpro_features <- get_protein_features(
  biomaRt_databases = c("interpro"),
  gtf_df = annotation_df$annotations,
  test = TRUE
)

signalp_features <- get_protein_features(
  biomaRt_databases = c("signalp"),
  gtf_df = annotation_df$annotations,
  test = TRUE
)

elm_features <- get_protein_features(
  biomaRt_databases = c("elm"),
  gtf_df = annotation_df$annotations,
  test = TRUE
)

protein_feature_total <- get_comprehensive_annotations(
  list(interpro_features, signalp_features, elm_features)
)

exon_features <- get_exon_features(
  annotation_df$annotations,
  protein_feature_total
)

table(protein_feature_total$database)
#> 
#>      elm interpro  signalp 
#>       18     1449       13

3 Build Sample Manifest

Each sample directory contains event files exported by rMATS and/or HITindex. The manifest below points to the bundled test directories. Using this function, SpliceImpactR will search each directory for the outputs provided by rMATS and HIT Index output files ending in .AFEPSI, .ALEPSI, and .exon. Label one condition case and one condition control.

Required sample manifest columns: - path: per-sample directory containing splicing output files. - sample_name: unique sample identifier. - condition: case/control label used in DI and downstream pairing.

Manifest expectations: - One row per sample. - path values must point to readable sample directories. - Use exactly two conditions for this standard workflow (case, control). - Include replicate samples per condition for robust differential modeling.

sample_frame <- data.frame(
  path = c(
    file.path(system.file("extdata", package = "SpliceImpactR"),
              "rawData/control_S5/"),
    file.path(system.file("extdata", package = "SpliceImpactR"),
              "rawData/control_S6/"),
    file.path(system.file("extdata", package = "SpliceImpactR"),
              "rawData/control_S7/"),
    file.path(system.file("extdata", package = "SpliceImpactR"),
              "rawData/control_S8/"),
    file.path(system.file("extdata", package = "SpliceImpactR"),
              "rawData/case_S1/"),
    file.path(system.file("extdata", package = "SpliceImpactR"),
              "rawData/case_S2/"),
    file.path(system.file("extdata", package = "SpliceImpactR"),
              "rawData/case_S3/"),
    file.path(system.file("extdata", package = "SpliceImpactR"),
              "rawData/case_S4/")
  ),
  sample_name = c("S5", "S6", "S7", "S8", "S1", "S2", "S3", "S4"),
  condition = c(
    "control", "control", "control", "control",
    "case", "case", "case", "case"
  ),
  stringsAsFactors = FALSE
)

3.1 Quick start wrapper (`get_splicing_impact`)

If your inputs already follow the standard layout, run the full workflow with the top-level wrapper and choose table or S4 return mode.

# data.table-first return
out_dt <- get_splicing_impact(
  sample_frame = sample_frame,
  source_data = "both",
  annotation_df = annotation_df,
  protein_feature_total = protein_feature_total,
  exon_features = exon_features,
  return_class = "data.table"
)

# S4-first return (single object carries all slots)
out_s4 <- get_splicing_impact(
  sample_frame = sample_frame,
  source_data = "both",
  annotation_df = annotation_df,
  protein_feature_total = protein_feature_total,
  exon_features = exon_features,
  return_class = "S4"
)

4 Read Splicing Events

For the stepwise workflow in this vignette, load event data explicitly:

data <- get_rmats_hit(
  sample_frame,
  event_types = c("ALE", "AFE", "MXE", "SE", "A3SS", "A5SS", "RI")
)
#> [INFO] Filtered 0 unannotated ALE/AFE events (465 -> 465)

DT <- data.table::as.data.table(data)
DT[, .(
  n_rows = .N,
  n_events = data.table::uniqueN(event_id),
  n_genes = data.table::uniqueN(gene_id)
)]
#>    n_rows n_events n_genes
#>     <int>    <int>   <int>
#> 1:   5863      605      41

If rMATS and HITindex outputs are stored in separate per-sample directories, load them independently and combine on shared columns. In this test-data example, both manifests point to the same bundled paths; in real analyses, sample_frame_rmats$path and sample_frame_hit$path can be different.

sample_frame_rmats <- sample_frame
sample_frame_hit <- sample_frame

rmats_only <- get_rmats(
  load_rmats(
    sample_frame_rmats,
    use = "JCEC",
    event_types = c("MXE", "SE", "A3SS", "A5SS", "RI")
  )
)

hit_only <- get_hitindex(
  sample_frame_hit,
  keep_annotated_first_last = TRUE
)
#> [INFO] Filtered 0 unannotated ALE/AFE events (465 -> 465)

shared_cols <- intersect(names(rmats_only), names(hit_only))
data_separate <- data.table::rbindlist(
  list(
    rmats_only[, ..shared_cols],
    hit_only[, ..shared_cols]
  ),
  use.names = TRUE,
  fill = TRUE
)

data_separate[, .(
  n_rows = .N,
  n_events = data.table::uniqueN(event_id),
  n_genes = data.table::uniqueN(gene_id)
)]
#>    n_rows n_events n_genes
#>     <int>    <int>   <int>
#> 1:   5863      605      41

5 Sample-Level QC and Exploration

5.1 Compare HITindex Between Conditions

This summary compares event-level mean HIT values across conditions.

hit_compare <- compare_hit_index(
  sample_frame,
  condition_map = c(control = "control", test = "case")
)

hit_compare$plot

Figure 1: HITindex condition comparison

head(hit_compare$results[order(fdr)], 6)
#>                               event_key    control        test   diff_HIT
#>                                  <char>      <num>       <num>      <num>
#> 1: ENSG00000158286|chr1:6210370-6211118 -0.1890000  0.12766667  0.3166667
#> 2: ENSG00000158286|chr1:6211867-6212053 -0.1905000 -0.23400000 -0.0435000
#> 3: ENSG00000158286|chr1:6212218-6212416 -0.1783333 -0.14533333  0.0330000
#> 4: ENSG00000158286|chr1:6212682-6213183  0.4350000  0.04666667 -0.3883333
#> 5: ENSG00000158286|chr1:6218289-6218369 -0.2993333 -0.05766667  0.2416667
#> 6: ENSG00000142599|chr1:8355419-8355599  0.1010000  0.10800000  0.0070000
#>    delta_HIT n_control n_test   p_value   fdr
#>        <num>     <int>  <int>     <num> <num>
#> 1: 0.3166667         2      3 1.0000000     1
#> 2: 0.0435000         2      3 1.0000000     1
#> 3: 0.0330000         3      3 0.8337307     1
#> 4: 0.3883333         2      3 1.0000000     1
#> 5: 0.2416667         3      3 0.2567179     1
#> 6: 0.0070000         4      4 0.9338296     1

5.2 PSI Distribution Overview

This panel summarizes event count depth-normalized metrics and PSI eCDF.

overview_plot <- overview_spicing_comparison(
  events = data,
  sample_df = sample_frame,
  depth_norm = "exon_files",
  event_type = "AFE"
)

overview_plot

Figure 2: Depth-normalized event counts and PSI overview

6 Differential Inclusion (DI)

We run quasi-binomial GLMs and filter significant events. Verbose filtering messages are line-wrapped so progress logs remain readable without horizontal scrolling. For efficient processing, set parallel_glm to TRUE and provide a BiocParallel object to BPPARAM.

Default thresholds in keep_sig_pairs is FDR < 0.05 and |DELTA PSI| > 0.1.

res <- get_differential_inclusion(
  data,
  min_total_reads = 10,
  parallel_glm = TRUE,
  BPPARAM = BiocParallel::SerialParam()
)
#> [INFO] Input contains 41 genes, 605 events, and 2932 event instances.
#> [PROCESSING/INFO] Filtering out low-coverage rows removed 892 event instances; 4971
#> remain.
#> [PROCESSING/INFO] Completing AFE/ALE with zeros per sample (total strategy: event_max)
#> filled 407 AFE rows and 160 ALE rows (totals: AFE=608, ALE=424).
#> [PROCESSING/INFO] Filtering genes with no nonzero values per sample/event_type removed
#> 37 genes from specific sample groups; 40 genes remain overall.
#> [PROCESSING/INFO] Filtering by minimum condition presence removed 211 events; 351
#> events remain.
#> [PROCESSING/INFO] Removed 62 non-changing events; 289 events remain.
#> [PROCESSING/INFO] Filtering events not represented in both conditions removed 30
#> events; 259 events remain.
#> [PROCESSING] Fitting quasi-binomial GLMs per site...
#> 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |======================================================================| 100%
#> 
#> [DONE] Fitted 531 sites in 1 chunks.
#> [INFO] Done.

res_di <- keep_sig_pairs(res)

res[, .(
  n_rows = .N,
  n_sig = sum(padj <= 0.05 & abs(delta_psi) >= 0.1, na.rm = TRUE)
)]
#>    n_rows n_sig
#>     <int> <int>
#> 1:    533    73

plot_di_volcano_dt(res)

Figure 3: Differential inclusion volcano plot

7 Match Events to Sequences and Build Transcript Pairs

Significant DI events are mapped to annotation and sequence data, then collapsed into case/control transcript pairs. This matching is done through a strict hierarchy:

Prefilter by chr, strand, and gene_id to keep only compatible annotation intervals.
Match inclusion (inc) coordinates to exons and require all inclusion parts to be covered for a transcript candidate.
Remove candidates that overlap exclusion (exc) coordinates.
Prioritize transcript/exon choices by event-type-consistent exon class (first, internal, last), then reciprocal overlap and intersection width; protein-linked transcripts are preferred when available.
Build case/control pairs in get_pairs(source = "multi") by joining all positive delta_psi rows (case) with all negative rows (control) for each event_id, then ordering by strongest |delta_psi|.

matched <- get_matched_events_chunked(
  res_di,
  annotation_df$annotations,
  chunk_size = 2000
)
#> Chunk 1/1: rows 1..94

hits_sequences <- attach_sequences(matched, annotation_df$sequences)
pairs <- get_pairs(hits_sequences, source = "multi")

pairs[, .(
  n_pairs = .N,
  n_events = data.table::uniqueN(event_id)
)]
#>    n_pairs n_events
#>      <int>    <int>
#> 1:      36       28

Once pairing is done, we can probe whether there is a proximal/distal shift in AFE/ALE usage.

proximal_output <- get_proximal_shift_from_hits(pairs)
proximal_output$plot

Figure 4: Proximal versus distal terminal exon summary

8 Sequence-Level Consequences

compare_sequence_frame() computes protein-coding class, alignment identity, frameshift/rescue status, and length deltas.

Key labels used downstream: - protein_coding: both isoforms have protein IDs (both sides coding). - onePC: only one side has a protein ID. - noPC: neither side has a protein ID. - Match: protein sequences are identical. - FrameShift: frame is disrupted between paired isoforms.

seq_compare <- compare_sequence_frame(pairs, annotation_df$annotations)
#> [1] "[INFO] Processing 36 transcript and protein sequence alignments, this may take a little..."
#> [1] "[Processing] Identifying frame shifts and rescues"
#> [1] "[INFO] 1 frameshifts (0 rescues) and 27 non-frameshifts were identified, "

alignment_plot <- plot_alignment_summary(seq_compare)
alignment_plot

Figure 5: Protein alignment summary by event type

length_plot <- plot_length_comparison(seq_compare)
length_plot

Figure 6: Case versus control isoform length comparison

9 Domain-Level Changes and Domain Enrichment

Background transcript pairs are derived from annotations and compared against observed event-linked domain changes.

bg <- get_background(
  source = "annotated",
  annotations = annotation_df$annotations,
  protein_features = protein_feature_total,
  BPPARAM = BiocParallel::SerialParam()
)

hits_domain <- get_domains(seq_compare, exon_features)

hits_domain[, .(
  n_hits = .N,
  n_domain_changed = sum(diff_n > 0, na.rm = TRUE)
)]
#>    n_hits n_domain_changed
#>     <int>            <int>
#> 1:     36               12

enriched_domains <- enrich_domains_hypergeo(
  hits = hits_domain,
  background = bg,
  db_filter = "interpro"
)

domain_plot <- plot_enriched_domains_counts(enriched_domains, top_n = 20)
domain_plot

Figure 7: Top enriched InterPro domain differences

You can also stratify enrichment by event type or by feature database.

enriched_afe_interpro <- enrich_by_event(
  hits = hits_domain,
  background = bg,
  events = "AFE",
  db_filter = "interpro"
)

enriched_interpro_only <- enrich_by_db(
  hits = hits_domain,
  background = bg,
  dbs = "interpro"
)

list(
  by_event_rows = nrow(enriched_afe_interpro),
  by_db_rows = nrow(enriched_interpro_only)
)
#> $by_event_rows
#> [1] 1
#> 
#> $by_db_rows
#> [1] 1

10 PPI Rewiring

PPI effects are inferred from domain and motif changes. Domain-domain and domain-motif interactions are sourced from 3did/PPDIM and ELM. PPI rewiring currently only works for human inputs.

ppi <- get_ppi_interactions()

hits_final <- get_ppi_switches(
  hits_domain,
  ppi,
  protein_feature_total
)
#> Registered S3 method overwritten by 'bit64':
#>   method          from 
#>   print.bitstring tools
#> 

ppi_plot <- plot_ppi_summary(hits_final)
ppi_plot

Figure 8: Summary of predicted case/control PPI changes

11 Gene Enrichment Foreground Probes

get_gene_enrichment() extracts foreground genes for enrichment.

mode = "di" expects a DI-like table (res).
mode = "ppi"/"domain" expects a hits-like table (hits_final or hits_domain).
S4 examples are shown in Section 13 so object setup stays in one place.

fg_di <- get_gene_enrichment(
  mode = "di",
  res = res,
  padj_threshold = 0.1,
  delta_psi_threshold = 0.1
)

fg_domain <- get_gene_enrichment(mode = "domain", hits = hits_domain)
fg_ppi <- get_gene_enrichment(mode = "ppi", hits = hits_final)

lengths(list(di = fg_di, domain = fg_domain, ppi = fg_ppi))
#>     di domain    ppi 
#>     40     11      6
intersect(fg_di, fg_ppi)
#> [1] "ENSG00000116171" "ENSG00000142599" "ENSG00000140983" "ENSG00000101391"
#> [5] "ENSG00000116406"

The optional get_enrichment() step depends on additional packages and may return no significant terms on small toy datasets. If you see all NA statistics, foreground genes often map too sparsely into the selected background and ontology. p_adjust_cutoff and top_n_plot are set to allow for example visualization, but should be adjusted to standard (eg: 0.05, 10) for non-toy data uses.

needed <- c("clusterProfiler", "msigdbr", "org.Hs.eg.db")
has_needed <- all(vapply(
  needed,
  requireNamespace,
  FUN.VALUE = logical(1),
  quietly = TRUE
))
#> 
#> 

if (has_needed) {
  enrichment_di <- get_enrichment(
    foreground = fg_domain,
    background = bg$gene_id,
    species = "human",
    gene_id_type = "ensembl",
    sources = "GO:BP",
    p_adjust_cutoff = 1, ## CHANGE FOR NON-TOY DATA USAGE, eg: 0.05
    top_n_plot = 1 ## CHANGE FOR NON-TOY DATA USAGE, eg: NULL or 10
  )

  enrichment_di$plot
} else {
  message(
    "Skipping get_enrichment(): install ",
    paste(needed, collapse = ", ")
  )
}
#> 'select()' returned 1:1 mapping between keys and columns
#> 'select()' returned 1:1 mapping between keys and columns
#> Warning: semData is not provided. It will be calculated automatically.

12 Visualize Transcript and Protein Consequences

plot_two_transcripts_with_domains_unified() shows matched isoforms in either transcript (genome-coordinate) or protein (amino-acid) view.

viz_pair <- hits_final[
  !is.na(transcript_id_case) &
    !is.na(transcript_id_control) &
    transcript_id_case != "" &
    transcript_id_control != ""
][1]

if (nrow(viz_pair) == 0) {
  stop("No transcript pairs available for visualization.")
}

tx_pair <- c(viz_pair$transcript_id_case, viz_pair$transcript_id_control)

transcript_centric <- plot_two_transcripts_with_domains_unified(
  transcripts = tx_pair,
  gtf_df = annotation_df$annotations,
  protein_features = protein_feature_total,
  feature_db = c("interpro"),
  combine_domains = TRUE,
  view = "transcript"
)

protein_centric <- plot_two_transcripts_with_domains_unified(
  transcripts = tx_pair,
  gtf_df = annotation_df$annotations,
  protein_features = protein_feature_total,
  feature_db = c("interpro"),
  combine_domains = TRUE,
  view = "protein"
)

transcript_centric

Figure 9: Transcript and protein views for one paired event

protein_centric

Figure 10: Transcript and protein views for one paired event

13 Inspect a Single Event

probe_individual_event() plots PSI for one event across samples.

event_to_probe <- res$event_id[1]
probe <- probe_individual_event(data, event = event_to_probe)
probe$plot

Figure 11: PSI distribution for one selected event

head(probe$data)
#>    event_id sample condition   psi i.condition   form
#>      <char> <char>    <char> <num>      <char> <char>
#> 1:  A3SS:26     S1      case 0.483        case    INC
#> 2:  A3SS:26     S2      case 0.483        case    INC
#> 3:  A3SS:26     S3      case 0.525        case    INC
#> 4:  A3SS:26     S4      case 0.345        case    INC
#> 5:  A3SS:26     S5   control 0.208     control    INC
#> 6:  A3SS:26     S6   control 0.085     control    INC

14 Integrative Visualization

integrated_event_summary() combines sequence, domain, and PPI summaries.

Panel guide: - Top-left: event classification composition (Match, FrameShift, etc.) by event type. - Top-right: alignment score distributions by event type. - Middle-left: domain-change prevalence (Any, case-only, control-only, both). - Middle-center/right: fraction and counts of predicted PPI rewiring. - Bottom-left: relative event retention from pre-filter DI to final hits. - Bottom-right: gene-level event-type coordination (Jaccard heatmap).

int_summary <- integrated_event_summary(hits_final, res)
int_summary$plot

Figure 12: Integrated multi-layer summary

int_summary$summaries$relative_use
#>    event_type n_pre n_post relative_use
#>        <fctr> <int>  <int>        <num>
#> 1:        AFE    22     17   0.77272727
#> 2:       A3SS    35      5   0.14285714
#> 3:         RI    35      2   0.05714286
#> 4:         SE   135      4   0.02962963
#> 5:       A5SS    11      0   0.00000000
#> 6:        ALE    10      0   0.00000000
#> 7:        MXE    11      0   0.00000000

15 Output Description (`hits_final`, `data`, `res`)

The pipeline returns three core tables in data.table mode: - data: sample-level event measurements before differential modeling. - res: differential inclusion results per tested event/site. - hits_final: paired case/control isoform effects with sequence, domain, and PPI annotations.

Suffix convention used throughout: - _case = case-preferred isoform values. - _control = control-preferred isoform values.

15.1 `hits_final` (integrated event-level output)

Use this table for biological interpretation and downstream plotting.

Event and isoform identifiers: event_id, event_type, gene_id, chr, strand, transcript_id_case, transcript_id_control, protein_id_case, protein_id_control, form_case, form_control, exons_case, exons_control.
Event coordinates and DI statistics: inc_case, inc_control, exc_case, exc_control, delta_psi_case, delta_psi_control, p.value_case, p.value_control, padj_case, padj_control, n_samples_case, n_samples_control, n_case_case, n_case_control, n_control_case, n_control_control.
Sequence and coding context: transcript_seq_case, transcript_seq_control, protein_seq_case, protein_seq_control, pc_class, length columns (prot_len_*, tx_len_*, exon_cds_len_*, exon_len_*) and their *_diff / *_diff_abs derivatives.
Alignment/frame classification: DNA alignment (dna_pid, dna_score, dna_width), protein alignment (prot_pid, prot_score, prot_width), frame diagnostics (frame_call, rescue, frame_check_exon1, frame_check_exon2), and summary_classification.
Domain-level change annotations: domains_exons_case, domains_exons_control, case_only_domains, control_only_domains, case_only_domains_list, control_only_domains_list, either_domains_list, case_only_n, control_only_n, diff_n.
Predicted interaction rewiring: case_ppi, control_ppi, n_case_ppi, n_control_ppi, n_ppi, case_ppi_drivers, control_ppi_drivers.

15.2 `data` (raw sample-level input table)

Use data to inspect per-sample evidence feeding DI.

Core columns: event_id, event_type, form, gene_id, chr, strand, inc, exc, inclusion_reads, exclusion_reads, psi, sample, condition, source_file.

Often present depending on import path: HITindex metadata such as HITindex, class, nFE, nLE, nUP, nDOWN, nTXPT, psi_original, total_reads, source.

15.3 `res` (differential inclusion output)

Use res to rank significant events before pairing/domain/PPI steps.

Core columns: site_id, event_id, event_type, gene_id, chr, strand, inc, exc, form, n_samples, n_control, n_case, mean_psi_ctrl, mean_psi_case, delta_psi, p.value, padj, cooks_max, n, n_used.

16 S4 Applications and Accessors

Use SpliceImpactResult when you want all pipeline pieces in one object, with consistent filtering across event-linked slots.

SpliceImpactResult is a custom S4 container that keeps all major pipeline parts synchronized: - raw_events (SummarizedExperiment): sample-level table + ranges/assays. - di_events / res_di (GRanges): differential inclusion rows. - matched (DFrame): annotation-matched rows (and sequence-attached rows). - paired_hits (GRanges) + segments (GRangesList): final case/control event impacts. - sample_frame (DFrame): sample manifest metadata.

obj <- as_splice_impact_result(
  data = data,
  res = res,
  res_di = res_di,
  matched = hits_sequences,
  sample_frame = sample_frame,
  hits_final = hits_final
)

get_hits_*() wrappers provide compact views for common downstream tasks: - get_hits_core(): event IDs, transcript/protein IDs, and key DI metadata. - get_hits_domain(): domain gain/loss columns (case_only_*, control_only_*, diff_n, mapped domain lists). - get_hits_ppi(): PPI partner changes (n_ppi, case/control partner lists, PPI drivers). - get_hits_sequence(): sequence/alignment/frame fields (pc_class, frame_call, protein/transcript length deltas, identity metrics).

raw_dt <- as_dt_from_s4(obj, "raw_events")
di_dt <- as_dt_from_s4(obj, "di_events")
hits_dt <- as_dt_from_s4(obj, "paired_hits")

hits_core <- get_hits_core(obj)
hits_domain_view <- get_hits_domain(obj)
hits_ppi_view <- get_hits_ppi(obj)
hits_sequence_view <- get_hits_sequence(obj)

c(
  raw_rows = nrow(raw_dt),
  di_rows = nrow(di_dt),
  hits_rows = nrow(hits_dt),
  hits_core_rows = nrow(hits_core),
  hits_domain_rows = nrow(hits_domain_view),
  hits_ppi_rows = nrow(hits_ppi_view),
  hits_sequence_rows = nrow(hits_sequence_view)
)
#>           raw_rows            di_rows          hits_rows     hits_core_rows 
#>               5863                533                 36                 36 
#>   hits_domain_rows      hits_ppi_rows hits_sequence_rows 
#>                 36                 36                 36

obj_focus <- filter_spliceimpact_hits(
  obj,
  diff_n > 0,
  n_ppi > 0
)

focus_hits <- as_dt_from_s4(obj_focus, "paired_hits")
focus_hits[, .N, by = event_type][order(-N)]
#>    event_type     N
#>        <char> <int>
#> 1:        AFE     2
#> 2:         RI     2
#> 3:         SE     2

fg_di_s4 <- get_gene_enrichment(
  mode = "di",
  x = obj,
  padj_threshold = 0.05,
  delta_psi_threshold = 0.1
)

fg_ppi_s4 <- get_gene_enrichment(mode = "ppi", x = obj)

fg_ppi_focus <- get_gene_enrichment(mode = "ppi", x = obj_focus)
list(
  di_equal_table = identical(sort(fg_di), sort(fg_di_s4)),
  ppi_equal_table = identical(sort(fg_ppi), sort(fg_ppi_s4)),
  focused_ppi_genes = head(fg_ppi_focus)
)
#> $di_equal_table
#> [1] TRUE
#> 
#> $ppi_equal_table
#> [1] TRUE
#> 
#> $focused_ppi_genes
#> [1] "ENSG00000116171" "ENSG00000142599" "ENSG00000140983" "ENSG00000164692"
#> [5] "ENSG00000116406" "ENSG00000101391"

spliceimpact_s4_schema()
#> $slots
#> $slots$raw_events
#> [1] "SummarizedExperiment: sample/form rows with rowRanges + rowData"
#> 
#> $slots$di_events
#> [1] "GRanges: differential inclusion rows with mcols"
#> 
#> $slots$res_di
#> [1] "GRanges: threshold-filtered differential rows with mcols"
#> 
#> $slots$matched
#> [1] "S4Vectors::DataFrame: annotation-matched DI rows (and sequence-attached rows when present)"
#> 
#> $slots$sample_frame
#> [1] "S4Vectors::DataFrame: sample manifest (`path`, `sample_name`, `condition`)"
#> 
#> $slots$paired_hits
#> [1] "GRanges: paired case/control rows with mcols"
#> 
#> $slots$segments
#> [1] "GRangesList: per-pair genomic segments (inc_case/inc_control/exc_case/exc_control)"
#> 
#> $slots$metadata
#> [1] "list: provenance and pipeline metadata"
#> 
#> 
#> $keys
#> $keys$raw_events
#> [1] "raw_key (rowData)"
#> 
#> $keys$di_events
#> [1] "di_key (mcols)"
#> 
#> $keys$res_di
#> [1] "di_key (mcols)"
#> 
#> $keys$paired_hits
#> [1] "pair_key (mcols)"
#> 
#> $keys$segments
#> [1] "names(segments) == pair_key"
#> 
#> 
#> $assays
#> [1] "psi"             "inclusion_reads" "exclusion_reads"

16.1 S4-first Main Workflow (Code Only)

This mirrors the table workflow, but keeps everything in a single SpliceImpactResult object.

obj_flow <- as_splice_impact_result(
  data = data,
  sample_frame = sample_frame
)

obj_flow <- get_differential_inclusion(
  obj_flow,
  min_total_reads = 10,
  parallel_glm = TRUE,
  BPPARAM = BiocParallel::SerialParam(),
  return_class = "S4"
)
obj_flow <- keep_sig_pairs(obj_flow, return_class = "S4")

obj_flow <- get_matched_events_chunked(
  obj_flow,
  annotation_df$annotations,
  chunk_size = 2000,
  return_class = "S4"
)
obj_flow <- attach_sequences(
  obj_flow,
  annotation_df$sequences,
  return_class = "S4"
)
obj_flow <- get_pairs(obj_flow, source = "multi", return_class = "S4")
obj_flow <- compare_sequence_frame(
  obj_flow,
  annotation_df$annotations,
  return_class = "S4"
)
obj_flow <- get_domains(obj_flow, exon_features, return_class = "S4")
obj_flow <- get_ppi_switches(
  obj_flow,
  ppi,
  protein_feature_total,
  return_class = "S4"
)

hits_core_flow <- get_hits_core(obj_flow)
hits_domain_flow <- get_hits_domain(obj_flow)
hits_ppi_flow <- get_hits_ppi(obj_flow)
hits_sequence_flow <- get_hits_sequence(obj_flow)

17 Custom Input Entry Points

These helpers support non-standard inputs at multiple stages of the workflow.

17.1 Add Manual Protein Features

get_manual_features() lets you add user-defined peptide-level features.

ann_dt <- data.table::as.data.table(annotation_df$annotations)
coding_tx <- unique(ann_dt[type == "exon" & cds_has == TRUE, transcript_id])
n_manual <- min(3L, length(coding_tx))

if (n_manual < 1L) {
  stop("No coding transcripts found for manual feature example.")
}

manual_df <- data.frame(
  ensembl_transcript_id = coding_tx[seq_len(n_manual)],
  ensembl_peptide_id = rep(NA_character_, n_manual),
  name = paste0("demo_feature_", seq_len(n_manual)),
  start = c(20L, 45L, 80L)[seq_len(n_manual)],
  stop = c(35L, 58L, 92L)[seq_len(n_manual)],
  database = rep("manual", n_manual),
  alt_name = rep(NA_character_, n_manual),
  feature_id = rep(NA_character_, n_manual),
  stringsAsFactors = FALSE
)

manual_features <- get_manual_features(
  manual_features = manual_df,
  gtf_df = annotation_df$annotations
)
#> 0 domain(s) not matched to genomic coords

head(manual_features)
#>    ensembl_transcript_id feature_id start  stop    chr strand     clean_name
#>                   <char>     <char> <num> <num> <char> <char>         <char>
#> 1:       ENST00000337907       <NA>    20    35   chr1      - demo_feature_1
#> 2:       ENST00000400908       <NA>    45    58   chr1      - demo_feature_2
#> 3:       ENST00000476556       <NA>    80    92   chr1      - demo_feature_3
#>    alt_name database ensembl_peptide_id method
#>      <char>   <char>             <char> <char>
#> 1:     <NA>   manual               <NA> manual
#> 2:     <NA>   manual               <NA> manual
#> 3:     <NA>   manual               <NA> manual
#>                                   name
#>                                 <char>
#> 1: demo_feature_1;chr1:8656030-8656077
#> 2: demo_feature_2;chr1:8656105-8656146
#> 3: demo_feature_3;chr1:8362842-8361798

17.2 Pre-DI Custom Table

Use get_user_data() when you already have sample-level reads/PSI.

example_df <- data.frame(
  event_id = rep("A3SS:1", 8),
  event_type = "A3SS",
  form = rep(c("INC", "EXC"), each = 4),
  gene_id = "ENSG00000158286",
  chr = "chrX",
  strand = "-",
  inc = c(rep("149608626-149608834", 4), rep("149608626-149608829", 4)),
  exc = c(rep("", 4), rep("149608830-149608834", 4)),
  inclusion_reads = c(30, 32, 29, 31, 2, 3, 4, 3),
  exclusion_reads = c(1, 1, 2, 1, 28, 27, 26, 30),
  sample = c("S1", "S2", "S3", "S4", "S1", "S2", "S3", "S4"),
  condition = rep(c("case", "case", "control", "control"), 2),
  stringsAsFactors = FALSE
)
example_df$psi <- example_df$inclusion_reads /
  (example_df$inclusion_reads + example_df$exclusion_reads)

user_data <- get_user_data(example_df)
head(user_data)
#>    event_id event_type   form         gene_id    chr strand                 inc
#>      <char>     <char> <char>          <char> <char> <char>              <char>
#> 1:   A3SS:1       A3SS    INC ENSG00000158286   chrX      - 149608626-149608834
#> 2:   A3SS:1       A3SS    INC ENSG00000158286   chrX      - 149608626-149608834
#> 3:   A3SS:1       A3SS    INC ENSG00000158286   chrX      - 149608626-149608834
#> 4:   A3SS:1       A3SS    INC ENSG00000158286   chrX      - 149608626-149608834
#> 5:   A3SS:1       A3SS    EXC ENSG00000158286   chrX      - 149608626-149608829
#> 6:   A3SS:1       A3SS    EXC ENSG00000158286   chrX      - 149608626-149608829
#>                    exc inclusion_reads exclusion_reads        psi sample
#>                 <char>           <num>           <num>      <num> <char>
#> 1:                                  30               1 0.96774194     S1
#> 2:                                  32               1 0.96969697     S2
#> 3:                                  29               2 0.93548387     S3
#> 4:                                  31               1 0.96875000     S4
#> 5: 149608830-149608834               2              28 0.06666667     S1
#> 6: 149608830-149608834               3              27 0.10000000     S2
#>    condition source_file
#>       <char>      <char>
#> 1:      case            
#> 2:      case            
#> 3:   control            
#> 4:   control            
#> 5:      case            
#> 6:      case

17.3 Post-DI Custom Table

Use get_user_data_post_di() when you already have event-level DI summaries.

example_user_data <- data.frame(
  event_id = rep("A3SS:1", 8),
  event_type = "A3SS",
  gene_id = "ENSG00000158286",
  chr = "chrX",
  strand = "-",
  form = rep(c("INC", "EXC"), each = 4),
  inc = c(rep("149608626-149608834", 4), rep("149608626-149608829", 4)),
  exc = c(rep("", 4), rep("149608830-149608834", 4)),
  stringsAsFactors = FALSE
)

user_res <- get_user_data_post_di(example_user_data)
head(user_res)
#>                                                                  site_id
#>                                                                   <char>
#> 1:                    A3SS|ENSG00000158286|chrX|149608626-149608834||INC
#> 2:                    A3SS|ENSG00000158286|chrX|149608626-149608834||INC
#> 3:                    A3SS|ENSG00000158286|chrX|149608626-149608834||INC
#> 4:                    A3SS|ENSG00000158286|chrX|149608626-149608834||INC
#> 5: A3SS|ENSG00000158286|chrX|149608626-149608829|149608830-149608834|EXC
#> 6: A3SS|ENSG00000158286|chrX|149608626-149608829|149608830-149608834|EXC
#>    event_type event_id         gene_id    chr strand                 inc
#>        <char>   <char>          <char> <char> <char>              <char>
#> 1:       A3SS   A3SS:1 ENSG00000158286   chrX      - 149608626-149608834
#> 2:       A3SS   A3SS:1 ENSG00000158286   chrX      - 149608626-149608834
#> 3:       A3SS   A3SS:1 ENSG00000158286   chrX      - 149608626-149608834
#> 4:       A3SS   A3SS:1 ENSG00000158286   chrX      - 149608626-149608834
#> 5:       A3SS   A3SS:1 ENSG00000158286   chrX      - 149608626-149608829
#> 6:       A3SS   A3SS:1 ENSG00000158286   chrX      - 149608626-149608829
#>                    exc n_samples n_control n_case mean_psi_ctrl mean_psi_case
#>                 <char>     <num>     <num>  <num>         <num>         <num>
#> 1:                            -1        -1     -1            -1            -1
#> 2:                            -1        -1     -1            -1            -1
#> 3:                            -1        -1     -1            -1            -1
#> 4:                            -1        -1     -1            -1            -1
#> 5: 149608830-149608834        -1        -1     -1            -1            -1
#> 6: 149608830-149608834        -1        -1     -1            -1            -1
#>    delta_psi p.value  padj cooks_max   form     n n_used
#>        <num>   <num> <num>     <num> <char> <num>  <num>
#> 1:         1       0     0        -1    INC    -1     -1
#> 2:         1       0     0        -1    INC    -1     -1
#> 3:         1       0     0        -1    INC    -1     -1
#> 4:         1       0     0        -1    INC    -1     -1
#> 5:        -1       0     0        -1    EXC    -1     -1
#> 6:        -1       0     0        -1    EXC    -1     -1

17.4 Post-DI rMATS Import

Use get_rmats_post_di() to ingest already-computed rMATS result tables.

rmats_df <- data.frame(
  ID = 1L,
  GeneID = "ENSG00000182871",
  geneSymbol = "COL18A1",
  chr = "chr21",
  strand = "+",
  longExonStart_0base = 45505834L,
  longExonEnd = 45505966L,
  shortES = 45505837L,
  shortEE = 45505966L,
  flankingES = 45505357L,
  flankingEE = 45505431L,
  ID.2 = 2L,
  IJC_SAMPLE_1 = "1,1,1",
  SJC_SAMPLE_1 = "1,1,1",
  IJC_SAMPLE_2 = "1,1,1",
  SJC_SAMPLE_2 = "1,1,1",
  IncFormLen = 52L,
  SkipFormLen = 49L,
  PValue = 0.6967562,
  FDR = 1,
  IncLevel1 = "0.0,0.0,0.0",
  IncLevel2 = "1.0,1.0,1.0",
  IncLevelDifference = 1.0,
  stringsAsFactors = FALSE
)

user_res_rmats <- get_rmats_post_di(rmats_df, event_type = "A3SS")
head(user_res_rmats)
#>    event_id event_type   form         gene_id    chr strand               inc
#>      <char>     <char> <char>          <char> <char> <char>            <char>
#> 1:   A3SS:1       A3SS    INC ENSG00000182871  chr21      + 45505834-45505966
#> 2:   A3SS:1       A3SS    EXC ENSG00000182871  chr21      + 45505837-45505966
#>                  exc   p.value delta_psi  padj
#>               <char>     <num>     <num> <num>
#> 1:                   0.6967562         1     1
#> 2: 45505834-45505836 0.6967562        -1     1

17.5 Transcript-Pair Entry Point

Use compare_transcript_pairs() when you want to start directly from chosen transcript pairs instead of event matching.

tx_ids <- unique(annotation_df$annotations$transcript_id)
tx_ids <- tx_ids[!is.na(tx_ids) & tx_ids != ""]

if (length(tx_ids) < 4L) {
  stop("Need at least four transcripts for transcript-pair example.")
}

transcript_pairs <- data.frame(
  transcript1 = tx_ids[1:2],
  transcript2 = tx_ids[3:4],
  stringsAsFactors = FALSE
)

user_matched <- compare_transcript_pairs(
  transcript_pairs = transcript_pairs,
  annotations = annotation_df$annotations
)
#> 2 out of 2 transcript pairs contained within annotations

head(user_matched)
#>    event_id event_type   form         gene_id    chr strand
#>      <char>     <char> <char>          <char> <char> <char>
#> 1:  TXCMP:1      TXCMP    EXC ENSG00000158286   chr1      +
#> 2:  TXCMP:1      TXCMP    INC ENSG00000158286   chr1      +
#> 3:  TXCMP:2      TXCMP    EXC ENSG00000158286   chr1      +
#> 4:  TXCMP:2      TXCMP    INC ENSG00000158286   chr1      +
#>                                                                inc
#>                                                             <char>
#> 1: 6205475-6206101;6206196-6206302;6206536-6206726;6207379-6208307
#> 2: 6205475-6206101;6206196-6206302;6206536-6206726;6207379-6208307
#> 3:                 6206129-6206302;6206536-6206726;6207379-6207564
#> 4:                 6206129-6206302;6206536-6206726;6207379-6207564
#>                                                                                                                                                                exc
#>                                                                                                                                                             <char>
#> 1:                                                                 6208618-6209025;6209115-6209196;6209268-6209539;6209924-6209970;6210223-6210295;6210370-6210900
#> 2:                                                                 6208618-6209025;6209115-6209196;6209268-6209539;6209924-6209970;6210223-6210295;6210370-6210900
#> 3: 6208860-6209025;6209414-6209539;6209924-6209970;6210223-6210295;6210370-6210438;6211867-6212053;6212218-6212416;6212682-6213183;6218289-6218369;6219236-6220805
#> 4: 6208860-6209025;6209414-6209539;6209924-6209970;6210223-6210295;6210370-6210438;6211867-6212053;6212218-6212416;6212682-6213183;6218289-6218369;6219236-6220805
#>    delta_psi p.value  padj n_samples n_control n_case   transcript_id
#>        <num>   <num> <num>     <num>     <num>  <num>          <char>
#> 1:        -1       0     0         0         0      0 ENST00000485539
#> 2:         1       0     0         0         0      0 ENST00000466994
#> 3:        -1       0     0         0         0      0 ENST00000496676
#> 4:         1       0     0         0         0      0 ENST00000484435
#>                                                                                                                                                              exons
#>                                                                                                                                                             <char>
#> 1:                                                                 ENSE00001934791;ENSE00003528881;ENSE00001948578;ENSE00003560800;ENSE00003516232;ENSE00001921263
#> 2:                                                                                                 ENSE00001843998;ENSE00001829482;ENSE00003686018;ENSE00001935169
#> 3: ENSE00001866172;ENSE00003662600;ENSE00003560800;ENSE00003516232;ENSE00003531305;ENSE00003488296;ENSE00001883823;ENSE00001858590;ENSE00003562079;ENSE00001844921
#> 4:                                                                                                                 ENSE00001551004;ENSE00003686018;ENSE00001828693

18 Session Info

sessionInfo()
#> R version 4.6.0 alpha (2026-04-05 r89794)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.4 LTS
#> 
#> Matrix products: default
#> BLAS:   /home/biocbuild/bbs-3.23-bioc/R/lib/libRblas.so 
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0  LAPACK version 3.12.0
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_GB              LC_COLLATE=C              
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> time zone: America/New_York
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] BiocParallel_1.45.0  SpliceImpactR_0.99.4 BiocStyle_2.39.0    
#> 
#> loaded via a namespace (and not attached):
#>   [1] splines_4.6.0               BiocIO_1.21.0              
#>   [3] bitops_1.0-9                ggplotify_0.1.3            
#>   [5] tibble_3.3.1                polyclip_1.10-7            
#>   [7] enrichit_0.1.3              XML_3.99-0.23              
#>   [9] lifecycle_1.0.5             httr2_1.2.2                
#>  [11] pwalign_1.7.0               rstatix_0.7.3              
#>  [13] processx_3.8.7              lattice_0.22-9             
#>  [15] MASS_7.3-65                 backports_1.5.1            
#>  [17] magrittr_2.0.5              sass_0.4.10                
#>  [19] rmarkdown_2.31              jquerylib_0.1.4            
#>  [21] yaml_2.3.12                 otel_0.2.0                 
#>  [23] ggtangle_0.1.1              cowplot_1.2.0              
#>  [25] DBI_1.3.0                   RColorBrewer_1.1-3         
#>  [27] abind_1.4-8                 GenomicRanges_1.63.2       
#>  [29] purrr_1.2.1                 BiocGenerics_0.57.0        
#>  [31] msigdbr_26.1.0              RCurl_1.98-1.18            
#>  [33] yulab.utils_0.2.4           tweenr_2.0.3               
#>  [35] rappdirs_0.3.4              aisdk_1.1.0                
#>  [37] gdtools_0.5.0               IRanges_2.45.0             
#>  [39] S4Vectors_0.49.1            enrichplot_1.31.5          
#>  [41] ggrepel_0.9.8               tidytree_0.4.7             
#>  [43] codetools_0.2-20            DelayedArray_0.37.1        
#>  [45] DOSE_4.5.1                  ggforce_0.5.0              
#>  [47] tidyselect_1.2.1            aplot_0.2.9                
#>  [49] farver_2.1.2                matrixStats_1.5.0          
#>  [51] stats4_4.6.0                Seqinfo_1.1.0              
#>  [53] GenomicAlignments_1.47.0    jsonlite_2.0.0             
#>  [55] Formula_1.2-5               systemfonts_1.3.2          
#>  [57] tools_4.6.0                 ggnewscale_0.5.2           
#>  [59] treeio_1.35.0               PFAM.db_3.22.0             
#>  [61] Rcpp_1.1.1                  glue_1.8.0                 
#>  [63] gridExtra_2.3               SparseArray_1.11.13        
#>  [65] xfun_0.57                   qvalue_2.43.0              
#>  [67] MatrixGenerics_1.23.0       dplyr_1.2.1                
#>  [69] withr_3.0.2                 BiocManager_1.30.27        
#>  [71] fastmap_1.2.0               callr_3.7.6                
#>  [73] digest_0.6.39               R6_2.6.1                   
#>  [75] gridGraphics_0.5-1          GO.db_3.23.1               
#>  [77] dichromat_2.0-0.1           RSQLite_2.4.6              
#>  [79] cigarillo_1.1.0             tidyr_1.3.2                
#>  [81] generics_0.1.4              fontLiberation_0.1.0       
#>  [83] data.table_1.18.2.1         rtracklayer_1.71.3         
#>  [85] httr_1.4.8                  htmlwidgets_1.6.4          
#>  [87] S4Arrays_1.11.1             scatterpie_0.2.6           
#>  [89] pkgconfig_2.0.3             gtable_0.3.6               
#>  [91] blob_1.3.0                  S7_0.2.1                   
#>  [93] XVector_0.51.0              clusterProfiler_4.19.7     
#>  [95] htmltools_0.5.9             fontBitstreamVera_0.1.1    
#>  [97] carData_3.0-6               bookdown_0.46              
#>  [99] scales_1.4.0                Biobase_2.71.0             
#> [101] png_0.1-9                   ggfun_0.2.0                
#> [103] knitr_1.51                  reshape2_1.4.5             
#> [105] rjson_0.2.23                nlme_3.1-169               
#> [107] curl_7.0.0                  org.Hs.eg.db_3.23.1        
#> [109] cachem_1.1.0                stringr_1.6.0              
#> [111] parallel_4.6.0              AnnotationDbi_1.73.1       
#> [113] restfulr_0.0.16             pillar_1.11.1              
#> [115] grid_4.6.0                  vctrs_0.7.2                
#> [117] ggpubr_0.6.3                car_3.1-5                  
#> [119] tidydr_0.0.6                cluster_2.1.8.2            
#> [121] evaluate_1.0.5              tinytex_0.59               
#> [123] magick_2.9.1                cli_3.6.5                  
#> [125] compiler_4.6.0              Rsamtools_2.27.1           
#> [127] rlang_1.2.0                 crayon_1.5.3               
#> [129] ggsignif_0.6.4              labeling_0.4.3             
#> [131] ps_1.9.2                    plyr_1.8.9                 
#> [133] fs_2.0.1                    ggiraph_0.9.6              
#> [135] stringi_1.8.7               babelgene_22.9             
#> [137] assertthat_0.2.1            Biostrings_2.79.5          
#> [139] lazyeval_0.2.3              GOSemSim_2.37.2            
#> [141] fontquiver_0.2.1            Matrix_1.7-5               
#> [143] patchwork_1.3.2             bit64_4.6.0-1              
#> [145] ggplot2_4.0.2               KEGGREST_1.51.1            
#> [147] SummarizedExperiment_1.41.1 igraph_2.2.3               
#> [149] broom_1.0.12                memoise_2.0.1              
#> [151] bslib_0.10.0                ggtree_4.1.1               
#> [153] bit_4.6.0                   ape_5.8-1                  
#> [155] gson_0.1.0

SpliceImpactR

April 8, 2026

Package

Contents

1 Overview

1.1 Installation

1.1.1 Option 1: BiocManager

1.1.2 Option 2: devtools (GitHub)

1.2 External tools and acronyms

1.3 Workflow map

2 Load Reference Resources

3 Build Sample Manifest

3.1 Quick start wrapper (`get_splicing_impact`)

4 Read Splicing Events

5 Sample-Level QC and Exploration

5.1 Compare HITindex Between Conditions

5.2 PSI Distribution Overview

6 Differential Inclusion (DI)

7 Match Events to Sequences and Build Transcript Pairs

8 Sequence-Level Consequences

9 Domain-Level Changes and Domain Enrichment

10 PPI Rewiring

11 Gene Enrichment Foreground Probes

12 Visualize Transcript and Protein Consequences

13 Inspect a Single Event

14 Integrative Visualization

15 Output Description (`hits_final`, `data`, `res`)

15.1 `hits_final` (integrated event-level output)

15.2 `data` (raw sample-level input table)

15.3 `res` (differential inclusion output)

16 S4 Applications and Accessors

16.1 S4-first Main Workflow (Code Only)

17 Custom Input Entry Points

17.1 Add Manual Protein Features

17.2 Pre-DI Custom Table

17.3 Post-DI Custom Table

17.4 Post-DI rMATS Import

17.5 Transcript-Pair Entry Point

18 Session Info

SpliceImpactR

April 8, 2026

Package

Contents

1 Overview

1.1 Installation

1.1.1 Option 1: BiocManager

1.1.2 Option 2: devtools (GitHub)

1.2 External tools and acronyms

1.3 Workflow map

2 Load Reference Resources

3 Build Sample Manifest

3.1 Quick start wrapper (get_splicing_impact)

4 Read Splicing Events

5 Sample-Level QC and Exploration

5.1 Compare HITindex Between Conditions

5.2 PSI Distribution Overview

6 Differential Inclusion (DI)

7 Match Events to Sequences and Build Transcript Pairs

8 Sequence-Level Consequences

9 Domain-Level Changes and Domain Enrichment

10 PPI Rewiring

11 Gene Enrichment Foreground Probes

12 Visualize Transcript and Protein Consequences

13 Inspect a Single Event

14 Integrative Visualization

15 Output Description (hits_final, data, res)

15.1 hits_final (integrated event-level output)

15.2 data (raw sample-level input table)

15.3 res (differential inclusion output)

16 S4 Applications and Accessors

16.1 S4-first Main Workflow (Code Only)

17 Custom Input Entry Points

17.1 Add Manual Protein Features

17.2 Pre-DI Custom Table

17.3 Post-DI Custom Table

17.4 Post-DI rMATS Import

17.5 Transcript-Pair Entry Point

18 Session Info

3.1 Quick start wrapper (`get_splicing_impact`)

15 Output Description (`hits_final`, `data`, `res`)

15.1 `hits_final` (integrated event-level output)

15.2 `data` (raw sample-level input table)

15.3 `res` (differential inclusion output)