% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/convertIdentifiers.R
\name{convertIdentifiers}
\alias{convertIdentifiers}
\alias{convertIdentifiers,BiocSet-method}
\alias{convertIdentifiers,GeneSetDb-method}
\title{Converts internal feature identifiers in a GeneSetDb to a set of new ones.}
\usage{
convertIdentifiers(
  x,
  from = NULL,
  to = NULL,
  id.type = c("ensembl", "entrez", "symbol"),
  xref = NULL,
  extra.cols = NULL,
  allow.cartesian = FALSE,
  method = c("orthogene", "babelgene"),
  min_support = 3,
  top = TRUE,
  ...
)

\S4method{convertIdentifiers}{BiocSet}(
  x,
  from = NULL,
  to = NULL,
  id.type = c("ensembl", "entrez", "symbol"),
  xref = NULL,
  extra.cols = NULL,
  allow.cartesian = FALSE,
  method = c("orthogene", "babelgene"),
  min_support = 3,
  top = TRUE,
  ...
)

\S4method{convertIdentifiers}{GeneSetDb}(
  x,
  from = NULL,
  to = NULL,
  id.type = c("ensembl", "entrez", "symbol"),
  xref = NULL,
  extra.cols = NULL,
  allow.cartesian = FALSE,
  method = c("orthogene", "babelgene"),
  min_support = 3,
  top = TRUE,
  ...
)
}
\arguments{
\item{x}{The GeneSetDb with identifiers to convert}

\item{from, to}{If you are doing identifier and/orspecies conversion using
babelgene, \code{to} is the species you want to convert to, and \code{from} is the
species of \code{x}. If you are only doing id type conversion within the same
species, specify the current species in \code{from}.
If you are providing a data.frame map of identifiers in \code{xref}, \code{to} is
the name of the column that holds the new identifiers, and \code{from} is the
name of the column that holds the current identifiers.}

\item{id.type}{If you are using babelgene conversion, this specifies the
type of identifier you want to convert to. It can be any of \code{"ensembl"},
\code{"entrez"}, or \code{"symbol"}.}

\item{xref}{a data.frame used to map current identifiers to target ones.}

\item{extra.cols}{a character vector of columns from \code{to} to add to the
features of the new GeneSetDb. If you want to keep the original identifiers
of the remapped features, include \code{"original_id"} as one of the values
here.}

\item{allow.cartesian}{a boolean used to temporarily set the
\code{datatable.allow.cartesian} global option. If you are doing a 1:many
map of your identifiers, you may trigger this error. You can temporarily
turn this option/error off by setting \code{allow.cartesian = TRUE}. The
option will be restored to its "pre-function call" value \code{on.exit}.}

\item{method}{The method used to convert identifers, either \code{"orthogene"} or
\code{"babelgene"}. \code{"orthogene"} (the default) is more powerful, supports
more organisms, and (unlike \code{"babelgene"}) can map between any two
arbitrary species -- babelgene requires one of the species in the mapping
to be human. The downside to \code{"orthogene"} is that you need internet access
to run.}

\item{min_support, top}{Parameters used in the internal call to
\code{\link[babelgene:orthologs]{babelgene::orthologs()}}}

\item{...}{pass through args (not used)}
}
\value{
A new GeneSetDb object with converted identifiers. We try to retain
any metadata in the original object, but no guarantees are given. If
\code{id_type} was stored previously in the collectionMetadata, that will be
dropped.
}
\description{
The various GeneSetDb data providers (MSigDb, KEGG, etc). limit the
identifier types that they return. Use this function to map the given
identifiers to whichever type you like.
}
\details{
For best results, provide your own identifier mapping reference, but we
provide a convenience wrapper around the \code{\link[babelgene:orthologs]{babelgene::orthologs()}} function to
change between identifier types and species.

When there are multiple target id's for the source id, they will all be
returned. When there is no target id for the source id, the soucre feature
will be axed.
}
\section{Methods (by class)}{
\itemize{
\item \code{convertIdentifiers(BiocSet)}: converts identifiers in a BiocSet

\item \code{convertIdentifiers(GeneSetDb)}: converts identifiers in a GeneSetDb

}}
\section{Custom Mapping}{

You need to provide a data.frame via the \code{xref} paramater that has a column
for the current identifiers and another column for the target identifiers.
The columns are specified by the \code{from} and \code{to} paramters, respectively.
}

\section{Convenience identifier and species mapping}{

If you don't provide a data.frame, you can provide a species name. We will
rely on the \code{{babelgene}} package for the conversion, so you will have to
provide a species name that it recognizes.
}

\section{Species and Identifier Conversion via babelgene}{

We plan to provide a quick wrapper to babelgene's ortholog mapping function
to make identifier conversion a easier through this function. You can track
this in \href{https://github.com/lianos/sparrow/issues/2}{sparrow issue #2}.
}

\section{Species and Identifier Conversion via orthogene}{

Babelgene is great, but does not support all species (like cynos), but we
can rely on the orthogene package for that. The downside to orthogene is that
it requires online acces.
}

\examples{
# You can convert the identifiers within a GeneSetDb to some other type
# by providing a "translation" table. Check out the unit tests for more
# examples.
gdb <- exampleGeneSetDb() # this has no symbols in it

# Define a silly conversion table.
xref <- data.frame(
  current_id = featureIds(gdb),
  new_id = paste0(featureIds(gdb), "_symbol"))
gdb2 <- convertIdentifiers(gdb, from = "current_id", to = "new_id",
                           xref = xref, extra.cols = "original_id")
geneSet(gdb2, name = "BIOCARTA_AGPCR_PATHWAY")

# Convert entrez to ensembl id's using babelgene
\dontrun{
# The conversion functionality via babelgene isn't yet implemented, but
# will look like this.

# 1. convert the human entrez identifiers to ensembl
gdb.ens <- convertIdentifiers(gdb, "human", id.type = "ensembl")

# 2. convert the human entrez to mouse entrez
gdb.entm <- convertIdentifiers(gdb, "human", "mouse", id.type = "entrez")

# 3. convert the human entrez to mouse ensembl
gdb.ensm <- convertIdentifiers(gdb, "human", "mouse", id.type = "ensembl")
}
}
