% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/easyRNASeq-synthetic-transcripts.R
\name{createSyntheticTranscripts,AnnotParamCharacter-method}
\alias{createSyntheticTranscripts,AnnotParamCharacter-method}
\alias{createSyntheticTranscripts}
\alias{createSyntheticTranscripts,character-method}
\title{Methods to create synthetic transcripts}
\usage{
\S4method{createSyntheticTranscripts}{AnnotParamCharacter}(
  obj,
  features = c("mRNA", "miRNA", "tRNA", "transcript"),
  verbose = TRUE
)

\S4method{createSyntheticTranscripts}{character}(
  obj,
  features = c("mRNA", "miRNA", "tRNA", "transcript"),
  verbose = TRUE,
  output = c("Genome_intervals", "GRanges"),
  input = c("gff3", "gtf")
)
}
\arguments{
\item{obj}{a \code{\linkS4class{AnnotParamCharacter}} object or the
annotation filename as a \code{character} string}

\item{features}{one or more of 'mRNA', 'miRNA', 'tRNA', 'transcript'}

\item{verbose}{increase the verbosity (default TRUE)}

\item{output}{the output type, one of 'Genome_intervals' or 'GRanges'}

\item{input}{the type of input, one of 'gff3' or 'gtf'}
}
\value{
Depending on the \code{obj} class.
\itemize{
  \item \code{AnnotParamCharacter}: a \code{AnnotParamObject} object
  \item a \code{character} filename: depending on the selected \code{output}
  value, a \code{\link[genomeIntervals:Genome_intervals-class]{Genome_intervals}}
  or a \code{\linkS4class{GRanges}} object.
}
}
\description{
This function create a set of synthetic transcripts from a provided
annotation file in "gff3" or "gtf" format. As detailed in
\url{http://www.epigenesys.eu/en/protocols/bio-informatics/1283-guidelines-for-rna-seq-data-analysis},
one major caveat of estimating gene expression using aligned RNA-Seq reads
is that a single read, which originated from a single mRNA molecule, might
sometimes align to several features (e.g. transcripts or genes) with
alignments of equivalent quality. This, for example, might happen as a result
of gene duplication and the presence of repetitive or common domains.
To avoid counting unique mRNA fragments multiple times, the
stringent approach is to keep only uniquely mapping reads - being aware of
potential consequences. Not only can "multiple counting" arise from a
biological reason, but also from technical artifacts, introduced mostly
by poorly formatted gff3/gtf annotation files. To avoid this, it is best
practice to adopt a conservative approach by collapsing all existing
transcripts of a single gene locus into a "synthetic" transcript containing
every exon of that gene. In the case of overlapping exons, the longest
genomic interval is kept, i.e. an artificial exon is created. This process
results in a flattened transcript - a gene structure with a one (gene) to
one (transcript) relationship.
}
\details{
The \code{createSyntheticTranscripts} function implements this, taking
advantage of the hierarchical structure of the gff3/gtf file. Exon
features are related to their transcript (parent), which themselves derives
from their gene parents. Using this relationship, exons are combined per gene
into a flattened transcript structure. Note that this might not avoid multiple
counting if genes overlap on opposing strands. There, only strand specific
sequencing data has the power to disentangle these situations.

As gff3/gtf file can contain a large number of feature types, the
\code{createSyntheticTranscripts} currently only supports: \emph{mRNA},
\emph{miRNA}, \emph{tRNA} and \emph{transcript}. Please contact me if you
need additional features to be considered. Note however, that I will only
add features that are part of the \url{sequenceontology.org} SOFA
(SO_Feature_Annotation) ontology.
}
\examples{

  # get the example file
  Dm.gtf <- fetchData("Drosophila_melanogaster.BDGP5.77.with-chr.gtf.gz")

  # create the AnnotParam
  annotParam <- AnnotParam(
    datasource=Dm.gtf,
    type="gtf")

  # create the synthetic transcripts
  annotParam <- createSyntheticTranscripts(annotParam,verbose=FALSE)

}
\seealso{
\itemize{
\item{For the input:
\itemize{
\item \code{\linkS4class{AnnotParam}}
}}
\item{For the output:
\itemize{
\item \code{\linkS4class{AnnotParam}}
\item \code{\link[genomeIntervals:Genome_intervals-class]{Genome_intervals}}
\item \code{\linkS4class{GRanges}}
}}}
}
\author{
Nicolas Delhomme
}
\keyword{methods}
