% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/generateRmdCodeDiffExpPhylo.R
\name{DESeq2.length.createRmd}
\alias{DESeq2.length.createRmd}
\title{Generate a \code{.Rmd} file containing code to perform differential expression analysis with DESeq2 with custom model matrix}
\usage{
DESeq2.length.createRmd(
  data.path,
  result.path,
  codefile,
  fit.type,
  test,
  beta.prior = TRUE,
  independent.filtering = TRUE,
  cooks.cutoff = TRUE,
  impute.outliers = TRUE,
  extra.design.covariates = NULL,
  nas.as.ones = FALSE
)
}
\arguments{
\item{data.path}{The path to a .rds file containing the \code{phyloCompData} object that will be used for the differential expression analysis.}

\item{result.path}{The path to the file where the result object will be saved.}

\item{codefile}{The path to the file where the code will be written.}

\item{fit.type}{The fitting method used to get the dispersion-mean relationship. Possible values are \code{"parametric"}, \code{"local"} and \code{"mean"}.}

\item{test}{The test to use. Possible values are \code{"Wald"} and \code{"LRT"}.}

\item{beta.prior}{Whether or not to put a zero-mean normal prior on the non-intercept coefficients. Default is \code{TRUE}.}

\item{independent.filtering}{Whether or not to perform independent filtering of the data. With independent filtering=TRUE, the adjusted p-values for genes not passing the filter threshold are set to NA.}

\item{cooks.cutoff}{The cutoff value for the Cook's distance to consider a value to be an outlier. Set to Inf or FALSE to disable outlier detection. For genes with detected outliers, the p-value and adjusted p-value will be set to NA.}

\item{impute.outliers}{Whether or not the outliers should be replaced by a trimmed mean and the analysis rerun.}

\item{extra.design.covariates}{A vector containing the names of extra control variables to be passed to the design matrix of \code{DESeq2}. All the covariates need to be a column of the \code{sample.annotations} data frame from the \code{\link{phyloCompData}} object, with a matching column name. The covariates can be a numeric vector, or a factor. Note that "condition" factor column is always included, and should not be added here. See Details.}

\item{nas.as.ones}{Whether or not adjusted p values that are returned as \code{NA} by \code{DESeq2} should be set to \code{1}. This option is useful for comparisons with other methods. For more details, see section "I want to benchmark DESeq2 comparing to other DE tools" from the \code{DESeq2} vignette (available by running \code{vignette("DESeq2", package = "DESeq2")}). Default to \code{FALSE}.}
}
\value{
The function generates a \code{.Rmd} file containing the code for performing the differential expression analysis. This file can be executed using e.g. the \code{knitr} package.
}
\description{
A function to generate code that can be run to perform differential expression analysis of RNAseq data (comparing two conditions) using the DESeq2 package. The code is written to a \code{.Rmd} file. This function is generally not called by the user, the main interface for performing differential expression analysis is the \code{\link{runDiffExp}} function.
}
\details{
For more information about the methods and the interpretation of the parameters, see the \code{DESeq2} package and the corresponding publications.


The lengths matrix is used as a normalization factor and applied to the \code{DESeq2}
model in the way explained in \code{\link[DESeq2]{normalizationFactors}}
(see examples of this function).
The provided matrix will be multiplied by the default normalization factor 
obtained through the \code{\link[DESeq2]{estimateSizeFactors}} function.

The \code{design} model used in the \code{\link[DESeq2]{DESeqDataSetFromMatrix}}
uses the "condition" column of the \code{sample.annotations} data frame from the \code{\link{phyloCompData}} object
as well as all the covariates named in \code{extra.design.covariates}.
For example, if \code{extra.design.covariates = c("var1", "var2")}, then
\code{sample.annotations} must have two columns named "var1" and "var2", and the design formula
in the \code{\link[DESeq2]{DESeqDataSetFromMatrix}} function will be:
\code{~ condition + var1 + var2}.
}
\examples{
try(
if (require(DESeq2)) {
tmpdir <- normalizePath(tempdir(), winslash = "/")
## Simulate data
mydata.obj <- generateSyntheticData(dataset = "mydata", n.vars = 1000, 
                                    samples.per.cond = 5, n.diffexp = 100, 
                                    id.species = 1:10,
                                    lengths.relmeans = rpois(1000, 1000),
                                    lengths.dispersions = rgamma(1000, 1, 1),
                                    output.file = file.path(tmpdir, "mydata.rds"))
## Add covariates
## Model fitted is count.matrix ~ condition + test_factor + test_reg
sample.annotations(mydata.obj)$test_factor <- factor(rep(1:2, each = 5))
sample.annotations(mydata.obj)$test_reg <- rnorm(10, 0, 1)
saveRDS(mydata.obj, file.path(tmpdir, "mydata.rds"))
## Diff Exp
runDiffExp(data.file = file.path(tmpdir, "mydata.rds"), result.extent = "DESeq2", 
           Rmdfunction = "DESeq2.length.createRmd", 
           output.directory = tmpdir, fit.type = "parametric",
           test = "Wald",
           extra.design.covariates = c("test_factor", "test_reg"))
})
}
\references{
Anders S and Huber W (2010): Differential expression analysis for sequence count data. Genome Biology 11:R106

Love, M.I., Huber, W., Anders, S. (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biology, 15:550. 10.1186/s13059-014-0550-8.
}
\author{
Charlotte Soneson, Paul Bastide, Mélina Gallopin
}
