% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/subsampleCounts.R
\name{subsampleCounts}
\alias{subsampleCounts}
\alias{rarifyCounts}
\alias{subsampleCounts,SummarizedExperiment-method}
\title{Subsample Counts}
\usage{
subsampleCounts(
  x,
  assay.type = assay_name,
  assay_name = "counts",
  min_size = min(colSums2(assay(x))),
  replace = TRUE,
  name = "subsampled",
  verbose = TRUE,
  ...
)

\S4method{subsampleCounts}{SummarizedExperiment}(
  x,
  assay.type = assay_name,
  assay_name = "counts",
  min_size = min(colSums2(assay(x))),
  replace = TRUE,
  name = "subsampled",
  verbose = TRUE,
  ...
)
}
\arguments{
\item{x}{A
\code{SummarizedExperiment} object.}

\item{assay.type}{A single character value for selecting the
\code{SummarizedExperiment} \code{assay} used for random subsampling.
Only counts are useful and other transformed data as input will give
meaningless output.}

\item{assay_name}{a single \code{character} value for specifying which
assay to use for calculation.
(Please use \code{assay.type} instead. At some point \code{assay_name}
will be disabled.)}

\item{min_size}{A single integer value equal to the number of counts being
simulated this can equal to lowest number of total counts
found in a sample or a user specified number.}

\item{replace}{Logical Default is \code{TRUE}. The default is with
replacement (\code{replace=TRUE}).
See \code{\link[phyloseq:rarefy_even_depth]{phyloseq::rarefy_even_depth}}
for details on implications of this parameter.}

\item{name}{A single character value specifying the name of transformed
abundance table.}

\item{verbose}{Logical Default is \code{TRUE}. When \code{TRUE} an additional
message about the random number used is printed.}

\item{...}{additional arguments not used}
}
\value{
\code{subsampleCounts} return \code{x} with subsampled data.
}
\description{
\code{subsampleCounts} will randomly subsample counts in
\code{SummarizedExperiment} and return the a modified object in which each
sample has same number of total observations/counts/reads.
}
\details{
Although the subsampling approach is highly debated in microbiome research,
we include the \code{subsampleCounts} function because there may be some
instances where it can be useful.
Note that the output of \code{subsampleCounts} is not the equivalent as the
input and any result have to be verified with the original dataset.
To maintain the reproducibility, please define the seed using set.seed()
before implement this function.
}
\examples{
# When samples in TreeSE are less than specified min_size, they will be removed.
# If after subsampling features are not present in any of the samples, 
# they will be removed.
data(GlobalPatterns)
tse <- GlobalPatterns
set.seed(123)
tse.subsampled <- subsampleCounts(tse, 
                                  min_size = 60000, 
                                  name = "subsampled" 
                                  )
tse.subsampled
dim(tse)
dim(tse.subsampled)

}
\references{
McMurdie PJ, Holmes S. Waste not, want not: why rarefying microbiome data
is inadmissible. PLoS computational biology. 2014 Apr 3;10(4):e1003531.

Gloor GB, Macklaim JM, Pawlowsky-Glahn V & Egozcue JJ (2017)
Microbiome Datasets Are Compositional: And This Is Not Optional.
Frontiers in Microbiology 8: 2224. doi: 10.3389/fmicb.2017.02224

Weiss S, Xu ZZ, Peddada S, Amir A, Bittinger K, Gonzalez A, Lozupone C,
Zaneveld JR, Vázquez-Baeza Y, Birmingham A, Hyde ER. Normalization and
microbial differential abundance strategies depend upon data characteristics.
Microbiome. 2017 Dec;5(1):1-8.
}
\author{
Sudarshan A. Shetty and Felix G.M. Ernst
}
