% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/filterData.R
\name{filterData}
\alias{filterData}
\title{Filter the positions of interest}
\usage{
filterData(
  data,
  cutoff = NULL,
  index = NULL,
  filter = "one",
  totalMapped = NULL,
  targetSize = 8e+07,
  ...
)
}
\arguments{
\item{data}{Either a list of Rle objects or a DataFrame with the coverage
information.}

\item{cutoff}{The base-pair level cutoff to use. It's behavior is controlled
by \code{filter}.}

\item{index}{A logical Rle with the positions of the chromosome that passed
the cutoff. If \code{NULL} it is assumed that this is the first time using
\link{filterData} and thus no previous index exists.}

\item{filter}{Has to be either \code{'one'} (default) or \code{'mean'}. In
the first case, at least one sample has to have coverage above \code{cutoff}.
In the second case, the mean coverage has to be greater than \code{cutoff}.}

\item{totalMapped}{A vector with the total number of reads mapped for each
sample. The vector should be in the same order as the samples in \code{data}.
Providing this data adjusts the coverage to reads in \code{targetSize}
library prior to filtering. See \link{getTotalMapped} for
calculating this vector.}

\item{targetSize}{The target library size to adjust the coverage to. Used
only when \code{totalMapped} is specified. By default, it adjusts to
libraries with 80 million reads.}

\item{...}{Arguments passed to other methods and/or advanced arguments.
Advanced arguments:
\describe{
\item{verbose }{ If \code{TRUE} basic status updates will be printed along
the way.}
\item{returnMean }{ If \code{TRUE} the mean coverage is included in the
result. \code{FALSE} by default.}
\item{returnCoverage }{ If \code{TRUE}, the coverage DataFrame is returned.
\code{TRUE} by default.}
}}
}
\value{
A list with up to three components.

\describe{
\item{coverage }{ is a DataFrame object where each column represents a
sample. The number of rows depends on the number of base pairs that passed
the cutoff and the information stored is the coverage at that given base.
Included only when \code{returnCoverage = TRUE}.}
\item{position }{  is a logical Rle with the positions of the chromosome
that passed the cutoff.}
\item{meanCoverage }{ is a numeric Rle with the mean coverage at each base.
Included only when \code{returnMean = TRUE}.}
\item{colnames }{ Specifies the column names to be used for the results
DataFrame. If \code{NULL}, names from \code{data} are used.}
\item{smoothMean }{ Whether to smooth the mean. Used only when
\code{filter = 'mean'}. This option is used internally by
\link{regionMatrix}.}
}
Passed to the internal function \code{.smootherFstats}, see
\link{findRegions}.
}
\description{
For a group of samples this function reads the coverage information for a
specific chromosome directly from the BAM files. It then merges them into a
DataFrame and removes the bases that do not pass the cutoff. This is a
helper function for \link{loadCoverage} and \link{preprocessCoverage}.
}
\details{
If \code{cutoff} is \code{NULL} then the data is grouped into
DataFrame without applying any cutoffs. This can be useful if you want to
use \link{loadCoverage} to build the coverage DataFrame without applying any
cutoffs for other downstream purposes like plotting the coverage values of a
given region. You can always specify the \code{colsubset} argument in
\link{preprocessCoverage} to filter the data before calculating the F
statistics.
}
\examples{
## Construct some toy data
library("IRanges")
x <- Rle(round(runif(1e4, max = 10)))
y <- Rle(round(runif(1e4, max = 10)))
z <- Rle(round(runif(1e4, max = 10)))
DF <- DataFrame(x, y, z)

## Filter the data
filt1 <- filterData(DF, 5)
filt1

## Filter again but only using the first two samples
filt2 <- filterData(filt1$coverage[, 1:2], 5, index = filt1$position)
filt2
}
\seealso{
\link{loadCoverage}, \link{preprocessCoverage},
\link{getTotalMapped}
}
\author{
Leonardo Collado-Torres
}
