\name{seeFastq}
\alias{seeFastq}
\alias{seeFastqPlot}
\title{
Quality reports for FASTQ files
}
\description{
The following \code{seeFastq} and \code{seeFastqPlot} functions generate and plot a series of
useful quality statistics for a set of FASTQ files including per cycle quality
box plots, base proportions, base-level quality trends, relative k-mer
diversity, length and occurrence distribution of reads, number of reads above
quality cutoffs and mean quality distribution. The functions allow processing
of reads with variable length, but most plots are only meaningful if the read
positions in the FASTQ file are aligned with the sequencing cycles. For
instance, constant length clipping of the reads on either end or variable
length clipping on the 3' end maintains this relationship, while variable
length clipping on the 5' end without reversing the reads erases it.

The function \code{seeFastq} computes the summary stats and stores them in a relatively 
small list object that can be saved to disk with \code{save()} and reloaded with 
\code{load()} for later plotting. The argument 'klength' specifies the k-mer length and 'batchsize' the
number of reads to random sample from each fastq file.  }
\usage{
seeFastq(fastq, batchsize, klength = 8)

seeFastqPlot(fqlist, arrange = c(1, 2, 3, 4, 5, 8, 6, 7), ...)
}
\arguments{
  \item{fastq}{
	Named character vector containing paths to FASTQ file in the data fields and sample labels in the name slots.
}
  \item{batchsize}{
	Number of reads to random sample from each FASTQ file that will be considered in the QC analysis. Smaller numbers reduce the memory footprint and compute time.
}
  \item{klength}{
	Specifies the k-mer length in the plot for the relative k-mer diversity.
}
  \item{fqlist}{
	\code{list} object returned by \code{seeFastq()}.
}
  \item{arrange}{
	Integer vector from 1 to 7 specifying the row order of the QC plot. Dropping numbers eliminates the corresponding plots.  
}
  \item{\dots}{
	Additional plotting arguments to pass on to \code{seeFastqPlot()}.
}
}
\value{
	The function \code{seeFastq} returns the summary stats in a \code{list} containing all information required for the quality plots. 
	The function \code{seeFastqPlot} plots the information generated by \code{seeFastq} using \code{ggplot2}.
}
\author{
Thomas Girke
}
\examples{
\dontrun{
targets <- system.file("extdata", "targets.txt", package="systemPipeR")
dir_path <- system.file("extdata/cwl", package="systemPipeR")
args <- loadWorkflow(targets=targets, wf_file="hisat2/hisat2-mapping-se.cwl", 
                  input_file="hisat2/hisat2-mapping-se.yml", dir_path=dir_path)
args <- renderWF(args, inputvars=c(FileName="_FASTQ_PATH1_", SampleName="_SampleName_"))
fqlist <- seeFastq(fastq=infile1(args), batchsize=10000, klength=8)
pdf("fastqReport.pdf", height=18, width=4*length(fastq))
seeFastqPlot(fqlist)
dev.off()

}
}
\keyword{ utilities }
