% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/fusedLasso.R
\name{fusedLasso}
\alias{fusedLasso}
\title{Generalized fused lasso to partition cell types by allelic imbalance}
\usage{
fusedLasso(
  sce,
  formula,
  model = c("binomial", "gaussian"),
  genecluster,
  niter = 1,
  pen.weights,
  lambda = "cv1se.dev",
  k = 5,
  adj.matrix,
  lambda.length = 25L,
  se.rule.nct = 8,
  se.rule.mult = 0.5,
  ...
)
}
\arguments{
\item{sce}{A SingleCellExperiment containing assays (\code{"ratio"},
\code{"counts"}) and colData \code{"x"}}

\item{formula}{A \code{\link[stats]{formula}} object which will typically
involve a fused lasso penalty: default is just using cell-type `x`:
\code{ratio ~ p(x, pen="gflasso")}. Other possibilities would be to use
the Graph-Guided Fused Lasso penalty, or add covariates want to be
adjusted for, which can include a gene-level baseline `gene`
\code{ratio ~ p(x, pen = "ggflasso") + gene + batch}
See \code{\link[smurf]{glmsmurf}} for more details}

\item{model}{Either \code{"binomial"} or \code{"gaussian"} used to fit
the generalized fused lasso}

\item{genecluster}{which gene cluster to run the fused lasso on.
Usually one first identifies an interesting gene cluster pattern by
\code{\link{summaryAllelicRatio}}}

\item{niter}{number of iterations to run; recommended to run 5 times
if allelic ratio differences across cell types are within [0.05, 0.1]}

\item{pen.weights}{argument as described in \code{\link[smurf]{glmsmurf}}}

\item{lambda}{argument as described in \code{\link[smurf]{glmsmurf}}.
Default lambda is determined by \code{"cv1se.dev"}
(cross-validation within 1 standard error rule(SE); deviance)}

\item{k}{number of cross-validation folds}

\item{adj.matrix}{argument as described in \code{\link[smurf]{glmsmurf}}}

\item{lambda.length}{argument as described in \code{\link[smurf]{glmsmurf}}}

\item{se.rule.nct}{the number of cell types to trigger a different SE-based rule than 1 SE
(to prioritize larger models, less fusing,
good for detecting smaller, e.g. 0.05, allelic ratio differences).
When the number of cell types is less than or equal to this value,
\code{se.rule.mult} SE rule is used. When the number of cell types
is larger than this value, the standard 1 SE rule is used.}

\item{se.rule.mult}{the multiplier of the SE in determining the lambda:
the chosen lambda is within \code{se.rule.mult} x SE of the minimum deviance.
Small values will prioritize larger models, less fusing.
Only used when number of cell types is equal to or less than \code{se.rule.nct}}

\item{...}{additional arguments passed to \code{\link[smurf]{glmsmurf}}}
}
\value{
A SummarizedExperiment with attached metadata and colData:
a matrix grouping factor partition
and the penalized parameter lambda
are returned in metadata \code{"partition"} and \code{"lambda"}.
Partition and logistic group allelic estimates are stored in
colData: \code{"part"} and \code{"coef"}.
}
\description{
Fits generalized fused lasso with either binomial(link="logit")
or Gaussian likelihood, leveraging functions from the
\code{smurf} package.
}
\details{
Usually, we used a Generalized Fused Lasso penalty for the
cell states in order to regularize all possible coefficient differences.
Another possibility would be to use the Graph-Guided Fused Lasso penalty
to only regularize the differences of coefficients of neighboring
cell states.

When using a Graph-Guided Fused Lasso penalty, the adjacency matrix
corresponding to the graph needs to be provided. The elements of this
matrix are zero when two levels are not connected, and one when they are
adjacent.

See the package vignette for more details and a complete description of a
use case.
}
\examples{
library(S4Vectors)
library(smurf)
sce <- makeSimulatedData()
sce <- preprocess(sce)
sce <- geneCluster(sce, G = seq_len(4))
f <- ratio ~ p(x, pen = "gflasso") # formula for the GFL
sce_sub <- fusedLasso(sce,
  formula = f, model = "binomial", genecluster = 1, ncores = 1)
metadata(sce_sub)$partition
metadata(sce_sub)$lambda

# can add covariates or `gene` to the formula
f2 <- ratio ~ p(x, pen = "gflasso") + gene
sce_sub <- fusedLasso(sce[1:5,],
  formula = f2, model = "binomial",
  genecluster = 1, ncores = 1)

# Suppose we have 4 cell states, if we only want neibouring cell states
# to be grouped together with other cell states. Note here the names of
# the cell states should be given as row and column names.
nct <- nlevels(sce$x)
adjmatrix <- makeOffByOneAdjMat(nct)
colnames(adjmatrix) <- rownames(adjmatrix) <- levels(sce$x)
f <- ratio ~ p(x, pen = "ggflasso") # use graph-guided fused lasso
sce_sub <- fusedLasso(sce,
  formula = f, model = "binomial", genecluster = 1,
  lambda = 0.5, ncores = 1,
  adj.matrix = list(x = adjmatrix)
)
metadata(sce_sub)$partition
}
\references{
This function leverages the glmsmurf function from the smurf package.
For more details see the following manuscript:

Devriendt S, Antonio K, Reynkens T, et al.
Sparse regression with multi-type regularized feature modeling[J].
Insurance: Mathematics and Economics, 2021, 96: 248-261.
}
\seealso{
\code{\link[smurf]{glmsmurf}},
\code{\link[smurf]{glmsmurf.control}},
\code{\link[smurf]{p}}, \code{\link[stats]{glm}}
}
