The function syntheticNMF
generates random target matrices that follow
some defined NMF model, and may be used to test NMF algorithms.
It is designed to designed to produce data with known or clear classes of
samples.
syntheticNMF(n, r, p, offset = NULL, noise = TRUE, factors = FALSE, seed = NULL)
numeric
, in which case argument p
is required
and r
groups of samples are generated from a draw from a multinomial
distribution with equal probabilities, that provides their sizes.
It may also be a numerical vector, which contains the number of samples in
each class (i.e integers). In this case argument p
is discarded
and forced to be the sum of r
.r
is a vector (see description of argument r
).n
, or a single numeric value that
is used as the standard deviation of a centred normal distribution from which
the actual offset values are drawn.a matrix, or a list if argument factors=TRUE
.
When factors=FALSE
, the result is a matrix object, with the following attributes set:
H
);
H
);
list
with one element 'Group'
that contains a factor
that indicates the true groups of samples, i.e. the most contributing basis component for each sample;
list
with one element 'Group'
that contains a factor
that indicates the true groups of features, i.e. the basis component
to which each feature contributes the most.
Moreover, the result object is an ExposeAttribute
object, which means that
relevant attributes are accessible via $
, e.g., res$coefficients
.
In particular, methods coef
and basis
will work as expected
and return the true underlying coefficient and basis matrices respectively.
# generate a synthetic dataset with known classes: 50 features, 18 samples (5+5+8)
n <- 50
counts <- c(5, 5, 8)
# no noise
V <- syntheticNMF(n, counts, noise=FALSE)
## Not run: aheatmap(V)
# with noise
V <- syntheticNMF(n, counts)
## Not run: aheatmap(V)