NMF Model - Standard model

Description

This class implements the standard model of Nonnegative Matrix Factorization. It provides a general structure and generic functions to manage factorizations that follow the standard NMF model, as defined by Lee et al. (2001).

Details

Let V be a n \times m non-negative matrix and r a positive integer. In its standard form (see references below), a NMF of V is commonly defined as a pair of matrices (W, H) such that:

V \equiv W H,

where:

  • W and H are n \times r and r \times m matrices respectively with non-negative entries;
  • \equiv is to be understood with respect to some loss function. Common choices of loss functions are based on Frobenius norm or Kullback-Leibler divergence.

Integer r is called the factorization rank. Depending on the context of application of NMF, the columns of W and H are given different names:

  1. columns of Wbasis vector, metagenes, factors, source, image basis
  2. columns of Hmixture coefficients, metagene sample expression profiles, weights
  3. rows of Hbasis profiles, metagene expression profiles

NMF approaches have been successfully applied to several fields. The package NMF was implemented trying to use names as generic as possible for objects and methods.

The following terminology is used:

  1. samplesthe columns of the target matrix V
  2. featuresthe rows of the target matrix V
  3. basis matrixthe first matrix factor W
  4. basis vectorsthe columns of first matrix factor W
  5. mixture matrixthe second matrix factor H
  6. mixtures coefficientsthe columns of second matrix factor H

However, because the package NMF was primarily implemented to work with gene expression microarray data, it also provides a layer to easily and intuitively work with objects from the Bioconductor base framework. See bioc-NMF for more details.

Slots

  1. WA matrix that contains the basis matrix, i.e. the first matrix factor of the factorisation

  2. HA matrix that contains the coefficient matrix, i.e. the second matrix factor of the factorisation

  3. btermsa data.frame that contains the primary data that define fixed basis terms. See bterms.

  4. ibtermsinteger vector that contains the indexes of the basis components that are fixed, i.e. for which only the coefficient are estimated.

    IMPORTANT: This slot is set on construction of an NMF model via nmfModel and is not recommended to not be subsequently changed by the end-user.

  5. ctermsa data.frame that contains the primary data that define fixed coefficient terms. See cterms.

  6. ictermsinteger vector that contains the indexes of the basis components that have fixed coefficients, i.e. for which only the basis vectors are estimated.

    IMPORTANT: This slot is set on construction of an NMF model via nmfModel and is not recommended to not be subsequently changed by the end-user.

Methods

  1. .basissignature(object = "NMFstd"): Get the basis matrix in standard NMF models

    This function returns slot W of object.

  2. .basis<-signature(object = "NMFstd", value = "array"): Set the basis matrix in standard NMF models

    This function sets slot W of object.

  3. .basis<-signature(object = "NMFstd", value = "matrix"): Replaces a slice of the basis array.

  4. bterms<-signature(object = "NMFstd"): Default method tries to coerce value into a data.frame with as.data.frame.

  5. .coefsignature(object = "NMFstd"): Get the mixture coefficient matrix in standard NMF models

    This function returns slot H of object.

  6. .coef<-signature(object = "NMFstd", value = "array"): Set the mixture coefficient matrix in standard NMF models

    This function sets slot H of object.

  7. .coef<-signature(object = "NMFstd", value = "matrix"): Replaces a slice of the coefficent array.

  8. cterms<-signature(object = "NMFstd"): Default method tries to coerce value into a data.frame with as.data.frame.

  9. fittedsignature(object = "NMFstd"): Compute the target matrix estimate in standard NMF models.

    The estimate matrix is computed as the product of the two matrix slots W and H:

    V ~ W H

  10. ibtermssignature(object = "NMFstd"): Method for standard NMF models, which returns the integer vector that is stored in slot ibterms when a formula-based NMF model is instantiated.

  11. ictermssignature(object = "NMFstd"): Method for standard NMF models, which returns the integer vector that is stored in slot icterms when a formula-based NMF model is instantiated.

References

Lee DD and Seung H (2001). "Algorithms for non-negative matrix factorization." _Advances in neural information processing systems_. .

Examples



# create a completely empty NMFstd object
new('NMFstd')
## <Object of class:NMFstd>
## features: 0 
## basis/rank: 0 
## samples: 0
# create a NMF object based on one random matrix: the missing matrix is deduced
# Note this only works when using factory method NMF
n <- 50; r <- 3;
w <- rmatrix(n, r)
nmfModel(W=w)
## <Object of class:NMFstd>
## features: 50 
## basis/rank: 3 
## samples: 0
# create a NMF object based on random (compatible) matrices
p <- 20
h <- rmatrix(r, p)
nmfModel(W=w, H=h)
## <Object of class:NMFstd>
## features: 50 
## basis/rank: 3 
## samples: 20
# create a NMF object based on incompatible matrices: generate an error
h <- rmatrix(r+1, p)
try( new('NMFstd', W=w, H=h) )
try( nmfModel(w, h) )

# Giving target dimensions to the factory method allow for coping with dimension
# incompatibilty (a warning is thrown in such case)
nmfModel(r, W=w, H=h)
## Warning in .local(rank, target, ...): nmfModel - Objective rank [3] is
## lower than the number of rows in H [4]: only the first 3 rows of H will be
## used
## <Object of class:NMFstd>
## features: 50 
## basis/rank: 3 
## samples: 20
# create a NMF array object based on random (compatible) arrays
# extra dimension (levels)
q <- 2
w <- array(seq(n*r*q), dim = c(n, r, q))
h <- rmatrix(r, p)
nmfModel(W = w, H = h)
## <Object of class:NMFstd>
## features: 50 
## basis/rank: 3 
## samples: 20 
## levels:  2 | 1