na.color
to specify the color to use
for NA values.add silhouette computation for NMF results, which can be performed on samples, features or consensus matrix. The average silhouette width is also computed by the summary methods. (Thanks to Gordon Robertson for this suggestion)
New runtime option 'shared.memory' (or 'm') for toggling usage of shared memory (requires package synchronicity).
seed
to associate them with a seeding method that is not the
default method, maxIter
to make default runs with more iterations
(for algorithms defined as NMFStrategyIterative), or any other
algorithm-specific parameter.o Subsetting an NMF object with a single index now returns an NMF object, except if argument drop is not missing (i.e. either TRUE or FALSE).
Due to the major changes made in the internal structure of the standard NMF
models, previous NMF fits are not compatible with this version.
The factory function nmfModel has been enhanced and provides new methods that makes more easier the creation of NMF objects. See ?nmfModel.
A new heatmap drawing function 'aheatmap' (for annotated heatmap) is now used to generate the different heatmaps (basismap, coefmap and consensusmap). It is a enhancement/fork of the function pheatmap from package pheatmap and draw – I think – very nice heatmaps, providing a convenient and flexible interface to add annotation tracks to both the columns and rows, with sensible automatic legends.
Method nmfModel when called with no arguments does not return anymore
the list of available NMF models, but an empty NMF model.
To list the available models, directly call nmfModels()
.
The function rmatrix
is now a S4 generic function.
It gains methods for generating random matrices based on a template matrix
or an NMF model.
See ?rmatrix.
The function rnmf
gains a method to generate a random NMF model
given numerical dimensions.
The function nmfEstimateRank now returns the fits for each value of the rank in element 'fit' of the result list. See ?nmfEstimateRank.
The state of the random number generator is systematically stored in the 'NMFfit' object returned by function 'nmf'. It is stored in a new slot 'rng.seed' and can be access via the new method 'rngSeed' of class 'NMFfit'. See ?rngSeed for more details.
The number of cores to use in multicore computations can now also be specified by the 'p - parallel' runtime option (e.g. 'p4' to use 4 cores). Note that the value specified in option 'p' takes precedence on the one passed via argument '.pbackend'. See section 'Runtime options' in ?nmf for more details.
Function 'nmfApply' now allows values 3 and 4 for argument 'MARGIN' to apply a given function to the basis vectors (i.e. columns of the basis matrix W) and basis profiles (i.e. rows of the mixture coefficients H) respectively. See ?nmfApply for more details.
New S4 generic 'basiscor' and 'profcor' to compute the correlation matrices of the basis vectors and basis profiles respectively from two NMF models or from an NMF model and a given compatible matrix. See ?basiscor or ?profcor for more details.
New S4 generic 'fitcmp' to compare the NMF models fitted by two different runs of function 'nmf' (i.e. two 'NMFfit' objects). See ?fitcmp for more details.
New S4 generic 'canFit' that tells if an NMF method is able to exactly or partly fit a given NMF model. See ?canFit for more details.
New function 'selectMethodNMF' that selects an appropriate NMF method to fit a given NMF model. See ?selectMethodNMF for more details.
The verbosity level can be controlled more finely by the 'v - verbose' runtime option (e.g. using .options='v1' or 'v2'). The greater is the level the more information is printed. The verbose outputs have been cleaned-up and should be more consistent across the run mode (sequential or multicore). See section 'Runtime options' in ?nmf for more details.
The standard update equations have been optimised further, by making them modify the factor matrices in place. This speeds up the computation and greatly improves the algorithms' memory footprint.
The package NMF now depends on the package digest that is used to display the state of random number generator in a concise way.
The methods 'metaHeatmap' are split into 3 new S4 generic 'basismap' to plot a heatmap of the basis matrix [formerly plotted by the call : 'metaHeatmap(object.of.class.NMF, 'features', …) ], 'coefmap' to plot a heatmap of the mixture coefficient matrix [formerly plotted by the call : 'metaHeatmap(object.of.class.NMF, 'samples', …) ], and 'consensusmap' to plot a heatmap of the consensus matrix associated with multiple runs of NMF [formerly plotted by the call : 'metaHeatmap(object.of.class.NMFfitX, …) ].
Method 'fcnnls' provides a user-friendly interface for the internal function '.fcnnls' to solve non-nengative linear least square problems, using the fast combinatorial approach from Benthem and Keenan (2004). See '?fcnnls' for more details.
New argument 'callback' in method 'nmf' allows to provide a callback function when running multiple runs in default mode, that only keeps the best result. The callback function is applied to the result of each run before it possibly gets discarding. The results are stored in the miscellaneous slot '.callback' accessible via the '\(‘ operator (e.g. res\).callback). See '?nmf' for more details.
New method 'niter' to retrieve the number of iterations performed to fit a NMF model. It is defined on objects of class 'NMFfit'.
New function 'isNMFfit' to check if an object is a result from NMF fitting.
New function 'rmatrix' to easily generate random matrices, and allow to specify the distribution from which the entries are drawn. For example:
* 'rmatrix(100, 10)' generates a 100x10 matrix whose entries are drawn
from the uniform distribution
* 'rmatrix(100, 10, rnorm)' generates a 100x10 matrix whose entries are drawn
from the standard Normal distribution.
New methods 'basisnames' and 'basisnames<-' to retrieve and set the basis vector names. See '?basisnames'.
Add a CITATION file that provides the bibtex entries for citing the BMC Bioinformatics paper for the package (http://www.biomedcentral.com/1471-2105/11/367), the vignette and manual. See 'citation('NMF')' for complete references.
New argument 'ncol' in method 'nmfModel' to specify the target dimensions more easily by calls like 'nmfModel(r, n, p)' to create a r-rank NMF model that fits a n x p target matrix.
The subset method '[]' of objects of class 'NMF' has been changed to be more convenient. See
The major change is the explicit addition of the synchronicity package into the suggested package dependencies. Since the publication of versions 4.x of the bigmemory package, it is used by the NMF package to provide the mutex functionality required by multicore computations. Note This is relevant only for Linux/Mac-like platforms as the multicore package is not yet supported on MS Windows. Users using a recent version of the bigmemory package (i.e. >=4.x) DO NEED to install the synchronicity package to be able to run multicore NMF computations. Versions of bigmemory prior to 4.x include the mutex functionality.
Minor enhancement in error messages
Method 'nmfModel' can now be called with arguments 'rank' and 'target' swapped. This is for convenience and ease of use.
function 'nmfEstimateRank' has been enhanced: -run options can be passed to each internal call to the 'nmf' function. See ?nmfEstiamteRank and ?nmf for details on the supported options.
in function 'plot.NMF.rank': a new argument 'na.rm' allows to remove from the plots the ranks for which the measures are NAs (due to errors during the estimation process with 'nmfEstimateRank').
Method 'consensus' is now exported in the NAMESPACE file. Thanks to Gang Su for reporting this.
Warnings and messages about loading packages are now suppressed. This was particularly confusing for users that do not have the packages and/or platform required for parallel computation: warnings were printed whereas the computation was – sequentially – performed without problem. Thanks to Joe Maisog for reporting this.
The 'metaHeatmap' function was not correctly handling row labels when argument filter was not FALSE. All usual row formating in heatmaps (label and ordering) are now working as expected. Thanks to Andreas Schlicker, from The Netherlands Cancer Institute for reporting this.
An error was thrown on some environments/platforms (e.g. Solaris) where not all the packages required for parallel computation were not available – even when using option 'p' ('p' in lower case), which should have switched the computation to sequential. This is solved and the error is only thrown when running NMF with option 'P' (capital 'P').
Not all the options were passed (e.g. 't' for tracking) in sequential mode (option '-p').
verbose/debug global nmf.options were not restored if a numerical random seed were used.
The 'metaHeatmap' function nows support passing the name of a filtering method in argument 'filter', which is passed to method 'extractFeatures'. See ?metaHeatmap.
Verbose and debug messages are better handled. When running a parallel computation of multiple runs, verbose messages from each run are shown only in debug mode.
Part of the code has been optimised in C++ for speed and memory efficiency:
- the multiplicative updates for reducing the KL divergence and the euclidean
distance have been optimised in C++. This significantly reduces the
computation time of the methods that make use of them: 'lee', 'brunet',
'offset', 'nsNMF' and 'lnmf'.
Old R version of the algorithm are still accessible with the suffix '.R#'.
- the computation of euclidean distance and KL divergence are implemented
in C++, and do not require the duplication of the original matrices as done
in R.
Generic 'dimnames' is now defined for objects of class 'NMF' and returns
of the mixture coefficient matrix , and the column names of the basis matrix.
A new class structure has been developed to handle the results of multiple
- Class 'NMFfitX' defines a common interface for multiple NMF runs of a
- Class 'NMFfitX1' handles the sub-case where only the best fit is returned.
- Class 'NMFfitXn' handles the sub-case where the list of all the fits
- Class 'NMFList' handles the case of heterogeneous NMF runs (different
The vignette contains more examples and details about the use of package.
The package is compatible with both versions 3.x and 4.x of the bigmemory package. This package is used when running multicore parallel computations. With version 4.x of bigmemory, the synchronicity package is also required as it provides the mutex functionality that used to be provided by bigmemory 3.x.
Running in multicore mode from the GUI on MacOS X is not allowed anymore as it is not safe and were throwing an error ['The process has forked and …']. Thanks to Stephen Henderson from the UCL Cancer Institute (UK) for reporting this.
Function 'nmf' now restores the random seed to its original value as before its call with a numeric seed. This behaviour can be disabled with option 'restore.seed=FALSE' or '-r'
Deprecated Generics/Methods 1) 'errorPlot' - S4 generic/methods remains with .Deprecated message. It is replaced by a definition of the 'plot' method for signatures 'NMFfit,missing' and 'NMFList,missing' It will be completely removed from the package in the next version.
Deprecated Class 1) 'NMFSet' - S4 class remains for backward compatibility, but is not used anymore. It is replaced by the classes 'NMFfitX1', 'NMFfitXn', 'NMFList'.
Function 'randomize' allows to randomise a target matrix or
over-fitting.
The 'random' method of class 'NMF' is renamed 'rnmf', but is still
Defunct Generics/Methods 1) 'extra' - S4 generic/methods remains with .Defunct message. It will be completely removed from the package in the next version.
Package 'Biobase' is not required anymore, and only suggested. The definition and export of the NMF-BioConductor layer is done at loading.
'nmfApply' S4 generic/method: a 'apply'-like method for objects of class 'NMF'.
'predict' S4 method for objects of class 'NMF': the method replace the now deprecated 'clusters' method.
'featureNames' and 'sampleNames' S4 method for objects of class 'NMFSet'.
sub-setting S4 method '[' for objects of class 'NMF': row subsets are applied to the rows of the basis matrix, column subsets are applied to the columns of the mixture coefficient matrix.
Methods 'basis<-' and 'coef<-' were not exported by file NAMESPACE.
Method 'featureNames' was returning the column names of the mixture coefficient matrix, instead of the row names of the basis matrix.
'metaHeatmap' with argument 'class' set would throw an error.