planc
Parallel Lowrank Approximation with Non-negativity Constraints
/Users/rnu/Documents/research/nmflibrary/nmf/README.md
Go to the documentation of this file.
1 # OpenMP Non-negative Matrix Factorization
2 
3 Install Instructions
4 ---------------------
5 
6 This program depends on:
7 
8 - Armadillo library which can be found at https://arma.sourceforge.net
9 - OpenBLAS https://github.com/xianyi/OpenBLAS. If building with OpenBLAS
10 and mkl is discoverable by cmake, use -DCMAKE_IGNORE_MKL=1.
11 
12 Once you have installed these libraries set the following environment variables.
13 
14 ````
15 export ARMADILLO_INCLUDE_DIR=/home/rnu/libraries/armadillo-6.600.5/include/
16 export LIB=$LIB:/home/rnu/libraries/openblas/lib:
17 export INCLUDE=$INCLUDE:/home/rnu/libraries/openblas/include:$ARMADILLO_INCLUDE_DIR:
18 export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/rnu/libraries/openblas/lib/:
19 export CPATH=CPATH:$INCLUDE:
20 ````
21 
22 * Create a build directory.
23 * Change to the build directory
24 * In case of MKL, source the ````$MKLROOT/bin/mkl_vars.sh intel64````
25 * run cmake [PATH TO THE CMakeList.txt]
26 * make
27 
28 Sparse NMF
29 ---------
30 Run cmake with -DCMAKE_BUILD_SPARSE=1
31 
32 Sparse Debug build
33 ------------------
34 Run cmake with -DCMAKE_BUILD_SPARSE -DCMAKE_BUILD_TYPE=Debug
35 
36 Building on Cray-EOS/Titan
37 -----------------------
38 CC=CC CXX=CC cmake ~/nmflibrary/mpi/ -DCMAKE_IGNORE_MKL=1
39 
40 Intel MKL vs Openblas
41 =====================
42 - ````export LD_LIBRARY_PATH=MKL_LIB path````
43 - source the ````$MKLROOT/bin/mkl_vars.sh intel64````
44 
45 Runtime usage
46 =============
47 Tell OpenBlas how many threads you want to use. For example on a quad core system use the following.
48 
49 ````
50 export OPENBLAS_NUM_THREADS=4
51 export MKL_NUM_THREADS=4
52 export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:MKL_LIB
53 ````
54 
55 Command Line options
56 ====================
57 
58 The single alphabet is called short option and the string equivalent is called
59 long option. For eg., "input" is the long equivalent of short option 'i'.
60 Typically long option is passed with "--algo=3" and short option with "-a 0".
61 The following is the brief description of the various command line options.
62 
63 * {"input",'i'} - Either it can be a path to a sparse/dense
64 matrix file or synthetically generate matrix. If this
65 option is not passed, we generate synthetic matrix.
66 * {"algo",'a'} - We support four algorithms.
67  0 - Multiplicative update (MU)
68  1 - Hierarchical Alternating Least Squares (HALS)
69  2 - ANLS/BPP implementation
70 * {"lowrank",'k'} - Low rank 'k'.
71 * {"iter",'t'} - Number of iterations
72 * {"dimensions",'d'} - This is applicable only for synthetic matrices. It takes
73  space separated dimensions as string. For eg., "21000 16000" means 21000 rows
74  and 16000 columns.
75 * {'o'} - File name to dump W and H. _w and _h will be appended to distinguish
76  W and H matrix.
77 * {"sparsity",'s'} - Density for the synthetic sparse matrix.
78 
79 Few usage examples are
80 Usage 1 : Sparse/Dense NMF for an input file with lowrank k=20 for 20 iterations.
81 ````NMFLibrary --algo=[0/1/2] --lowrank=20 --input=filename --iter=20 ````
82 
83 Usage 2 : Sparse/Dense synthetic NMF for a 20000x10000 matrix
84 ````NMFLibrary --algo=[0/1/2] --lowrank=20 -p "20000 10000" --iter=20 ````
85 
86 Usage3 : Sparse/Dense NMF for an input file with lowrank k=20 for 20 iterations starting
87 from the initialization matrix defined in winit and hinit. Finally, it dumps the output
88 W and H in the specified file
89 ````NMFLibrary --algo=[0/1/2] --lowrank=20 --input=filename --winit=filename --hinit=filename --w=woutputfilename --h=outputfilename --iter=20````
90 
91 Citation
92 ========
93 
94 If you are using this openmp implementation, kindly cite.
95 
96 James P. Fairbanks, Ramakrishnan Kannan, Haesun Park, David A. Bader, Behavioral clusters in dynamic graphs, Parallel Computing, Volume 47, August 2015, Pages 38-50, ISSN 0167-8191, http://dx.doi.org/10.1016/j.parco.2015.03.002.