Examples
Is there an example in tapkee to do the following? a.) Load a data file from a text file b.) Call a dimension reduction method c.) Output the mapped data back to a text file..
Did you browse through the examples directory? Some examples load data from files and then plot the embedded data, others write the data in the standard output.
Can you point me to one such example? I would be interested in an API that is similar to the example shown on the main page of github.
www.github.com/lisitsyn/tapkee/blob/master/examples/rna/rna.cpp
Hey. @skn123 are you talking about command-line call example?
Hi @lisitsyn I am indeed talking about that. What I would like is a way to pass as input a text file with each row corresponding to a feature vector, along with the method that I would like to implement and its parameters. After the processing, the mapped output should be written out as a text file.
@iglesias Indeed, that method is close enough! @lisitsyn have you considered hiving off the K-NN part of computing the neighborhood graph as an OpenCL module?
@skn123 quite easy. Use tapkee_cli binary with options -i input_file.dat and -o output_file.dat. Let me cite example from the help:
Run locally linear embedding with k=10 with arpack eigensolver on data from input.dat saving embedding to output.dat
tapkee -i input.dat -o output.dat --method lle --eigen-method arpack -k 10
@skn123 yeah there is a bunch of things I like to improve with OpenCL as well - just a matter of lacking time :(
@lisitsyn I am unable to build tapkee_cli using MinGW, the bug that I had mentioned earlier. If you can help me solve that issue then I would be set. Suppose I do not provide arpack as the eigen-method. Will it revert to some default eigensolver?
@skn123 ah sorry I didn't recognize its you who had that issue :) I'll try to do resolve this issue tomorrow.
Default eigenmethod is ok, yeah, you don't have to put it explicitly.
@lisitsyn Maybe you can have a sub-project for Kd-tree / K-nn using OpenCL and that would be a nice fit for Tapkee.
@skn123 yeah or something to outsource things to :)
@lisitsyn I see that Tapkee uses ltsa as one of the methods. Can you point me to the eigensolver that this method uses to compute the eigenvalues given that the matrices will be sparse.
@skn123 it uses ARPACK as it only requires matrix-vector products. The sparsity is of no matter for ARPACK this way. Although it is possible to use other methods from Eigen3.
@lisitsyn suppose I don't specify the eigensolver (which means I dont want to use ARPACK), then how will go about solving it? I may have a sparse matrix of 1 million entries !
TapkeeOutput embedKernelLocalTangentSpaceAlignment()
{
Neighbors neighbors = findNeighborsWith(kernel_distance);
SparseWeightMatrix weight_matrix =
tangent_weight_matrix(begin,end,neighbors,kernel,p_target_dimension,p_eigenshift);
DenseMatrix embedding =
eigendecomposition(p_eigen_method,p_computation_strategy,SmallestEigenvalues,
weight_matrix,p_target_dimension).first;
return TapkeeOutput(embedding, unimplementedProjectingFunction());
}
So the point is; without using ARPACK and assuming that the matrix is sparse, how do you go about computing the eigenvectors?
@skn123 ha! good point, sorry I confused you about it. It seems that w/o ARPACK it would convert to dense thing. This is a no-go for sure. I'll try to fix it as soon as I get some time (a few days I hope)
@lisitsyn Even if you were to use randomization (as in red-svd), you will still face a problem of finding the "bottom" eigenvectors. Do you have any thoughts on how we can compute the bottom eigen vectors without using ARPACK ?