brown-cluster
brown-cluster copied to clipboard
C++ implementation of the Brown word clustering algorithm.
http://www.cs.berkeley.edu/~pliang/papers/meng-thesis.pdf is borken.
In case anyone is clustering large datasets: in my experiments (40M corpus and NofClusters=1000), turning on compiler optimization with "-O3" yields speed-ups of around 3. I changed the following lines...
Hello! I was browsing the code and I saw the opt_define_bool(paths2map, "paths2map", false, "Take the paths file and generate a map file."); Is it possible to be used? What is...
the length of text is defined int in src, so what happened if length of text is bigger than INT_MAX ?
I was wondering if it's possible to make a library out of this code in order to be able to include it into other projects?