Gyeong-In Yu
Gyeong-In Yu
@yunseong Does Vortex give similar LR accuracy as Dolphin? (~66% for URL reputation data)
I found that the number of tasks affects Dolphin's algorithm correctness. I used first 10000 lines in URL reputation dataset, and run LR job with following two configurations: `-dim 3231961...
Supporting n-fold cross validation in framework level could be another task to work on.
This does not mean to add a new integration test which uses ImageNet dataset to our codebase, since it will take too long.
I pushed a branch named `gy-lr-test` which uses scala library `breeze` instead of `mahout`. I made some changes to save memory and improve performance. With the entire URL reputation dataset...
Changes I made: - Use `breeze` instead of `mahout` (`mahout` does not support inplace update) - Change vector computations to inplace update to not allocate redundant new vectors - Use...