datafu icon indicating copy to clipboard operation
datafu copied to clipboard

Hadoop library for large-scale data processing, now an Apache Incubator project

Results 4 datafu issues
Sort by recently updated
recently updated
newest added

A couple of different configuration options are available for calculating ndcg. You can specify positional values per range, use a standard logarithmic discounting function, or use a custom function.

Thanks for the fix to SampleByKey issue. Please let us know when can we expect the release that contains this fix. Or If the build instructions are documented somewhere I...

It would be great to be able to get back a long instead of a GUID. Ended up writing my own UDF to do this :/

I have a pair of 35M of links from 117K nodes and ran pagerank job on 3 node m2.2xlarge EMR cluster. Initially I got out of memory error in the...