bespin icon indicating copy to clipboard operation
bespin copied to clipboard

Loss of log probability precision in PageRank implementation

Open xcharleslin opened this issue 7 years ago • 1 comments

The reference implementation of PageRank throws away accuracy when calculating missing mass (RunPageRankBasic.java:456) by bringing the log-probability back into linear space to compute the missing mass.

Obviously, this introduces error proportional to how small the missing mass is, as well as how many iterations are ran. Specifically, it will slightly affect the solution values for an assignment of a certain systems course taught using this library - 0.00001 on public local test cases, and potentially more on blind cloud test cases. (It does not affect the values in the README example.)

The patch is here (too lazy to pull-request): https://pastebin.com/A9gV6jRi

xcharleslin avatar Feb 25 '18 05:02 xcharleslin

Thanks for noting. Now that the semester is winding down, I'll have time to take a look at this. I PR would be even better...

lintool avatar Apr 03 '18 12:04 lintool