Radim Řehůřek

Results 318 comments of Radim Řehůřek
trafficstars

Also bitten by this. I debugged and googled my way to this PR – I have nothing to add to @ja0x and @TomiBelan excellent analysis, so just a +1. I...

Yes, very confusing, especially given there also exists a `pybloomfilter` package (different from `pybloomfiltermmap`). Renaming to `pybloomfiltermmap` for consistency seems sensible, depending on how important is backward compatibility.

There is I/O -- processor caches can make a huge difference, and I think any write invalidates it, even when the write is a no-op (no bits changed). Memory writes...

Done. Let me know if you can replicate the performance improvements, I wonder how other HW / compiler factors play into this (OS X here, Apple LLVM version 7.0.0, MacBookPro11,3)....

I think you're right -- I checked a version that simply writes instead of the `if`, and it's faster still. So, either was just Python overhead that makes this faster,...

I can reproduce it, on OS X 10.7.5, using `pip install pybloomfiltermmap` as well as `easy_install pybloomfiltermmap`. The package being downloaded in both cases is `pybloomfiltermmap-0.3.14.macosx-10.9-intel.tar.gz`.

Installing from `tar.gz` worked fine.

No, from PyPI source tarball: https://pypi.python.org/packages/source/p/pybloomfiltermmap/pybloomfiltermmap-0.3.14.tar.gz#md5=9c711cf6efca7438fa9dd1829dfa9d05

I just checked and confirmed -- the singular values are ok, but left/right singular vectors are complete rubbish. Actually, the issue seems to be not just with identity matrix, but...