gensim icon indicating copy to clipboard operation
gensim copied to clipboard

Using efficient np.linalg.multi_dot

Open maldil opened this issue 2 years ago • 8 comments

It is more succinct and effective to refactor code to use np.linalg.multi_dot rather than numerous np.dot routines. What do you think about this change that has practical value?

maldil avatar Jul 05 '22 02:07 maldil

Can you quantify what the benefit might be, perhaps with a demo showing the improvement?

gojomo avatar Jul 05 '22 05:07 gojomo

Please check the NumPy API documentation here (https://numpy.org/doc/stable/reference/generated/numpy.linalg.multi_dot.html) If you do not trust this doc, I can run a small performance check for you. Thanks for the comment.

maldil avatar Jul 05 '22 05:07 maldil

Where does Gensim multiply more than 2 matrices (so that multidot has any effect)?

Like @gojomo, a practical demonstration of the benefits would be nice.

piskvorky avatar Jul 05 '22 06:07 piskvorky

@piskvorky In the PR, there were two calls to np.dot, which we can replace with one call to np.linalg.multi_dot, Invoking several np.dot invocations is less efficient, according to the API documentation, than using np.linalg.multi_dot.

Please refer the blow demo (I am not sure this is the exact demo that you are looking for)

    from datetime import datetime

    
    A = np.random.random((10000, 100))
    B = np.random.random((100, 1000))
    C = np.random.random((1000, 5))
    D = np.random.random((5, 333))

    multi_dot_start = datetime.now()
    _ = np.linalg.multi_dot([A, B, C, D])
    multi_dot_end = datetime.now()
    print("multi_dot execution time",(multi_dot_end-multi_dot_start).microseconds)

    dot_start = datetime.now()
    _ = np.dot(np.dot(np.dot(A, B), C), D)
    dot_end = datetime.now()
    print("dot execution time",(dot_end - dot_start).microseconds)

Output

multi_dot execution time 17929
dot execution time 85056

multi_dot is 4.7 times faster, and it also succinct and clean which we should not forget.

maldil avatar Jul 05 '22 07:07 maldil

But where would the proposed replacement make any noticeable difference to a Gensim user?

Is the print_error() function that's changed in the PR a major contributor to the runtime of the svd_error.py script? How much would a representative usage of that script be sped up by this change?

gojomo avatar Jul 05 '22 07:07 gojomo

I proposed this patch as an external contributor to make the code more concise and efficient. But how frequently do you use the script svd error.py and what effect does that have? I believe the core contributors should make that decision. But rather than debating about the effects, I think the code has to be improved and adjusted. The impact is always subjective.

maldil avatar Jul 05 '22 07:07 maldil

Sorry, without a clear demonstration of its benefit, I'm -1 on introducing changes.

If there's a compelling use-case, such as a speed-up when training a new model, or calculating document similarities, please post it here.

Speeding up an internal test script by (milli? nano?) seconds is not it. We'd need something that impacts users noticeably.

piskvorky avatar Jul 05 '22 09:07 piskvorky

The PR submitted does not make the source code more 'succinct' or 'concise' – it's 7 characters longer than the original code.

The claim that it is more efficient, in this usage, remains an untested theory. (For example, it's possible that in this case multi_dot may wind up choosing the same order-of-operations as the prior code did – while adding overhead to make that decision. Usually, actual improvements can be objectively proven, not just asserted.)

As a matter of style, I suppose one could try to make a case that it'd be better to always prefer multi_dot when there are any chained dot operations like this, just in case its optimal reordering offers a big win.

However, more developers who will ever be reading or modifying this code are likely to know, offfhand, what np.dot is than what np.linalg.multi_dot is. Seeing multi_dot in the code might require looking up docs, where dot would not. And, it turns out there are even some (corner, perhaps contrived) cases where multi_dot is slower. So even a "good style just in case it helps" case isn't a slam dunk, on grounds of either readability or performance.

I would be favorable to the idea of preferring multi_dot in any cases where it has a chance of making a user-noticeable performance difference, though.

gojomo avatar Jul 07 '22 09:07 gojomo

Thank you for your effort @maldil . I see this change has pros and cons.

Pros: maybe more efficient Cons: more obscure

Given that efficiency is not a prime concern of the affected code, I think at this stage it's best not to merge this change, because the pros outweigh the cons.

mpenkov avatar Oct 16 '22 06:10 mpenkov