mdanalysis icon indicating copy to clipboard operation
mdanalysis copied to clipboard

Fast cythonized inverse array of unsorted, unique indice

Open mdaviesNCL opened this issue 3 years ago • 9 comments
trafficstars

Fixes #3387

Changes made in this Pull Request:

  • Added cython implementation of inverse of unsorted, unqiue array
  • Replaced python code in groups.py with new cython function
  • Added tests confirming inverse correctly maps unique array onto original

PR Checklist

  • [ ] Tests?
  • [ ] Docs?
  • [ ] CHANGELOG updated?
  • [ ] Issue raised/referenced?

mdaviesNCL avatar Mar 24 '22 18:03 mdaviesNCL

Hello @mdaviesNCL! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! :beers:

Comment last updated at 2022-04-12 16:44:55 UTC

pep8speaks avatar Mar 24 '22 18:03 pep8speaks

Artifical benchmarking shows speed up. Randomly generated arrays of integers of varying sizes and varying number of unique values. ie: np.random.randint(1, n_unique, n_indices, dtype=np.intp) And comparing the new function with the old version pulled from groups.py

def current(index_array, indices):
    mask = np.zeros_like(index_array)  
    for i, x in enumerate(indices):  
        values = np.where(index_array == x)[0]  
    mask[values] = i 
    return mask

I can add these result in the comments if they are sufficient. Or is there a more realistic test that should be run?

ghost avatar Mar 24 '22 18:03 ghost

Codecov Report

Merging #3573 (6852e2f) into develop (dd7ce7b) will decrease coverage by 0.00%. The diff coverage is 100.00%.

:exclamation: Current head 6852e2f differs from pull request most recent head 2c14033. Consider uploading reports for the commit 2c14033 to get more accurate results

@@             Coverage Diff             @@
##           develop    #3573      +/-   ##
===========================================
- Coverage    94.22%   94.21%   -0.01%     
===========================================
  Files          190      190              
  Lines        24790    24735      -55     
  Branches      3340     3320      -20     
===========================================
- Hits         23359    23305      -54     
+ Misses        1383     1382       -1     
  Partials        48       48              
Impacted Files Coverage Δ
package/MDAnalysis/core/groups.py 98.57% <100.00%> (-0.01%) :arrow_down:
package/MDAnalysis/lib/_cutil.pyx 100.00% <100.00%> (ø)
package/MDAnalysis/lib/util.py 90.46% <100.00%> (ø)
package/MDAnalysis/analysis/encore/covariance.py 87.69% <0.00%> (-6.16%) :arrow_down:
package/MDAnalysis/analysis/lineardensity.py 84.41% <0.00%> (-3.24%) :arrow_down:
package/MDAnalysis/topology/MOL2Parser.py 98.90% <0.00%> (-0.23%) :arrow_down:
MDAnalysisTests/coordinates/base.py 95.34% <0.00%> (-0.02%) :arrow_down:
package/MDAnalysis/coordinates/base.py 95.47% <0.00%> (-0.01%) :arrow_down:
package/MDAnalysis/analysis/pca.py 100.00% <0.00%> (ø)
package/MDAnalysis/analysis/rms.py 95.34% <0.00%> (ø)
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update dd7ce7b...2c14033. Read the comment docs.

codecov[bot] avatar Mar 24 '22 18:03 codecov[bot]

@mdaviesNCL are you thinking of applying for GSOC or Outreachy with MDAnalysis this year? I ask because if you are thinking of applying, you will need to introduce your on the developers mailing list.

If not all good, its a good idea to introduce yourself on the mailing list regardless. :)

hmacdope avatar Apr 06 '22 01:04 hmacdope

Hi, I am new to open source. so, please tell me how can I start

1krishnasharma avatar Apr 14 '22 07:04 1krishnasharma

@1krishnasharma welcome to MDAnalysis! Are you a GSOC or Outreachy applicant? If so, please introduce yourself on our mailing list (https://groups.google.com/g/mdnalysis-devel)

As a first step, have a look through the user guide (https://userguide.mdanalysis.org/stable/index.html) to get familiar with MDAnalysis, particularly the quick start guide and contributing to MDAnalysis.

If you are applying to GSOC: Have a look at https://www.mdanalysis.org/2022/03/07/gsoc2022/. If you want to apply to the Google Summer of Code, you will need to make a contribution to MDAnalysis first; have a look at the starter issues: https://github.com/MDAnalysis/mdanalysis/labels/GSOC%20Starter

If you are applying to Outreachy: have a look at our wiki (https://github.com/MDAnalysis/mdanalysis/wiki/Outreachy) and FAQ (https://github.com/MDAnalysis/mdanalysis/wiki/Outreachy-FAQ).

If you have any further questions, please ask them on the mailing list or Discord instead of here. This page is for a specific pull request that fixes an issue in MDAnalysis. All discussion should be related to the PR.

lilyminium avatar Apr 14 '22 08:04 lilyminium

This is probably something simple but I am having issues pushing the latest changes. Specfically:

 ! [rejected]            cython-inverse-array -> cython-inverse-array (non-fast-forward)
error: failed to push some refs to 'https://github.com/mdaviesNCL/mdanalysis.git'
hint: Updates were rejected because the tip of your current branch is behind
hint: its remote counterpart. Integrate the remote changes (e.g.
hint: 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.

I attempted fetch and rebase based on the contibution guide and this seemed successful but does not stop the issue.

ghost avatar Apr 21 '22 14:04 ghost

@mdaviesNCL hmm odd, that is usually the error message for when upstream has had additions. Worst case, you can rename your local branch to something "safe", re-check out this remote branch and cherry pick (git cherry-pick <HASH>) in your local commits onto the "safe" branch.

richardjgowers avatar Apr 21 '22 14:04 richardjgowers

@mdaviesNCL are you still having trouble pushing up changes?

richardjgowers avatar May 16 '22 10:05 richardjgowers