big-ann-benchmarks
big-ann-benchmarks copied to clipboard
OPtANNe
BIGANN submission for GraphANN using Intel Optane Persistent Memory (high QPS HW)
@sdongaonkar Hey, catching up with the PRs. Are the two OptANNE PR's intended to be different submissions? I think we already discussed this in email, but I just wanted to make sure:
https://github.com/harsha-simhadri/big-ann-benchmarks/pull/64
https://github.com/harsha-simhadri/big-ann-benchmarks/pull/63
@sdongaonkar I sent you an email on the same topic. We are seeing improvement in 2 datasets (text2image and msspacev) but big decrease in the rest ( on the recall benchmark ). Was that intended?
@sdongaonkar I sent you an email on the same topic. We are seeing improvement in 2 datasets (text2image and msspacev) but big decrease in the rest ( on the recall benchmark ). Was that intended?
Thanks for the heads up. I just sent a reply. It seems for 3 of the datasets the last search window size value was picked for the summary. The attached file has all the values, and I've highlighted the intended search_window_size values for recall and QPS rankings.
Please let me know if your numbers are still different to these ones.
@sdongaonkar I sent you an email on the same topic. We are seeing improvement in 2 datasets (text2image and msspacev) but big decrease in the rest ( on the recall benchmark ). Was that intended?
Thanks for the heads up. I just sent a reply. It seems for 3 of the datasets the last search window size value was picked for the summary. The attached file has all the values, and I've highlighted the intended search_window_size values for recall and QPS rankings.
Please let me know if your numbers are still different to these ones.
OK @sdongaonkar, lets keep it as a github issue/conversation for now. I'd like to track down what went wrong and what might be wrong. The following is the algos.yaml we used when we ran the eval. Does it look correct?
https://github.com/harsha-simhadri/big-ann-benchmarks/blob/t3/eval_optanne_graphann/t3/optanne_graphann/algos.yaml
If it is correct, then something else besides the yaml is wrong. Thanks.
@sdongaonkar Hey, we are currently unable to run any more evaluations due to this error. Any thoughts on what to do here?
Trying to instantiate benchmark.algorithms.graphann.GraphANN(['euclidean', {'index_file': '/mnt/data/competition_indexes/bigann1b_index127-superflat.index', 'vectors_file': '/mnt/data/competition_indexes/bigann1b_vectors.bin', 'vectors_location': 'HUGE'}])
Activating environment at ~/BigANN/GraphANN/contrib/PyANN/Project.toml
Precompiling project...
✓ PyCall 7 dependencies successfully precompiled in 217 seconds (58 already precompiled) 1 dependency precompiled but a different version is currently loaded. Restart julia to access the new version [ Info: Running in an exclusive environment. Populating thread affinities. [ Info: Running in an exclusive environment. Populating thread affinities. [ Info: Running in an exclusive environment. Populating thread affinities. Running graphann on bigann-1B Got 10000 queries Copying index to PMem...
Process Process-1: Traceback (most recent call last): File "/home/bigann/.pyenv/versions/3.8.12/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap self.run() File "/home/bigann/.pyenv/versions/3.8.12/lib/python3.8/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/bigann/BigANN/big-ann-benchmarks/benchmark/main.py", line 45, in run_worker run_no_docker(definition, args.dataset, args.count, File "/home/bigann/BigANN/big-ann-benchmarks/benchmark/runner.py", line 339, in run_no_docker run_from_cmdline(cmd) File "/home/bigann/BigANN/big-ann-benchmarks/benchmark/runner.py", line 230, in run_from_cmdline run(definition, args.dataset, args.count, args.runs, args.rebuild, File "/home/bigann/BigANN/big-ann-benchmarks/benchmark/runner.py", line 95, in run elif rebuild or not algo.load_index(dataset): File "/home/bigann/BigANN/big-ann-benchmarks/benchmark/algorithms/graphann.py", line 85, in load_index self.create_index_dir(ds), File "/home/bigann/BigANN/big-ann-benchmarks/benchmark/algorithms/graphann.py", line 151, in create_index_dir shutil.copy(self._index_file, graph_file0) File "/home/bigann/.pyenv/versions/3.8.12/lib/python3.8/shutil.py", line 418, in copy copyfile(src, dst, follow_symlinks=follow_symlinks) File "/home/bigann/.pyenv/versions/3.8.12/lib/python3.8/shutil.py", line 275, in copyfile _fastcopy_sendfile(fsrc, fdst) File "/home/bigann/.pyenv/versions/3.8.12/lib/python3.8/shutil.py", line 166, in _fastcopy_sendfile raise err from None File "/home/bigann/.pyenv/versions/3.8.12/lib/python3.8/shutil.py", line 152, in _fastcopy_sendfile sent = os.sendfile(outfd, infd, offset, blocksize) OSError: [Errno 28] No space left on device: '/mnt/data/competition_indexes/bigann1b_index127-superflat.index' -> '/mnt/pm0/public/graph.bin'
I was running the benchmark script on it. I just logged out. It should work now.
I was running the benchmark script on it. I just logged out. It should work now.
Ooops. Apologies...I should have checked if someone was logged in. Thanks!
@sdongaonkar @hildebrandmw We want to merge this branch to main. Could you please make any final changes to the PR before we merge. Could you please not delete or modify data in the common yaml and eval files in your PR?
Hi there! Very, very sorry for the delay! As far as I know, there are unfortunately no plans to make the original code behind this submission publicly available. I'd be happy to try to resolve the merge conflicts, but would honestly recommend closing this PR (and PRs associated with this one) to avoid merging what is now essentially dead code into main.
If in the future the code does become available, I'd be willing to open a new PR any updated interfaces and configurations. What do you think?
Thanks @hildebrandmw for the update. It's a pity, but it would of course be very much appreciated if you were to include it at a later stage. @sourcesync @harsha-simhadri Probably #63, #64, and #103 should be closed following Mark's suggestions.
@Martin Aumüller @.***> I agree unless Mark sees a way to "resolve the merge conflicts" as he offered. But even then, I'm not sure how useful that would be. I believe also Intel has decided to discontinue making the Optane hardware, so at some point it will be impossible to reincarnate this submission.
On Wed, Apr 26, 2023 at 10:21 AM Martin Aumüller @.***> wrote:
Thanks @hildebrandmw https://github.com/hildebrandmw for the update. It's a pity, but it would of course be very much appreciated if you were to include it at a later stage. @sourcesync https://github.com/sourcesync @harsha-simhadri https://github.com/harsha-simhadri Probably #63 https://github.com/harsha-simhadri/big-ann-benchmarks/pull/63, #64 https://github.com/harsha-simhadri/big-ann-benchmarks/pull/64, and #103 https://github.com/harsha-simhadri/big-ann-benchmarks/pull/103 should be closed following Mark's suggestions.
— Reply to this email directly, view it on GitHub https://github.com/harsha-simhadri/big-ann-benchmarks/pull/63#issuecomment-1523788940, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADL6CM424FIKT7ECPGXKNTXDFKP3ANCNFSM5GRNXZ3A . You are receiving this because you were mentioned.Message ID: @.***>
@sourcesync I think it's your call to say whether https://github.com/harsha-simhadri/big-ann-benchmarks/pull/103/files#diff-8deadc2d4adf2d0fbfc7cb5dc53dd739b5dc69e68779d7f53999985ec47374a6R59-R100 adds something useful to the infrastructure, independently of the implementation.
@Martin Aumüller @.> Thanks for the diff. I don't see anything in there worth keeping for version 2. I might recommend we keep or archive the branch and just close the PR. But also I believe the participant forked the repository and it lives there as well just in case. I think @Harsha Vardhan Simhadri @.> should have the final say, but I'm good at least closing the PR.
On Wed, Apr 26, 2023 at 10:33 AM Martin Aumüller @.***> wrote:
@sourcesync https://github.com/sourcesync I think it's your call to say whether https://github.com/harsha-simhadri/big-ann-benchmarks/pull/103/files#diff-8deadc2d4adf2d0fbfc7cb5dc53dd739b5dc69e68779d7f53999985ec47374a6R59-R100 adds something useful to the infrastructure, independently of the implementation.
— Reply to this email directly, view it on GitHub https://github.com/harsha-simhadri/big-ann-benchmarks/pull/63#issuecomment-1523804679, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADL6CNSUVIUC7RWXKLPIIDXDFL73ANCNFSM5GRNXZ3A . You are receiving this because you were mentioned.Message ID: @.***>