dgl icon indicating copy to clipboard operation
dgl copied to clipboard

[Do not merge][graphbolt] s3-fifo-cache for DiskBasedFeature

Open pyynb opened this issue 1 year ago • 7 comments

Description

Checklist

Please feel free to remove inapplicable items for your PR.

  • [ ] The PR title starts with [$CATEGORY] (such as [NN], [Model], [Doc], [Feature]])
  • [ ] I've leverage the tools to beautify the python and c++ code.
  • [ ] The PR is complete and small, read the Google eng practice (CL equals to PR) to understand more about small PR. In DGL, we consider PRs with less than 200 lines of core code change are small (example, test and documentation could be exempted).
  • [ ] All changes have test coverage
  • [ ] Code is well-documented
  • [ ] To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change
  • [ ] Related issue is referred in this PR
  • [ ] If the PR is for a new model/paper, I've updated the example index here.

Changes

pyynb avatar Jun 03 '24 03:06 pyynb

To trigger regression tests:

  • @dgl-bot run [instance-type] [which tests] [compare-with-branch]; For example: @dgl-bot run g4dn.4xlarge all dmlc/master or @dgl-bot run c5.9xlarge kernel,api dmlc/master

dgl-bot avatar Jun 03 '24 03:06 dgl-bot

Is the data required to run this code generated by some other code?

mfbalin avatar Jun 03 '24 03:06 mfbalin

Commit ID: 55d97f7cc140f95550bf36ba07a88fb6e1c3f80d

Build ID: 1

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

dgl-bot avatar Jun 03 '24 04:06 dgl-bot

Commit ID: 09dfc1fd42c6eea7dafe318f0683b343dec88825

Build ID: 2

Status: ✅ CI test succeeded.

Report path: link

Full logs path: link

dgl-bot avatar Jun 03 '24 04:06 dgl-bot

Is the data required to run this code generated by some other code?

Yes, during a training process, I recorded all the indexes and saved them in a txt file. The file is too large so i can't push it. You can try recording it yourself.

pyynb avatar Jun 11 '24 12:06 pyynb

Commit ID: b404ae34844d5cc7c75bc865a685ca6c4976d24d

Build ID: 3

Status: ❌ CI test failed in Stage [Distributed Torch CPU Unit test].

Report path: link

Full logs path: link

dgl-bot avatar Jun 11 '24 13:06 dgl-bot

@pyynb Could you run the cache hit rate benchmark with the cache in #7492? It can be accessed via gb.impl.FeatureCache.

mfbalin avatar Jul 02 '24 06:07 mfbalin

Commit ID: 53107cfeba431ca74d2d539afe8339cb149bad65

Build ID: 4

Status: ✅ CI test succeeded.

Report path: link

Full logs path: link

dgl-bot avatar Jul 05 '24 07:07 dgl-bot

https://github.com/dmlc/dgl/tree/master/examples/graphbolt/disk_based_feature

We have experimental results comparing different caching policies here.

mfbalin avatar Aug 21 '24 01:08 mfbalin