SynapseML icon indicating copy to clipboard operation
SynapseML copied to clipboard

feat: Add LightGBM streaming execution mode

Open svotaw opened this issue 3 years ago • 40 comments

Summary

Add the streaming execution mode to LightGBM wrapper. This mode uses almost no memory on top of what LightGBM needs to execute.

Tests

Tests will be modified to run in both bulk and streaming mode before checkin. Before this PR was pushed, LightGBMClassifier tests were all passing for streaming mode (the bulk of the tests). There are also some new tests just for streaming components and instrumentation in Common.

Dependency changes

This PR cannot be checked in without corresponding changes in native LightGBM library, or pointing to custom upload. https://github.com/microsoft/LightGBM/pull/5299

AB#1891953

svotaw avatar Jul 21 '22 22:07 svotaw

Hey @svotaw ! Thank you so much for contributing to our repository. Someone from SynapseML Team will be reviewing this pull request soon. We appreciate your patience and contributions!

github-actions[bot] avatar Jul 21 '22 22:07 github-actions[bot]

/azp run

svotaw avatar Jul 21 '22 23:07 svotaw

Azure Pipelines successfully started running 1 pipeline(s).

azure-pipelines[bot] avatar Jul 21 '22 23:07 azure-pipelines[bot]

/azp run

mhamilton723 avatar Jul 22 '22 16:07 mhamilton723

Azure Pipelines successfully started running 1 pipeline(s).

azure-pipelines[bot] avatar Jul 22 '22 16:07 azure-pipelines[bot]

/azp run

svotaw avatar Aug 17 '22 06:08 svotaw

Azure Pipelines successfully started running 1 pipeline(s).

azure-pipelines[bot] avatar Aug 17 '22 06:08 azure-pipelines[bot]

Codecov Report

Merging #1580 (5afbd85) into master (b205cc4) will increase coverage by 0.57%. The diff coverage is 94.58%.

@@            Coverage Diff             @@
##           master    #1580      +/-   ##
==========================================
+ Coverage   85.94%   86.51%   +0.57%     
==========================================
  Files         271      273       +2     
  Lines       14141    14420     +279     
  Branches      738      769      +31     
==========================================
+ Hits        12154    12476     +322     
+ Misses       1987     1944      -43     
Impacted Files Coverage Δ
...soft/azure/synapse/ml/core/utils/ClusterUtil.scala 68.49% <ø> (+2.73%) :arrow_up:
...e/synapse/ml/lightgbm/params/BaseTrainParams.scala 98.36% <0.00%> (-1.64%) :arrow_down:
...oft/azure/synapse/ml/lightgbm/swig/SwigUtils.scala 84.81% <76.92%> (-0.10%) :arrow_down:
...ft/azure/synapse/ml/lightgbm/TrainingContext.scala 73.07% <85.71%> (+4.65%) :arrow_up:
...zure/synapse/ml/lightgbm/dataset/SampledData.scala 90.00% <90.00%> (ø)
...e/synapse/ml/lightgbm/StreamingPartitionTask.scala 97.10% <97.10%> (ø)
.../azure/synapse/ml/lightgbm/BasePartitionTask.scala 90.40% <100.00%> (+0.15%) :arrow_up:
.../azure/synapse/ml/lightgbm/BulkPartitionTask.scala 98.21% <100.00%> (ø)
...osoft/azure/synapse/ml/lightgbm/LightGBMBase.scala 95.41% <100.00%> (+7.08%) :arrow_up:
...crosoft/azure/synapse/ml/lightgbm/TrainUtils.scala 84.03% <100.00%> (-0.14%) :arrow_down:
... and 10 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

codecov-commenter avatar Aug 17 '22 06:08 codecov-commenter

/azp run

svotaw avatar Aug 17 '22 19:08 svotaw

Azure Pipelines successfully started running 1 pipeline(s).

azure-pipelines[bot] avatar Aug 17 '22 19:08 azure-pipelines[bot]

/azp run

svotaw avatar Aug 17 '22 19:08 svotaw

Azure Pipelines successfully started running 1 pipeline(s).

azure-pipelines[bot] avatar Aug 17 '22 19:08 azure-pipelines[bot]

/azp run

svotaw avatar Aug 17 '22 21:08 svotaw

Azure Pipelines successfully started running 1 pipeline(s).

azure-pipelines[bot] avatar Aug 17 '22 21:08 azure-pipelines[bot]

/azp run

svotaw avatar Aug 21 '22 05:08 svotaw

Azure Pipelines successfully started running 1 pipeline(s).

azure-pipelines[bot] avatar Aug 21 '22 05:08 azure-pipelines[bot]

/azp run

svotaw avatar Aug 21 '22 17:08 svotaw

Azure Pipelines successfully started running 1 pipeline(s).

azure-pipelines[bot] avatar Aug 21 '22 17:08 azure-pipelines[bot]

/azp run

svotaw avatar Aug 22 '22 21:08 svotaw

Azure Pipelines successfully started running 1 pipeline(s).

azure-pipelines[bot] avatar Aug 22 '22 21:08 azure-pipelines[bot]

/azp run

svotaw avatar Aug 22 '22 21:08 svotaw

Azure Pipelines successfully started running 1 pipeline(s).

azure-pipelines[bot] avatar Aug 22 '22 21:08 azure-pipelines[bot]

/azp run

svotaw avatar Aug 22 '22 22:08 svotaw

Azure Pipelines successfully started running 1 pipeline(s).

azure-pipelines[bot] avatar Aug 22 '22 22:08 azure-pipelines[bot]

/azp run

svotaw avatar Aug 22 '22 23:08 svotaw

Azure Pipelines successfully started running 1 pipeline(s).

azure-pipelines[bot] avatar Aug 22 '22 23:08 azure-pipelines[bot]

/azp run

svotaw avatar Aug 23 '22 16:08 svotaw

Azure Pipelines successfully started running 1 pipeline(s).

azure-pipelines[bot] avatar Aug 23 '22 16:08 azure-pipelines[bot]

/azp run

svotaw avatar Aug 25 '22 22:08 svotaw

Pull request contains merge conflicts.

azure-pipelines[bot] avatar Aug 25 '22 22:08 azure-pipelines[bot]

/azp run

svotaw avatar Aug 25 '22 23:08 svotaw

Azure Pipelines successfully started running 1 pipeline(s).

azure-pipelines[bot] avatar Aug 25 '22 23:08 azure-pipelines[bot]

/azp run

svotaw avatar Aug 26 '22 03:08 svotaw

Azure Pipelines successfully started running 1 pipeline(s).

azure-pipelines[bot] avatar Aug 26 '22 03:08 azure-pipelines[bot]

/azp run

svotaw avatar Aug 26 '22 05:08 svotaw

Azure Pipelines successfully started running 1 pipeline(s).

azure-pipelines[bot] avatar Aug 26 '22 05:08 azure-pipelines[bot]

/azp run

svotaw avatar Aug 26 '22 06:08 svotaw

Azure Pipelines successfully started running 1 pipeline(s).

azure-pipelines[bot] avatar Aug 26 '22 06:08 azure-pipelines[bot]

/azp run

svotaw avatar Sep 01 '22 22:09 svotaw

Azure Pipelines successfully started running 1 pipeline(s).

azure-pipelines[bot] avatar Sep 01 '22 22:09 azure-pipelines[bot]