SynapseML
SynapseML copied to clipboard
feat: Add LightGBM streaming execution mode
Summary
Add the streaming execution mode to LightGBM wrapper. This mode uses almost no memory on top of what LightGBM needs to execute.
Tests
Tests will be modified to run in both bulk and streaming mode before checkin. Before this PR was pushed, LightGBMClassifier tests were all passing for streaming mode (the bulk of the tests). There are also some new tests just for streaming components and instrumentation in Common.
Dependency changes
This PR cannot be checked in without corresponding changes in native LightGBM library, or pointing to custom upload. https://github.com/microsoft/LightGBM/pull/5299
AB#1891953
Hey @svotaw ! Thank you so much for contributing to our repository. Someone from SynapseML Team will be reviewing this pull request soon. We appreciate your patience and contributions!
/azp run
Azure Pipelines successfully started running 1 pipeline(s).
/azp run
Azure Pipelines successfully started running 1 pipeline(s).
/azp run
Azure Pipelines successfully started running 1 pipeline(s).
Codecov Report
Merging #1580 (5afbd85) into master (b205cc4) will increase coverage by
0.57%. The diff coverage is94.58%.
@@ Coverage Diff @@
## master #1580 +/- ##
==========================================
+ Coverage 85.94% 86.51% +0.57%
==========================================
Files 271 273 +2
Lines 14141 14420 +279
Branches 738 769 +31
==========================================
+ Hits 12154 12476 +322
+ Misses 1987 1944 -43
| Impacted Files | Coverage Δ | |
|---|---|---|
| ...soft/azure/synapse/ml/core/utils/ClusterUtil.scala | 68.49% <ø> (+2.73%) |
:arrow_up: |
| ...e/synapse/ml/lightgbm/params/BaseTrainParams.scala | 98.36% <0.00%> (-1.64%) |
:arrow_down: |
| ...oft/azure/synapse/ml/lightgbm/swig/SwigUtils.scala | 84.81% <76.92%> (-0.10%) |
:arrow_down: |
| ...ft/azure/synapse/ml/lightgbm/TrainingContext.scala | 73.07% <85.71%> (+4.65%) |
:arrow_up: |
| ...zure/synapse/ml/lightgbm/dataset/SampledData.scala | 90.00% <90.00%> (ø) |
|
| ...e/synapse/ml/lightgbm/StreamingPartitionTask.scala | 97.10% <97.10%> (ø) |
|
| .../azure/synapse/ml/lightgbm/BasePartitionTask.scala | 90.40% <100.00%> (+0.15%) |
:arrow_up: |
| .../azure/synapse/ml/lightgbm/BulkPartitionTask.scala | 98.21% <100.00%> (ø) |
|
| ...osoft/azure/synapse/ml/lightgbm/LightGBMBase.scala | 95.41% <100.00%> (+7.08%) |
:arrow_up: |
| ...crosoft/azure/synapse/ml/lightgbm/TrainUtils.scala | 84.03% <100.00%> (-0.14%) |
:arrow_down: |
| ... and 10 more |
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.
/azp run
Azure Pipelines successfully started running 1 pipeline(s).
/azp run
Azure Pipelines successfully started running 1 pipeline(s).
/azp run
Azure Pipelines successfully started running 1 pipeline(s).
/azp run
Azure Pipelines successfully started running 1 pipeline(s).
/azp run
Azure Pipelines successfully started running 1 pipeline(s).
/azp run
Azure Pipelines successfully started running 1 pipeline(s).
/azp run
Azure Pipelines successfully started running 1 pipeline(s).
/azp run
Azure Pipelines successfully started running 1 pipeline(s).
/azp run
Azure Pipelines successfully started running 1 pipeline(s).
/azp run
Azure Pipelines successfully started running 1 pipeline(s).
/azp run
Pull request contains merge conflicts.
/azp run
Azure Pipelines successfully started running 1 pipeline(s).
/azp run
Azure Pipelines successfully started running 1 pipeline(s).
/azp run
Azure Pipelines successfully started running 1 pipeline(s).
/azp run
Azure Pipelines successfully started running 1 pipeline(s).
/azp run
Azure Pipelines successfully started running 1 pipeline(s).