SynapseML
SynapseML copied to clipboard
[LightGBM] Weight column in LightGBM classifier is not working as per expectation
SynapseML version
2.12:0.9.5
System information
- Language version : python 3.7, scala 2.12
- Spark Version: 3.3.0
- Spark Platform : Databricks
Describe the problem
Hi , I am using LightGBMClassifier for a skewed binary classification problem. I have several features like A, B, C.... so on. I am grouping by the features and computing weights for class 0 and class 1.
However, for testing data I am giving weights as all 1s.
I can see my testing data's loss is not converging. Is this the correct way to use weightCol feature ?
One more observation, while inferencing if I use isUnbalance
as True , then the model gives random predictions , AUC comes down to 50%. So, I had to use isUnbalance
as False while inferencing. Please let me know if this is the correct behavior.
Code to reproduce issue
params = {'baggingFraction': 0.8156468375795559,
'featureFraction': 0.8609557255311693,
'featuresCol': 'features',
'labelCol': 'label',
'learningRate': 0.1449558170049662,
'maxDepth': 29,
'minSumHessianInLeaf': 0.03753901648224433,
'numIterations': 80,
'numLeaves': 133,
'weightCol': 'weight',
'objective': 'binary',
'useSingleDatasetMode': True,
'isUnbalance': False,
'useBarrierExecutionMode': True,
'parallelism': 'voting_parallel',
'metric': 'auc'
}
lgb = LightGBMClassifier(
numIterations = params['numIterations'],
numLeaves = params['numLeaves'],
maxDepth = params['maxDepth'],
baggingFraction = params['baggingFraction'],
featureFraction = params['featureFraction'],
minSumHessianInLeaf = params['minSumHessianInLeaf'],
learningRate=params['learningRate'],
objective = params['objective'],
labelCol = params['labelCol'],
featuresCol=params['featuresCol'],
weightCol=params['weightCol'],
useSingleDatasetMode=True,
#isUnbalance=False,
useBarrierExecutionMode=True,
#parallelism = "voting_parallel",
metric = params['metric']
)
Other info / logs
No response
What component(s) does this bug affect?
- [ ]
area/cognitive
: Cognitive project - [ ]
area/core
: Core project - [ ]
area/deep-learning
: DeepLearning project - [X]
area/lightgbm
: Lightgbm project - [ ]
area/opencv
: Opencv project - [ ]
area/vw
: VW project - [ ]
area/website
: Website - [ ]
area/build
: Project build system - [ ]
area/notebooks
: Samples under notebooks folder - [ ]
area/docker
: Docker usage - [ ]
area/models
: models related issue
What language(s) does this bug affect?
- [ ]
language/scala
: Scala source code - [X]
language/python
: Pyspark APIs - [ ]
language/r
: R APIs - [ ]
language/csharp
: .NET APIs - [ ]
language/new
: Proposals for new client languages
What integration(s) does this bug affect?
- [ ]
integrations/synapse
: Azure Synapse integrations - [ ]
integrations/azureml
: Azure ML integrations - [X]
integrations/databricks
: Databricks integrations
Hey @coolcoder001 :wave:! Thank you so much for reporting the issue/feature request :rotating_light:. Someone from SynapseML Team will be looking to triage this issue soon. We appreciate your patience.
We have released 11.2, which has newer features. We aren't really supporting 0.9.5 anymore, and will release the official 1.0 version soon.