SynapseML icon indicating copy to clipboard operation
SynapseML copied to clipboard

Lightgbm ValidationIndicatorCol - which are values exactly in this str Col

Open whiteneverdie opened this issue 5 years ago • 6 comments

Hi, folks!

validationIndicatorCol (str): Indicates whether the row is for training or validation

def setValidationIndicatorCol(value: String): this.type = set(validationIndicatorCol, value) }

Is it means just string col with two values "training" or "validation" litteraly?

Originally posted by @whiteneverdie in https://github.com/Azure/mmlspark/issues/689#issuecomment-644167974

whiteneverdie avatar Jun 15 '20 15:06 whiteneverdie

👋 Thanks for opening your first issue here! If you're reporting a 🐞 bug, please make sure you include steps to reproduce it.

welcome[bot] avatar Jun 15 '20 15:06 welcome[bot]

@whiteneverdie great question, and sorry about the confusion. Yes, this should just be the name of the column: https://github.com/Azure/mmlspark/blob/master/src/main/scala/com/microsoft/ml/spark/core/contracts/Params.scala#L179

The column itself should just contain booleans, we filter it here: https://github.com/Azure/mmlspark/blob/master/src/main/scala/com/microsoft/ml/spark/lightgbm/LightGBMBase.scala#L175

Note there is an issue with having huge validation sets, hopefully that's not the case for your scenario: https://github.com/Azure/mmlspark/issues/689

Oh, actually, it looks like you commented on that issue already, sorry I didn't notice your comment there.

imatiach-msft avatar Jun 16 '20 02:06 imatiach-msft

booleans value in ValidationIndicatorCol, emm. But where can we set what kind of metric for our validation? @imatiach-msft thanks!

shuDaoNan9 avatar Aug 29 '20 06:08 shuDaoNan9

@JWenBin the metric for validation can be set here, using setMetric: https://github.com/Azure/mmlspark/blob/master/src/main/scala/com/microsoft/ml/spark/lightgbm/LightGBMParams.scala#L310

imatiach-msft avatar Aug 29 '20 20:08 imatiach-msft

Hi~ I have a question.Why the validate data participate in model training when using setValidationIndicatorCol,Looking forward to your reply.

ICDI0906 avatar Oct 08 '21 13:10 ICDI0906

So, i think in the code should describe more clearly, such as:

  • The param contain column name
  • That column type is boolean: true if that row is validate set, false if that row is training set

ttpro1995 avatar Jul 05 '22 04:07 ttpro1995