SynapseML
SynapseML copied to clipboard
Simple and Distributed Machine Learning
**Describe the bug** When running the following lines, I get the error: parameter numItemBlocks given invalid value 0. access_anomaly = AccessAnomaly( tenantCol='tenant_id', userCol='user', resCol='res', likelihoodCol='likelihood', maxIter=100 ) access_anomaly_model = access_anomaly.fit(ptraining)...
If I do not set numBatches, there will be ‘NegativeArraySizeException’ or ‘OOM’ during trainning big dataset (about 26320507 rows), and the cpu utilization will be less than 90%. **But if...
**Describe the bug** Hi, I have tried to migrate the local python lightgbm to spark lightgbm, it successfully trained model but got a quite different result when predicting. **To Reproduce**...
Hi, folks! validationIndicatorCol (str): Indicates whether the row is for training or validation def setValidationIndicatorCol(value: String): this.type = set(validationIndicatorCol, value) } Is it means just string col with two values...
**Describe the bug** Vowpal Wabbit's [One Against All](https://github.com/VowpalWabbit/vowpal_wabbit/wiki/One-Against-All-(oaa)-multi-class-example) classifier does not work via the MMLSpark interface. **To Reproduce** ``` val vwClassifier = new VowpalWabbitClassifier() .setFeaturesCol("features") .setLabelCol("label") .setProbabilityCol("predictedProb") .setPredictionCol("predictedLabel") .setRawPredictionCol("rawPrediction") .setArgs("--oaa=2...
**Describe the bug** I'm trying to write the complete ML pipeline including StringIndexer, VectorAssembler and LightGBMRegressor to the disk using pipeline_model.write().overwrite().save("model_file"), but I'm unable to write to disk. **To Reproduce**...
Is there UI or log which to see the number of iterations and loss changes when running model in mmlspark lightgbm? AB#1984522
**Describe the bug** `com.microsoft.azure.synapse.ml.featurize.DataConversion` doesn't implement read(). Saving works fine. This doesn't work when used on its own (`DataConversion().load()`), and also doesn't work when used with MLlib's Pipeline/PipelineModel. **To Reproduce**...
How to improve the speed of lightgbm on spark?
# Summary Adding Codacy Scanner