SynapseML
SynapseML copied to clipboard
Simple and Distributed Machine Learning
perf: reduce lightgbm prediction time by 20% depends on native changes: https://github.com/microsoft/LightGBM/pull/3159 In testing, time went from 82159416923ns to 65852779813ns on the pima dataset (test added in PR but num...
**Describe the bug** Trying to run tests on OSX 10.14.6, I'm getting the following error indicating the shared library is failing to load. ``` Cause: java.lang.UnsatisfiedLinkError: /private/var/folders/bz/984j8jp91x37t980k20lx3z1gx50k5/T/mml-natives3454868880605903181/lib_lightgbm.dylib: dlopen(/private/var/folders/bz/984j8jp91x37t980k20lx3z1gx50k5/T/mml-natives3454868880605903181/lib_lightgbm.dylib, 1): Library...
Hi, I am training my VW CB model using synapseml on databricks. Code is same as in: https://microsoft.github.io/SynapseML/docs/features/vw/Vowpal%20Wabbit%20-%20Overview/#vw-contextual-bandit the model and featurizers/zipper pipeline are saved once training is complete. now...
setting ```useSingleDatasetMode``` to ```True``` gives ``` java: malloc.c:4033: _int_malloc: Assertion `(unsigned long) (size) >= (unsigned long) (nb)' failed. ``` I keep everything else the same and set that flag to...
TypeError: 'JavaPackage' object is not callable Traceback (most recent call last): File "/opt/spark/python/lib/pyspark.zip/pyspark/__init__.py", line 114, in wrapper return func(self, **kwargs) File "/home/trusted-service-user/cluster-env/env/lib/python3.8/site-packages/synapse/ml/cognitive/EntityDetector.py", line 95, in __init__ self._java_obj = self._new_java_obj("com.microsoft.azure.synapse.ml.cognitive.EntityDetector", self.uid)...
I have already noticed the issue https://github.com/Azure/mmlspark/issues/542, but the answer cannot solve my problem. I have a dataset nearly 72GB and 145 columns. My spark config is spark-submit \ --master...
My ```labelCol``` is named ```label``` Post running vectorassembler I run ```df_modeling = df_modeling.withColumn('label', col('label').alias('label',metadata={'numClasses':self.num_classes}))``` and yet I get the warning: ```com.microsoft.azure.synapse.ml.lightgbm.LightGBMClassifier: com.microsoft.azure.synapse.ml.lightgbm.LightGBMClassifier inferred 2 classes for labelCol=LightGBMClassifier_4f7ef3fb1833__labelCol since numClasses was...
Adding some tests to contribute to code coverage. Submitting for PR to see if the tests run ok in the pipeline.
- Add setAsyncCompletionLogMessagePrefixValue to enable logging of async operations on completion (with duration) - When reporting the retries exceeded error, include the last retrieved status to aid debugging
https://github.com/microsoft/SynapseML/blob/0840e318bdba37b93fd39029867edffd466be92b/core/src/main/scala/com/microsoft/azure/synapse/ml/recommendation/SARModel.scala#L87 The "===" is a test function that may be excluded from the package. An alternative approach should be used to avoid issues if the following excludes are used. `.config('spark.jars.excludes',...