SynapseML
SynapseML copied to clipboard
Feature: Autogen Frequent Pattern Matching (and other SparkML models) for .NET for Apache Spark project
Team, Is there a plan to implement full Spark's MLLib, especially ML.fpm (Frequent Pattern Mining) anytime soon? It has only 2 algorithms as per Spark v3.1.2 (FP-Growth & PrefixSpan) and are very useful ML algorithms in some scenarios. Would be a great help if it's a part of this library. Thanks.
AB#1946967
Hey @rrekapalli :wave:! Thank you so much for reporting the issue/feature request :rotating_light:. Someone from SynapseML Team will be looking to triage this issue soon. We appreciate your patience.
Hey @rrekapalli thanks for reaching out, not sure what the request is here but SparkML seems to already support these algorithms: https://spark.apache.org/docs/latest/ml-frequent-pattern-mining.html#:~:text=Mining%20frequent%20items%2C%20itemsets%2C%20subsequences,rule%20learning%20for%20more%20information. And our library is completely inter-operable with SparkML so feel free to mix this into your SynapseML models and pipelines
Hi @mhamilton723 , thank you for the quick response!
I believe SynapseML depends on .Net for Apache Spark, which does not have full implementation of SparkML. Thought SynpseML would have full interoperability (especifically, C# bindings) with SparkML, but I could not find any references to the "org.apache.spark.ml.fpm" in this repo. Appreciate if you could point me to a reference about this feature.
Ahhhh yes i get what you are saying now. @serena-ruan will eventually contribute back the generated SparkML bindings to the Spark.NET team. I will let her comment on timelines and whatnot
Thank you, @mhamilton723 !
@rrekapalli Thanks for raising up this feature request!
Currently as you can see, not all SparkML models are supported in .Net for Apache Spark. But this FPM
looks like a typical one we could solve by applying our codegen bindings.
Though I can't give you a precise ETA at this moment, because I think even after we contribute to dotnet/spark repo, we need to wait until Microsoft.Spark cut a newer release in order to use the feature officially. But I'll have a try within this week or early next week, and keep you updated :D
Thank you very much for taking this up, @serena-ruan! Would be eagerly waiting this to be part of this Repo. Really appreciate your effort!