mleap
mleap copied to clipboard
Question: How to extend DefaultLeapFrames with new operations
Hi,
i am currently working on an extension of the DefaultLeapFrame
class. I want to give it 3 operations which I think would help several people to get things rolling: explode, implode and join. I do this to be able to join additional information to data which is sent to the serving rest-server for predictions which can then be used to create new features that are needed for my pipeline.
I have a first draft for which i touched the following files:
-
LeapFrame.scala
(adding abstract methods there) -
DefaultLeapFrame.scala
- extends LeapFrame (implementing them there)
When I try to compile, it gives me one error, complaining that I did not implement the new methods in the RowTransformer
(also extends LeapFrame ).
So my questions is: do I need to add those methods to the LeapFrame
which is higher in the class hierarchy or is it enough to add them only to the leaf DefaultLeapFrame
(which i found to be used in the serving project)? The RowTransformer
is as far as i currently understand it only important to provide dataframe functionality for the Transformer parts of the Pipelines we want to model. So to my understanding, I would not need to add those methods there. Is this correct?
Ok, I have implemented the operations for the DefaultLeapFrame
and an additional subclass for the LeapFrame in form of the HashIndexedLeapFrame
(which allows for creating in-memory indexes in order to speed up join operations). I also added some tests to it to the LeapFrameSpec.scala
.
If you would like to add those changes to the project, I would push them or create a Pull Request for them.