dffml icon indicating copy to clipboard operation
dffml copied to clipboard

plugin: model: Add some new models!

Open johnandersen777 opened this issue 5 years ago • 17 comments

Add a model!

This issue is for discussion and help needed comments while adding new Modelss to DFFML.

First, get familiar with how models can be used via the DFFML command line: https://intel.github.io/dffml/master/plugins/dffml_model.html

Make sure you follow: https://intel.github.io/dffml/master/contributing/dev_env.html

Look at what libraries are already being wrapped or models have already been implemented. If you want to use a library that has not yet been integrated, reference the new model tutorial: https://intel.github.io/dffml/master/tutorials/models/

If want to create a new model using any libraries we already have wrappers for,just start working on those packages that already exist under model/. Create a new file under dffml_model_library_name. Each library wrapper does things differently, you should check out how that wrapper is interacting with the underlying library by looking at how the existing models are implemented.

johnandersen777 avatar Mar 20 '19 17:03 johnandersen777

Hey! So I thought of trying to add a new model but am confused about a good starting point for the same. Can you suggest a model that you are/were thinking to add? This way I'll have a more clear way of starting up on the same. Thanks.

yashlamba avatar Mar 30 '19 19:03 yashlamba

Is this still open? If yes what models are you looking to add?Also the tutorial link given is not working

aghinsa avatar Oct 07 '19 18:10 aghinsa

Hi @aghinsa yes it's still open. I've updated the issue. We don't have any neural networks that aren't classifiers right now. So that would be the top priority. The quickest way to fix that is probably by copying the tensorflow based classifier and modifying it to use: https://www.tensorflow.org/versions/r1.14/api_docs/python/tf/estimator/DNNEstimator

or you could create a new package and use another machine learning framework other than tensorflow.

johnandersen777 avatar Oct 07 '19 18:10 johnandersen777

Thanks.I'll go through the links. I'll go with your suggestion as I'm more comfortable with tensorflow than other frameworks. PS: Pretty sure I'll have lots of doubts.But I'm ready to invest time,so please to help with this.Also do we discuss related things here or on gitter(just found out about this).

aghinsa avatar Oct 08 '19 03:10 aghinsa

Awesome! Yes I'm around to answer any questions. Thanks for the help!

johnandersen777 avatar Oct 08 '19 15:10 johnandersen777

So, I went through model/tensorflow/dffml_model_tensorflow/dnnc.py,to clarify things

  1. we want to add a regression model which trains on all features
  2. we aren't passing any separate model_fn to the estimator(are we?) rather using the hidden_units from config to specify the model
  3. should the model be such that warm_start is enabled?if yes, can i use the model_dir arg for that

aghinsa avatar Oct 09 '19 05:10 aghinsa

  1. You'll probably want to make a class which subclasses from TensorflowModelContext (or maybe even DNNClassifierModelContext depending on how many methods you think don't need to be changed).
  2. Features we want to train on are passed to the __init__ method of the TensorflowModelContext class (I noticed #216, just so you know that the other method arguments aren't named correctly). We want to train on all features that we know how to make a feature column for using the tensorflow API. The feature columns are created in TensorflowModelContext.__init__, the names of the features we care about training on will be in self.features the feature volumes are self.feature_columns. A. You'll notice in the *_input_fn methods self.features is passed to repo.features in order to get a dict where the keys are feature names, and the values are the values of that feature.
  3. Yes, we're just using hidden units (for now, we could change this later, after you get the first version working, if you want).
  4. Yes, warm_start should be enabled. And yes, using self.model_dir_path would be the right way to do that.

https://github.com/intel/dffml/blob/65e4ce4cac10327659a20a0f7a33b3812c829ae2/model/tensorflow/dffml_model_tensorflow/dnnc.py#L158-L175

I'm sorry there's not a ton of comments in there, another good thing to do to start would be to copy the test file for the existing model to create a new test, and then run just that test.

$ cp tests/test_dnnc.py tests/test_dnnr.py
$ python3.7 setup.py test -s tests.test_dnnr

johnandersen777 avatar Oct 09 '19 18:10 johnandersen777

Is someone working on the issue? If not I would love to work on it.

aditisingh2362 avatar Dec 24 '19 10:12 aditisingh2362

Hi, I'm interested in working for this sub org under GSOC 2020, if you are planning to apply for GSOC 2020, could you please let me know where I can get started with so that I can start contributing. Thanks for your time.

rohit901 avatar Feb 03 '20 09:02 rohit901

I want to implement some neural network based models... and i am a gsoc 2020 aspirant and want to work on this topic.

sparkingdark avatar Mar 16 '20 06:03 sparkingdark

@rohit901 @darkdebo @aditisingh2362 Sorry for the late reply to those of you who commented on this a while ago, I'm sorry no one saw your comments. I've updated the issue, to point to the new tutorial. Let me know if you have any questions

johnandersen777 avatar Mar 16 '20 19:03 johnandersen777

Hello sir, I want to contribute in this project under GSOC 2020, please guide me any tutorial or videos to familiar with concept of adding ml models

purnimapatel avatar Mar 27 '20 06:03 purnimapatel

Hello sir, I want to contribute in this project under GSOC 2020, please guide me any tutorial or videos to familiar with concept of adding ml models

@purnimapatel Please see the New Model Tutorial

johnandersen777 avatar Mar 27 '20 14:03 johnandersen777

@pdxjohnny thanks sir

purnimapatel avatar Mar 28 '20 07:03 purnimapatel

Hey. I'd love to contribute to this issue, if it's still open. Are there any specific models you're looking to add? I had a few ideas, and would love to discuss them with you. @pdxjohnny

spur19 avatar Oct 20 '20 13:10 spur19

Hi, I am GSoC ' 2021 participant. I'd love to contribute to this issue if it's still open. Are there any specific models you're looking to add? I am comfortable with python and PyTorch. I am contributing for the very first time. I really appreciate any help/suggestion you can provide to get me started.

Soumyajain29 avatar Feb 27 '21 05:02 Soumyajain29

something that might be useful:

Mark Tenenholtz (@marktenenholtz) Tweeted: TL;DR:

Tabular: XGBoost/LightGBM/RF Time series: XGBoost/LightGBM/RF Image: ResNet/EffNet Text: RoBERTa Audio: ResNet/EffNet

Your best bet is usually to start with these and then experiment from there.

Nothing in ML is an end-all-be-all!

^ https://mobile.twitter.com/marktenenholtz/status/1501905757842731014

johnandersen777 avatar Mar 10 '22 18:03 johnandersen777