TransmogrifAI The effect of random seeds on results ?

The effect of random seeds on results ?

Open shenzgang opened this issue 3 years ago • 5 comments

When I use Titan tests I get different and very different estimates each time. Does random seeding have that much of an impact?

Aug 10 '21 08:08 shenzgang

Yes, indeed. In order to get a predictable behavior you can set random seed in your tests. Depending on your tests structure where you set the seed might vary. For example - https://github.com/salesforce/TransmogrifAI/blob/master/helloworld/src/main/scala/com/salesforce/hw/titanic/OpTitanic.scala#L50,

Sep 09 '21 05:09 tovbinm

Thanks for your reply! There is also the question of how to use the generated model to predict unlabeled test sample data. Are there any examples of using model prediction?

Sep 09 '21 05:09 shenzgang

You can save a trained model, then load it later, set a new scoring reader / a new input dataset, and finally compute scores by invoking score().

You can also use transmogrifai-local for on-line serving of your model (e.g over HTTP API)

Sep 09 '21 06:09 tovbinm

The data set used for model training is labeled column, while the test data is not labeled column. When calling score(), an exception will be thrown

Sep 09 '21 07:09 shenzgang

You will need to create an empty label column

Sep 09 '21 14:09 leahmcguire

TransmogrifAI TransmogrifAI copied to clipboard

The effect of random seeds on results ?

TransmogrifAI
TransmogrifAI copied to clipboard