TransmogrifAI icon indicating copy to clipboard operation
TransmogrifAI copied to clipboard

Improve test coverage for transformers & estimators

Open tovbinm opened this issue 6 years ago • 4 comments

Problem Some of our transformers & estimators are not thoroughly tested or not tested at all.

Solution Use OpTransformerSpec and OpEstimatorSpec base test specs to provide tests for all existing transformers & estimators.

tovbinm avatar Apr 11 '19 04:04 tovbinm

After a quick survey of com.salesforce.op.stages.impl.feature, found the following:

These classes don't appear to have associated tests:

  • FilterMap
  • OpIndexToString (just a wrapper for Spark IndexToString, though)
  • OpLDA (has test, OpLdaTest, but names don't match)
  • OpOneHotVectorizer
  • OpScalarStandardScaler (OpStandardScalerTest exists but names don't match)
  • RealNNVectorizer
  • TextMapPivotVectorizer (tested in both OPMapVectorizerTest and TextMapVectorizerTest but names don't match)
  • Transmogrifier (TransmogrifyTest exists but names don't match)

These tests exist but do not extend OpTransformerSpec or OpEstimatorSpec

  • Base64VectorizerTest
  • DateMapVectorizerTest
  • DateTimeVectorizerTest
  • DateVectorizerTest
  • EmailParserTest
  • EmailVectorizerTest
  • FillMissingWithMeanTest
  • GeolocationVectorizerTest
  • HashingTFTest
  • IntegralVectorizerTest
  • IsotonicRegressionCalibratorTest
  • LinearScalerTest
  • MultiPickListMapVectorizerTest
  • NGramSimilarityTest
  • NGramTest
  • NumericBucketizerTest
  • NumericVectorizerTest
  • OPCollectionHashingVectorizerTest
  • OPCollectionTransformerTest
  • OpCountVectorizerTest
  • OpIndexToStringNoFilterTest
  • OpLdaTest
  • OPMapVectorizerTest
  • OpSetVectorizerTest
  • OpStandardScalerTest
  • OpStringIndexerNoFilterTest
  • OpStringIndexerTest
  • OpWord2VecTest
  • PercentileCalibratorTest
  • PhoneNumberParserTest
  • RealVectorizerTest
  • ScalerMetadataTest
  • ScalerTest
  • SmartTextMapVectorizerTest
  • TextMapVectorizerTest
  • TextTokenizerTest
  • TextTransmogrifyTest
  • TextVectorizerTest
  • ToOccurTransformerTest
  • TransmogrifyTest
  • UniqueCountTest
  • URLVectorizerTest

crupley avatar Apr 15 '19 20:04 crupley

Hi guys, I would like to help contributing and this seems like a good place to start. Any tips or places to look to help me getting up and running with implementing some of these tests?

Sammyalhashe avatar Sep 28 '19 03:09 Sammyalhashe

Hi @Sammyalhashe! There are still some tests in the list above that need to be updated. I would go through there and find the tests that don't extend OpTransformerSpec or OpEstimatorSpec. You can look at the tests that do extend them and at the PR's that reference this issue for examples on what to do.

crupley avatar Sep 28 '19 15:09 crupley

Hey @crupley! Thanks for the reply, I'll start looking into some

Sammyalhashe avatar Sep 29 '19 20:09 Sammyalhashe