spark icon indicating copy to clipboard operation
spark copied to clipboard

Included StringIndexer and StringIndexerModel along with related test…

Open ramanathanv opened this issue 4 years ago • 7 comments

Hi,

I am creating this pull request that implements the basic methods in StringIndexer and StringIndexerModel class along with Test cases.

This is related to "Implement ML Features" #381

ramanathanv avatar Jan 06 '21 15:01 ramanathanv

@ramanathanv Is this PR ready to review? I see some StringIndexerTests tests failing in the pipeline.

Niharikadutta avatar Jan 15 '21 19:01 Niharikadutta

@ramanathanv Is this PR ready to review? I see some StringIndexerTests tests failing in the pipeline.

Hi @Niharikadutta ,

In the test case, I am trying to compare two List objects using Assert.Equal(). I notice in the test results that the Lists have the same content (see image below). Still there is an error. image

I assume that there must be a different way to compare the Lists. Can you please suggest me a way an alternate way to do this comparison?

Thanks.

ramanathanv avatar Jan 18 '21 06:01 ramanathanv

I notice in the test results that the Lists have the same content (see image below).

I see "..." in the output. Can you compare elements one by one to see where the difference comes from?

imback82 avatar Jan 19 '21 03:01 imback82

I notice in the test results that the Lists have the same content (see image below).

I see "..." in the output. Can you compare elements one by one to see where the difference comes from?

Hi @imback82 , I noticed that even the raw log file provides only 3 dots "...". Is there another way for me to check this?

ramanathanv avatar Jan 28 '21 14:01 ramanathanv

What I meant was if you are comparing two lists, you can compare elements in those lists one by one instead of comparing them at the list level. Does it make sense?

imback82 avatar Jan 28 '21 17:01 imback82

In StringIndexer there are a few calls which are missing the get* at the beginning - if I change them all to get* then the test passes for me.

public string GetStringOrderType() => (string)_jvmObject.Invoke("stringOrderType");

should be

public string GetStringOrderType() => (string)_jvmObject.Invoke("getStringOrderType");
public string[] GetInputCols() => (string[])_jvmObject.Invoke("inputCols");

should be

public string[] GetInputCols() => (string[])_jvmObject.Invoke("getInputCols");
public string GetInputCol() => (string)_jvmObject.Invoke("inputCol");

should be

public string GetInputCol() => (string)_jvmObject.Invoke("getInputCol");
public string GetHandleInvalid() => (string)_jvmObject.Invoke("handleInvalid");

should be

public string GetHandleInvalid() => (string)_jvmObject.Invoke("getHandleInvalid");

If you change those then the tests should pass :)

GoEddie avatar Feb 01 '21 21:02 GoEddie

If you change those then the tests should pass :)

Thanks @GoEddie for identifying the issue . I have fixed the property names. But the test fails for unknown reasons. Can you please re-initiate the build/tests?

ramanathanv avatar Feb 03 '21 18:02 ramanathanv