knn
knn copied to clipboard
String.substring always returns null for corpus weighting
At line 173 in Vectorize20NewsGroups.java [1], the substring call is from startIndex 1 to endIndex 1 which always returns an empty string. So, the CorpusWeighting cw is always going to be null.
Did you run it to see if it works? :)
[1] https://github.com/tdunning/knn/commit/c09d742febf5242899b1c187c802d3bbb5164f0d#L0R173
Ouch. Bit again by the index-or-length issue.
This is not a substring of the word... it is a substring of the code for controlling the word weighting. I have run the code and don't understand how it avoided an NPE here.
On Thu, Dec 27, 2012 at 6:57 AM, Dan Filimon [email protected]:
At line 173 in Vectorize20NewsGroups.java [1], the substring call is from startIndex 1 to endIndex 1 which always returns an empty string. So, the CorpusWeighting cw is always going to be null.
Did you run it to see if it works? :)
[1] c09d742#L0R173https://github.com/tdunning/knn/commit/c09d742febf5242899b1c187c802d3bbb5164f0d#L0R173
— Reply to this email directly or view it on GitHubhttps://github.com/tdunning/knn/issues/9.
Looks like I never ran this version:
Exception in thread "main" java.lang.NullPointerException at org.apache.mahout.knn.Vectorize20NewsGroups$CorpusWeighting.parse(Vectorize20NewsGroups.java:175) at org.apache.mahout.knn.Vectorize20NewsGroups.main(Vectorize20NewsGroups.java:74) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)
On Thu, Dec 27, 2012 at 12:30 PM, Ted Dunning [email protected] wrote:
Ouch. Bit again by the index-or-length issue.
This is not a substring of the word... it is a substring of the code for controlling the word weighting. I have run the code and don't understand how it avoided an NPE here.
On Thu, Dec 27, 2012 at 6:57 AM, Dan Filimon [email protected]:
At line 173 in Vectorize20NewsGroups.java [1], the substring call is from startIndex 1 to endIndex 1 which always returns an empty string. So, the CorpusWeighting cw is always going to be null.
Did you run it to see if it works? :)
[1] c09d742#L0R173https://github.com/tdunning/knn/commit/c09d742febf5242899b1c187c802d3bbb5164f0d#L0R173
— Reply to this email directly or view it on GitHubhttps://github.com/tdunning/knn/issues/9.
Yeah, no worries, I patched it up and ran it. Could you please look at the thread on the mailing list? :)