tntsearch
tntsearch copied to clipboard
Replacement of count() with !empty() in Italian Stemmer
With commit b23234f @wetfire2k replaced the count() in the if checks with !empty().
That's fine, but I would like to ask: how can you be 100% sure that the function deleteStuff() is not returning an empty string?
My question relates to the following reasoning:
- the
deleteStuff()function may only return either astringor anullvalue - the
count()function in the original code returns1if thedeleteStuff()function returned a string, including the empty string ('')
I may assume that the original intent of using count() was to pass the if check when the returned value is a string including the empty string. So, unless this assumption is wrong, we can replace count() with another functionally-equivalent function, but !empty() is functionally-equivalent if and only if the deleteStuff() function never returns an empty string!
How can we know if this can happen?
I guess it all boils down to the logic of the algorithm and the inner logic of that function. The last return instruction (in the last meaningful row) of the function deleteStuff() is the only one which may (potentially) return an empty string:
return mb_substr($str, 0, $pos + $str_len - $r_len);
It depends on the values of the arguments passed to mb_substr(): it returns an empty string only when $str is empty or when $pos + $str_len - $r_len computes to 0. Unfortunately I've not been able to demonstrate that this can or cannot happen.
So, here comes my question again: how can you be 100% sure that the function deleteStuff() is not returning an empty string?
If it were to return an empty string, than using !empty() would alter the original code behavior (assuming it was the correct implementation of the algorithm); another replacement should be done instead: for instance null !== and is_string() in this specific situation are functionally-equivalent to count().
PS: by the way, I was wondering why you renamed the function delete_suff() (as in the original code) into deleteStuff(): the original meaning was "delete suffix" and not "delete stuff (ie. delete things)" :-)
Nice question. On one hand with php >= 7.2 count() parameter must be an array or an object that implements Countable and string is not, in the other hand if the value returned by deleteStuff() is empty ( '' or null) why you should go ahed?
For the question related to deleteStuff function, I can't help you because its name was "deleteStuff" before my changes. So you have to ask why to @GaspariLab.
Thanks you for your feedback , and apologize my delay.
Your question makes sense, but so far I have not been able to demonstrate whether an empty value is something significant in the stemming algorithm.
What if the original intent was right to have the if clause compute to "true" when that deleteStuff() returned empty?
Unless we recognize that the original author "abused" the usage of count() and that he should have used !empty() instead. It may be possible, actually my question aims at resolving this doubt :)