underscore.string icon indicating copy to clipboard operation
underscore.string copied to clipboard

Trim Algorithm Is Overly Simplistic

Open machineghost opened this issue 13 years ago • 6 comments
trafficstars

If you try doing _.trim('someProperty', 'Property'), you might be surprised at the result: instead of "some" (as you might expect), you get "som".

This happens because the current trim implementation just takes every character in your second arg and throws them inside "[]" in a regex. So if a letter in the second arg (eg. 'e') exists in the part that shouldn't be trimmed, that letter (or letters) get trimmed also. In other words, _.trim(anyStringWhatsoever, 'a-zA-Z1-0') == ''.

I think a much less error-prone approach would be to check whether arg #1 startsWith/endsWith arg #2(or both, depending on whether this is a trim/ltrim/rtrim); if not, just return arg #1. If it does start/end with arg #2, simply use .substring to do the trimming: ltrim = arg1.substring(arg2.length); rtrim = arg1.substring(0, arg1.length - arg2.length);

machineghost avatar Oct 17 '12 19:10 machineghost

Just to make things easier, here are revised versions of ltrim/rtrim/trim (yes I know I could make a pull request, but I'm feeling lazy, and at least I did the hard part of writing the functions). Hopefully they are helpful:

trim: function(str, characters){
  if (str == null) return '';
  if (!characters && nativeTrim) return nativeTrim.call(str);
  characters = characters || ' ';
  while(str && _s.startsWith(str, characters)) {
      str = str.substring(characters.length);
  }
  while (str && _s.endsWith(str, characters)) {
      str = str.substring(0, str.length - characters.length);
  }
  return str;
},

ltrim: function(str, characters){
  if (str == null) return '';
  if (!characters && nativeTrimLeft) return nativeTrimLeft.call(str);
  characters = characters || ' ';
  while(str && _s.startsWith(str, characters)) {
      str = str.substring(characters.length);
  }
  return str;
},

rtrim: function(str, characters){
  if (str == null) return '';
  if (!characters && nativeTrimRight) return nativeTrimRight.call(str);
  characters = characters || ' ';
  while (str && _s.endsWith(str, characters)) {
      str = str.substring(0, str.length - characters.length);
  }
  return str;
},

machineghost avatar Oct 17 '12 20:10 machineghost

This actually makes sense to me. Will think about introducing these changes in next major release.

/cc @epeli @edtsech

rwz avatar Oct 18 '12 05:10 rwz

Awesome, glad I could help improve the library :-)

machineghost avatar Oct 18 '12 17:10 machineghost

(potentially/maybe/if you ultimately decide to use the changes)

machineghost avatar Oct 18 '12 17:10 machineghost

@rwz: This also helps resolve #168 and an error I was going to report regarding _.words where certain delimiters (such as /\s+/) would fail with a Syntax Error.

terinjokes avatar Feb 07 '13 03:02 terinjokes

A correct trim implementation should be something like this, but more underscory.

https://github.com/es-shims/es5-shim/blob/v4.0.3/es5-shim.js#L1398-L1415

wyuenho avatar Aug 31 '14 00:08 wyuenho