cl-strings icon indicating copy to clipboard operation
cl-strings copied to clipboard

Digressive discussions

Open pnathan opened this issue 8 years ago • 9 comments

Some remarks.

First - I like the idea. CL is clunky in certain ways, and these functions are often reimplemented. It is important not to take the criticism of the legendarily pushy Lisp community too much to self, and to have a very thick skin. You can find me on Twitter or email if you want to reach out!

Now, that said, I have many critiques. ;-)

cl-strings sounds quite authoritative. Perhaps useful-strings might be a better name.

Having a package-nickname s is, I think, not a good idea. That feels very liable to a collision with another package. And 10 characters is not much. :-)

In general, I'd suggest not outputting to streams as much? I would guess that is fairly slow, which tends to be undesirable in these sorts of functions.

You should definitely check your system against ABCL, CCL, and SBCL. Those are the major open source Common Lisps today.

http://www.cliki.net/parse-number exists already.

format-number should be condensed into FORMAT under the hood; maybe even removed.

split - is partially implemented by the split-sequence library. Maybe just wrap that?

join uses output to a stream: would suggest FORMAT instead.

clean handles both prefix and postfix space, as well as interior space duping. I think that you should let STRING-TRIM handle the prefix/postfix spaces, and have a dedup function for the interior.

pnathan avatar Mar 10 '16 17:03 pnathan

Hi Paul,

First, thanks for your awesome work on articulate-common-lisp! And also, ofcourse, for taking the time to review and give feedback on this :)

cl-strings sounds quite authoritative (...) Having a package-nickname s is, I think, not a good idea (...)

Yes, cl-strings was what I called it for myself, more for lack or originality than presumption. Not sure whether to change the name or try to make this worthy of it (probably the former...). I also see the potential problems with the nickname. Do you think it would be an improvement to just remove the nickname and maybe change the name of truncate so that it's possible to do use-package without conflicts? That seems to make everything better.

In general, I'd suggest not outputting to streams as much?

I actually read somewhere that this can be very fast instead of many concatenations. I then confirmed it on sbcl, with some tests. Maybe these are sbcl specific optimizations? I'll look into this.

You should definitely check your system against ABCL, CCL, and SBCL.

I developed it on sbcl, and tested for portability only on ecl. I need to follow your suggestion and do that on ABCL and CCL asap.

format-number should be condensed into FORMAT under the hood; maybe even removed.

It uses format, but twice, once for the integer part, another for the decimal part. This is because I didn't see how it was possible to use format directly to accommodate for some options, like having floats with order separators (if i remember correctly). If i'm wrong here, i'd love to simplify it.

clean handles both prefix and postfix space, as well as interior space duping. I think that you should let STRING-TRIM handle the prefix/postfix spaces, and have a dedup function for the interior.

Awesome, a dedup function sounds great! That sounds better than what i did: clean uses string-trim and than implements some weird dedup functionality internally.

I'm thinking about your other points related to the existing libraries (which are awesome). I started this with the thinking that it would be nice to have cl-strings be dependency free. But I'm not at all sure this is the right vision, do you have any opinions on the pluses and minuses of a package not bringing up dependencies in the common lisp ecosystem?

Again, many thanks for taking the time to help out!

diogoalexandrefranco avatar Mar 10 '16 21:03 diogoalexandrefranco

I started this with the thinking that it would be nice to have cl-strings be dependency free. But I'm not at all sure this is the right vision, do you have any opinions on the pluses and minuses of a package not bringing up dependencies in the common lisp ecosystem?

Depends on your target audience. I would be looking for dependency-free libraries when doing something with mocl, ECL, etc, the more fringe Common Lisps. If I was using this for desktop or server usage, I would not even trouble myself for dependency-free libraries, I would just want the tools that worked well.

pnathan avatar Mar 10 '16 22:03 pnathan

About string streams: I think what pnathan is getting at is that, if you can easily calculate the length of the final string, you can do better than string streams by allocating the final string and updating it destructively. In insert, for example, I think it would be more natural to use replace than to write to a string stream.

ruricolist avatar Mar 11 '16 02:03 ruricolist

Yeah, I also think using so many with-output-to-string is unnecessary and just cluttering the code base with minimum (if any) performance improvement. Remember that premature optimization is the root of all evil, and I would be very reluctant to use any library that try to re-invent functionalities that already existed in the language instead of building on top of what's already there.

And there are many built-in functions in Common Lisp that you could leveraged to simplified the library implementation while still retaining portability.

For example, the chars function can be implemented using just a single call to concatenate, eliminating the need for a loop.

Another example is the toggle-case function, where instead of looping over each character in a string changing its case and writing one character at a time out to a stream, you could instead use string-upcase and string-downcase to change the case of the entire string in one function call (which I'm pretty sure would also be a lot faster than the current incarnation of toggle-case).

Also, you should really spend some time and learn advance features of Common Lisp's format. Format is a very powerful and efficient function for creating and manipulating strings, and you should use it more in your library instead of rolling your own solution.

I hope I haven't offended you. Again, just trying to give (what I think is) a constructive feedbacks.

zodmaner avatar Mar 11 '16 03:03 zodmaner

@ruricolist Thanks for your feedback. I 100% agree, i'll see where I can make it more natural / improve performance. I must admit I still didn't thoroughly test the performance of many of these functions, but it's something I want to do soon (also, if you feel like contributing, that would be awesome).

@zodmaner Thank you also! Any feedback is appreciated.

the chars function can be implemented using just a single call to concatenate, eliminating the need for a loop

Oh, concatenate with a 'list type, I always forget about that. Of course that still loops, under the hood. I will compare the performance of both, and go for concatenate if it's anywhere close, since it does look even simpler. Edit: From my tests on sbcl, it seems that concatenate is a bit slower than the loop version.

Another example is the toggle-case function (...)

I think your suggestion doesn't work here, notice that this function toggles the case of each char individually because they can have different cases, instead of the entire string.

diogoalexandrefranco avatar Mar 11 '16 08:03 diogoalexandrefranco

Chipping in w/ the suggestions

Currently cl-strings is not portable, as make-template-parser as for-as clauses after a conditional clause. That is, while is intermingled with for clauses. This is not something that one can depend upon on a conforming CL implementation.

Furthermore, portability cannot be tested by just running. For example, if one code cons walked in SBCL prior to version 1.2.2 it would work. However because it was not portable it broke from version 1.22 onwards. Unfortunately the adjective portable is sometimes used to refer to cross implementation compatibility layers like bourdeaux-threads.

Another detail, the error invocation code is unncesarily verbose (and it is good style for error messages to end in a period)

(error (make-condition 'simple-type-error
                           :format-control "Delimiters can't be empty strings"))

could be rewritten as

(error 'simple-type-error
       :format-control "The empty string is not a valid delimiter.")

Do not use :nickname option in defpackage

As Didier Verna explains, if the library author defines the nicknames, nickname conflicts are inevitable. Especially if the nickname is a single letter. This has happened before with binary-types and bourdeaux-threads. Following the suggestion, cl-strings would export the following function

(defun nickname-package (&optional (nickname :s))
  (let ((cl-strings-package *package*))
    ;; Is it wise to use *package* here or better to use (find-package
    ;; :cl-strings)?
    (rename-package cl-strings-package (package-name cl-strings-package)
                    (adjoin nickname (package-nicknames cl-strings-package)
                            :test #'string-equal))))

My 2c, Lisp Long and propser

PuercoPop avatar Mar 22 '16 21:03 PuercoPop

Thank you for chipping in Javier, great points!

Currently cl-strings is not portable, as make-template-parser as for-as clauses after a conditional clause.

I did notice today that clisp doesn't like my use of for-as clauses interleaved with conditional clauses, so this is why. I was fooled by the fact that it ran correctly on the implementations I tried. Since by portable I mean the definition you linked above, I'll open an issue with the bug label (edit: created the portability label instead). It shouldn't be hard to fix, and I believe is the only place in the code where this happens.

Another detail, the error invocation code is unncesarily verbose (and it is good style for error messages to end in a period)

Thanks for pointing this out, condition signaling was also a bit inconsistent. I just pushed a commit that normalizes error messages, make-condition verbosity, periods, etc..

Do not use :nickname option in defpackage

This is the second time it is mentioned to me, so I removed the nickname. I'm still not sure about exporting the nickname-package function though. This looks like it can be done by the user from the outside, so maybe I just document how to do it? My concern is that i feel like i'm polluting the api with a sort of meta function, since it has more to do with the lib than with strings.

Again, thanks for taking the time to help!

diogoalexandrefranco avatar Mar 22 '16 23:03 diogoalexandrefranco

per commit https://github.com/diogoalexandrefranco/cl-strings/commit/516ead136d0eca4f60c44e2a8686465633157fd2 it looks like the portable issue of make-template-parser is fixed.

ps: this is a much needed library IMO ! It just has to be referenced a bit more on cliki and the lists out there, trying to. I too wrote a little string manipulation library to address the same needs (cl-str), before I discovered this one. I myself rely more on built-ins or existing libraries.

vindarel avatar Jan 24 '17 16:01 vindarel

There is no more the s nickname per commit diogoalexandrefranco/cl-strings@642b468

vindarel avatar Jan 24 '17 17:01 vindarel