cl-str icon indicating copy to clipboard operation
cl-str copied to clipboard

Ensure functions

Open vindarel opened this issue 2 years ago • 16 comments

Example:

  (is (ensure-starts-with "/" "/abc") "/abc")

Feedback welcome (especially for naming).

vindarel avatar Jun 02 '22 13:06 vindarel

@tdrhq @charJe Hello, feedback welcome for the naming of these new functions :) ensure-starts-with, ensure-enclosed-by

vindarel avatar Jun 03 '22 05:06 vindarel

I like it. If you want to make the names shorter, maybe ensure-start?

charJe avatar Jun 03 '22 12:06 charJe

The words "prefix" and "suffix" are more concise and descriptive than "starts-with" and "ends-with". "Enclosed-by" seems awkward; "wrapped-in" feels more natural to me, and is probably more accessible to non-native English speakers.

alphapapa avatar Jun 03 '22 12:06 alphapapa

Thanks. Indeed, I hesitated with ensure-prefix. But I judged that the prefix & suffix related functions are a bit special on their own (for one, they don't accept the usual parameters with the full string at the end), and I thought that auto-completing str:start… would give more accurate / predictive results.

I see your update: wrapped-in sounds good, thanks. (tempted by ensure-wrapped-in !)

vindarel avatar Jun 03 '22 13:06 vindarel

Why would the ensure- functions not have the full string at the end?

charJe avatar Jun 03 '22 13:06 charJe

Why would the ensure- functions not have the full string at the end?

sorry, I badly expressed myself: the existing functions with "prefix" have a different signature:

(defun prefix (items) 

(defun common-prefix (items)

(defun prefix? (items prefix)  ;; <-- not S, as a full word 

so using "start-with…" would feel more consistent.

vindarel avatar Jun 03 '22 13:06 vindarel

(ensure-starts-with "/" "/abc") isnt "/" considered the full string. It is not the last argument. This is confusing to me because starts-with-p has the full string as the last argument.

Disregard, I was confused, all is well.

charJe avatar Jun 03 '22 13:06 charJe

Could you clarify a typical use case for these functions? It could help with the naming.

In particular just calling it prefix and suffix makes me ask, what happens in this case: (ensure-starts-with "bar" "/ba")

Just from the name, should that give us "/bar" or "/babar"? The code clearly suggests it's the latter, but without knowing the use case the function name (and even documentation) feels a little ambiguous.

Many of your test-cases seem to suggest single character prefixes, maybe that's where the use case lies?

Another naming direction you could with is something like concat-if-required (not that exactly, but something of that manner)

tdrhq avatar Jun 03 '22 15:06 tdrhq

What's the difference with starts-with-p. I'd prefer the common CL naming scheme for predicates.

mdbergmann avatar Jun 03 '22 15:06 mdbergmann

The ensure- functions return a string that has the specified prefix or suffix, appened if necessary.

charJe avatar Jun 03 '22 16:06 charJe

My typical use-case if for slashes with URIs:

(str:ensure-starts-with "/" "foo/bar") ;; => /foo/bar/ (edited for middle / )
(str:ensure-enclosed-by "/" "9278-uuid-123") ;; => "/9278-uuid-123/"

in that case, that's a poor man URI handling when I don't want (or need) to reach for a proper lib like Quri.

@tdrhq

 (str:ensure-starts-with "bar" "/ba")
"bar/ba"

vindarel avatar Jun 03 '22 19:06 vindarel

@vindarel Oh sorry I meant (str:ensure-starts-with "/ba" "bar"). The shortest way to "ensure" that "bar" starts with "/ba" is to just prepend a "/"

tdrhq avatar Jun 03 '22 19:06 tdrhq

I agree with your use case btw, I can see myself using that frequently. Just not sure about the semantics or usefulness of multi-character prefixes/suffixes.

tdrhq avatar Jun 03 '22 19:06 tdrhq

I was trying to figure out real world situations where multi-character prefixs and suffixes come up, this is the best I got:

(str:ensure-ends-with "/index.html" "https://example.com/") --> I would expect this to return "https://example.com/index.html" not "https://example.com//index.html".

(str:ensure-starts-with "https://" "www.example.com") -> both behaviours would probably be reasonable

tdrhq avatar Jun 03 '22 19:06 tdrhq

yeah, good catch for the multi-character prefixes/suffixes. I didn't have such a use-case, so gotta give them more thought.

For this use-case though: (str:ensure-ends-with "/index.html" "https://example.com/") I am thinking about another function, to join several elements as in a URI, to add a "/" if necessary and to avoid double slashes: (str:join-as-uri "http://example.com/" "/index.html") -> the right thing, no double "/" (and for more use cases, we refer to Quri).

You might be on to something with this use case:

(str:ensure-starts-with "https://" "www.example.com")  ;; = logical
(str:ensure-starts-with "https://" "https://www.example.com")  ;; = nothing to do here
;; but here:
(str:ensure-starts-with "https:// "http://www.example.com") ;; => should it give "https://www.example.com" ?

if it should be the default behavior, another function or a keyword argument, I'll see. But I didn't meet the use case, our examples are URI-specific, so I'll probably stay simple with the current behavior.

vindarel avatar Jun 03 '22 20:06 vindarel

(str:ensure-starts-with "https:// "http://www.example.com") ;; => should it give "https://www.example.com" ?

This one should definitely not be part of cl-str. My URI examples were mostly for "first-pass", "hacky" implementations. It's convenient to just to string handling when you're just prototyping something.

PS. I went through my code base trying to find where this pattern shows up, the trailing slash shows up in several locations. The exact pattern of checking if it's a suffix and only then appending happens only at one place: https://github.com/screenshotbot/screenshotbot-oss/blob/main/src/util/store.lisp#L46 (copied from my internal repo), but there are several other places where I just do (format nil "~a/" var), and ensure-ends-with would read a lot better. They're all related to the "/" as far as I can tell.

tdrhq avatar Jun 03 '22 20:06 tdrhq

I renamed the functions to ensure-start, ensure-end, ensure-wrapped-in, I added wrapped-in-p.

I noticed a difference in a hedge case:

(str:starts-with-p nil "rst")
NIL
CL-USER> (str:wrapped-in-p nil "rst")
"rst"

I didn't add something for (str:join-as-uri "http://example.com/" "/index.html") in this PR, which I'd have the need for. But maybe I should just use Quri. BTW, off-topic for this PR, but this library seems to do what I want: be able to join URIs, but keep manipulating strings, because hey sometimes it's more convenient / quicker:

CL-USER > (url-parse "google.com" :path "/index.html")
http://google.com/index.html
;; and then:
CL-USER > (url-parse * :scheme "https")
https://google.com/?s=common+lisp

https://github.com/massung/url (not on QL)

With Quri… ok, not so bad.

CL-USER> (quri:merge-uris (quri:uri "/index.html") (quri:uri "https://en.wikipedia.org/wiki/URL"))
#<QURI.URI.HTTP:URI-HTTPS https://en.wikipedia.org/index.html>
CL-USER> (format nil "~a" *)
"https://en.wikipedia.org/index.html"

There's also a commit that says "Let merge-uris accept string parameters." "As it did in version 0.4.0." and this was many months ago, looks like I'm lagging behind)

vindarel avatar Feb 20 '23 18:02 vindarel

The idea of ensure function is nice.

One more option could be having one function with start, end, wrapped-in as keyword parameter, where either wrapped-in or start and/or end can be supplied. E.g.

(defun ensure (s &key start end wrapped-in)

Or with different parameter-names

(defun ensure (s &key prefix suffix wrapped-in)

kilianmh avatar Feb 23 '23 22:02 kilianmh

Good idea. Something like

(defun ensure (s &key start end wrapped-in)
  (cond
    (wrapped-in
     (ensure-wrapped-in wrapped-in s))
    ((and start end)
     (ensure-start start (ensure-end end s)))
    (start
     (ensure-start start s))
    (end
     (ensure-end end s)))
    (t
     s)))

in that case, a ":start" key is weird and too different than its usual meaning, so probably ":prefix" is better. This leads again to my hesitation to use ensure-prefix instead of ensure-start (because we have starts-with-p and the existing "*prefx" functions are a bit unusual in their signature. But "ensure-start" may not be has explicit has "ensure-starts-with" which is too long.). Shall we vote? :D

vindarel avatar Feb 24 '23 10:02 vindarel

re. naming, I now tend to ensure-prefix/suffix, along with (ensure s :prefix … :suffix …) @kilianmh thoughts?

vindarel avatar Feb 28 '23 19:02 vindarel

Yes, prefix/suffix makes more sense, even if you look up dictionary: prefix / suffix.

Here I have an exemplary implementation of partial prefix adding

  (defun ensure-prefix (prefix s)
    (labels ((add-prefix (remaining-prefix)
               (cond ((emptyp remaining-prefix)
                      (concat prefix s))
                     ((starts-with-p remaining-prefix s)
                      (concat (subseq prefix 0 (1+ (length remaining-prefix)))
                              s))
                     (t
                      (add-prefix (s-rest remaining-prefix))))))
      (if (or (null prefix)
              (starts-with-p prefix s))
          s
          (add-prefix (s-rest prefix)))))

(ensure-prefix "abc" "c3235") ;=> "abc3235"

We could implement ensure-suffix either similiarly or reuse ensure-prefix (less efficient, but less code)

  (defun ensure-suffix (suffix s)
    (reverse
     (ensure-prefix (reverse suffix)
                    (reverse s))))

kilianmh avatar Mar 01 '23 13:03 kilianmh

Alright, thanks, let's name them properly and merge quick, I have code using these functions now!

(ensure-prefix "abc" "c3235") ;=> "abc3235"

but this is too weird, I need a strong rationale to add this, but definitely not in the "ensure" functions nor in this PR. I questioned the possibility, but it was discarded by the discussion. For URLs, there are URL handling libraries.

vindarel avatar Mar 08 '23 18:03 vindarel