shlink icon indicating copy to clipboard operation
shlink copied to clipboard

Allow long URL to be case sensitive, affecting findIfExists flag

Open davidstoker opened this issue 4 years ago • 4 comments

How Shlink is set-up

  • Shlink Version: 2.9.1
  • PHP Version: 8
  • How do you serve Shlink: Docker image
  • Database engine used: MySQL 5.7 (AWS Aurora)

Summary

Case-sensitive URLs are returning the same short link URL even when findIfExists is set to true.

Current behavior

In my deployed setup I have the original_url column with a utf8_unicode_ci collation.

  1. If I pass a URL like http://example.com/test with findIfExists set to true, I get a short link http://foo.com/abc123
  2. If a pass a different URL (anything after domain should be case sensitive) like http://example.com/TeSt and findIfExists set to true, I get the same http://foo.com/abc123 short link back

Expected behavior

I expected differing results. It makes sense that an existing short code is already seen as existing since the utf8_unicode_ci is case-insensitive.

I see that the short_code column is intentionally set to use a utf8_bin collation to ensure the UNIQUE constraint is case sensitive for short codes. Would it make sense to have a similar configuration for original_url to ensure that lookup of existing URLs is case-sensitive as well?

How to reproduce

  1. Check MySQL database configuration and see if a case-insensitive collation is in use
  2. Generate short link for a URL like http://example.com/test and note the result
  3. Generate short link for URL like http://example.com/TeSt with findIfExists set to true and note that the same short code from step 1 was returned

davidstoker avatar Oct 19 '21 02:10 davidstoker

Would it make sense to have a similar configuration for original_url

Short answer without checking anything else, yes.

However, changing that is a breaking change that could potentially have many side effects, so it needs to be treated as a feature you can optionally opt-in.

acelaya avatar Oct 19 '21 05:10 acelaya

@acelaya Thanks for the quick response! Could you clarify why it should be treated as an optional feature? Since the path information in a URL is case-sensitive (scheme, host and others aren't), would it not be desirable to have the default behavior be that URLs and lookups against them like findIfExists be case-sensitive?

For a further view on my use, we use 6 digit case-sensitive alpha-numeric IDs in URLs similar to YouTube and others. We convert to short URLs for notifications and ran into repeated use of previous URLs since some of our IDs had 1-letter case differences. This naturally results in sending users to the wrong intended destination.

davidstoker avatar Oct 19 '21 13:10 davidstoker

Yes, it would be desirable, but the fact it will change the way Shlink behaves, can potentially break how others are using it, even if, technically speaking, we could say they are using it in the "wrong" way.

Because of that, it needs to be optional (I can see a stream of new tickets being opened otherwise). But from your point if view, there's no difference if it's treated as a feature or a bug. The only practical difference is that you will have to provide an env var when the new version is released.

acelaya avatar Oct 19 '21 14:10 acelaya

Thanks for the insight @acelaya! I'm wrapping up testing, but on my end it looks viable for me to change original_url to utf8_bin collation to solve for the behavior for now.

davidstoker avatar Oct 19 '21 15:10 davidstoker

This is no longer applicable, now that Shlink supports strict and loose modes.

acelaya avatar May 29 '23 07:05 acelaya