dbhub.io icon indicating copy to clipboard operation
dbhub.io copied to clipboard

Determine full list of acceptable characters for table names

Open justinclift opened this issue 7 years ago • 4 comments

An error on the dev1 console shows someone uploaded a database with a space in it's table name.

The upload succeeded, but the database table can't be viewed as our internal checks barf when they see the space and refuse to process it.

This is related to #23.

justinclift avatar Apr 13 '17 15:04 justinclift

As I've said in #23 it's probably the best to eventually just allow all characters when we're sure that quoting works fine.

MKleusberg avatar Apr 15 '17 19:04 MKleusberg

Well, I can think of some potential exceptions, or at least characters we'd really want to deeply consider first. eg the "/" char for database or usernames, as that would make things like this confusing:

  • someuser / somedatabase ← Standard layout, nothing weird here
  • someuser / somefolder / somedatabase ← Similar, but with a folder in the path. No weirdness
  • "someuser / somefolder" / somedatabase ← Looks like the previous entry, but it's actually a user with a username of "someuser / somefolder". Obviously a contrived example, but it demonstrates the namespace clash.

So I'm kind of thinking that we just completely disallow "/" in user, folder, and database names. In table names though they might be workable.

justinclift avatar Apr 18 '17 14:04 justinclift

You're right! We should definitely disallow the "/" character for database, user, and directory names.

For table names it should be fine. I also feel like we are free to restrict the database, user, and directory names in any way we like but can't really do that for table or column names. The first are names on our platform, the latter are names inside the users' data and we shouldn't mess with that.

MKleusberg avatar Apr 29 '17 11:04 MKleusberg

Haven't yet gotten around to this, as our present validation system uses (Golang) regex's which I don't (yet) have a deep understanding of.

We'll need to figure out a more complete method for determining "safe" vs "unsafe" characters:

  • If something isn't valid unicode, reject it
  • Not all Unicode characters are safe.
    • https://github.com/qntm/base65536gen#what-makes-a-character-safe

To do this properly I'm thinking we'll first need to look through (and really grok) the approach taken by the above base65536gen project. Then figure out which parts are relevant to us and implement a validator using those.

Expecting that'll be a non-trivial task. :wink:

justinclift avatar Aug 14 '17 13:08 justinclift