substrait icon indicating copy to clipboard operation
substrait copied to clipboard

Clarify name matching for functions, datatypes, and options

Open gforsyth opened this issue 1 year ago • 1 comments

#504 led to a discussion in one of the fortnightly calls about how to perform matching:

Snippet from the agenda notes from the Substrait fortnightly meeting on May 24th, 2023:

[Jacques] Let’s start with “does anyone support or want ADD to match add?”
<silence>
[Jacques] Ok, how about we just stick with “plain binary matching?”
[Gil] That seems the best approach, given we aren’t experts in languages
[Gil] We might also want to update the rules for option matching so we can have consistent matching everywhere.
[Jacques] That sounds reasonable, Victor, do you want to put together a PR?
[Victor] Sounds reasonable
[David] Do we want to recommend (not require) that everything be lowercase?
[Jacques] I think that’s fine
[Victor] What about also recommending snake_case?
[Jacques] Maybe lets move this to the ticket now

the general consensus was that we should preserve case in matching function names, e.g. ADD ~= add.

We also (relatively) recently codified that option names perform case insensitive matching. It was also discussed in the meeting that having inconsistent matching rules is not great for anyone.

Does anyone have any feelings about whether datatypes should also be matched exactly?

Should i8 be equivalent to I8? Should we require them to be lowercase?

Items to do for this:

  • [ ] codify in documentation that scalar and aggregate function names should be matched exactly
  • [ ] revert / change documentation and state that extension function options are matched exactly
  • [ ] clarify datatype matching policy (once we sort out what that is)

gforsyth avatar Jun 28 '23 12:06 gforsyth