ast icon indicating copy to clipboard operation
ast copied to clipboard

The definition of `.` and `source` in `shtab_builtins[]` are different

Open krader1961 opened this issue 5 years ago • 7 comments

While working on the conversion from the AST optget() function to the borg standard getopt_long() I noticed a weird discrepancy. The definitions of the . (dot) command and source in shtab_builtins[] are different. The former includes the BLT_SPC flag while the latter does not.

This discrepancy does not exist in the ksh93u+ release. That's because in that release source is an alias for command .. There is no entry for source in the shtab_builtins[] array. In the ksh93v- release the alias has been removed and an entry for source added to shtab_builtins[]; but without the BLT_SPC flag that is present in the definition of the . (dot) command. This is clearly a mistake. The question is why doesn't it appear to cause any problems? Is the BLT_SPC flag even needed for the . command?

krader1961 avatar Sep 26 '19 05:09 krader1961

POSIX specifies . as a “special builtin”. I'm assuming that is what the BLT_SPC flag does. So yes, it is needed per POSIX.

POSIX does not specify source at all, so there is no reason for it to be a special builtin. Special builtins really only exist for compatibility with historical behaviour. Making new ones is not desirable.

The old alias definition, command ., used the command builtin to disable the special-builtin properties* of ., so that it would act like a regular builtin. The current definition correctly preserves that behaviour.

So I would conclude that this is not a bug.


* Part of the specification of the command builtin is: “If the command_name is the same as the name of one of the special built-in utilities, the special properties in the enumerated list at the beginning of Special Built-In Utilities shall not occur.”

McDutchie avatar Sep 26 '19 21:09 McDutchie

Thanks, @McDutchie. I had answered most of my question(s) after opening this issue when it occurred to me to carefully read the ksh(1) man page. Nonetheless, your comment was helpful. Especially the link to the relevant POSIX text which we can compare to that in the first paragraph of the "Execution" section of the man page. Guess what? The POSIX page explaining what "special" means has no resemblance to the ksh explanation:

Each time a command is read, the above substitutions are carried out. If the command name matches one of the Special Built-in Commands listed below, it is executed within the current shell process. Next, the command name is checked to see if it matches a user defined function. If it does, the positional parameters are saved and then reset to the arguments of the function call. A function is also executed in the current shell process. When the function completes or issues a return, the positional parameter list is restored. For functions defined with the function name syntax, any trap set on EXIT within the function is executed. The exit value of a function is the value of the last command executed. If a command name is not a special built-in command or a user defined function, but it is one of the built-in commands listed below, it is executed in the current shell process.

The main problem is that 99.9% of users will expect the . (dot) and source builtins to behave identically. The difference in behavior is only indicated by the presence, or not, of a "dagger" (or hyphen) char at the front of the command description in the ksh(1) man page. Something few people will a) notice, and b) figure out the meaning thereof. So we can do one of two things:

a) Explicitly document the meaning of "special" builtin and why . is special and source is not, or

b) Make the two, seemingly identical, commands behave the same.

Prior to creating a 2020.0.0 release branch a few weeks ago the second option would be unthinkable. The question now is whether we can make the two cases truly synonymous. The answer is almost certainly: no. Especially given that command name is documented as a way to suppress the legacy "special builtin" behavior. Having said that I'm going to try and find some time to experiment and see if the specialness of . and source really matters.

krader1961 avatar Sep 27 '19 04:09 krader1961

Also, the builtin documentation in the ksh(1) man page says:

Special built-ins cannot be bound to a pathname or deleted.

Which means you can't create a dynamically loaded plugin that defines an alternative implementation for a "special" builtin. Why should that be true for a "special" builtin like . (dot) but not source which 99.9% of people expect to be synonymous. And then there is this behavior:

KSH PROMPT:4: function source { echo WTF source; }
KSH PROMPT:5: source jklsdf
/bin/ksh: .: jklsdf: cannot open [No such file or directory]

Notice that due to the alias source 'command .' baked into ksh93u+ you can't actually use a function of that name. Which is a different, and unrelated, problem with the POSIX/ksh model for these objects that we will probably never be able to change. So the question is how to more explicitly document these behaviors.

krader1961 avatar Sep 27 '19 04:09 krader1961

Gah! I also meant to include this text from the "Built-in Commands" section of the ksh(1) man page in my previous comment:

Commands  that are preceded by one or two - symbols are
       special built-in commands and are treated specially in the following ways:
       1.     Variable  assignment  lists  preceding the command remain in effect
              when the command completes.
       2.     I/O redirections are processed after variable assignments.
       3.     Errors cause a script that contains them to abort.
       4.     They are not valid function names.
       5.     Words following a command preceded by -- that are in the format  of
              a  variable  assignment are expanded with the same rules as a vari-
              able assignment.  This means that tilde substitution  is  performed
              after  the  = sign and field splitting and file name generation are
              not performed.  These are called declaration built-ins.

TBD is whether this distinction for the . (dot) and source commands is a net positive or negative. If a net positive the difference in behavior needs to be more clearly documented. If a net negative we should make the two commands aliases so that there is no need to explain to users why they behave differently.

krader1961 avatar Sep 27 '19 04:09 krader1961

Point number one in my previous comment, from the ksh documentation, is obviously misleading if not outright wrong. Using a ksh93u+ binary:

$ /bin/ksh
KSH PROMPT:1: x=y . /dev/stdin
echo yes
yes
KSH PROMPT:2: typeset -p x
x=y
KSH PROMPT:3: x=z source /dev/stdin
echo no
no
KSH PROMPT:4: typeset -p x
x=z

Notice that the non-special source command also preserves the "variable assignment lists" after it completes.

krader1961 avatar Sep 27 '19 05:09 krader1961

Notice that the non-special source command also preserves the "variable assignment lists" after it completes.

That would be a bug in ksh93u+ (with command . to which source is aliased in that version).

But my testing is showing me that this is already fixed in the current code for both source and command .; the assignment is correctly not preserved beyond the command in the current version (1b34915).

$ echo $KSH_VERSION
Version A 2020.0.0-beta1-92-g1b349151
$ x=z source /dev/stdin
echo no
no
$ echo $x

$ x=z command . /dev/stdin
echo no
no
$ echo $x

$ x=z . /dev/stdin
echo no
no
$ echo $x
z
$ 

McDutchie avatar Sep 28 '19 01:09 McDutchie

@McDutchie, Yes, you're correct that this has been fixed. The "correct" behavior also occurs in the ksh93v- release. I use scare quotes because it is not clear why . and source should have different semantics. We have two options:

  1. The last stable release, ksh93u+, seems to have the same behavior for both variants. So we could simply revert to that behavior by adding BLT_SPC to the flags for the source command definition.

  2. Retain the difference and improve the documentation to make that difference much clearer.

Also, some of the points about "special" builtins in my comment quoting from the ksh.1 man page are questionable or wrong. For example, "4. They are not valid function names." is obviously not true since 98% of the special builtins have names that are valid function names.

krader1961 avatar Sep 28 '19 02:09 krader1961